Joint Single Cell Profile of RNA and Histone Marks To Reveal Epigenetic and Transcriptional Dynamics (TrES-Seq)
Code repository related to analysing TrES-Seq data
The analysis consists of two main parts. First, the launcher scripts in ProcessingScripts take FASTQ reads, align them, and process the output into usable matrices and fragment files. These files can also be downloaded directly from GEO under GSE324511. They are then used as input for the scripts in FiguresScripts, which reproduce the figures shown in the publication.
A Nextflow pipeline has been developed to automate the workflow from raw FASTQs through to the production of processed matrices, it lives here: TrESFlow
Raw files are available on GEO under the Accession Number: GSE324511.
Processed files - scRNA-seq matrices in 10x mtx format (RNA | STARSolo), fragments files (H3K27ac/H3K27me3 | SnapATAC2) are available as supplementary files on the same GEO repository.
All analyses were made using the environment in env. To recreate that environment you can use:
conda env create -f env/environment.yaml
To preprocess the raw fastq files you will need to install Codon and Seq, in addition to this environment. To install them, follow the instructions listed in the links.
To run a script, you will need to open it to modify the path to the input files of that script and the required accompanying scripts (only for preprocessing).
Before starting the preprocessing, first use the scripts in the PeprocessingScripts/GenerateSTARSoloGenome folder to create the proper STAR genomes that will be used during processing. Then to process a specific dataset, use Launch.sh inside the corresponding folder in ProcessingScripts. The attached whitelists WLs are required to demultiplex the reads. You can then use PostProcessing.py to obtain the relevant matrices from the aligned reads create in the previous step.