This repository contains the following Snakemake pipelines and scripts, to be run in this order:
ds_detection
)augment_transcriptome
)de_analysis
)
Download the respective gene annotations and genome files for the species of interest and place them into a location of your choice:
GFF3
gene annotation file (e.g. gencode.vM29.primary_assembly.annotation.gff3
). Unzip the file using gunzip
tool.GTF
gene annotation file (e.g. gencode.vM29.primary_assembly.annotation.gtf.gz
).FASTA
genome file (e.g. GRCm39.primary_assembly.genome.fa.gz
).You may obtain the annotation files from Gencode (mouse).
git clone https://github.com/ys-lim/SpliCeAT.git
The pipelines expect RNA-seq alignments/BAM files to be labelled as sample_Aligned.sortedByCoord.out.bam
(as in STAR output format). Nevertheless, modifications can be made (at the user’s discretion) in the Snakemake rules to account for alignments generated by other tools (e.g. HISAT2). ↩