Installation

External software

BWA-mem v0.7.12
GMAP 2014-12-28
samtools v0.1.19
BBT v3.0.0b (required) (if TAP is run in targeted mode)

*tested versions indicated, may not be the most recent version. For BBT, v3.0.0b is required.

External files

To run PAVFinder, the following reference sequence and annotation files are required:

genome FASTA and index files

For example, to prepare hg19 files from UCSC:

wget ftp://hgdownload.cse.ucsc.edu/goldenPath/hg19/chromosomes/chr*.fa.gz
zcat chr*.fa.gz > hg19.fa && rm chr*.fa.gz
samtools faidx hg19.fa
bwa index hg19.fa
gmap_build -D . -d hg19 hg19.fa

annotation VCF and index files

For example, to prepare Refseq genes from UCSC:

wget http://hgdownload.cse.ucsc.edu/goldenpath/hg19/database/refGene.txt.gz
zcat refGene.txt.gz | cut -f2- | ~rchiu/bin/genePredToGtf file stdin refGene.gtf
sort -k1,1 -k4,4n refGene.gtf > refGene.sorted.gtf
bgzip refGene.sorted.gtf
tabix -p gff refGene.sorted.gtf.gz

transcriptome FASTA and index files
To create a reference transcriptome FASTA, PAVFinder provides a utility for making it (available under 'bin' for pip virtualenv install)

extract_transcript_sequence.py <gtf> <fasta_output> <genome_fasta> --index

A transcript FASTA and a corresponding BWA index will be genearated

If running TAP, after installation of PVT(below) full paths of the above reference and annotation files can be specified in /path/to/virtualenv/config/tap.cfg which can be passed to TAP via --params

genome_index = /real/path/to/gmapdb hg19
transcripts_fasta = /real/path/to/transcripts.fa
genome_fasta = /real/path/to/genome.fa
gtf = /real/path/to/gtf
suppl_annot = /real/path/to/supplementary_gtf (optional)

Installing the Python PVT package

pip install virtualenv
virtualenv <DIR>
source <DIR>/bin/activate
pip install -U cython
pip install git+https://github.com/BirolLab/pavfinder_transcriptome.git#egg=pavfinder_transcriptome

After successful installation, the following will be available in different sub-directories under the virtualenv directory:

Python scripts for detecting structural (find_sv.py) and splice (map_splice.py) variants, and the TAP pipeline script (tap.py) will be in "bin"
A template config file (tap.cfg) for running TAP will be in "config"
Sample data will be in "test"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Installation

External software

External files

Installing the Python PVT package

FilesExpand file tree

INSTALL.md

Latest commit

History

INSTALL.md

File metadata and controls

Installation

External software

External files

Installing the Python PVT package