Demiulge is a tool for deconvolution the strain abundance of universal species in mixed samples (e.g. wastewater). The old version of this tool is NextVpower.
"Deconvolution" means "demixing", analyzing the relative abundance of lineage, genotype and phylogeny of a certain taxonomy in one sample.
What can Demiulge demix:
- Human Adenovirus (HAdV)
- Human Astrovirus (HAstV)
- Norovirus
- Rotavirus
- Pathogens in NextClade, using barcodes extracted by BarcodeExtracter.py
- ...
Uusing BarcodeBuilder.py, you can locally build barcodes of your concerned lineages of species with provided sequences. Then you can demix your high-throughput targeted-sequenced samples using Demiulge.
For demixing, in python3:
- numpy
- pandas
- cvxpy
For building phylogenetic trees:
For processing NGS data:
- Demix from input sample mutation table file, and save result to result.tsv:
python Demiulge.py -i PP_raw_example.tsv -o demix_result_example.tsv- Demix from input *.vcf files under a folder, filter mutation sites with mutation rate lower than 0.1, filter mutation sites with depth lower than 10, and save result to result.tsv:
python Demiulge.py -i vcf_example -r 0.1 -d 10 -o demix_result_vcf_example.tsv- Set barcode filter criteria, filter lineages with fewer than 0.02 of total "key" mutation sites (default 0.01), retain "key" mutation sites present in more than 0.1 of total lineages (default 0.05):
python Demiulge.py -i PP_raw_example.tsv -n 0.02 -k 0.1 -o demix_result_example_300_30.tsv- Add annotation to sample mutation table according to variation annotation table and demix:
python Demiulge.py -i vcf_example -o demix_result_vcf_example.tsv --ann_vcsample Annotated_PP_raw_example.tsv- Demix from input *.vcf files under a folder, save result in result.tsv, and save middle data to files:
python Demiulge.py -i vcf_example -o demix_result_vcf_example.tsv --vcsample PP_raw_example.tsv --fbarcode MMFF_example.tsv --fsample PPFF_example.tsv --potentials potential_sites_example.tsvPlease see detailed usage by running python Demiulge.py -h or in source code
An example pipeline for demixing HAdV-F is shown in analysis pipeline
This project was not published yet, but you can still have a try on your amplicon sequencing data of target species.
This project is licensed under the MIT License - see the LICENSE file for details.