Auto-NCBI

Scripts to automatic recovery/process information of NCBI.

ncbi_seq_retrieve

Recover NCBI sequences, host and organism taxonomy information based on list of tax_ids

Usage

To recovery genbank information from nucleotide sequences:

python ncbi_seq_retrieve.py -in file_with_access_ids.txt -db nucleotide -ot gb

Or to recovery in xml format, just insert the parameter -tf xml.

To recovery cds translated to aminoacids from nucleotide sequences:

python ncbi_seq_retrieve.py -in file_with_acess_ids.txt -db nucleotide -ot fasta_cds_aa

Or to recovery cds not translated, just change fasta_cds_aa for fasta_cds_na

To recovery nucleotide of aminoacid sequences

python ncbi_seq_retrieve.py -in file_with_acess_ids.txt -db (nucleotide or protein) -ot fasta

Or to recovery in xml format, just insert the parameter -tf xml.

To recovery taxonomy information of ncbi acess IDs

python ncbi_seq_retrieve.py -in file_with_acess_ids.txt -db (nucleotide or protein) -ot gb -tx True

To recovery taxonomy information of host of ncbi acess IDs (ideal for viruses)

python ncbi_seq_retrieve.py -in file_with_acess_ids.txt -db (nucleotide or protein) -ot gb -tx True -th True

Some considerations

If you have a file with IDs from nucleotide sequences, you can't use this file in a protein database, and vice-versa. If you call help function, a table with which text formats are allowed per output type, and which output types are allowed per database.

split_by_tax

Sample a fasta file based on taxonomy and virus name, the header of sequence should follow the pattern: <ncbi-access>|<tax>|<sequence name>[<virus name>]. For example YP_010037467.1|Alphacoronavirus|polyprotein 1ab [Alphacoronavirus sp.] can be used to samble by Genus.

Usage

python split_by_tax.py input.fasta output_directory seed

With test data:

python split_by_tax.py test_data/ncbi_virus.fa test_out 123

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
test_data		test_data
README.md		README.md
ncbi_seq_retrieve.py		ncbi_seq_retrieve.py
split_by_tax.py		split_by_tax.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Auto-NCBI

ncbi_seq_retrieve

Usage

Some considerations

split_by_tax

Usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Auto-NCBI

ncbi_seq_retrieve

Usage

Some considerations

split_by_tax

Usage

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages