-
Notifications
You must be signed in to change notification settings - Fork 4
Description
The NCBI sometimes merges old TaxIDs into new ones. However, depending on how the header name is structured, treesapp can't match the taxIDs to a lineage and will erroneously assign a sequence to root:
For Example, in the SoxZ package:
1525715.IX54_08960 get read as having a taxID as 1525715,
however, 1525715 has been merged into 1545044.
TreeSAPP will classify this sequence as Root.
However, the taxonomy should be:
Bacteria; Pseudomonadota; Alphaproteobacteria; Rhodobacterales; Paracoccaceae; Paracoccus; Paracoccus sanguinis
https://www.ncbi.nlm.nih.gov/protein/694216822
For cases where the protein accession is listed without a taxID prefix, this issue is avoided. It seems that this is more of an issue for sequences that originate from EggNog.
- TreeSAPP Version [e.g. 0.11.4]
Additional context
Add any other context about the problem here.