Skip to content

Contribution: Pairwise genetic distances using empirical protein substitution model #5

@GavinHuttley

Description

@GavinHuttley

Category

Phylogenetic analysis

Project

cogent3

Description

How to calculate a pairwise distance matrix using an empirical protein substitution model with piqtree.

Code

# code from this piqtree issue https://github.com/iqtree/piqtree/issues/392
import cogent3 as c3

aln = c3.get_dataset("brca1")

# piqtree requires 3 sequences for a tree fit, so select an outgroup
outgroup = "Wombat"
# You need to write a loop that iterates all name pairs aside from the outgroup, use `for ingroups in itertools.combinations([n for n in aln.names if n != outgroup], 2):...`
# then in the loop
ingroups = ["Human", "Mouse"]
saln = aln.take_seqs(ingroups + [outgroup])
tree = c3.make_tree(tip_names=saln.names) # tree has no lengths, it's just the topology
app = c3.get_app("piq_fit_tree", model="GTR", tree=tree)
result = app(saln)
dmat = result.tip_to_tip_distances()
h_m_dist = dmat["Human", "Mouse"]
# add that value to a dict like
# dists[*ingroups] = h_m_dist
# dists[*reversed(ingroups)] = h_m_dist
# etc...

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions