The software should be able to read phylogenetic output files produced by ARETE. These will comprise a reference tree and a set of (typically thousands of) gene trees that contain subsets of the leaves in the reference tree. Leaf labels produced by the pipeline should be consistent between the reference and gene trees.
Output trees will have been inferred using IQ-TREE or (more frequently) FastTree.
- Trees will have support values, these will be scaled from 0 to 1 in FastTree output.
- They will generally not be rooted. The user may specify an outgroup when they invoke the pipeline, but in general this will be impractical as outgroups may be complicated and uncertain. Midpoint rooting is acceptable as a default if rooting is necessary.
- Multifurcations and zero-length branches are possible (and probable) so the software should expect these.
My recommendation is some combination of building a wrapper script (probably using Python) that can do the necessary conversions and invoke rSPR, and / or modifying the rSPR code itself to perform some of these tasks internally.
The software should be able to read phylogenetic output files produced by ARETE. These will comprise a reference tree and a set of (typically thousands of) gene trees that contain subsets of the leaves in the reference tree. Leaf labels produced by the pipeline should be consistent between the reference and gene trees.
Output trees will have been inferred using IQ-TREE or (more frequently) FastTree.
My recommendation is some combination of building a wrapper script (probably using Python) that can do the necessary conversions and invoke rSPR, and / or modifying the rSPR code itself to perform some of these tasks internally.