Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
69ea588
wip: gh actions for docs
tschuelia Dec 29, 2024
bdb8913
wrong branch name...
tschuelia Dec 29, 2024
a912ba0
wrong docs version name
tschuelia Dec 29, 2024
83c51ac
please work, I'm hungry
tschuelia Dec 29, 2024
f16bd52
wip: docs for v1
tschuelia Dec 29, 2024
4e9e755
reset docs to current state
tschuelia Dec 29, 2024
68f8b82
more docs
tschuelia Dec 29, 2024
2d7cc87
add version flag
tschuelia Dec 29, 2024
29b9124
wip
tschuelia Feb 25, 2025
b0217a1
add licence-files
tschuelia Feb 25, 2025
32e41ec
switch to pyproject.toml
tschuelia Feb 25, 2025
0ade8a3
fix python version
tschuelia Feb 25, 2025
dd28a31
add repo link to docs
tschuelia Feb 27, 2025
c21ff90
update pre-commit; test coverage
tschuelia Feb 27, 2025
29b7cd6
wip: refactoring
tschuelia Feb 27, 2025
402df36
more test cases: reduced MSAs
tschuelia Feb 27, 2025
a850fb4
fix test data
tschuelia Feb 27, 2025
b268635
better parametrize test cases
tschuelia Feb 28, 2025
06eb6b9
correctly setup ruff pre-commit; some test refactoring
tschuelia Feb 28, 2025
c57ccaf
wip: tests
tschuelia Feb 28, 2025
db7bde1
more error handling + test coverage
tschuelia Feb 28, 2025
8b1bd59
handle none prefix
tschuelia Feb 28, 2025
2e19061
wip: fix gh actions
tschuelia Feb 28, 2025
f25caf3
don't extract unused abs rfdist
tschuelia Mar 3, 2025
d41e15e
separate test data for macOS and linux (RAxML-NG yields slightly diff…
tschuelia Mar 3, 2025
670c8f1
rename parse function
tschuelia Mar 3, 2025
ced7e59
update gitignore
tschuelia Mar 4, 2025
287a238
fix doc references
tschuelia Mar 4, 2025
90b6e9f
reintroduce shap command line flag (runtime impact)
tschuelia Mar 4, 2025
a0fbb02
update usage docs
tschuelia Mar 5, 2025
0b96375
refactoring
tschuelia Mar 5, 2025
20e115c
refactoring
tschuelia Mar 5, 2025
72e05ee
refactoring
tschuelia Mar 5, 2025
be52c49
update docs
tschuelia Mar 5, 2025
c99aaf7
update docs
tschuelia Mar 5, 2025
6d8fe77
rename package and adjust version retrieval for pypi
tschuelia Mar 5, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions .github/workflows/dev-docs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
name: Dev Docs

on:
push:
branches:
- dev

jobs:
dev-docs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Configure Git Credentials
run: |
git config user.name github-actions[bot]
git config user.email 41898282+github-actions[bot]@users.noreply.github.com
- uses: actions/setup-python@v5
with:
python-version: 3.x
- run: echo "cache_id=$(date --utc '+%V')" >> $GITHUB_ENV
- uses: actions/cache@v4
with:
key: mkdocs-material-${{ env.cache_id }}
path: .cache
restore-keys: |
mkdocs-material-
- run: pip install mkdocs-material mkdocstrings-python mike markdown-callouts
- run: mike deploy --push dev
28 changes: 0 additions & 28 deletions .github/workflows/docs.yml

This file was deleted.

34 changes: 34 additions & 0 deletions .github/workflows/release-docs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
name: Release Docs

on:
release

jobs:
dev-docs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Configure Git Credentials
run: |
git config user.name github-actions[bot]
git config user.email 41898282+github-actions[bot]@users.noreply.github.com
- uses: actions/setup-python@v5
with:
python-version: 3.x
- name: Set release notes tag
run: |
export RELEASE_TAG_VERSION=${{ github.event.release.tag_name }}
echo "RELEASE_TAG_VERSION=${RELEASE_TAG_VERSION:1}" >> $GITHUB_ENV
- run: echo "cache_id=$(date --utc '+%V')" >> $GITHUB_ENV
- uses: actions/cache@v4
with:
key: mkdocs-material-${{ env.cache_id }}
path: .cache
restore-keys: |
mkdocs-material-
- run: pip install mkdocs-material mkdocstrings-python mike markdown-callouts
- run: |
mike deploy --push --update-aliases ${RELEASE_TAG_VERSION} latest
mike set-default --push latest
2 changes: 1 addition & 1 deletion .github/workflows/test-pythia.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ jobs:
cat tests/test_config.py
- name: Run Pythia tests
run: |
PYTHONPATH=. pytest
PYTHONPATH=. pytest -svx --color=yes

Install-using-conda:
runs-on: ${{ matrix.os }}
Expand Down
18 changes: 17 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,6 +1,22 @@
**/*.reduced.*
**/*.log
**/*.csv
**/*.shap.pdf
**/*.pythia.trees

**/*.iqtree.ckp.gz
**/*.iqtree.iqtree
**/*.iqtree.trees
**/*.iqtree.treefile

**/*.raxml.bestModel
**/*.raxml.bestTree
**/*.raxml.bestTreeCollapsed
**/*.raxml.mlTrees
**/*.raxml.plausibleTrees
**/*.raxml.rba
**/*.raxml.startTree
**/*.rfDistances


# Byte-compiled / optimized / DLL files
__pycache__/
Expand Down
4 changes: 2 additions & 2 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,10 @@ repos:
- id: trailing-whitespace
- id: detect-private-key
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.7.4
rev: v0.9.7
hooks:
# Run the linter.
- id: ruff
args: [--select, I, --fix ]
args: [--fix, --exit-non-zero-on-fix]
# Run the formatter.
- id: ruff-format
2 changes: 2 additions & 0 deletions docs/api/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,10 @@

options:
show_root_heading: true
modernize_annotations: true

::: pypythia.config.DEFAULT_RAXMLNG_EXE

options:
show_root_heading: true
modernize_annotations: true
2 changes: 2 additions & 0 deletions docs/api/custom_errors.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,12 @@
show_root_heading: true
merge_init_into_class: false
group_by_category: true
modernize_annotations: true

::: pypythia.custom_errors.RAxMLNGError

options:
show_root_heading: true
merge_init_into_class: false
group_by_category: true
modernize_annotations: true
2 changes: 2 additions & 0 deletions docs/api/custom_types.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,12 @@
show_root_heading: true
merge_init_into_class: false
group_by_category: true
modernize_annotations: true

::: pypythia.custom_types.FileFormat

options:
show_root_heading: true
merge_init_into_class: false
group_by_category: true
modernize_annotations: true
6 changes: 5 additions & 1 deletion docs/api/msa.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,18 +5,22 @@
show_root_heading: true
merge_init_into_class: false
group_by_category: true
modernize_annotations: true

::: pypythia.msa.parse
::: pypythia.msa.parse_msa

options:
show_root_heading: true
modernize_annotations: true

::: pypythia.msa.remove_full_gap_sequences

options:
show_root_heading: true
modernize_annotations: true

::: pypythia.msa.deduplicate_sequences

options:
show_root_heading: true
modernize_annotations: true
6 changes: 4 additions & 2 deletions docs/api/prediction.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@

::: pypythia.prediction.predict_difficulty
::: pypythia.prediction.collect_features

options:
show_root_heading: true
modernize_annotations: true

::: pypythia.prediction.collect_features
::: pypythia.prediction.predict_difficulty

options:
show_root_heading: true
modernize_annotations: true
1 change: 1 addition & 0 deletions docs/api/predictor.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,4 @@
show_root_heading: true
merge_init_into_class: false
group_by_category: true
modernize_annotations: true
8 changes: 8 additions & 0 deletions docs/api/raxmlng.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,16 @@
show_root_heading: true
merge_init_into_class: false
group_by_category: true
modernize_annotations: true

::: pypythia.raxmlng.run_raxmlng_command

options:
show_root_heading: true
modernize_annotations: true

::: pypythia.raxmlng.get_raxmlng_rfdist_results

options:
show_root_heading: true
modernize_annotations: true
36 changes: 0 additions & 36 deletions docs/conf.py

This file was deleted.

54 changes: 27 additions & 27 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -1,43 +1,43 @@
# Home
Welcome to the PyPythia documentation. Pythia is a lightweight python library to predict the difficulty of Multiple Sequence Alignments (MSAs).

Welcome to the PyPythia documentation.


### C Library

The same functionality is also available as C library [here](https://github.com/tschuelia/difficulty_prediction).
Since the C library depends on [Coraxlib](https://codeberg.org/Exelixis-Lab/coraxlib) it is not as easy and fast to use as this python library.
If you are only interested in the difficulty of your MSA, we recommend using this Python library.
If you want to incorporate the difficulty prediction in a phylogenetic tool, we recommend using the faster C library.

Pythia is a lightweight python library to predict the difficulty of Multiple Sequence Alignments (MSA). Phylogenetic
analyzes under the Maximum-Likelihood (ML) model are time and resource intensive. To adequately capture the vastness of
tree space, one needs to infer multiple independent trees. On some datasets, multiple tree inferences converge to
similar tree topologies, on others to multiple, topologically highly distinct yet statistically indistinguishable
topologies. Pythia predicts the degree of difficulty of analyzing a dataset prior to initiating ML-based tree
inferences. Predicting the difficulty using Pythia is substantially faster than inferring multiple ML trees using
RAxML-NG. Pythia can be used to increase user awareness with respect to the amount of signal and uncertainty to be
expected in phylogenetic analyzes, and hence inform an appropriate (post-)analysis setup. Further, it can be used to
select appropriate search algorithms for easy-, intermediate-, and hard-to-analyze datasets. Pythia supports DNA, AA,
and morphological data in Phylip and FASTA format.

### Support

If you encounter any trouble using Pythia, have a question, or you find a bug, please feel free to open an issue here.

If you encounter any trouble using Pythia, have a question, or you find a bug, please feel free to open an
issue [here](https://github.com/tschuelia/PyPythia/issues).

### Publication

The paper explaining the details of Pythia is published in MBE:
Haag, J., Höhler, D., Bettisworth, B., & Stamatakis, A. (2022). **From Easy to Hopeless - Predicting the Difficulty of Phylogenetic Analyses.** *Molecular Biology and Evolution*, 39(12). [https://doi.org/10.1093/molbev/msac254](https://doi.org/10.1093/molbev/msac254)
Haag, J., Höhler, D., Bettisworth, B., & Stamatakis, A. (2022). **From Easy to Hopeless - Predicting the Difficulty of
Phylogenetic Analyses.** *Molecular Biology and Evolution*, 39(12). [https://doi.org/10.1093/molbev/msac254](https://doi.org/10.1093/molbev/msac254)

> [!WARNING]
> Since this publication, we made some considerable changes to Pythia.
> The most important change is that we switched from using a Random Forest Regressor to using a LightGBM Gradient Boosted Tree Regressor.
> This affects all Pythia versions >= 1. If you use Pythia in your work, please state the correct learning algorithm. If you are unsure, feel free to reach out to me 🙂
> The most important change is that we switched from using a Random Forest Regressor to using a LightGBM Gradient
> Boosted Tree Regressor.
> This affects all Pythia versions >= 1. If you use Pythia in your work, please state the correct learning algorithm. If
> you are unsure, feel free to reach out to me 🙂
>
> There will soon be a new pre-print that explains the changes in detail, stay tuned!


### CPythia

### References

* A. M. Kozlov, D. Darriba, T. Flouri, B. Morel, and A. Stamatakis (2019)
**RAxML-NG: a fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference**
*Bioinformatics*, 35(21): 4453–4455.
[https://doi.org/10.1093/bioinformatics/btz305](https://doi.org/10.1093/bioinformatics/btz305)

* D. Höhler, W. Pfeiffer, V. Ioannidis, H. Stockinger, A. Stamatakis (2022)
**RAxML Grove: an empirical phylogenetic tree database**
*Bioinformatics*, 38(6):1741–1742.
[https://doi.org/10.1093/bioinformatics/btab863](https://doi.org/10.1093/bioinformatics/btab863)

For full documentation visit [mkdocs.org](https://www.mkdocs.org).
The same functionality is also available as C library [here](https://github.com/tschuelia/difficulty_prediction).
Since the C library depends on [Coraxlib](https://codeberg.org/Exelixis-Lab/coraxlib) it is not as easy and fast to use
as this python library.
If you are only interested in the difficulty of your MSA, we recommend using this Python library.
If you want to incorporate the difficulty prediction in a phylogenetic tool, we recommend using the faster C library.
Loading