Jenkins-based automation of phenotype semantic similarity on PHENIO with Semsimian.
This repository provides two equivalent ways to run the phenotype comparison pipeline:
Jenkinsfile: CI pipeline that runs comparisons and publishes results to Zenodo.run_pipeline.py: local runner intended to mirror the Jenkins behavior.
Each run produces three comparison tarballs:
HP_vs_HP_semsimian_phenio.tar.gzHP_vs_MP_semsimian_phenio.tar.gzHP_vs_ZP_semsimian_phenio.tar.gz
Each tarball includes the similarity TSV, a YAML log file, and the information-content file used.
The results of this process are available on Zenodo:
- DOI: https://doi.org/10.5281/zenodo.18474575
- Latest record example: https://zenodo.org/records/18474576
Both the Jenkins pipeline and run_pipeline.py publish a new Zenodo version on each run:
- Create a new version draft from an existing record.
- Remove any files inherited from the previous version.
- Upload the new tarballs to the draft bucket.
- Publish the draft.
The version name is set to the run date in YYYY-MM-DD format by default (for example 2025-07-24).
No credentials are stored in this repository. Zenodo credentials are provided at runtime.
Run the Jenkins pipeline with these build parameters:
ZENODO_TOKEN(required): Zenodo API token. This can be obtained from the "Applications" section of your Zenodo account menu.ZENODO_RECORD_ID(required, default18474576): Zenodo record ID to version.
The pipeline uses today’s date as the version name (UTC from the Jenkins host). You can change
the record ID per run by overriding ZENODO_RECORD_ID.
The Python script mirrors the Jenkins pipeline and supports the same Zenodo publishing flow.
python3 run_pipeline.pypython3 run_pipeline.py \
--zenodo-record-id 18474576 \
--zenodo-token "$ZENODO_TOKEN"python3 run_pipeline.py --comparison hp-hp
python3 run_pipeline.py --comparison hp-mp
python3 run_pipeline.py --comparison hp-zppython3 run_pipeline.py --working-dir /path/to/workdir
python3 run_pipeline.py --custom-phenio /path/to/phenio.dbpython3 run_pipeline.py --test-modepython3 run_pipeline.py \
--zenodo-record-id 18474576 \
--zenodo-token "$ZENODO_TOKEN" \
--zenodo-version 2025-07-24 \
--zenodo-base-url https://zenodo.org/apiCommon options for run_pipeline.py:
--working-dir: Directory for pipeline execution (default./working)--comparison:all,hp-hp,hp-mp, orhp-zp(defaultall)--resnik-threshold: Minimum ancestor information content (default1.5)--custom-phenio: Path to a local PHENIO SQLite database--skip-setup: Skip tool downloads and data fetch--test-mode: Download data but skip comparisons--zenodo-record-id: Zenodo record ID (required to publish)--zenodo-token: Zenodo API token (required to publish)--zenodo-version: Zenodo version name (default: todayYYYY-MM-DD)--zenodo-base-url: Zenodo API base URL (defaulthttps://zenodo.org/api)