OneEHR

OneEHR is a unified Python platform for longitudinal EHR experiments across ML, DL, and LLM agents. It provides shared infrastructure for preprocessing, modeling, testing, and analysis on one shared run contract — the first toolkit bridging classical machine learning, deep learning, and agentic AI for clinical prediction.

Key Features

38 model architectures — tabular ML, recurrent/non-recurrent DL, irregular-time, KG-enhanced, and survival models
Unified ML/DL/LLM comparison — all predictions in one predictions.parquet with bootstrap CI and statistical tests
Dataset converters — built-in support for MIMIC-III, MIMIC-IV, and eICU
Medical code ontologies — ICD-9/10 mapping, CCS grouping, ATC drug hierarchy
Survival analysis — DeepSurv, DeepHit, concordance index, Kaplan-Meier visualization
Fairness & interpretability — demographic parity, equalized odds, SHAP, LIME, integrated gradients, attention visualization
Publication-quality figures — ROC, PR, calibration, DCA, forest plots, KM curves with Nature/Lancet style presets
Reproducibility by design — single TOML config = complete experiment specification

Workflow At A Glance

oneehr preprocess --config experiment.toml   # Bin features, split patients
oneehr train      --config experiment.toml   # Train ML/DL models
oneehr test       --config experiment.toml   # Evaluate on test set
oneehr analyze    --config experiment.toml   # Cross-system comparison
oneehr plot       --config experiment.toml   # Publication figures

All commands operate on the same run directory under {output.root}/{output.run_name}/.

Install

OneEHR requires Python 3.12+.

pip install oneehr

# Or from source:
uv venv .venv --python 3.12
uv pip install -e .
oneehr --help

Quickstart

Use the bundled TJH COVID-19 ICU example:

# Convert source data (only needed once)
python examples/tjh/convert.py

# Run the full pipeline
oneehr preprocess --config examples/tjh/mortality_patient.toml
oneehr train      --config examples/tjh/mortality_patient.toml
oneehr test       --config examples/tjh/mortality_patient.toml
oneehr analyze    --config examples/tjh/mortality_patient.toml

Or use the Python API:

import oneehr

config = oneehr.load_config("examples/tjh/mortality_patient.toml")
oneehr.preprocess(config)
oneehr.train(config)
oneehr.test(config)
oneehr.analyze(config)

Dataset Converters

Convert standard clinical datasets into OneEHR's three-table format:

# MIMIC-III
oneehr convert --dataset mimic3 --raw-dir /path/to/mimic3 --output-dir data/mimic3/ --task mortality

# MIMIC-IV
oneehr convert --dataset mimic4 --raw-dir /path/to/mimic4 --output-dir data/mimic4/ --task mortality

# eICU
oneehr convert --dataset eicu --raw-dir /path/to/eicu --output-dir data/eicu/ --task mortality

Each converter produces labels for mortality, readmission, and length-of-stay tasks.

Models

OneEHR ships 38 model architectures:

Category	Models
Tabular ML	XGBoost, CatBoost, Random Forest, Decision Tree, GBDT, Logistic Regression
Recurrent	GRU, LSTM, RNN, GRU-D, Dipole, HiTANet, M3Care, PAI
Non-recurrent	CNN, TCN, Transformer, SAnD, MLP, Deepr, EHR-Mamba, Jamba, LSAN
Irregular-time	mTAND, Raindrop, ContiFormer, TECO
EHR-specialised	AdaCare, StageNet, RETAIN, ConCare, GRASP, MCGRU, DrAgent, PRISM, SAFARI
KG-enhanced	GraphCare, KerPrint, ProtoEHR
Survival	DeepSurv, DeepHit

Models with static branches (ConCare, GRASP, MCGRU, DrAgent, PRISM, SAFARI, TECO) automatically use patient-level static features when static.csv is provided.

Task Types

Task	Config	Description
Binary classification	`kind = "binary"`	Mortality, readmission, etc.
Multiclass	`kind = "multiclass"`	Phenotyping, diagnosis groups
Regression	`kind = "regression"`	Length of stay, lab value prediction
Survival	`kind = "survival"`	Time-to-event with censoring
Multi-label	`kind = "multilabel"`	ICD coding, multi-diagnosis

Medical Code Ontologies

from oneehr.medcode import ICD9, ICD10, CodeMapper, CCSGrouper, ATCHierarchy

# ICD code utilities
ICD9.chapter("401.9")    # → "Circulatory system"
ICD10.category("I10.0")  # → "I10"

# Aggregate codes by ontology for dimensionality reduction
mapper = CodeMapper()
mapper.add_icd_chapter_mapping(version=9)
mapped_events = mapper.apply(events_df)

Configuration

OneEHR uses TOML as the experiment contract:

[dataset] — input table paths (dynamic, static, label)
[preprocess] — binning, feature engineering, preprocessing pipeline
[task] — task kind and prediction mode (patient or time)
[split] — patient-level train/val/test splitting
[[models]] — model selection with per-model params
[trainer] — DL training config (mixed precision, LR schedulers, early stopping)
[[systems]] — LLM/agent system definitions
[output] — run root and run name

Tutorials

Tutorial	Description
01 Quickstart	End-to-end TJH mortality prediction
02 Custom Dataset	Bring your own data + medical code mapping
03 Model Comparison	ML vs DL with bootstrap CI and statistical tests
04 Fairness & Explainability	Bias detection + feature importance
05 Survival Analysis	DeepSurv, C-index, Kaplan-Meier curves

Documentation

Full documentation: medxlab.github.io/OneEHR

Build docs locally:

uv pip install -e ".[docs]"
uv run mkdocs serve

Contributing

See CONTRIBUTING.md for development setup and guidelines.

Validation

pytest tests/ -v                                                    # 114 tests
oneehr preprocess --config examples/tjh/mortality_patient.toml      # End-to-end

Name		Name	Last commit message	Last commit date
Latest commit History 326 Commits
.github/workflows		.github/workflows
docs		docs
examples/tjh		examples/tjh
oneehr		oneehr
tests		tests
tutorials		tutorials
.gitignore		.gitignore
.python-version		.python-version
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
mkdocs.toml		mkdocs.toml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OneEHR

Key Features

Workflow At A Glance

Install

Quickstart

Dataset Converters

Models

Task Types

Medical Code Ontologies

Configuration

Tutorials

Documentation

Contributing

Validation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

OneEHR

Key Features

Workflow At A Glance

Install

Quickstart

Dataset Converters

Models

Task Types

Medical Code Ontologies

Configuration

Tutorials

Documentation

Contributing

Validation

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages