This document contains performance benchmarks and optimization targets for the Terraphim Medical AI system.
| Component | Target | Status |
|---|---|---|
| KG Queries | <5ms | PASS (~2.6µs) |
| Thesaurus Lookup | <5ms | PASS (~0.9ms) |
| Role Graph Search | <5ms | PASS (~2.6µs) |
| Learning Inference | <10ms | PASS (~240ns) |
| End-to-End Pipeline | <10ms | PASS (~0.6-1.3ms) |
The thesaurus expansion benchmarks measure synonym lookup performance for medical terminology.
thesaurus_expand/Oncologist/malignant neoplasm: ~1.14ms
thesaurus_expand/Oncologist/lung cancer: ~1.08ms
thesaurus_expand/Oncologist/tumor: ~982µs
thesaurus_expand/Cardiologist/heart failure: ~1.01ms
thesaurus_expand/Cardiologist/hypertension: ~874µs
Status: PASS - All lookups are under the 5ms target.
The role graph search benchmarks measure SNOMED-based treatment discovery performance.
role_graph_search/Oncologist/lung cancer: ~2.09µs
role_graph_search/Oncologist/carcinoma: ~851ns
role_graph_search/Cardiologist/heart failure: ~2.61µs
role_graph_search/Cardiologist/hypertension: ~852ns
Status: PASS - All searches are well under the 5ms target.
The learning inference benchmarks measure case-based recommendation performance.
pipeline_stages/stage_3_learning: ~239ns
learning_inference/recommend/10_patterns: ~203ns
learning_inference/recommend/50_patterns: ~178ns
learning_inference/recommend/100_patterns: ~173ns
learning_inference/recommend/500_patterns: ~177ns
Status: PASS - Inference is well under the 10ms target and scales well with pattern count.
The end-to-end benchmarks measure the complete workflow from query to recommendations.
end_to_end_workflow/full_pipeline/lung cancer: ~587µs
end_to_end_workflow/full_pipeline/malignant neoplasm: ~1.35ms
end_to_end_workflow/full_pipeline/breast cancer: ~957µs
end_to_end_workflow/full_pipeline/tumor: ~937µs
Status: PASS - Full pipeline is well under the 10ms target.
To run all benchmarks:
cargo bench --package terraphim-medical-agentsTo run a specific benchmark:
# Thesaurus lookup benchmarks
cargo bench --package terraphim-medical-agents --bench thesaurus_lookup
# Role graph search benchmarks
cargo bench --package terraphim-medical-agents --bench role_graph_search
# End-to-end pipeline benchmarks
cargo bench --package terraphim-medical-agents --bench end_to_endFor quick test runs (reduced sample size):
cargo bench --package terraphim-medical-agents -- --quick-
Embedded Data: UMLS slices are embedded at compile time using
include_bytes!, eliminating I/O overhead at runtime. -
Lazy Loading: The knowledge graph is loaded lazily via
OnceLock, avoiding initialization costs until first use. -
Efficient Data Structures:
HashMapfor O(1) synonym lookupsRwLockfor concurrent read accessArcfor cheap cloning of shared data
-
Zero-Copy Where Possible: String references are used where feasible to minimize allocations.
-
Caching: Add an LRU cache for frequently accessed thesaurus expansions.
-
SIMD: Use SIMD instructions for string comparisons in synonym matching.
-
Memory Pooling: Use object pools for frequently allocated objects in the hot path.
-
AOT Compilation: Pre-compile the knowledge graph to a binary format for faster loading.
-
Lock-Free Data Structures: Consider lock-free hash maps for read-heavy workloads.
To profile the benchmarks:
# Build with debug symbols
cargo bench --package terraphim-medical-agents --no-run
# Run with perf (Linux)
perf record target/release/deps/thesaurus_lookup-* --bench
perf report
# Run with Instruments (macOS)
instruments -t "Time Profiler" target/release/deps/thesaurus_lookup-* --benchBenchmarks are run on every commit. Results are tracked to detect performance regressions.
To compare benchmark results between commits:
# Save baseline
cargo bench --package terraphim-medical-agents -- --save-baseline=main
# Compare against baseline
cargo bench --package terraphim-medical-agents -- --baseline=mainBenchmarks were run on:
- CPU: AMD Ryzen/Intel equivalent
- RAM: 16GB DDR4
- Storage: NVMe SSD
- OS: Linux 6.8.0
Results may vary on different hardware configurations.