Skip to content

Latest commit

 

History

History
163 lines (109 loc) · 4.67 KB

File metadata and controls

163 lines (109 loc) · 4.67 KB

Performance Benchmarks

This document contains performance benchmarks and optimization targets for the Terraphim Medical AI system.

Targets

Component Target Status
KG Queries <5ms PASS (~2.6µs)
Thesaurus Lookup <5ms PASS (~0.9ms)
Role Graph Search <5ms PASS (~2.6µs)
Learning Inference <10ms PASS (~240ns)
End-to-End Pipeline <10ms PASS (~0.6-1.3ms)

Benchmark Results

Thesaurus Lookup

The thesaurus expansion benchmarks measure synonym lookup performance for medical terminology.

thesaurus_expand/Oncologist/malignant neoplasm:  ~1.14ms
thesaurus_expand/Oncologist/lung cancer:         ~1.08ms
thesaurus_expand/Oncologist/tumor:               ~982µs
thesaurus_expand/Cardiologist/heart failure:     ~1.01ms
thesaurus_expand/Cardiologist/hypertension:      ~874µs

Status: PASS - All lookups are under the 5ms target.

Role Graph Search

The role graph search benchmarks measure SNOMED-based treatment discovery performance.

role_graph_search/Oncologist/lung cancer:        ~2.09µs
role_graph_search/Oncologist/carcinoma:          ~851ns
role_graph_search/Cardiologist/heart failure:    ~2.61µs
role_graph_search/Cardiologist/hypertension:     ~852ns

Status: PASS - All searches are well under the 5ms target.

Learning Inference

The learning inference benchmarks measure case-based recommendation performance.

pipeline_stages/stage_3_learning:                ~239ns
learning_inference/recommend/10_patterns:        ~203ns
learning_inference/recommend/50_patterns:        ~178ns
learning_inference/recommend/100_patterns:       ~173ns
learning_inference/recommend/500_patterns:       ~177ns

Status: PASS - Inference is well under the 10ms target and scales well with pattern count.

End-to-End Pipeline

The end-to-end benchmarks measure the complete workflow from query to recommendations.

end_to_end_workflow/full_pipeline/lung cancer:           ~587µs
end_to_end_workflow/full_pipeline/malignant neoplasm:    ~1.35ms
end_to_end_workflow/full_pipeline/breast cancer:         ~957µs
end_to_end_workflow/full_pipeline/tumor:                 ~937µs

Status: PASS - Full pipeline is well under the 10ms target.

Running Benchmarks

To run all benchmarks:

cargo bench --package terraphim-medical-agents

To run a specific benchmark:

# Thesaurus lookup benchmarks
cargo bench --package terraphim-medical-agents --bench thesaurus_lookup

# Role graph search benchmarks
cargo bench --package terraphim-medical-agents --bench role_graph_search

# End-to-end pipeline benchmarks
cargo bench --package terraphim-medical-agents --bench end_to_end

For quick test runs (reduced sample size):

cargo bench --package terraphim-medical-agents -- --quick

Optimization Notes

Current Optimizations

  1. Embedded Data: UMLS slices are embedded at compile time using include_bytes!, eliminating I/O overhead at runtime.

  2. Lazy Loading: The knowledge graph is loaded lazily via OnceLock, avoiding initialization costs until first use.

  3. Efficient Data Structures:

    • HashMap for O(1) synonym lookups
    • RwLock for concurrent read access
    • Arc for cheap cloning of shared data
  4. Zero-Copy Where Possible: String references are used where feasible to minimize allocations.

Potential Future Optimizations

  1. Caching: Add an LRU cache for frequently accessed thesaurus expansions.

  2. SIMD: Use SIMD instructions for string comparisons in synonym matching.

  3. Memory Pooling: Use object pools for frequently allocated objects in the hot path.

  4. AOT Compilation: Pre-compile the knowledge graph to a binary format for faster loading.

  5. Lock-Free Data Structures: Consider lock-free hash maps for read-heavy workloads.

Profiling

To profile the benchmarks:

# Build with debug symbols
cargo bench --package terraphim-medical-agents --no-run

# Run with perf (Linux)
perf record target/release/deps/thesaurus_lookup-* --bench
perf report

# Run with Instruments (macOS)
instruments -t "Time Profiler" target/release/deps/thesaurus_lookup-* --bench

Continuous Benchmarking

Benchmarks are run on every commit. Results are tracked to detect performance regressions.

To compare benchmark results between commits:

# Save baseline
cargo bench --package terraphim-medical-agents -- --save-baseline=main

# Compare against baseline
cargo bench --package terraphim-medical-agents -- --baseline=main

Hardware Specifications

Benchmarks were run on:

  • CPU: AMD Ryzen/Intel equivalent
  • RAM: 16GB DDR4
  • Storage: NVMe SSD
  • OS: Linux 6.8.0

Results may vary on different hardware configurations.