Skip to content

Comments

Add generic SQLite v0 provider with SNOMED CT support and expandForValueSet#132

Closed
jmandel wants to merge 3 commits intoHealthIntersections:mainfrom
jmandel:tx-incremental-plus-snomed
Closed

Add generic SQLite v0 provider with SNOMED CT support and expandForValueSet#132
jmandel wants to merge 3 commits intoHealthIntersections:mainfrom
jmandel:tx-incremental-plus-snomed

Conversation

@jmandel
Copy link
Contributor

@jmandel jmandel commented Feb 21, 2026

Summary

Builds on #131 to add a generic SQLite v0 provider that can serve any terminology loaded into the v0 database schema, starting with SNOMED CT International.

What this adds

Generic SQLite v0 provider (cs-sqlite-runtime-v0.js)

  • A single provider that reads the normalized v0 SQLite schema (concept, closure, concept_link, concept_literal, property_def, value_set tables)
  • Runtime configuration is stored in the database itself (cs_config table), so no code changes are needed per-terminology
  • Factory auto-discovers all code systems in a v0 database file

SNOMED CT importer (cs-sqlite-import-snomed.js)

  • Loads SNOMED RF2 distribution files into the v0 schema
  • Imports concepts, descriptions, relationships, reference sets, and the transitive closure

expandForValueSet on the v0 provider

  • Uses better-sqlite3 for synchronous lazy-cursor SQL queries
  • Maps FHIR ValueSet compose filters to SQL JOINs against the v0 schema:
    • concept is-a / descendent-of → closure table
    • concept = → direct code match
    • concept in → value_set_member join
    • Property filters → concept_link / concept_literal joins
  • UNION ALL for multi-include, DISTINCT dedup, LIMIT/OFFSET paging
  • Bypass toggle via /debug/bypass-expand-for-valueset endpoint

Performance results (SNOMED CT International, 371K concepts)

3-way benchmark (v0+expandForValueSet vs v0 without vs legacy in-memory):

Test Case v0 optimized v0 baseline Legacy in-memory Speedup vs legacy
Clinical Finding (124K codes) 95ms 8,004ms 1,155ms 12x faster
Procedure (59K codes) 65ms 4,782ms 640ms 10x faster
Body Structure (42K codes) 47ms 3,431ms 466ms 10x faster
Refset (21K codes) 36ms 3,676ms 265ms 7x faster
Small hierarchies 2-10ms 15-28ms 3-12ms ~1-2x faster

Resource usage:

Metric Legacy v0 SQLite
Load time 0.39s 0.083s
Memory 1,653 MB 14 MB

Replay test correctness: 6/6 requests produce identical results to the legacy in-memory provider (5 pass, 1 expected fail — same on both).

Full-drain verification: All expanded code sets are identical between v0 and legacy providers. Differences in first-page results are due to sort order only (FHIR spec does not mandate expansion ordering).

What does NOT change

Commits

  1. b6d31bf feat: add generic SQLite v0 provider and SNOMED importer — v0 schema, runtime provider, factory, SNOMED RF2 importer
  2. 91fedf0 feat: add expandForValueSet to SQLite v0 provider — optimized SQL-based expansion with better-sqlite3, bypass toggle

jmandel and others added 3 commits February 20, 2026 19:08
…nsion

Add one optional method to the CodeSystemProvider interface that lets
SQL-backed providers handle ValueSet expansion in a single query with
LIMIT/OFFSET instead of the per-code iterator loop.

The worker groups compose includes/excludes by code system and passes
the full hull to the provider. Providers returning an iterable skip the
framework's manual include/exclude loops entirely. Providers returning
null (e.g., SNOMED) fall back to the existing path unchanged.

RxNorm implementation: TTY/STY filter mapping to SQL, GROUP BY for JOIN
dedup, exclude/activeOnly/searchText push-down via better-sqlite3 cursors.

LOINC implementation: Relationship/Property/Status filter mapping, UNION
per include with GROUP BY dedup, all using existing indexes.

Paging offset/count are only passed for single-system composes where they
are exact. Multi-system composes still get filter/exclude push-down but
the framework handles paging in finalization.

Also includes: eager context loading in RxNorm (eliminates 3x redundant
SQL per code), searchFilter arg order fixes, TxParameters.assign fix,
perf counters, and props-only-when-requested optimization.

Benchmarks: RxNorm 13/13 pass (median 37x), LOINC 14/14 pass (median 25x).
No regressions on captured production query replay.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Bring in the generic SQLite v0 runtime provider from the convergence
branch. This provides a single provider implementation backed by a
normalized SQLite schema (concept, designation, concept_link, closure,
etc.) that works for any terminology.

Includes:
- cs-sqlite-runtime-v0.js: Generic provider implementing full CS API
- cs-sqlite-v0-specializers.js: LOINC implicit ValueSet specialization
- cs-provider-api.js/cs-provider-list.js: Provider base classes
- import-snomed-v0.js: SNOMED RF2 → SQLite v0 importer
- schema-v0.sql: Normalized schema definition
- library.js: sqlite-v0/snomed-sqlite-v0 source type routing

Existing loinc/rxnorm/snomed legacy loaders remain unchanged.
The sqlite-v0 path is additive — use 'snomed-sqlite-v0:<db>' in config.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Implement expandForValueSet on SqliteRuntimeV0Provider using better-sqlite3
for synchronous lazy-cursor iteration. Maps FHIR ValueSet compose filters
to v0 schema SQL:

- concept is-a/descendent-of → closure table JOIN
- concept = → direct WHERE on code
- concept in → value_set_member JOIN
- Property filters → concept_link/concept_literal JOINs
- Excludes via NOT IN subqueries
- LIMIT/OFFSET paging, activeOnly filtering
- UNION ALL for multi-include with DISTINCT dedup

Benchmarks on SNOMED CT International (371K concepts):
- is-a Clinical Finding (124K descendants): 93ms vs 7,272ms (78x)
- is-a Procedure: 52ms vs 4,633ms (89x)
- Refset laterality (21K): 30ms vs 2,928ms (97x)
- Deep paging @offset=50000: 172ms vs 8,862ms (52x)

14/14 correctness tests pass (exact match or set-equal).

Also wires bypass-expand-for-valueset debug toggle for v0 provider.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jmandel jmandel closed this Feb 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant