Add expandForValueSet to CodeSystemProvider for SQL-backed expansion#131
Open
jmandel wants to merge 2 commits intoHealthIntersections:mainfrom
Open
Add expandForValueSet to CodeSystemProvider for SQL-backed expansion#131jmandel wants to merge 2 commits intoHealthIntersections:mainfrom
jmandel wants to merge 2 commits intoHealthIntersections:mainfrom
Conversation
…nsion Add one optional method to the CodeSystemProvider interface that lets SQL-backed providers handle ValueSet expansion in a single query with LIMIT/OFFSET instead of the per-code iterator loop. The worker groups compose includes/excludes by code system and passes the full hull to the provider. Providers returning an iterable skip the framework's manual include/exclude loops entirely. Providers returning null (e.g., SNOMED) fall back to the existing path unchanged. RxNorm implementation: TTY/STY filter mapping to SQL, GROUP BY for JOIN dedup, exclude/activeOnly/searchText push-down via better-sqlite3 cursors. LOINC implementation: Relationship/Property/Status filter mapping, UNION per include with GROUP BY dedup, all using existing indexes. Paging offset/count are only passed for single-system composes where they are exact. Multi-system composes still get filter/exclude push-down but the framework handles paging in finalization. Also includes: eager context loading in RxNorm (eliminates 3x redundant SQL per code), searchFilter arg order fixes, TxParameters.assign fix, perf counters, and props-only-when-requested optimization. Benchmarks: RxNorm 13/13 pass (median 37x), LOINC 14/14 pass (median 25x). No regressions on captured production query replay. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
417330e to
e300d0d
Compare
- scripts/test-expand-for-valueset.js: RxNorm benchmark (13 tests) - scripts/test-loinc-expand.js: LOINC benchmark (14 tests) - scripts/replay-sampled-terminology.js: replay test harness - test-expand-results.txt: RxNorm results (7-341x speedup) - test-loinc-expand-results.txt: LOINC results (2-106x speedup) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds one optional method to the
CodeSystemProviderinterface —expandForValueSet(spec)— that lets SQL-backed providers handle ValueSet expansion in a single query instead of the per-code iterator loop.The worker groups compose includes/excludes by code system and passes the full hull (includes, excludes, activeOnly, paging) to the provider. Providers that implement this method return an async iterable of fully-resolved entries; the worker feeds them through
includeCode()for dedup, import filtering, and FHIR construction. Providers that don't implement it (e.g., SNOMED) are completely unaffected — the existing iterator-oracle path runs unchanged.What changed
Interface (
tx/cs/cs-api.js)expandForValueSet(spec)method with JSDoc contractAsyncIterable<ExpandedEntry>ornull(fall back to existing path)Worker (
tx/workers/expand.js)expandForValueSetfirst; marks handled systems to skip framework's manual include/exclude loopsoffset/countonly passed for single-system composes (multi-system getsnull— framework applies paging in finalization)RxNorm (
tx/cs/cs-rxnorm.js)expandForValueSetusingbetter-sqlite3lazy cursorsWHERE IN, STY filter →JOIN rxnsty, concepts →WHERE RXCUI INGROUP BY RXCUIwhen JOINs cause row multiplicationLOINC (
tx/cs/cs-loinc.js)expandForValueSetusingbetter-sqlite3lazy cursorsOther
tx/perf-counters.js) for timing expandForValueSet pathslocateMany/filterPagestubs on SNOMED, LOINC (may be useful for future IPC)searchFilterarg order,validate.jsmessages,TxParameters.assignpaging fieldsBenchmark results
RxNorm (13 tests, all pass, median 37x speedup):
LOINC (14 tests, all pass, median 25x speedup):
LOINC jest: 37/37 pass. Replay: no regressions on captured production queries.
Design doc
See
docs/PROVIDER_INTERFACE_EVOLUTION.mdfor full rationale, SQL strategies, paging contract, and index decisions.