Skip to content

Ensure date basis merges do not de-duplicate dates in population counts#1041

Open
lukedegruchy wants to merge 4 commits into
mainfrom
ld-20260528-cql-types-and-fhir-resources-sets
Open

Ensure date basis merges do not de-duplicate dates in population counts#1041
lukedegruchy wants to merge 4 commits into
mainfrom
ld-20260528-cql-types-and-fhir-resources-sets

Conversation

@lukedegruchy
Copy link
Copy Markdown
Contributor

@lukedegruchy lukedegruchy commented May 29, 2026

  • This is a tactical fix as the upgrade to use CQL 5.0.0 will be large and invasive, and there is already a clinical-reasoning with this work under way, so this PR is careful not to derail that work with large changes.
  • Comment out CqlType-specific code in HashSetForFhirResourcesAndCqlTypes that deals with CqlTypes, as this fixes the case of de-duplicated dates.
  • Add testing for duplicate strings and dates and ensure those tests pass, as well as tests for boolean basis
  • Add tests for integer basis, but leave them disabled.
  • Add a PRP file with a plan to address other de-duplicate cases.

Closes #1040

lukedegruchy and others added 3 commits May 28, 2026 14:45
Pin the expected behaviour for CDO-714: when a Measure's population basis is a
non-boolean primitive type and a CQL expression yields duplicate values within
a single subject, each occurrence must be counted. The population count should
match the expression's list length, not the deduped set size.

New tests under cqf-fhir-cr (no production code changes):

- DuplicateTypeIntraSubjectTest exercises the single-Measure path against three
  basis flavours, each tripping a distinct dedup code path inside
  HashSetForFhirResourcesAndCqlTypes:

    * CQL Date  ({ @2025-07-03, @2025-07-03, @2025-07-04 }) -- fails today via
      the CqlType / EqualEvaluator.equal path.

    * CQL Integer ({ 42, 42, 43 }) -- fails today via default HashSet semantics
      using Integer's value-based equals/hashCode.

    * FHIR IPrimitiveType<String> (Patient.address[0].line yielding duplicate
      "dup-line" primitives) -- passes today as a regression guard: HAPI's Base
      does not override equals, so identity semantics keep distinct instances.

- MultiMeasureDuplicateTypeIntraSubjectTest runs all four basis Measures
  (date, integer, FHIR string, boolean) through R4MultiMeasureService in a
  single call to confirm the bug reproduces along that orchestration path and
  that the boolean baseline stays at 1.

Shared fixture at
cqf-fhir-cr/src/test/resources/.../measure/r4/DuplicateTypeIntraSubject/.

Also adds MultiMeasure.SelectedMeasureReport#logReportJson() so MultiMeasure
tests can dump the MeasureReport JSON at any point in the fluent chain,
matching the single-Measure SelectedMeasureReport#logReportJson helper.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Disable the CqlType dedup branches in HashSetForFhirResourcesAndCqlTypes so
distinct runtime.Date (and other no-equals-override CqlType) instances survive
default HashSet identity semantics. The two @2025-07-03 instances in the
CDO-714 fixture's initial-population now count as 2 separate occurrences, and
the population reports 3 instead of 2.

Scope is intentionally tactical. A survey of what the CQL engine returns
shows the bug surface spans every CQL primitive type (Boolean, Integer, Long,
Decimal, String, Date, DateTime, Time) plus every CQL composite (Quantity,
Ratio, Code, Concept, CodeSystem, ValueSet, Interval, Tuple). The disabled
CqlType branches alone unblock CQL Date, the FHIR-string regression case, and
the boolean baseline -- the realistic CDO-714 surface today. Java primitives
that the engine returns (notably java.lang.Integer, java.lang.String) still
dedup via default HashSet using their own value-based equals/hashCode, so the
integer assertion in DuplicateTypeIntraSubjectTest and the integer Measure in
the MultiMeasure chain are @disabled / removed and explicitly point at the
PRP below.

The holistic basis-aware-storage redesign that would cover all the deferred
cases is captured in PRPs/prp-population-basis-primitive-duplicate-counting.md.
It is sequenced after the CQL 5.0 (cql1) ExpressionResult type changes land,
since cql1 will materially shift the return-type table.

CqlDate-specific unit tests in HashSetForFhirResourcesAndCqlTypesTest are
@disabled with the same PRP pointer -- they assert the dedup behaviour the
tactical fix removes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

Formatting check succeeded!

@lukedegruchy lukedegruchy changed the title Ld 20260528 cql types and fhir resources sets Ensure date basis merges do not de-duplicate dates in population counts May 29, 2026
@lukedegruchy lukedegruchy marked this pull request as ready for review May 29, 2026 19:05
@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MeasureReport deduping duplicate results generated by the cql

1 participant