Skip to content

v9.7.0#2622

Draft
johngrimes wants to merge 31 commits into
mainfrom
release/9.7.0
Draft

v9.7.0#2622
johngrimes wants to merge 31 commits into
mainfrom
release/9.7.0

Conversation

@johngrimes
Copy link
Copy Markdown
Member

No description provided.

piotrszul and others added 19 commits May 11, 2026 20:00
Introduces let() binding in SqlFunctions to materialise non-deterministic
Spark column expressions exactly once per row, and applies it throughout
the fhirpath and sql packages to prevent TraceExpression side effects
from firing multiple times where the same operand appears in both branches
of a when() expression.

Adds a RepeatedSqlEvaluation checkstyle rule (RegexpMultiline) to catch
accidental duplicate SQL evaluation at compile time, scoped to the
fhirpath and sql package trees. Includes regression tests for all fixed
evaluation paths.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix join() to use the lambda-bound parameter instead of getValue(),
  preventing duplicate evaluation of non-deterministic operands, and
  add a single-fire regression test with a string-array dataset.
- Replace nullif(c, array()) in normaliseNull() with let() + size()
  check to avoid relying on element-type equality, which fails for
  MapType array elements in ANSI mode.
- Document the 400-character false-negative trade-off in the
  RepeatedSqlEvaluation checkstyle rule comment.
- Add @throws AnalysisException to SqlFunctions.let() Javadoc for
  the aggregate/window constraint.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix "Two" → "Three" design constraints count in checkstyle comment
- Remove incorrect cdUnit as let() lambda parameter example; clarify
  that leftR/rightR are short local variables, not lambda parameters
- Update PathlingContext Javadoc: add toBoolean() to list of affected
  helpers and change "post-3.0" to "Spark 3.0+" for consistency
- Update SqlFunctions class Javadoc to mention union alongside deduplication
- Add testNormaliseNull() and nullArray() case to testSingular() in
  DefaultRepresentationTest to cover semantic correctness after rewrite
- Add trace-count regression guards for: convertToDateTime, convertToTime,
  and IMPLIES right-operand in TraceFunctionTest

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Backtick-quoted code references and removed bare semicolons from
explanatory comments in ColumnRepresentation and TraceFunctionTest
to avoid triggering java:S125.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Extend the let() materialisation pattern to five more sites where a
Column parameter was referenced multiple times in a single Spark SQL
expression tree, causing nondeterministic expressions (e.g. trace())
to fire once per reference instead of once per row.

Fixed sites:
- ArrayElementWiseColumnEquality.performArrayComparison()
- QuantityComparator.wrap()
- TemporalComparator.implementWithSql()
- ReferenceValue.validateTypeFormat()
- ValidationLogic.validateConversionToBoolean()

Also adds a binary let(Column, Column, BinaryOperator<Column>) overload
to SqlFunctions to reduce verbosity when materialising two operands.

Each fix is covered by a new trace-count regression test that wraps the
input column in TraceExpression and asserts exactly one fire per row.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
`sanitiseRow` only handled nested `Row` values but fell through for
`scala.collection.Seq` values (how Spark represents array fields), so
synthetic fields like `_fid` and null-valued fields leaked into the JSON
output whenever a FHIRPath expression returned a type containing an
array of structs (e.g. `CodeableConcept.coding`).

Adds a new branch that iterates over `Seq` elements, recursively
sanitises any `Row` elements, and updates the parent field's `ArrayType`
elementType to the sanitised element schema so that `Row.json()`
positional mapping remains correct.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pre-size the ArrayList with the known sequence length, remove a redundant
what-comment, and extract the shared coding row fixture into a helper to
eliminate copy-paste between two test classes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Locks in that sanitiseRow correctly renders JSON for an array of structs
where elements differ in which fields are null, and therefore have
different post-sanitisation schemas.
Added suppressions for newly reported CVEs across core libraries, server,
and site scopes following contextual impact assessment. All suppressed
findings are either not bundled in the distribution or have unreachable
vulnerable code paths.

Upgraded mermaid from 11.12.2 to 11.15.0 via package.json override to
fix four MEDIUM CVEs (CSS/HTML injection and DoS in diagram rendering)
in the deployed static site.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three unit tests (SqlQueryResultStreamerTest, ViewRegistrationServiceTest,
LibraryReferenceResolverTest.CanonicalReferences) called spark.stop() in
@afterall, which destroyed the JVM-wide SparkContext and caused
ViewDefinitionSearchTest and ViewDefinitionCreateTest to fail intermittently
depending on test execution order.

Converted all three tests to @SpringBootUnitTest so they receive the shared
SparkSession via Spring injection, consistent with every other Spark-dependent
test in the server module. The manually created sessions and @afterall
teardowns are removed entirely.

Closes #2615

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
BulkSubmitProviderTest was installing a Mockito mock as the active Spring
SecurityContext in @beforeeach but never clearing it. Under JUnit 5 parallel
execution the mock leaked onto adjacent threads, causing
SearchProviderAuthTest to inherit a mock context in which setAuthentication()
is a no-op, so checkHasAuthority() would throw "Token not present".

Closes #2617.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Hadoop Path.toUri() drops the empty authority on file:// paths built via
new Path(parent, child), yielding file:/path. Files discovered later via
fs.listFiles + fs.makeQualified preserve the empty authority and come
back as file:///path. UrlAllowlist's string-prefix match then rejects
the downloaded file URLs against the staging-directory prefix, failing
the import with an AccessDeniedError after the bulk export has already
completed. Build the prefix via fs.getFileStatus so both sides use the
same canonical URI form.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The shared static warehouse @tempdir is cleaned in @AfterEach, but
Spark's catalog cache and Delta's global DeltaLog cache still hold
references to the deleted tables. The next test rebuilds the warehouse
from test fixtures, but isDeltaTable returns false against the stale
log, so the import falls through to an ERROR_IF_EXISTS write that
collides with the freshly-copied directory and fails with
DELTA_PATH_EXISTS. Clear both caches before deleting files so cleanup
restores both the on-disk and in-memory state.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The test runs under the integration-test profile with PNP credentials
configured and auth enabled, but its requests were missing the
Authorization header. The pre-existing 401 was hidden by an earlier
PNP allowlist bug; with that fix in place, the auth interlock now
rejects the request before the poisoning scenario can exercise.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The integration tests configured pathling.bulk-submit.allowable-sources
to bare http://localhost via @TestPropertySource. The URI-aware
UrlAllowlist resolves that prefix to effective port 80 and no longer
matches the dynamic http://localhost:{wireMockPort} the tests
actually use. Move the property into @DynamicPropertySource so it
picks up the WireMock port at runtime.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The test relied on the WebTestClient default response timeout of 5 s,
which is shorter than the cold-start latency of the first POST against a
freshly started Spring Boot context with a Delta-backed warehouse. Match
the 60 s timeout already used by the sibling integration tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
fix: Resolve intermittent test failures caused by shared state leaks
@johngrimes johngrimes self-assigned this May 19, 2026
@johngrimes johngrimes added the release Pull request that represents a new release label May 19, 2026
@github-project-automation github-project-automation Bot moved this to Backlog in Pathling May 19, 2026
@johngrimes johngrimes moved this from Backlog to In progress in Pathling May 19, 2026
Bumps the org.hl7.fhir.r4 and org.hl7.fhir.utilities overrides from
6.9.6 to 6.9.7 in the core library POM to address CVE-2026-45367
(ReDoS via FHIRPath matches()/replaceMatches()).

Adds explicit tomcat-embed-core/el/websocket overrides at 10.1.55 in
the server POM to address five Tomcat CVEs (CVE-2026-41293 CRITICAL,
CVE-2026-43512 CRITICAL, CVE-2026-41284 HIGH, CVE-2026-42498 HIGH,
CVE-2026-43513 HIGH) introduced by Spring Boot 3.5.14 bundling 10.1.54.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@piotrszul piotrszul deployed to maven-central May 19, 2026 12:02 — with GitHub Actions Active
johngrimes and others added 4 commits May 21, 2026 12:06
Extract switch-based field dispatch and the array-of-struct branch
into separate helper methods so sanitiseRow itself stays simple and
the type-dispatch chain is expressed as a Java 21 switch expression.
Resolves SonarCloud java:S3776 and java:S6880 findings on
SingleInstanceEvaluator.
Previously FhirViewShareableComplianceTest only ran via the
sof-compliance-test surefire execution, which disables exclusions and
sets testFailureIgnore=true so the SoF compliance report can capture
every result without failing the build. As a side effect, regressions
in supported features were not detected by normal CI runs.

The default test execution now picks up the compliance suite with the
maintained exclusion list, so a failure in a supported case breaks the
build. The unconstrained report run moves behind an opt-in
sofComplianceReport profile, activated by the release workflow so the
compliance report continues to be produced.

%rowIndex cases are added to the exclusion set because that feature
is not yet supported and is tracked separately.
The shareable compliance suite previously ran only in a report-only
execution that ignored test failures, so regressions in features
Pathling supports could land unnoticed. The suite now runs in the
default build with the maintained exclusion list, failing the build
on any regression in a supported case. The report-only run moves to
an opt-in profile activated by the release workflow, so the SoF
compliance report continues to be produced.

%rowIndex cases are added to the exclusion list because that feature
is not yet supported and is tracked separately.
When a repeat directive's traversal followed a path whose runtime value was
null (e.g. multi-path repeat where one branch produced no value at certain
nodes), the extractor returned a null array and Spark's Concat propagated the
null upward, producing wrong results. Wrapping the extractor's output in
Coalesce(_, []) keeps the typed empty array in place of nulls and lets the
surrounding Concat assemble the projection correctly.

Resolves the previously failing repeat compliance case
"multi-path repeat inside forEach" in the expanded SQL on FHIR v2 test suite.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
piotrszul and others added 5 commits May 22, 2026 18:03
When a repeat directive recursively traversed past the encoder's
maxNestingLevel, Pathling's FHIRPath evaluator continued resolving the
path against HAPI definitions while the Catalyst schema no longer had
the field. The FIELD_NOT_FOUND fallback emitted an untyped empty array,
which crashed StructProduct with "NullType cannot be cast to StructType"
whenever a sibling typed array combined with the empty result.

The expected element type is now derived from the repeat's projection
clause (declared sqlType, FHIR type, or materialised column type,
wrapped in ArrayType for collection columns) and threaded through
transformTree. When the root traversal hits FIELD_NOT_FOUND, the
fallback emits a typed empty array matching the declared column shape,
so downstream StructProduct sees a consistent element type and combines
correctly.

Resolves the previously failing repeat compliance cases (repeat inside
repeat, triple-nested repeat, repeat with forEach with repeat) in the
expanded SQL on FHIR v2 test suite. The symmetric forEach-past-cap case
is tracked separately and excluded from the regression suite.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…d-empty

Add regression tests covering three type-mapping bugs silently fixed by
routing through FhirPathType (BASE64BINARY, DECIMAL, INSTANT). Add a
behavioural test for the typed-empty fallback path in transformTree.
Improve comments in RepeatSelection, Expressions, and ProjectedColumn
to capture non-obvious intent. Update design.md to reflect as-built
decisions (getSqlType on existing records, FhirPathType delegation).
Archive the completed change and sync the repeat-directive delta spec
to the main specs directory.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add regression cases for EqualityOperator.handleNonEquivalentTypes
covering both left and right traced operands. The branch is currently
correct because isEmpty() references its operand once, but the
asymmetry with handleEquivalentTypes (which already let()-wraps both
sides) means a future refactor that adds a second operand reference
would silently re-introduce issue #2594. The new assertions document
the single-fire invariant at the FHIRPath surface.
Eliminate trace() entry duplication in ColumnRepresentation
fix: Ensure repeat directive implementation passes expanded test coverage
Extend sanitiseRow to recurse into arrays of structs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release Pull request that represents a new release

Projects

Status: In progress

Development

Successfully merging this pull request may close these issues.

2 participants