GeoTIFF: VRT backend parity with .ovr sidecar interactions (#2321 sub-task 4)#2335
Merged
Conversation
Sub-task 4 of #2321. Locks eager vs dask parity on the VRT read surface that is most likely to drift between backends: metadata (transform, crs, crs_wkt, georef_status), windowed-coord shifts, and the .tif.ovr sidecar lookup vs an equivalent inline-overview source. The matrix mirrors test_backend_parity_matrix.py with a small declarative fixture/backend layout, a shared materialise + parity helper, and one parametrised cell per (fixture, backend) pair. Twelve cells in total.
brendancol
commented
May 23, 2026
Contributor
Author
brendancol
left a comment
There was a problem hiding this comment.
PR Review: VRT backend parity with .ovr sidecar interactions (#2321 sub-task 4)
Blockers (must fix before merge)
None.
Suggestions (should fix, not blocking)
-
test_vrt_backend_parity_2321.py:317-328-- thesidecar-uint16-windowcell only checks eager-vs-dask parity for its windowed transform; it does not pin absolute values. The float32 windowed test (test_windowed_vrt_shifts_coords_and_transform_consistently) does pin absolute coords and the pixel-size half of the transform tuple, so a regression where BOTH backends shift coords the same wrong way would slip past the sidecar cell. Either reuse that absolute pin (the bundled sidecar fixture has known pixel size 0.001 and origin -120.0, 45.0), or document explicitly that the sidecar cell is parity-only.
Nits (optional improvements)
-
test_vrt_backend_parity_2321.py:355-366-- thevrt_fixtureresolver globs*.vrtand returns the first match, then re-reads it viaopen_geotiffpurely to recover the dtype. The dtype is already determined by the builder; storing it next to the path on first build (or hard-coding per_FixtureSpec) avoids a strayopen_geotiffround-trip on every cache hit. -
test_vrt_backend_parity_2321.py:431--test_sidecar_vrt_attrs_match_inlinebuilds both VRTs fresh on every cell (no session cache) because it usestmp_pathdirectly. Two parametrised cells (eager + dask) and the build is cheap, so this is fine, but a session-scoped fixture would shave a few writes. Pure perf nit. -
test_vrt_backend_parity_2321.py:299-329-- the four_FIXTURESentries collapse onto two builders viabuilder.__name__. The collapse logic lives in thevrt_fixtureresolver and is implicit; a one-line comment on_FIXTURESsaying "fix_id is unique per (builder, window); the resolver caches per builder" would save a reader a hop.
What looks good
- The harness mirrors
test_backend_parity_matrix.pystructure faithfully: same_materialise/_assert_pixels_equalshape, same dataclass-driven matrix, same labelled assertion messages. A future move to a shared parity harness is mechanical, as the docstring claims. test_assert_metadata_parity_flags_transform_driftis the right kind of sanity check: it locks the helper's own behaviour so a regression that quietly drops one of the metadata assertions cannot let the matrix pass with empty checks.- The "windowed-cell straddles the tile seam" comment on
two-tile-float32-window-spans-seamis the kind of intent-explaining note that catches the next reviewer up immediately. - Re-using the bundled
overview_external_ovr_uint16.tif/.tif.ovrfixture (no new on-disk fixtures added) keeps the corpus small and matches the brief's "only if no existing case exercises the path."
Checklist
- Algorithm matches reference -- tests-only, no algorithm changes.
- All implemented backends produce consistent results -- exactly what the new cells assert.
- NaN handling is correct --
_assert_pixels_equaluses NaN-aware comparison for floats. - Edge cases covered by tests -- window straddles tile seam; sidecar with and without window; cross-fixture pyramid comparison.
- Dask chunk boundaries handled correctly --
chunks=(16, 16)on a 16x32 mosaic yields a 1x2 grid; window cells force the graph to read both backing sources. - No premature materialization or unnecessary copies --
_materialiseis the only.compute()site, called once at assertion time. - [n/a] Benchmark exists or is not needed -- tests-only.
- [n/a] README feature matrix updated -- tests-only.
- Docstrings present and accurate -- module-level docstring explicitly names the parent issue and the acceptance bar.
- Cache (path, dtype) in the vrt_fixture resolver instead of re-opening the VRT on every cache hit just to recover the dtype. - Add a dedicated test that pins the absolute coord/transform shift for the sidecar windowed cell. The parametrised matrix only checks eager-vs-dask equality; the absolute pin catches the regression where both backends drift the same way. - Clarify that fix_id collapses to builder name in the cache via a comment on _FIXTURES.
brendancol
commented
May 23, 2026
Contributor
Author
brendancol
left a comment
There was a problem hiding this comment.
PR Review (second pass): after follow-up commits
Status of original findings
- Suggestion (sidecar window absolute pin): fixed in
test_sidecar_window_shifts_to_known_coords. The bundled fixture's pixel size (0.001) and origin (-120.0, 45.0) are pinned, so a regression that drifts BOTH backends the same way now surfaces. - Nit (cache dtype lookup): fixed.
vrt_fixturenow caches(path, dtype)in an in-process dict; cache hits no longer round-trip throughopen_geotiff. - Nit (per-cell sidecar build in
test_sidecar_vrt_attrs_match_inline): deferred. Two parametrised cells, builds are cheap, session-scoping adds complexity for no measurable gain. - Nit (comment on
_FIXTUREScollapse): fixed. One-line comment above_FIXTURESsays the fix_id is unique per (builder, window) and the resolver caches per builder.
Blockers
None.
Suggestions
None.
Nits
None.
What looks good
- The new sidecar absolute-shift test follows the same pattern as
test_windowed_vrt_shifts_coords_and_transform_consistently, so the two windowed surfaces (float32 mosaic + sidecar) now have symmetric coverage. - The cache rework removes a stray
open_geotiffcall per cache hit, which is the kind of small cleanup that pays off when the matrix grows.
All 13 cells pass locally.
The cache dict introduced in the previous review-fix commit lived inside the function-scoped ``vrt_fixture`` and was rebuilt every test, so every cell that shared a builder still triggered a fresh write. On POSIX that is just wasted I/O. On Windows ``to_geotiff`` renames a ``.tmp`` file over the existing target, and the previous cell's read may still hold the target open, producing PermissionError / OSError. Move the cache to a session-scoped fixture so the build runs once per builder for the entire pytest session.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
xrspatial/geotiff/tests/test_vrt_backend_parity_2321.py, a small parity matrix that asserts pixel and metadata equality between eager and dask VRT reads, plus sidecar.ovrvs inline-overview equivalence.test_backend_parity_matrix.py: declarative fixture and backend specs, a shared_assert_metadata_parityhelper, one parametrised cell per (fixture, backend) pair.Backend coverage
Tests-only. Backends exercised by the new cells:
GPU + VRT parity already has dedicated coverage in
test_vrt_lazy_chunks_1814.pyand is out of scope here.Test plan
pytest xrspatial/geotiff/tests/test_vrt_backend_parity_2321.py -v: 12 cells pass locally.test_assert_metadata_parity_flags_transform_driftconfirms the harness itself fails on a transform-only drift.overview_external_ovr_uint16.tif+.tif.ovr) and the inline-overview comparison fixture both already live undergolden_corpus/fixtures/. No new on-disk fixtures added.Closes #2330. Sub-task 4 of #2321.