Skip to content

geotiff: extend round-trip invariants to corpus fixtures (#1986)#2086

Merged
brendancol merged 2 commits into
mainfrom
issue-1986
May 18, 2026
Merged

geotiff: extend round-trip invariants to corpus fixtures (#1986)#2086
brendancol merged 2 commits into
mainfrom
issue-1986

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

Closes #1986.

Extends the canonical round-trip invariants module from PR #2007 with the deferred cases. Each new test pulls a fixture from the #1930 golden corpus (no new fixtures added per the issue's constraint), runs read -> write -> read, and pins the invariant in the class docstring.

  • planar-separate multiband: pixels byte-equal in memory, on-disk layout drifts to chunky.
  • internal-IFD overviews: base IFD bytes byte-equal, overview factor list preserved.
  • COG: base bytes byte-equal, factors preserved, LAYOUT=COG marker documented as semantic drift (xrspatial writer does not emit the GDAL ghost-IFD block).
  • sparse tiled: elided zero tiles materialise to zeros on read; rewrite is a normal tiled GeoTIFF.
  • VRT mosaic: open_geotiff(.vrt) matches np.concatenate of sources; rewrite is a plain GeoTIFF.

The fixed-point check runs from the second cycle onward for corpus-backed cases because the fixtures still carry the deprecated #1984 geographic attrs (geog_citation, angular_units, ...) which drop on the first write.

Test plan

  • pytest xrspatial/geotiff/tests/test_round_trip_invariants.py -- 20 passed
  • Existing incident round-trip files still pass (test_metadata_round_trip_1484.py, test_no_georef_writer_round_trip_1949.py, test_int_coords_round_trip_hotfix_1962.py)

Adds five corpus-backed test classes to the canonical round-trip
invariants module, closing the deferred-cases list:

* planar-separate multiband (uses ``planar_separate_uint8_rgb``)
* internal-IFD overviews (uses ``overview_internal_uint16``)
* COG layout (uses ``cog_internal_overview_uint16``)
* sparse tiled (uses ``sparse_tiled_uint16``)
* VRT mosaic (uses ``dtype_uint8`` and ``dtype_uint16``)

Each case pins the canonical invariant in its docstring: byte-equal
pixels with documented on-disk drift (planar layout collapses to
chunky, sparse tiles materialise, ``LAYOUT=COG`` marker does not
re-emit, VRT XML does not round-trip). Fixed-point convergence is
checked from the second cycle onward because the corpus fixtures
carry deprecated #1984 attrs that drop on the first write.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 18, 2026
#1986)

* Fix module docstring: the COG bullet incorrectly said the
  ``LAYOUT=COG`` ghost-IFD marker re-emits on round-trip; the test
  actually asserts the opposite. Reword to match the invariant.
* Tighten the overview bullet to "factor list preserved" since the
  test does not read overview pixels; per-pixel overview parity is
  the oracle's job (#1930).
* Document why ``TestVRTRoundTripFromCorpus`` can fixed-point on the
  first cycle (rasterio-written intermediate, not a corpus fixture).
* Drop the duplicated ``LAYOUT=COG`` precondition in the COG test;
  that property is already pinned by the corpus oracle module.
* Move ``write_vrt`` to module-level imports for consistency.
@brendancol brendancol merged commit 0749e15 into main May 18, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

geotiff: canonical round-trip invariants (byte- vs semantic-equivalent)

1 participant