Skip to content

Epic: GeoTIFF correctness and backend parity release gate #2341

@brendancol

Description

@brendancol

Goal

Make the promised GeoTIFF paths resistant to silent wrongness by asserting parity across pixels, coordinates, and metadata attrs.

The highest release risk is not an obvious crash. It is a plausible-looking raster with wrong transform, crs, nodata, georef_status, or coords that downstream spatial functions trust.

Scope

This epic covers correctness gates for the release-promised paths:

  • Local eager GeoTIFF reads/writes.
  • COG reads/writes for stable codecs.
  • Windowed reads.
  • Dask reads where promised.
  • Sidecar overview reads where promised.
  • VRT simple mosaics where promised by the VRT epic.
  • GPU only as experimental smoke/parity checks, not as stable release blocker unless it affects CPU behavior.

Priority Risks

  • Pixels match but attrs or coords are wrong.
  • Eager and dask paths disagree.
  • GPU fallback paths produce different metadata than CPU paths.
  • Windowed reads return unshifted transforms.
  • Overview reads lose CRS/transform/nodata metadata.
  • Nodata masking/promotion differs across backends.
  • Integer nodata, NaN nodata, MinIsWhite, and masked_nodata lifecycle drift.

Work Items

  • Define a small geotiff_release_gate test set that runs in normal CI.
  • Add helper assertions for full raster equivalence:
    • values
    • dims
    • coords
    • attrs['transform']
    • attrs['crs'] / attrs['crs_wkt']
    • attrs['nodata']
    • attrs['masked_nodata']
    • attrs['georef_status']
    • attrs['raster_type']
    • relevant rich metadata when promised
  • Add eager vs dask parity tests for representative files.
  • Add windowed-read tests that assert shifted coords and transform.
  • Add overview/sidecar tests that assert metadata parity, not just pixel parity.
  • Add read/write/read round-trip tests for stable codecs.
  • Add negative tests for ambiguous metadata that must fail closed.

Non-Goals

  • Exhaustive codec matrix in the release gate.
  • Making GPU a stable backend.
  • Full GDAL/rasterio parity for every edge case.

Acceptance Criteria

  • A stable-path regression cannot pass by returning correct pixel values with wrong metadata.
  • Eager and dask promised paths agree on values, coords, dims, and attrs.
  • Windowed reads have correct shape, coords, and shifted transform.
  • Overview and sidecar reads preserve promised geospatial metadata.
  • Nodata lifecycle behavior is pinned for representative int, float, and masked cases.
  • Unsupported or ambiguous metadata fails loudly instead of flattening or guessing.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions