Skip to content

geotiff: oracle normalises multi-band axis order (#1930)#2047

Merged
brendancol merged 3 commits into
mainfrom
1930-oracle-axis-order
May 18, 2026
Merged

geotiff: oracle normalises multi-band axis order (#1930)#2047
brendancol merged 3 commits into
mainfrom
1930-oracle-axis-order

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

Closes the JPEG-YCbCr xfail that the eager and dask phase-3 backend modules of issue #1930 shared. rasterio reads every TIFF as (bands, H, W); xrspatial reads multi-band rasters as (H, W, B). The convention difference is documented, not a bug, so the oracle now normalises before comparing.

  • Add _normalise_axis_order in _oracle.py. It transposes the trailing-band side to leading-band when H and W line up after the move. The single-band (1, H, W) vs (H, W) squeeze branch is preserved, and a genuine 3-D shape mismatch falls through unchanged so the comparison still raises with the real shapes.
  • Both the bit-exact pixel check (_assert_pixels) and the lossy shape-only check (_assert_shape_only) route through the helper, so the JPEG-YCbCr corpus cell (lossy) and any future multi-band strict fixture get the same treatment.
  • Eager-numpy and dask-numpy backend modules: remove compression_jpeg_uint8_ycbcr from _PARITY_GAPS (the test now passes). Docstrings move the entry to "Resolved gaps", mirroring the masked-nodata pattern from PR geotiff: oracle understands masked-nodata candidates (#1930) #2046.
  • Dask+GPU module: the axis-order gap is closed, but JPEG-YCbCr still fails earlier on the GPU decode path with OSError: broken data stream. Move the entry from _PARITY_GAPS to _DASK_GPU_SKIPS so the test stays xfailed with a reason that matches the pure-GPU module's _GPU_SKIPS.
  • GPU module: refresh the _GPU_SKIPS comment so it no longer says the axis-order gap is open on the CPU paths.

Test plan

  • pytest xrspatial/geotiff/tests/test_golden_corpus_*_1930.py xrspatial/geotiff/tests/golden_corpus/ -- 160 passed, 8 skipped, 6 xfailed, no failures, no XPASS.
  • New tests in golden_corpus/test_oracle.py:
    • synthesised (B, H, W) ref vs (H, W, B) candidate passes strict
    • pixel mismatch between the two layouts still raises
    • single-band (1, H, W) vs (H, W) squeeze still works
    • lossy shape-only path accepts matching shapes across layouts
    • lossy still rejects a genuine shape mismatch
    • unit pin on _normalise_axis_order's four supported cases plus the fall-through
  • _build_candidate gains a band_axis=('leading'|'trailing') knob; a new _write_multiband_tiff helper writes the rasterio side of the fixture pair.

Part of #1930 (multi-band axis-order parity gap). The citation-CRS gap is tracked separately.

Closes the JPEG-YCbCr xfail shared by the eager and dask phase-3
backend modules.

rasterio reads every TIFF as (bands, H, W); xrspatial reads
multi-band rasters as (H, W, B). The convention difference is
documented, not a bug, so the oracle now normalises before
comparing. The new _normalise_axis_order helper transposes the
trailing-band side to leading-band whenever H and W line up after
the move; the single-band squeeze branch is preserved, and a
genuine 3-D shape mismatch falls through unchanged so the
comparison still raises with the real shapes.

Both the bit-exact pixel check and the lossy shape-only check
route through the helper, so the JPEG-YCbCr corpus cell (lossy)
and any future multi-band strict fixture get the same treatment.

Per-module changes:

* eager-numpy and dask-numpy: remove compression_jpeg_uint8_ycbcr
  from _PARITY_GAPS (the test now passes); docstring moves the
  entry to "Resolved gaps", mirroring the masked-nodata pattern
  from PR #2046.
* dask+GPU: the axis-order gap is closed, but JPEG-YCbCr still
  fails earlier on the GPU decode path with OSError: broken data
  stream. Move the entry from _PARITY_GAPS to _DASK_GPU_SKIPS to
  capture the actual failure mode (matches the pure-GPU module's
  _GPU_SKIPS rationale).
* gpu: refresh the _GPU_SKIPS comment so it no longer claims the
  axis-order gap is open on the CPU paths.

New test coverage in golden_corpus/test_oracle.py:

* synthesised (B, H, W) ref vs (H, W, B) candidate passes strict;
* pixel mismatch between the two layouts still raises;
* single-band (1, H, W) vs (H, W) squeeze still works;
* lossy shape-only path accepts matching shapes across layouts;
* lossy still rejects a genuine shape mismatch;
* unit-level pin on _normalise_axis_order's four supported cases
  plus the fall-through.

_build_candidate gains a band_axis=('leading'|'trailing') knob and
a new _write_multiband_tiff helper writes the rasterio side of the
fixture pair.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 18, 2026
…1930)

- _normalise_axis_order now short-circuits when ref and candidate
  already share a shape. This makes the H == W == B ambiguity
  (e.g. a 3-band 3x3 raster, shape (3, 3, 3) regardless of layout)
  resolve as a no-op rather than an unnecessary transpose, and
  cleans up the no-op path for already-(B, H, W) / (B, H, W) and
  2-D / 2-D inputs as well.
- Docstring spells out the two-arm symmetry and the H == W == B
  limit so future readers do not have to grep the implementation.
- Test pin: the 2-D pass-through path is exercised directly, and
  the H == W == B case is pinned to return the inputs untouched
  via identity assertions on the helper's outputs.

No behaviour change for any corpus fixture; the JPEG-YCbCr cell
still passes on the eager and dask paths and still xfails on the
GPU and dask+GPU paths for the unrelated decode-error reason.
After the oracle's axis-order normaliser landed, the eager / dask
JPEG xfails resolved cleanly. The GPU and dask+GPU paths still
xfailed because the GPU JPEG-YCbCr decoder is not implemented.
Address both:

* GPU module: introduce ``_GPU_CPU_FALLBACK`` to mark fixtures
  whose codec is genuinely not implemented on the GPU. The parity
  test routes those through ``on_gpu_failure='auto'`` instead of
  ``'strict'``, which exercises the documented CPU fallback
  contract. The fallback yields a CPU-decoded DataArray that the
  oracle compares cleanly against the rasterio reference. JPEG
  cell now PASSES.

* dask+GPU module: move JPEG from ``_DASK_GPU_SKIPS`` to
  ``_INTENTIONAL_SKIPS``. The chunked GPU path cannot fall back
  per chunk -- the decode error surfaces at ``.compute()`` time
  regardless of ``on_gpu_failure`` mode, and there is no
  foreseeable fix short of implementing nvCOMP JPEG-YCbCr
  decode. Plain skip is more honest than xfail(strict=True).

Result: GPU and dask+GPU modules carry no JPEG xfails. The only
remaining corpus xfails are the four ``crs_citation_only`` entries
across the four backend modules, owned by the parallel PR #2054.
@brendancol brendancol merged commit 434b0fc into main May 18, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant