diff --git a/CHANGELOG.md b/CHANGELOG.md index 11c61c20..3a16349a 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,7 @@ ### Unreleased #### Bug fixes and improvements +- Reconcile `xrspatial.geotiff.SUPPORTED_FEATURES` with the GeoTIFF release-contract tiering proposed in epic #2340. Adds `reader.windowed` at `stable` (covered by the existing window-read suite) and `reader.dask` at `stable` (covered by the cross-backend parity matrix in `test_backend_parity_matrix.py` and `test_backend_full_parity_2211.py`). Demotes `reader.allow_rotated` and `reader.allow_unparseable_crs` from `advanced` to `experimental` to match the epic's placement of permissive read-side escape hatches in the Experimental tier. A new shape test (`test_supported_features_shape_2348.py`) pins the structural invariants of the mapping (every entry carries a tier label; the tier set is closed at `{stable, advanced, experimental, internal_only}`; the dict literal contains no duplicate keys) so future drift fails CI. Runtime behaviour of every read and write path is unchanged; this is metadata-only. Callers gating on the exact string value of these two demoted entries will need to update their checks. (#2348) - Promote the local COG read and write paths to the `stable` tier in `xrspatial.geotiff.SUPPORTED_FEATURES`. `SUPPORTED_FEATURES['writer.cog']` and `SUPPORTED_FEATURES['reader.local_cog']` now report `stable`; `reader.http_cog` stays `advanced` while the HTTP transport surface is contracted separately. The stable COG contract covers axis-aligned 2D / 3D rasters, the CPU writer and CPU reader, the lossless codecs (`none`, `deflate`, `lzw`, `zstd`, `packbits`), internal overviews, and normal CRS / transform / dtype / nodata / band / pixel-is-area / pixel-is-point round-trip. GPU COG paths, experimental codecs, rotated transforms, external `.tif.ovr` sidecars, file-like destinations with `cog=True`, BigTIFF COG, and HTTP COG remain outside the contract. Backed by the writer compliance suite (#2292), the cross-backend parity gate (#2293), and the per-tile byte-budget contract (#2294 / #2298). The reference docs (`docs/source/reference/geotiff.rst`) and the COG overview notebook spell out the full contract. (#2300) - Resolve the GeoTIFF writer's `GeographicTypeGeoKey` / `ProjectedCSTypeGeoKey` decision via pyproj instead of an EPSG number range. The legacy heuristic (4326 + 4000-4999 -> geographic, else projected) silently mis-tagged geographic CRSes registered outside the 4000-4999 block (NAD83(2011) = 6318, GDA2020 = 7844, WGS 84 (G2139) = 9057, etc.) as projected and projected codes inside the block (4087 / 4088 / 4499) as geographic, corrupting the CRS at write time. The writer now calls `pyproj.CRS.from_epsg(...).is_geographic`. When pyproj can't classify a code (uninstalled, or installed but the local PROJ database lacks the entry), the writer raises the new `UnknownCRSModelTypeError` rather than guessing -- a small vetted allowlist (4326, 4269, 4267, 4258, 4283, 4322, 4230, 4019, 4047) is still honoured for the pyproj-missing case. `pyproj` is now listed under the `geotiff` extra. (#2277) - Shut down the per-tile compression `ThreadPoolExecutor` on every exit path of the streaming tiled-write code in `to_geotiff`. The old code only called `shutdown(wait=True)` after the tile-row loop completed, so any mid-stream raise (compression failure, dask compute failure, file write failure) bypassed shutdown and leaked worker threads. The loop now runs inside `try/finally` and the finally calls `shutdown(wait=True, cancel_futures=True)` so queued tiles get dropped on the error path instead of blocking the unwind. The pool's workers carry an `xrspatial-geotiff-tile-compress` `thread_name_prefix` so leak-detection tests can tell them apart from dask's own offload/scheduler pools. (#2276) diff --git a/docs/source/reference/release_gate_geotiff.rst b/docs/source/reference/release_gate_geotiff.rst index f40f721a..1e64bc9b 100644 --- a/docs/source/reference/release_gate_geotiff.rst +++ b/docs/source/reference/release_gate_geotiff.rst @@ -56,6 +56,21 @@ Local GeoTIFF read and write ``nodata`` all survive read. - ``xrspatial/geotiff/tests/test_backend_pixel_parity_matrix_1813.py``, ``xrspatial/geotiff/tests/test_backend_parity_matrix.py`` + * - ``reader.windowed`` + - stable + - ``open_geotiff(window=(x0, y0, w, h))`` returns the requested + pixel sub-rectangle for tiled and stripped layouts; out-of-bounds + and zero-area windows raise rather than silently clamp; coords + on georeferenced inputs match the eager full-read slice. + - ``xrspatial/geotiff/tests/test_window_out_of_bounds_1634.py``, + ``xrspatial/geotiff/tests/test_no_georef_windowed_coords_1710.py`` + * - ``reader.dask`` + - stable + - ``open_geotiff(chunks=...)`` returns a Dask-backed + :class:`xarray.DataArray` that computes to the same pixels, + coords, and ``attrs`` as the eager numpy read. + - ``xrspatial/geotiff/tests/test_backend_parity_matrix.py``, + ``xrspatial/geotiff/tests/test_backend_full_parity_2211.py`` * - ``writer.local_file`` - stable - ``to_geotiff`` writes a file that ``open_geotiff`` reads back diff --git a/xrspatial/geotiff/_attrs.py b/xrspatial/geotiff/_attrs.py index 61825456..369dac84 100644 --- a/xrspatial/geotiff/_attrs.py +++ b/xrspatial/geotiff/_attrs.py @@ -228,11 +228,39 @@ # * GPU COG read / write (``writer.gpu``, ``reader.gpu``). # * Experimental codecs (``lerc``, ``jpeg2000``, ``j2k``, ``lz4``) and the # internal-only ``jpeg`` codec. -# * Rotated transforms (``reader.allow_rotated``). +# * Rotated transforms (``reader.allow_rotated``); wave-1 of epic +# #2340 additionally demoted this entry from ``advanced`` to +# ``experimental`` -- see the alignment block immediately below. # * External ``.tif.ovr`` sidecars (``reader.sidecar_ovr``). # * File-like destinations with ``cog=True``. # * BigTIFF COG (tracked separately). # * HTTP / range COG (``reader.http_cog``); see the per-key comment below. +# +# Epic #2340 alignment (release contract) +# --------------------------------------- +# The epic tiers the public geotiff surface as Stable / Advanced / +# Experimental / Internal-only for the next release. The mapping below +# is the source of truth that the docs page, the user-guide notebook, +# and the writer gates read from, so it has to match the epic. +# +# Wave-1 reconciliation under #2340: +# +# * Add ``reader.windowed`` at ``stable``. Windowed reads have a +# release-gate suite (``test_window_*`` + ``test_no_georef_windowed_coords_1710`` +# and the GPU window cases) and the epic places them in Stable. +# * Add ``reader.dask`` at ``stable``. Dask reads are parity-tested +# against the eager numpy reader via +# ``test_backend_parity_matrix.py`` and ``test_backend_full_parity_2211.py``, +# which is the bar the epic asks for ("dask reads only where +# parity-tested"). +# * Demote ``reader.allow_rotated`` from ``advanced`` to ``experimental``. +# The opt-in lets a caller bypass the read-side rotated-transform +# check that ``RotatedTransformError`` defends; the epic places +# rotated writes as Unsupported and the read-side escape hatch sits +# naturally in Experimental. +# * Demote ``reader.allow_unparseable_crs`` from ``advanced`` to +# ``experimental``. The epic places "explicit-CRS permissive escape +# hatches" in Experimental. SUPPORTED_FEATURES = { # Codecs. Tier 1 lossless integer + float byte-for-byte round-trip. 'codec.none': 'stable', @@ -250,12 +278,25 @@ 'codec.jpeg': 'internal_only', # Read paths. 'reader.local_file': 'stable', + # Windowed reads (#2340): release-gate covered by the window-read + # suite in ``xrspatial/geotiff/tests/`` (eager numpy, dask, GPU, + # HTTP, and VRT branches each have a window-validation test). + 'reader.windowed': 'stable', + # Dask reads (#2340): parity-tested against the eager numpy reader + # by ``test_backend_parity_matrix.py`` and ``test_backend_full_parity_2211.py``, + # which is the gate the epic asks for. GPU-backed dask reads remain + # under ``reader.gpu`` and stay experimental. + 'reader.dask': 'stable', 'reader.fsspec': 'advanced', 'reader.http': 'advanced', 'reader.vrt': 'advanced', 'reader.sidecar_ovr': 'advanced', - 'reader.allow_rotated': 'advanced', - 'reader.allow_unparseable_crs': 'advanced', + # Permissive escape hatches (#2340): demoted from ``advanced`` to + # ``experimental``. Both opt-ins let a caller bypass a read-side + # check (rotated transform / unparseable CRS) the writer normally + # rejects; the epic places these in Experimental. + 'reader.allow_rotated': 'experimental', + 'reader.allow_unparseable_crs': 'experimental', # COG reader paths (issue #2291): split out from the previous # single COG concept (which only carried ``writer.cog``) so the # local and HTTP reader variants can promote independently of the diff --git a/xrspatial/geotiff/tests/test_supported_features_shape_2348.py b/xrspatial/geotiff/tests/test_supported_features_shape_2348.py new file mode 100644 index 00000000..2f25424f --- /dev/null +++ b/xrspatial/geotiff/tests/test_supported_features_shape_2348.py @@ -0,0 +1,155 @@ +"""Structural invariants for ``SUPPORTED_FEATURES`` (issue #2348). + +Background +---------- +Epic #2340 reconciles ``SUPPORTED_FEATURES`` with the GeoTIFF release +contract. The tier label on each entry is the source of truth that the +docs page, the user-guide notebook table, and the writer gates all +read from. If a new entry lands without a tier, or with a tier outside +the closed set ``{stable, advanced, experimental, internal_only}``, +the downstream consumers silently mis-render or skip it. + +The tier-coverage test (``test_supported_features_tiers_2137.py``) +exercises the codec gate. This file pins the structural invariants of +the mapping itself so a stray entry without a tier, or with a +free-form tier string, fails CI before it reaches the consumers. + +What this test pins +------------------- +* The mapping is a dict, non-empty, keyed by ``"."`` + strings. +* Every value is one of ``stable`` / ``advanced`` / ``experimental`` + / ``internal_only``. The tier set is closed. +* Every key is unique (defended against accidental dict-literal + duplicates that Python silently dedupes). +* The wave-1 reconciliation under #2340 lands the expected + promotions / demotions (``reader.windowed`` and ``reader.dask`` + added at ``stable``; ``reader.allow_rotated`` and + ``reader.allow_unparseable_crs`` at ``experimental``). +""" +from __future__ import annotations + +import ast +from pathlib import Path + +import pytest + +from xrspatial.geotiff import SUPPORTED_FEATURES + +_VALID_TIERS = frozenset({'stable', 'advanced', 'experimental', 'internal_only'}) + + +def test_supported_features_is_non_empty_dict(): + """Catches an accidental refactor that swaps the dict for a + different container or empties it.""" + assert isinstance(SUPPORTED_FEATURES, dict) + assert len(SUPPORTED_FEATURES) > 0 + + +def test_every_entry_has_a_tier(): + """Every value is a non-empty string. Catches ``None`` / + placeholder values that would otherwise silently disable the + docs / notebook renderer for that row.""" + for name, tier in SUPPORTED_FEATURES.items(): + assert isinstance(tier, str), (name, type(tier).__name__) + assert tier, name + + +def test_tier_set_is_closed(): + """Every tier value is one of the four documented labels. The + set is closed; introducing a new tier requires updating the + docs, the notebook, and this test together.""" + seen = set(SUPPORTED_FEATURES.values()) + extras = seen - _VALID_TIERS + assert not extras, ( + f"SUPPORTED_FEATURES carries unrecognised tier labels {sorted(extras)!r}; " + f"valid labels are {sorted(_VALID_TIERS)!r}. Adding a new tier requires " + f"updating xrspatial.geotiff.__init__ docs, the user-guide notebook table, " + f"and this test in the same commit." + ) + + +def test_every_tier_label_is_used(): + """Every documented tier label appears on at least one entry. + Catches accidental drift where a tier is documented in the + package docstring but no feature references it (which makes the + label dead code).""" + seen = set(SUPPORTED_FEATURES.values()) + missing = _VALID_TIERS - seen + assert not missing, ( + f"the following tier labels are documented but unused: {sorted(missing)!r}. " + f"Either remove the label from the docs / notebook or add an entry that uses it." + ) + + +def test_keys_follow_group_dot_name_shape(): + """Every key is ``"."`` with a non-empty group and + name. The renderer in the user-guide notebook splits on ``.`` + once to group rows; an entry without a dot would land in an + "(unknown)" bucket.""" + for key in SUPPORTED_FEATURES: + assert isinstance(key, str), type(key).__name__ + head, _, tail = key.partition('.') + assert head and tail, key + assert key.count('.') >= 1, key + + +def test_keys_are_unique_in_source(): + """Dict literals silently dedupe duplicate keys ('a': 'x', 'a': + 'y' -> {'a': 'y'}). Parse the source and assert no key appears + twice so a future copy/paste typo fails CI instead of silently + overwriting the earlier tier.""" + src = Path(__file__).resolve().parents[1] / '_attrs.py' + tree = ast.parse(src.read_text()) + + found_keys: list[str] | None = None + for node in ast.walk(tree): + if not isinstance(node, ast.Assign): + continue + targets = node.targets + if (len(targets) == 1 + and isinstance(targets[0], ast.Name) + and targets[0].id == 'SUPPORTED_FEATURES' + and isinstance(node.value, ast.Dict)): + found_keys = [] + for k in node.value.keys: + # Dict-literal keys are ast.Constant on 3.8+. + if isinstance(k, ast.Constant) and isinstance(k.value, str): + found_keys.append(k.value) + break + + assert found_keys is not None, ( + "could not locate the ``SUPPORTED_FEATURES = {...}`` literal in " + "xrspatial/geotiff/_attrs.py; this test parses the source so a " + "duplicate key cannot be hidden by Python's dict-literal dedup." + ) + duplicates = [k for k in found_keys if found_keys.count(k) > 1] + assert not duplicates, ( + f"duplicate keys in the SUPPORTED_FEATURES literal: " + f"{sorted(set(duplicates))!r}. Python silently dedupes these so the " + f"later tier wins; remove the duplicates or rename them." + ) + + +@pytest.mark.parametrize("key,tier", [ + ('reader.windowed', 'stable'), + ('reader.dask', 'stable'), + ('reader.allow_rotated', 'experimental'), + ('reader.allow_unparseable_crs', 'experimental'), +]) +def test_epic_2340_wave_1_reconciliation(key, tier): + """Wave-1 reconciliation under epic #2340 lands the expected + promotions and demotions. Pinned so a future revert that bumps + these back to ``advanced`` or drops them entirely fails this + test before it reaches the docs / release notes. + """ + assert key in SUPPORTED_FEATURES, ( + f"{key!r} dropped from SUPPORTED_FEATURES; epic #2340 introduced " + f"this entry at tier {tier!r}. Restore it or update the test if " + f"the reconciliation has been revised." + ) + assert SUPPORTED_FEATURES[key] == tier, ( + f"{key!r} expected tier {tier!r}, got {SUPPORTED_FEATURES[key]!r}. " + f"Epic #2340 set this tier; a promotion / demotion needs to be " + f"justified in the changelog and reflected here." + )