Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
### Unreleased

#### Bug fixes and improvements
- Reconcile `xrspatial.geotiff.SUPPORTED_FEATURES` with the GeoTIFF release-contract tiering proposed in epic #2340. Adds `reader.windowed` at `stable` (covered by the existing window-read suite) and `reader.dask` at `stable` (covered by the cross-backend parity matrix in `test_backend_parity_matrix.py` and `test_backend_full_parity_2211.py`). Demotes `reader.allow_rotated` and `reader.allow_unparseable_crs` from `advanced` to `experimental` to match the epic's placement of permissive read-side escape hatches in the Experimental tier. A new shape test (`test_supported_features_shape_2348.py`) pins the structural invariants of the mapping (every entry carries a tier label; the tier set is closed at `{stable, advanced, experimental, internal_only}`; the dict literal contains no duplicate keys) so future drift fails CI. Runtime behaviour of every read and write path is unchanged; this is metadata-only. Callers gating on the exact string value of these two demoted entries will need to update their checks. (#2348)
- Promote the local COG read and write paths to the `stable` tier in `xrspatial.geotiff.SUPPORTED_FEATURES`. `SUPPORTED_FEATURES['writer.cog']` and `SUPPORTED_FEATURES['reader.local_cog']` now report `stable`; `reader.http_cog` stays `advanced` while the HTTP transport surface is contracted separately. The stable COG contract covers axis-aligned 2D / 3D rasters, the CPU writer and CPU reader, the lossless codecs (`none`, `deflate`, `lzw`, `zstd`, `packbits`), internal overviews, and normal CRS / transform / dtype / nodata / band / pixel-is-area / pixel-is-point round-trip. GPU COG paths, experimental codecs, rotated transforms, external `.tif.ovr` sidecars, file-like destinations with `cog=True`, BigTIFF COG, and HTTP COG remain outside the contract. Backed by the writer compliance suite (#2292), the cross-backend parity gate (#2293), and the per-tile byte-budget contract (#2294 / #2298). The reference docs (`docs/source/reference/geotiff.rst`) and the COG overview notebook spell out the full contract. (#2300)
- Resolve the GeoTIFF writer's `GeographicTypeGeoKey` / `ProjectedCSTypeGeoKey` decision via pyproj instead of an EPSG number range. The legacy heuristic (4326 + 4000-4999 -> geographic, else projected) silently mis-tagged geographic CRSes registered outside the 4000-4999 block (NAD83(2011) = 6318, GDA2020 = 7844, WGS 84 (G2139) = 9057, etc.) as projected and projected codes inside the block (4087 / 4088 / 4499) as geographic, corrupting the CRS at write time. The writer now calls `pyproj.CRS.from_epsg(...).is_geographic`. When pyproj can't classify a code (uninstalled, or installed but the local PROJ database lacks the entry), the writer raises the new `UnknownCRSModelTypeError` rather than guessing -- a small vetted allowlist (4326, 4269, 4267, 4258, 4283, 4322, 4230, 4019, 4047) is still honoured for the pyproj-missing case. `pyproj` is now listed under the `geotiff` extra. (#2277)
- Shut down the per-tile compression `ThreadPoolExecutor` on every exit path of the streaming tiled-write code in `to_geotiff`. The old code only called `shutdown(wait=True)` after the tile-row loop completed, so any mid-stream raise (compression failure, dask compute failure, file write failure) bypassed shutdown and leaked worker threads. The loop now runs inside `try/finally` and the finally calls `shutdown(wait=True, cancel_futures=True)` so queued tiles get dropped on the error path instead of blocking the unwind. The pool's workers carry an `xrspatial-geotiff-tile-compress` `thread_name_prefix` so leak-detection tests can tell them apart from dask's own offload/scheduler pools. (#2276)
Expand Down
15 changes: 15 additions & 0 deletions docs/source/reference/release_gate_geotiff.rst
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,21 @@ Local GeoTIFF read and write
``nodata`` all survive read.
- ``xrspatial/geotiff/tests/test_backend_pixel_parity_matrix_1813.py``,
``xrspatial/geotiff/tests/test_backend_parity_matrix.py``
* - ``reader.windowed``
- stable
- ``open_geotiff(window=(x0, y0, w, h))`` returns the requested
pixel sub-rectangle for tiled and stripped layouts; out-of-bounds
and zero-area windows raise rather than silently clamp; coords
on georeferenced inputs match the eager full-read slice.
- ``xrspatial/geotiff/tests/test_window_out_of_bounds_1634.py``,
``xrspatial/geotiff/tests/test_no_georef_windowed_coords_1710.py``
* - ``reader.dask``
- stable
- ``open_geotiff(chunks=...)`` returns a Dask-backed
:class:`xarray.DataArray` that computes to the same pixels,
coords, and ``attrs`` as the eager numpy read.
- ``xrspatial/geotiff/tests/test_backend_parity_matrix.py``,
``xrspatial/geotiff/tests/test_backend_full_parity_2211.py``
* - ``writer.local_file``
- stable
- ``to_geotiff`` writes a file that ``open_geotiff`` reads back
Expand Down
47 changes: 44 additions & 3 deletions xrspatial/geotiff/_attrs.py
Original file line number Diff line number Diff line change
Expand Up @@ -228,11 +228,39 @@
# * GPU COG read / write (``writer.gpu``, ``reader.gpu``).
# * Experimental codecs (``lerc``, ``jpeg2000``, ``j2k``, ``lz4``) and the
# internal-only ``jpeg`` codec.
# * Rotated transforms (``reader.allow_rotated``).
# * Rotated transforms (``reader.allow_rotated``); wave-1 of epic
# #2340 additionally demoted this entry from ``advanced`` to
# ``experimental`` -- see the alignment block immediately below.
# * External ``.tif.ovr`` sidecars (``reader.sidecar_ovr``).
# * File-like destinations with ``cog=True``.
# * BigTIFF COG (tracked separately).
# * HTTP / range COG (``reader.http_cog``); see the per-key comment below.
#
# Epic #2340 alignment (release contract)
# ---------------------------------------
# The epic tiers the public geotiff surface as Stable / Advanced /
# Experimental / Internal-only for the next release. The mapping below
# is the source of truth that the docs page, the user-guide notebook,
# and the writer gates read from, so it has to match the epic.
#
# Wave-1 reconciliation under #2340:
#
# * Add ``reader.windowed`` at ``stable``. Windowed reads have a
# release-gate suite (``test_window_*`` + ``test_no_georef_windowed_coords_1710``
# and the GPU window cases) and the epic places them in Stable.
# * Add ``reader.dask`` at ``stable``. Dask reads are parity-tested
# against the eager numpy reader via
# ``test_backend_parity_matrix.py`` and ``test_backend_full_parity_2211.py``,
# which is the bar the epic asks for ("dask reads only where
# parity-tested").
# * Demote ``reader.allow_rotated`` from ``advanced`` to ``experimental``.
# The opt-in lets a caller bypass the read-side rotated-transform
# check that ``RotatedTransformError`` defends; the epic places
# rotated writes as Unsupported and the read-side escape hatch sits
# naturally in Experimental.
# * Demote ``reader.allow_unparseable_crs`` from ``advanced`` to
# ``experimental``. The epic places "explicit-CRS permissive escape
# hatches" in Experimental.
SUPPORTED_FEATURES = {
# Codecs. Tier 1 lossless integer + float byte-for-byte round-trip.
'codec.none': 'stable',
Expand All @@ -250,12 +278,25 @@
'codec.jpeg': 'internal_only',
# Read paths.
'reader.local_file': 'stable',
# Windowed reads (#2340): release-gate covered by the window-read
# suite in ``xrspatial/geotiff/tests/`` (eager numpy, dask, GPU,
# HTTP, and VRT branches each have a window-validation test).
'reader.windowed': 'stable',
# Dask reads (#2340): parity-tested against the eager numpy reader
# by ``test_backend_parity_matrix.py`` and ``test_backend_full_parity_2211.py``,
# which is the gate the epic asks for. GPU-backed dask reads remain
# under ``reader.gpu`` and stay experimental.
'reader.dask': 'stable',
'reader.fsspec': 'advanced',
'reader.http': 'advanced',
'reader.vrt': 'advanced',
'reader.sidecar_ovr': 'advanced',
'reader.allow_rotated': 'advanced',
'reader.allow_unparseable_crs': 'advanced',
# Permissive escape hatches (#2340): demoted from ``advanced`` to
# ``experimental``. Both opt-ins let a caller bypass a read-side
# check (rotated transform / unparseable CRS) the writer normally
# rejects; the epic places these in Experimental.
'reader.allow_rotated': 'experimental',
'reader.allow_unparseable_crs': 'experimental',
# COG reader paths (issue #2291): split out from the previous
# single COG concept (which only carried ``writer.cog``) so the
# local and HTTP reader variants can promote independently of the
Expand Down
155 changes: 155 additions & 0 deletions xrspatial/geotiff/tests/test_supported_features_shape_2348.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
"""Structural invariants for ``SUPPORTED_FEATURES`` (issue #2348).

Background
----------
Epic #2340 reconciles ``SUPPORTED_FEATURES`` with the GeoTIFF release
contract. The tier label on each entry is the source of truth that the
docs page, the user-guide notebook table, and the writer gates all
read from. If a new entry lands without a tier, or with a tier outside
the closed set ``{stable, advanced, experimental, internal_only}``,
the downstream consumers silently mis-render or skip it.

The tier-coverage test (``test_supported_features_tiers_2137.py``)
exercises the codec gate. This file pins the structural invariants of
the mapping itself so a stray entry without a tier, or with a
free-form tier string, fails CI before it reaches the consumers.

What this test pins
-------------------
* The mapping is a dict, non-empty, keyed by ``"<group>.<name>"``
strings.
* Every value is one of ``stable`` / ``advanced`` / ``experimental``
/ ``internal_only``. The tier set is closed.
* Every key is unique (defended against accidental dict-literal
duplicates that Python silently dedupes).
* The wave-1 reconciliation under #2340 lands the expected
promotions / demotions (``reader.windowed`` and ``reader.dask``
added at ``stable``; ``reader.allow_rotated`` and
``reader.allow_unparseable_crs`` at ``experimental``).
"""
from __future__ import annotations

import ast
from pathlib import Path

import pytest

from xrspatial.geotiff import SUPPORTED_FEATURES

_VALID_TIERS = frozenset({'stable', 'advanced', 'experimental', 'internal_only'})


def test_supported_features_is_non_empty_dict():
"""Catches an accidental refactor that swaps the dict for a
different container or empties it."""
assert isinstance(SUPPORTED_FEATURES, dict)
assert len(SUPPORTED_FEATURES) > 0


def test_every_entry_has_a_tier():
"""Every value is a non-empty string. Catches ``None`` /
placeholder values that would otherwise silently disable the
docs / notebook renderer for that row."""
for name, tier in SUPPORTED_FEATURES.items():
assert isinstance(tier, str), (name, type(tier).__name__)
assert tier, name


def test_tier_set_is_closed():
"""Every tier value is one of the four documented labels. The
set is closed; introducing a new tier requires updating the
docs, the notebook, and this test together."""
seen = set(SUPPORTED_FEATURES.values())
extras = seen - _VALID_TIERS
assert not extras, (
f"SUPPORTED_FEATURES carries unrecognised tier labels {sorted(extras)!r}; "
f"valid labels are {sorted(_VALID_TIERS)!r}. Adding a new tier requires "
f"updating xrspatial.geotiff.__init__ docs, the user-guide notebook table, "
f"and this test in the same commit."
)


def test_every_tier_label_is_used():
"""Every documented tier label appears on at least one entry.
Catches accidental drift where a tier is documented in the
package docstring but no feature references it (which makes the
label dead code)."""
seen = set(SUPPORTED_FEATURES.values())
missing = _VALID_TIERS - seen
assert not missing, (
f"the following tier labels are documented but unused: {sorted(missing)!r}. "
f"Either remove the label from the docs / notebook or add an entry that uses it."
)


def test_keys_follow_group_dot_name_shape():
"""Every key is ``"<group>.<name>"`` with a non-empty group and
name. The renderer in the user-guide notebook splits on ``.``
once to group rows; an entry without a dot would land in an
"(unknown)" bucket."""
for key in SUPPORTED_FEATURES:
assert isinstance(key, str), type(key).__name__
head, _, tail = key.partition('.')
assert head and tail, key
assert key.count('.') >= 1, key


def test_keys_are_unique_in_source():
"""Dict literals silently dedupe duplicate keys ('a': 'x', 'a':
'y' -> {'a': 'y'}). Parse the source and assert no key appears
twice so a future copy/paste typo fails CI instead of silently
overwriting the earlier tier."""
src = Path(__file__).resolve().parents[1] / '_attrs.py'
tree = ast.parse(src.read_text())

found_keys: list[str] | None = None
for node in ast.walk(tree):
if not isinstance(node, ast.Assign):
continue
targets = node.targets
if (len(targets) == 1
and isinstance(targets[0], ast.Name)
and targets[0].id == 'SUPPORTED_FEATURES'
and isinstance(node.value, ast.Dict)):
found_keys = []
for k in node.value.keys:
# Dict-literal keys are ast.Constant on 3.8+.
if isinstance(k, ast.Constant) and isinstance(k.value, str):
found_keys.append(k.value)
break

assert found_keys is not None, (
"could not locate the ``SUPPORTED_FEATURES = {...}`` literal in "
"xrspatial/geotiff/_attrs.py; this test parses the source so a "
"duplicate key cannot be hidden by Python's dict-literal dedup."
)
duplicates = [k for k in found_keys if found_keys.count(k) > 1]
assert not duplicates, (
f"duplicate keys in the SUPPORTED_FEATURES literal: "
f"{sorted(set(duplicates))!r}. Python silently dedupes these so the "
f"later tier wins; remove the duplicates or rename them."
)


@pytest.mark.parametrize("key,tier", [
('reader.windowed', 'stable'),
('reader.dask', 'stable'),
('reader.allow_rotated', 'experimental'),
('reader.allow_unparseable_crs', 'experimental'),
])
def test_epic_2340_wave_1_reconciliation(key, tier):
"""Wave-1 reconciliation under epic #2340 lands the expected
promotions and demotions. Pinned so a future revert that bumps
these back to ``advanced`` or drops them entirely fails this
test before it reaches the docs / release notes.
"""
assert key in SUPPORTED_FEATURES, (
f"{key!r} dropped from SUPPORTED_FEATURES; epic #2340 introduced "
f"this entry at tier {tier!r}. Restore it or update the test if "
f"the reconciliation has been revised."
)
assert SUPPORTED_FEATURES[key] == tier, (
f"{key!r} expected tier {tier!r}, got {SUPPORTED_FEATURES[key]!r}. "
f"Epic #2340 set this tier; a promotion / demotion needs to be "
f"justified in the changelog and reflected here."
)
Loading