From 60569ddc145c798fc6042602d707c7dc8f600f42 Mon Sep 17 00:00:00 2001 From: Brendan Collins Date: Sun, 17 May 2026 05:16:31 -0700 Subject: [PATCH 1/2] geotiff: document attrs tier classification in _attrs.py (#1984) Extends the module docstring with the three-tier attrs contract from issue #1984: - Canonical keys (crs, crs_wkt, transform, nodata, raster_type, extra_tags, gdal_metadata, gdal_metadata_xml, x_resolution, y_resolution, resolution_unit, _xrspatial_geotiff_contract) are owned by xrspatial and survive round-trip. - Compatibility aliases (nodatavals, _FillValue) are read for ecosystem interop but writers never emit them when the canonical key is set. - Best-effort pass-through keys (GeoKey-derived fields, image_description, extra_samples, colormap variants) are preserved when the writer can reconstruct them from canonical state. Docstring-only change. The contract version key is mentioned for reference; it is populated by a later PR in the series. --- xrspatial/geotiff/_attrs.py | 56 +++++++++++++++++++++++++++++++++++++ 1 file changed, 56 insertions(+) diff --git a/xrspatial/geotiff/_attrs.py b/xrspatial/geotiff/_attrs.py index 37f8615c1..9f1d72553 100644 --- a/xrspatial/geotiff/_attrs.py +++ b/xrspatial/geotiff/_attrs.py @@ -14,6 +14,62 @@ entries. Extracted in step 5 of issue #1813. + +Attrs contract (issue #1984) +---------------------------- + +The keys written into ``DataArray.attrs`` by the read paths fall into +three tiers. Writers honour the same split: canonical keys are emitted, +compatibility aliases are read but never written when the canonical key +is present, and pass-through keys are kept when the writer can +reconstruct them from canonical state. + +The contract version is recorded in ``attrs['_xrspatial_geotiff_contract']`` +(currently ``1``). Consumers can branch on this integer if the tier +split changes in a future release. + +Canonical (xrspatial owns these; round-trip stable): + +- ``crs``: EPSG integer code for the horizontal CRS. +- ``crs_wkt``: PROJ WKT string for the horizontal CRS. Always emitted. +- ``transform``: tuple of ``(origin_x, origin_y, pixel_width, pixel_height)``. +- ``nodata``: declared file sentinel as stored in the GDAL_NODATA tag. The + declared-vs-masked split is tracked in issue #1988; the canonical + semantics here describe the intended behaviour once #1988 lands. +- ``raster_type``: ``'area'`` (implicit / RasterPixelIsArea) or ``'point'`` + (explicit / RasterPixelIsPoint). +- ``extra_tags``: list of ``(tag_id, type_id, count, value)`` tuples for + TIFF tags outside the structured set. +- ``gdal_metadata``: dict parsed from the GDAL_METADATA XML tag. +- ``gdal_metadata_xml``: raw GDAL_METADATA XML string. +- ``x_resolution``, ``y_resolution``, ``resolution_unit``: TIFF + XResolution / YResolution / ResolutionUnit values. +- ``_xrspatial_geotiff_contract``: integer version of this contract. + +Compatibility alias (read for ecosystem interop; writers must not emit +when the canonical key is present): + +- ``nodatavals``: rioxarray per-band tuple form of ``nodata``. +- ``_FillValue``: CF-convention name for ``nodata``. + +Best-effort pass-through (preserved when the writer can reconstruct +from canonical state, otherwise dropped on round-trip): + +- ``crs_name``: human-readable CRS name from the GeoKey directory. +- ``geog_citation``: GeographicTypeGeoKey citation string. +- ``datum_code``: GeogGeodeticDatumGeoKey value. +- ``angular_units``: GeogAngularUnitsGeoKey value. +- ``linear_units``: ProjLinearUnitsGeoKey value. +- ``semi_major_axis``: GeogSemiMajorAxisGeoKey value. +- ``inv_flattening``: GeogInvFlatteningGeoKey value. +- ``projection_code``: ProjectedCSTypeGeoKey value. +- ``vertical_crs``: VerticalCSTypeGeoKey value. +- ``vertical_citation``: VerticalCitationGeoKey value. +- ``vertical_units``: VerticalUnitsGeoKey value. +- ``image_description``: TIFF ImageDescription tag. +- ``extra_samples``: TIFF ExtraSamples tag. +- ``colormap``, ``colormap_rgba``, ``cmap``: palette data attached to + single-band paletted images. """ from __future__ import annotations From f2e81f1cc0b1bce65421ebaecf9e8c74b8a8cfa8 Mon Sep 17 00:00:00 2001 From: Brendan Collins Date: Sun, 17 May 2026 10:13:56 -0700 Subject: [PATCH 2/2] geotiff: address #2000 review nits in attrs tier docstring - transform: corrected layout from (origin_x, origin_y, pixel_width, pixel_height) to the rasterio-style 6-tuple (pixel_width, 0.0, origin_x, 0.0, pixel_height, origin_y) that _transform_tuple_from_pixel_geometry actually emits, and noted that it is omitted on files with no GeoTIFF transform tags. - crs_wkt: replaced "Always emitted" (the reader only emits it when geo_info.crs_wkt is not None) with the more accurate "Present on read whenever any CRS information is available". Also dropped the unconfirmed "PROJ" qualifier on the WKT dialect. - gdal_metadata_xml: noted that writers prefer this over the parsed dict when both attrs are present. - extra_tags: noted that the attr is omitted when no out-of-band tags are present, matching the conditional emission in _populate_attrs_from_geo_info. --- xrspatial/geotiff/_attrs.py | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/xrspatial/geotiff/_attrs.py b/xrspatial/geotiff/_attrs.py index 9f1d72553..a0d5aa1e5 100644 --- a/xrspatial/geotiff/_attrs.py +++ b/xrspatial/geotiff/_attrs.py @@ -31,17 +31,23 @@ Canonical (xrspatial owns these; round-trip stable): - ``crs``: EPSG integer code for the horizontal CRS. -- ``crs_wkt``: PROJ WKT string for the horizontal CRS. Always emitted. -- ``transform``: tuple of ``(origin_x, origin_y, pixel_width, pixel_height)``. +- ``crs_wkt``: WKT string for the horizontal CRS. Present on read whenever + any CRS information is available. +- ``transform``: rasterio-style 6-tuple + ``(pixel_width, 0.0, origin_x, 0.0, pixel_height, origin_y)``. Omitted + for files with no GeoTIFF transform tags (ModelTransformation, + ModelPixelScale, or ModelTiepoint). - ``nodata``: declared file sentinel as stored in the GDAL_NODATA tag. The declared-vs-masked split is tracked in issue #1988; the canonical semantics here describe the intended behaviour once #1988 lands. - ``raster_type``: ``'area'`` (implicit / RasterPixelIsArea) or ``'point'`` (explicit / RasterPixelIsPoint). - ``extra_tags``: list of ``(tag_id, type_id, count, value)`` tuples for - TIFF tags outside the structured set. + TIFF tags outside the structured set. Omitted when no out-of-band + tags are present. - ``gdal_metadata``: dict parsed from the GDAL_METADATA XML tag. -- ``gdal_metadata_xml``: raw GDAL_METADATA XML string. +- ``gdal_metadata_xml``: raw GDAL_METADATA XML string. Writers prefer this + over ``gdal_metadata`` when both are present. - ``x_resolution``, ``y_resolution``, ``resolution_unit``: TIFF XResolution / YResolution / ResolutionUnit values. - ``_xrspatial_geotiff_contract``: integer version of this contract.