rasterize: propagate like.attrs/coords and emit _FillValue (#2018)#2024
Conversation
When the caller passed ``like=template``, ``rasterize()`` dropped all of the template's ``attrs`` (including ``crs``, ``res``, ``transform``, ``nodatavals``) and rebuilt the output coords via ``np.linspace`` from re-derived bounds. The output then no longer ``xr.align``-ed with the template, and chained pipelines like ``slope(rasterize(gdf, like=elevation))`` silently saw a no-CRS, no-res raster and recomputed cell-size from coords -- the same class of bug as the #1407 sky_view_factor cellsize issue. Three fixes, all at the same site (line 2227, where the output ``xr.DataArray`` is built): 1. ``_extract_grid_from_like`` now also returns the input ``x`` and ``y`` coords and ``attrs``. ``rasterize()`` copies ``like.attrs`` onto the output. 2. When the resolved output grid matches ``like`` (same width, height, and bounds), the output reuses ``like.coords['x']`` and ``like.coords['y']`` directly so the result is bit-identical to the template and ``xr.align`` keeps working. 3. The ``fill`` value is now emitted as ``attrs['_FillValue']`` and ``attrs['nodatavals']`` when ``fill`` is not NaN. Downstream tools (``to_geotiff``, custom masks) can identify nodata pixels. All four backends (numpy, cupy, dask+numpy, dask+cupy) route through the same final ``xr.DataArray`` constructor, so the fix is in one place and behaves identically across backends. Adds ``TestMetadataPropagation`` to ``test_rasterize.py`` with 9 cases covering attrs propagation, bit-identical coord reuse, fill-value emission, isolation from the template's attrs dict, and parity across all four backends. Closes #2018.
PR Review: rasterize: propagate like.attrs/coords and emit _FillValue (#2018)Nice fix — the bug is real and the one-site fix at the final Blockers (must fix before merge)
Suggestions (should fix, not blocking)
Nits (optional improvements)
What looks good
Checklist
|
Strip inherited nodata / _FillValue / nodatavals from like.attrs before emitting a fresh triplet keyed off the actual fill, so stale sentinels from a prior round-trip can't outlive the new fill. Detect numpy-typed NaN via float() + np.isnan() so np.float32(np.nan) is treated as NaN. Propagate non-dim coords (e.g. rioxarray's spatial_ref). Replace the float-equal bounds check in the coord-reuse predicate with a check on "caller didn't override bounds/resolution". Adds tests for inherited nodata, numpy-scalar NaN, spatial_ref propagation, and a geotiff round-trip pinning the user-visible #2018 fix.
Summary
Closes #2018.
rasterize()was returning a DataArray with emptyattrsand freshly-builtlinspacecoords on every code path, even when the caller passedlike=template. The result silently broke chained spatial pipelines.slope(rasterize(gdf, like=elevation))saw nocrs, nores, notransform, and recomputed cell size from coords that were not bit-identical toelevation. Same class of bug as the sky_view_factor: horizon angle ignores cell size, wrong by factor of cell_size #1407 sky_view_factor cellsize issue.rasterize(..., fill=-9999, dtype=np.int32)produced an integer raster with no_FillValueand nonodatavalsattr, so downstream tools could not identify nodata pixels.All four backends (numpy, cupy, dask+numpy, dask+cupy) route through the same final
xr.DataArray(...)constructor, so the fix is at one site.Changes
_extract_grid_from_likenow also returns the template's x/y coords and attrs.xr.DataArray(...)constructor reuseslike.coordsdirectly when the resolved output grid matcheslike, so the output is bit-identical to the template andxr.alignkeeps working.like.attrsare copied onto the output (defensivedict(...)so mutating the output doesn't leak into the template).fillis emitted asattrs['_FillValue']andattrs['nodatavals'] = (fill,)whenfillis not NaN. The default NaN fill leaves attrs empty (no pollution).Test plan
TestMetadataPropagationtoxrspatial/tests/test_rasterize.pywith 9 cases:test_like_propagates_attrs--crs/transform/resfromlikeland on outputtest_like_preserves_coords_bit_identical-- output x/y coords arenp.array_equaltolike.coordstest_like_attrs_isolated_from_template-- mutating output attrs does not mutatelike.attrstest_fill_value_recorded_when_not_nan--fill=-9999->_FillValue == -9999,nodatavals == (-9999,)test_fill_value_omitted_for_nan-- default fill leaves_FillValueandnodatavalsabsenttest_no_like_no_attrs_pollution-- nolike, NaN fill ->attrs == {}test_like_attrs_propagated_dask-- dask+numpy backend behaves identicallytest_like_attrs_propagated_cupy-- cupy backend behaves identically (skipped if no CUDA)test_like_attrs_propagated_dask_cupy-- dask+cupy backend behaves identically (skipped if no CUDA)