diff --git a/.claude/sweep-test-coverage-state.csv b/.claude/sweep-test-coverage-state.csv index f1abe453d..4dbe605b0 100644 --- a/.claude/sweep-test-coverage-state.csv +++ b/.claude/sweep-test-coverage-state.csv @@ -1,3 +1,4 @@ module,last_inspected,issue,severity_max,categories_found,notes geotiff,2026-05-15,1974,HIGH,1;2;3;4,"Pass 16 (2026-05-15): added test_max_cloud_bytes_dispatcher_silent_drop_2026_05_15.py closing Cat 4 HIGH parameter-coverage gap on the open_geotiff dispatcher's max_cloud_bytes kwarg. The kwarg was added in #1928 (eager fsspec budget) and re-ordered into the canonical reader signature by #1957, but open_geotiff only forwards it to _read_to_array on the eager non-VRT branch (__init__.py:431). The GPU branch at line 410, the dask branch at line 422, and the VRT branch at line 362 never reference the kwarg, so open_geotiff(p, max_cloud_bytes=8, gpu=True) / open_geotiff(p, max_cloud_bytes=8, chunks=N) / open_geotiff(vrt, max_cloud_bytes=8) all silently drop the budget. Same class of dispatcher-silently-drops-backend-kwarg bug fixed by #1561 / #1605 / #1685 / #1810 for other kwargs; the two sibling kwargs on_gpu_failure (line 339) and missing_sources (line 355) already raise ValueError when used on a path where they do not apply. 11 tests: 4 xfail(strict=True) pinning the fix surface (gpu, dask, vrt, dask+gpu), 3 passing pins on the current silent-drop behaviour so the fix is visible as a diff, 4 positive pins that the eager local + file-like paths accept the kwarg (docstring no-op contract). Filed issue #1974 for the dispatcher fix (sweep is test-only). Cat 4 HIGH (silent backend-kwarg drop). Pass 15 (2026-05-15): added test_write_vrt_bool_nodata_1921.py closing Cat 1 HIGH backend-parity gap on bool nodata rejection. Issue #1911 added the isinstance(nodata, (bool, np.bool_)) -> TypeError guard at to_geotiff and build_geo_tags, but the sibling writers were left unchecked: write_vrt(nodata=True) silently emits True into the VRT XML (str(True) drops the sentinel because no reader parses 'True' as numeric); write_geotiff_gpu direct call relies on the build_geo_tags defense-in-depth rather than an entry-point check, so a future refactor moving that guard would regress the GPU writer with no test coverage. 17 new tests: 4 xfail (strict=True) pinning the write_vrt fix surface (issue #1921), 1 passing pin on the current buggy str(True) emission so the fix is visible as a diff, 6 numeric/None happy-path tests on write_vrt, 4 GPU writer direct-call bool-reject tests (4 dtypes x 1 call), 1 to_geotiff(gpu=True) dispatcher thread-through. Filed issue #1921 for the write_vrt fix (sweep is test-only). Cat 1 HIGH (write_vrt backend parity bug) + Cat 1 MEDIUM (write_geotiff_gpu defense-in-depth pin). Pass 14 (2026-05-15): added test_dask_streaming_write_degenerate_2026_05_15.py closing Cat 3 HIGH and Cat 2 HIGH/MEDIUM gaps on the dask streaming write path (to_geotiff with dask-backed DataArray, #1084). test_streaming_write.py covered 100x100 with a NaN block plus a 2x2 small raster but had nothing 1-pixel-row, 1-pixel-column, all-NaN, all-Inf, or +/-Inf-mixed. The streaming tile-row segmenter (#1485) on a 1-pixel-tall raster and the streaming nodata-mask coercion on an all-NaN chunk were reachable only with a dask input and had no direct coverage; a regression on either would not surface from the eager numpy path or the write_geotiff_gpu path (pass 5 covered the GPU writer's degenerate shapes). 16 new tests, all passing: 1x1 chunk-matches-shape + nodata-attr round-trip + uint16, 1xN single chunk + chunks-split-columns + wide-segmented-by-buffer (#1485 streaming_buffer_bytes=1 forces the segmenter), Nx1 single chunk + chunks-split-rows, all-NaN with finite sentinel + all-NaN without sentinel, mixed NaN/+Inf/-Inf preserving Inf bit-exact + sentinel masking NaN only, all-+Inf and all--Inf, predictor=3 (float predictor) round-trip on float32 + float64 plus int-dtype ValueError. predictor=3 streaming coverage extends the small-chunk and int-rejection geometry around test_predictor_fp_write_1313.test_predictor3_streaming_dask (which already covers a 128x192 predictor=3 dask streaming write with a Predictor-tag assertion). Cat 3 HIGH (1x1/1xN/Nx1) + Cat 2 HIGH (all-NaN with sentinel) + Cat 2 MEDIUM (mixed-Inf, all-Inf) + Cat 4 MEDIUM (predictor=3 streaming). Pass 13 (2026-05-13): added test_size_param_validation_gpu_vrt_1776.py closing Cat 4 HIGH parameter-coverage gap on size-arg validation. Issue #1752 added tile_size validation to to_geotiff and chunks validation to read_geotiff_dask, but the matching kwargs on three sibling entry points were left unchecked: write_geotiff_gpu(tile_size=) raised ZeroDivisionError for 0, struct.error for -1, TypeError for 256.0; read_geotiff_gpu(chunks=) and read_vrt(chunks=) raised ZeroDivisionError for 0 and silently accepted negative values. Factored two shared validators (_validate_tile_size_arg, _validate_chunks_arg) and called them up front from each entry point. 34 new tests, all passing on GPU host: tile_size matrix on write_geotiff_gpu (0/-1/256.0/True/False/positive/np.int64), chunks matrix on read_geotiff_gpu and read_vrt (0/-1/(0,N)/(N,-1)/wrong-length/bool/non-int/(N,float)/positive/np.int64), dispatcher thread-through tests (open_geotiff(gpu=True, chunks=0), to_geotiff(gpu=True, tile_size=0)). Pre-existing 13 #1752 tests still pass after refactor. Filed issue #1776. Pass 12 (2026-05-12): added test_gpu_writer_overview_mode_and_compression_level_1740.py closing Cat 4 HIGH and Cat 4 MEDIUM parameter-coverage gaps. (1) write_geotiff_gpu(overview_resampling='mode') and the dedicated _block_reduce_2d_gpu mode-fallback branch (_gpu_decode.py:3051-3056) had zero direct tests; six of the seven overview_resampling modes were covered (mean/nearest by test_features, min/max/median by pass 6, cubic by test_signature_parity_1631) but mode was the odd one out -- a regression dropping the mode dispatch from _block_reduce_2d_gpu would fall through to the mean reshape branch and emit wrong overview pixels for integer rasters. (2) write_geotiff_gpu(compression_level=) documented as accepted-but-ignored had no test; the CPU writer rejects out-of-range levels with ValueError, the GPU writer is documented not to -- a regression wiring the GPU writer up to the CPU range validator would silently break every to_geotiff(gpu=True, compression_level=X) caller for in-range levels and noisily for out-of-range. 19 tests, all passing on GPU host: _block_reduce_2d_gpu(method='mode') CPU-parity on 4x4 deterministic + random 8x8 + dtype-preserved across u8/u16/i16/i32, write_geotiff_gpu(cog=True, overview_resampling='mode') end-to-end round trip, to_geotiff(gpu=True, ..., overview_resampling='mode') dispatcher thread-through, GPU-vs-CPU pixel parity on 8x8 input, write_geotiff_gpu(compression_level=) in-range matrix on zstd/deflate, out-of-range matrix (zstd=999/-5, deflate=50/0) accepted without raising + round-trip preserved, to_geotiff(gpu=True, compression_level=999) dispatcher thread-through, companion CPU rejects-OOR pin to lock the asymmetry. Mutation against the mode branch (drop the 'if method == mode' block in _block_reduce_2d_gpu) flipped 9 mode tests red. Filed issue #1740. Pass 11 (2026-05-12): added test_gpu_writer_cpu_fallback_codecs_2026_05_12.py closing a Cat 4 HIGH parameter-coverage gap on write_geotiff_gpu compression= modes for the CPU-fallback codecs (lzw, packbits, lz4, lerc, jpeg2000/j2k). Pass 7 (test_gpu_writer_compression_modes_2026_05_11) covered only none/deflate/zstd/jpeg; the remaining five codecs route through dedicated branches in gpu_compress_tiles (_gpu_decode.py:2974-3019) with CPU fallbacks (lerc_compress, jpeg2000_compress, cpu_compress) that had zero direct tests via write_geotiff_gpu. A regression in routing/tag-wiring/fallback dispatch would ship silently because the internal reader uses the same compression-tag table. 17 tests, all passing on GPU host: lzw/packbits/lz4 round-trip + compression-tag pin on uint16, lerc lossless float32 + uint16 round-trip + tag pin, jpeg2000 uint8 single-band + RGB multi-band lossless round-trip + j2k-alias parity + tag pin, GPU-vs-CPU writer pixel parity for lzw/packbits, to_geotiff(gpu=True, compression=lzw/packbits) dispatcher thread-through. Mutation against compression dispatch (swap lzw bytes to zstd; swap lerc bytes to deflate) flipped round-trip tests red. Filed issue #1706. Pass 10 (2026-05-12): added test_kwarg_behaviour_2026_05_12_v2.py closing two Cat 4 HIGH parameter-coverage gaps. (1) write_geotiff_gpu(predictor=True/2/3) had zero direct tests; the GPU writer threads predictor= through normalize_predictor and gpu_compress_tiles into five CUDA encode kernels (_predictor_encode_kernel_u8/u16/u32/u64 for predictor=2, _fp_predictor_encode_kernel for predictor=3) and a regression dropping the encode-kernel calls would ship corrupt files. (2) read_vrt(window=) had no behaviour tests (only a signature pin in test_signature_annotations_1654); the kwarg is documented and _vrt.read_vrt implements full windowed-read semantics (clip, multi-source overlap, src/dst scaling, GeoTransform origin shift on coords + attrs['transform']). 23 tests, all passing on GPU host: predictor=True/2 round-trips on u8/u16/i32 + 3-band RGB samples_per_pixel stride; predictor=3 lossless round-trip on f32 and f64; predictor=3 int-dtype ValueError (CPU/GPU parity); CPU/GPU pixel-exact parity for pred=2 u16 and pred=3 f32; read_vrt(window=) subregion + full + clamp-overflow + clamp-negative + 2x1 mosaic seam straddle + offset past seam + transform-attr origin shift + y/x coords half-pixel shift + window+band + window+chunks (dask) + window+gpu (cupy) + window+gpu+chunks (dask+cupy). Mutation against the encode dispatch flipped 7 predictor tests red. Filed issue #1690. Pass 9 (2026-05-12): added test_kwarg_behaviour_2026_05_12.py closing three Cat 4 MEDIUM parameter-coverage gaps plus one Cat 4 LOW error path. write_vrt documented kwargs (relative/crs_wkt/nodata) had a smoke-test pinning that the kwargs are accepted but no test verified the override *effect* -- a regression dropping the override branch and silently using the default-from-first-source would ship undetected. read_geotiff_gpu(dtype=) cast had zero direct tests; the eager path has TestDtypeEager and dask has TestDtypeDask but the GPU branch had no equivalent. write_geotiff_gpu(bigtiff=) threads through to _assemble_tiff(force_bigtiff=) but no test asserted the on-disk header byte switches; the CPU writer had it via test_features::test_force_bigtiff_via_public_api. write_vrt(source_files=[]) ValueError was uncovered. 26 tests, all passing on GPU host: write_vrt relative=True/False XML attribute + path inspection + parse-back round-trip, write_vrt crs_wkt= override distinct-from-default XML check, write_vrt nodata= override + default-from-source coverage, write_vrt([]) ValueError + no-file side effect, read_geotiff_gpu dtype= matrix (float64->float32, float64->float16, uint16->int32, uint16->uint8, float-to-int raise, dtype=None preserves native), open_geotiff(gpu=True, dtype=) dispatcher, read_geotiff_gpu(chunks=, dtype=) dask+GPU branch, write_geotiff_gpu bigtiff=True/False/None header verification, to_geotiff(gpu=True, bigtiff=True) dispatcher thread-through. Pass 8 (2026-05-11): added test_lz4_compression_level_2026_05_11.py closing Cat 4 MEDIUM parameter-coverage gap on compression='lz4' + compression_level=. _LEVEL_RANGES advertises lz4: (0, 16) but only deflate (1, 9) and zstd (1, 22) had direct level boundary + round-trip + reject tests. The range check is the gatekeeper -- lz4_compress silently accepts any int level -- so a regression dropping 'lz4' from _LEVEL_RANGES would ship undetected. 18 tests, all passing: round-trip at levels 0/1/9/16 (lossless), default-level no-arg path, higher-level-not-larger smoke check on compressible input, out-of-range reject at -1/-10/17/100 on eager path, valid-range message format pin (lz4 valid: 0-16), dask streaming round-trip at 0/1/8/16, dask streaming out-of-range reject at -1/17/50 (separate _LEVEL_RANGES call site). Pass 7 (2026-05-11): added test_gpu_writer_compression_modes_2026_05_11.py closing Cat 4 HIGH gap on write_geotiff_gpu compression= modes. The writer documents zstd (default, fastest GPU), deflate, jpeg, and none, but only deflate + none had round-trip tests; the default zstd and the jpeg (nvJPEG/Pillow) paths shipped without targeted coverage. 11 new tests, all passing on GPU host: zstd round-trip + default-codec pinning, jpeg round-trip on 3-band RGB uint8 + 1-band greyscale, TIFF compression-tag header check across none/deflate/zstd/jpeg, plain deflate + none round-trips outside the COG/sentinel paths, and a cross-codec lossless parity check (zstd/deflate/none agree pixel-exact). nvJPEG path was exercised live, not just the Pillow fallback. Pass 6 (2026-05-11): added test_overview_resampling_min_max_median_2026_05_11.py covering Cat 4 HIGH parameter-coverage gap on overview_resampling=min/max/median. CPU end-to-end paths were already covered by test_cog_overview_nodata_1613::test_cpu_cog_overview_aggregations_ignore_sentinel; the GPU end-to-end paths and the direct CPU+GPU block-reducer branches had no targeted tests, so a regression on those code paths would ship undetected. 26 tests, all passing on GPU host: block-reducer unit tests (finite + partial-NaN), end-to-end COG writes for both to_geotiff and write_geotiff_gpu, CPU/GPU parity for to_geotiff(gpu=True), CPU nodata-sentinel regression check, and ValueError error-path tests for unknown method names on both backends. Pass 5 (2026-05-11): added test_degenerate_shapes_backends_2026_05_11.py covering Cat 3 HIGH geometric gaps (1x1 / 1xN / Nx1 reads on dask+numpy, GPU, dask+cupy backends; 1x1 / 1xN / Nx1 writes through write_geotiff_gpu) and Cat 2 MEDIUM NaN/Inf gaps (all-NaN read on GPU + dask+cupy, Inf / -Inf reads on all non-eager backends, NaN sentinel mask on dask read path including sentinel block split across chunk boundary). 23 tests, all passing on GPU host. Prior passes still hold: pass 4 (r4) closed read_geotiff_gpu/dask name= + max_pixels= kwargs (Cat 4), pass 3 (r3) closed read_vrt GPU/dask+GPU backend dispatch (Cat 1) and dtype/name kwargs (Cat 4)." +rasterize,2026-05-17,,HIGH,1;3;4,"Pass 1 (2026-05-17): added test_rasterize_coverage_2026_05_17.py with 34 tests, all passing on a CUDA host. Closes four documented public-API gaps left after the pass-0 audit. (1) Cat 3 HIGH 1x1 single-pixel raster -- test_rasterize.py covers 1xN strips and Nx1 strips but never width=1 AND height=1, so the polygon scanline / line Bresenham / point burn kernels all ship without the single-cell degenerate case; the new TestSinglePixelRaster class pins polygon/point/line on eager numpy plus polygon parity across cupy / dask+numpy / dask+cupy. (2) Cat 4 HIGH like= template-raster parameter is documented at rasterize.py:2038 and implemented by _extract_grid_from_like (line 1930) but no test exercises it; TestLikeParameter pins dtype/bounds/coords inheritance, the three override branches (dtype, bounds, width/height), the three validation branches (not-DataArray, 3D, wrong dim names) and like= on all four backends. Mutation against the like-dtype branch (rasterize.py:2183-2184) flipped the inheritance test red. (3) Cat 4 HIGH resolution= happy path -- only the oversize-rejection error path was tested (line 304); TestResolutionParameter pins the scalar branch, the tuple branch, the ceil-and-clamp-to-1 semantics, and resolution= on all four backends. (4) Cat 4 HIGH non-empty GeometryCollection unpacking is documented at rasterize.py:1995 and implemented by _classify_geometries_loop (line 228) but only the empty-GC case was tested (line 269); TestGeometryCollection pins polygon+point and polygon+line+point collections on eager numpy plus parity across cupy / dask+numpy / dask+cupy so the loop classifier's polygon/line/point sub-bucketing has direct coverage. Cat 1 MEDIUM gap closed: eager cupy all_touched=True parity vs eager numpy (TestEagerCupyAllTouched) -- the existing test only covered dask+cupy all_touched, leaving the direct GPU all_touched kernel untested. Cat 2 MEDIUM gap closed: int32 dtype with default NaN fill silently casts to the int32-min sentinel (TestIntegerDtypeNanFill) -- pin the cast so any future ValueError-raises switch is visible as a code-review diff. Pre-existing 143 passing + 2 skipped tests in test_rasterize.py untouched." reproject,2026-05-10,,HIGH,1;4;5,"Added 39 tests: LiteCRS direct coverage, itrf_transform behaviour/roundtrip/array, itrf_frames, geoid_height numerical correctness + raster happy-path, vertical helpers (ellipsoidal<->orthometric/depth), reproject() lat/lon and latitude/longitude dim propagation. Note: _merge_arrays_cupy is imported but unused (no cupy merge dispatch in merge()); flagged as feature gap not test gap." diff --git a/xrspatial/tests/test_rasterize_coverage_2026_05_17.py b/xrspatial/tests/test_rasterize_coverage_2026_05_17.py new file mode 100644 index 000000000..96c2c83d4 --- /dev/null +++ b/xrspatial/tests/test_rasterize_coverage_2026_05_17.py @@ -0,0 +1,603 @@ +"""Coverage-gap tests for xrspatial.rasterize (deep-sweep test-coverage, pass 1). + +Closes documented but untested public-API surface flagged by the +test-coverage sweep on 2026-05-17: + +- Cat 3 HIGH -- 1x1 single-pixel raster across numpy / cupy / dask+numpy / + dask+cupy (test_rasterize.py covers 1xN strips and Nx1 strips but never + the single-pixel degenerate case). +- Cat 4 HIGH -- ``like=`` template-raster parameter forwards through + ``_extract_grid_from_like`` into width/height/bounds/dtype resolution, + but no test in test_rasterize.py exercises it. The dtype-inheritance + branch and the bounds-from-like branch ship without coverage on any + backend. +- Cat 4 HIGH -- ``resolution=`` parameter happy-path: only the + oversize-rejection error path is tested; the scalar / tuple branches + and the ceil-and-clamp-to-1 logic have no positive coverage on any + backend. +- Cat 4 HIGH -- Non-empty ``GeometryCollection`` unpacking is + implemented by ``_classify_geometries_loop`` but only the empty-GC + case is tested. All four backends route through this path. +- Cat 1 MEDIUM -- eager cupy ``all_touched=True`` is covered only on the + dask+cupy path; the eager cupy branch invokes a different kernel and + had no direct test. +- Cat 2 MEDIUM -- integer dtype with the default ``fill=nan`` is + unpinned behaviour: ``np.full(..., np.nan).astype(int)`` silently + casts to the platform-specific int-min sentinel. Pin the observed + cast (numpy backend) so a future refactor that switches to an explicit + raise surfaces in CI. + +The "fix" in this sweep is *adding tests*. No source changes. CUDA is +available on this host so cupy / dask+cupy tests execute live. +""" +from __future__ import annotations + +import numpy as np +import pytest +import xarray as xr + +try: + from shapely.geometry import ( + box, GeometryCollection, LineString, Point, + ) + has_shapely = True +except ImportError: + has_shapely = False + +if has_shapely: + from xrspatial.rasterize import rasterize + +pytestmark = pytest.mark.skipif( + not has_shapely, reason="shapely not installed" +) + +try: + import cupy + has_cupy = True +except ImportError: + has_cupy = False + +try: + import dask # noqa: F401 (availability probe only) + has_dask = True +except ImportError: + has_dask = False + +try: + from numba import cuda + has_cuda = has_cupy and cuda.is_available() +except Exception: + has_cuda = False + +skip_no_cuda = pytest.mark.skipif( + not has_cuda, reason="CUDA / CuPy not available") +skip_no_dask = pytest.mark.skipif( + not has_dask, reason="dask not installed") + + +def _as_numpy(result): + """Materialise any backend's DataArray data to a numpy array.""" + data = result.data + if hasattr(data, 'compute'): + data = data.compute() + if has_cupy and isinstance(data, cupy.ndarray): + return cupy.asnumpy(data) + return np.asarray(data) + + +# --------------------------------------------------------------------------- +# Cat 3 HIGH -- 1x1 single-pixel raster +# --------------------------------------------------------------------------- + +class TestSinglePixelRaster: + """rasterize() on a 1x1 grid: the smallest legal degenerate shape. + + test_single_row_raster / test_single_column_raster cover the 1xN and + Nx1 cases but the 1x1 case is its own degeneracy: the bounds-to-pixel + transform collapses to a single (xmin, ymin)..(xmax, ymax) cell and + every kernel (polygon scanline, line Bresenham, point burn) has to + handle a 1-element output array. A regression that mis-handled the + height==1 or width==1 branch would already be caught, but the + height==1 AND width==1 case has no test today. + """ + + def test_polygon_eager_numpy(self): + """A polygon covering the bounds burns the single pixel.""" + r = rasterize([(box(0, 0, 5, 5), 7.0)], + width=1, height=1, bounds=(0, 0, 5, 5)) + assert r.shape == (1, 1) + assert r.values[0, 0] == 7.0 + # Coords are the cell centre. + assert r.coords['x'].values[0] == pytest.approx(2.5) + assert r.coords['y'].values[0] == pytest.approx(2.5) + + def test_polygon_eager_numpy_fill_when_outside(self): + """Polygon outside the single pixel leaves the fill value.""" + r = rasterize([(box(10, 10, 20, 20), 7.0)], + width=1, height=1, bounds=(0, 0, 5, 5), fill=-1.0) + assert r.shape == (1, 1) + assert r.values[0, 0] == -1.0 + + def test_point_eager_numpy(self): + """A point inside the single pixel burns it.""" + r = rasterize([(Point(2.5, 2.5), 9.0)], + width=1, height=1, bounds=(0, 0, 5, 5), fill=0) + assert r.values[0, 0] == 9.0 + + def test_line_eager_numpy(self): + """A line crossing the single pixel burns it.""" + r = rasterize([(LineString([(0.0, 2.5), (5.0, 2.5)]), 3.0)], + width=1, height=1, bounds=(0, 0, 5, 5), fill=0) + assert r.values[0, 0] == 3.0 + + @skip_no_cuda + def test_polygon_eager_cupy_matches_numpy(self): + """1x1 raster on cupy matches numpy.""" + np_r = rasterize([(box(0, 0, 5, 5), 7.0)], + width=1, height=1, bounds=(0, 0, 5, 5)) + cp_r = rasterize([(box(0, 0, 5, 5), 7.0)], + width=1, height=1, bounds=(0, 0, 5, 5), + use_cuda=True) + assert cp_r.shape == (1, 1) + # Pin the absolute value too: a co-regression in eager numpy and + # eager cupy (both writing fill instead of the burn value) would + # otherwise slip past a pure parity check. + assert _as_numpy(cp_r)[0, 0] == 7.0 + np.testing.assert_array_equal(np_r.values, _as_numpy(cp_r)) + + @skip_no_dask + def test_polygon_dask_numpy_matches_numpy(self): + """1x1 raster on dask+numpy matches numpy.""" + np_r = rasterize([(box(0, 0, 5, 5), 7.0)], + width=1, height=1, bounds=(0, 0, 5, 5)) + dk_r = rasterize([(box(0, 0, 5, 5), 7.0)], + width=1, height=1, bounds=(0, 0, 5, 5), + chunks=(1, 1)) + assert dk_r.shape == (1, 1) + assert _as_numpy(dk_r)[0, 0] == 7.0 + # Dask single-chunk pipeline must produce the same value. + np.testing.assert_array_equal(np_r.values, _as_numpy(dk_r)) + + @skip_no_cuda + @skip_no_dask + def test_polygon_dask_cupy_matches_numpy(self): + """1x1 raster on dask+cupy matches numpy.""" + np_r = rasterize([(box(0, 0, 5, 5), 7.0)], + width=1, height=1, bounds=(0, 0, 5, 5)) + dkcp_r = rasterize([(box(0, 0, 5, 5), 7.0)], + width=1, height=1, bounds=(0, 0, 5, 5), + chunks=(1, 1), use_cuda=True) + assert dkcp_r.shape == (1, 1) + assert _as_numpy(dkcp_r)[0, 0] == 7.0 + np.testing.assert_array_equal(np_r.values, _as_numpy(dkcp_r)) + + +# --------------------------------------------------------------------------- +# Cat 4 HIGH -- ``like=`` template-raster parameter +# --------------------------------------------------------------------------- + +class TestLikeParameter: + """``like=`` inherits width/height/bounds/dtype from a template. + + The public docstring at rasterize.py:2038 promises a "Template raster. + Width, height, bounds, and dtype are copied from this array (any can + still be overridden explicitly)". No test in test_rasterize.py + invokes the function with ``like=``, so each of the four inheritance + branches and the three validation branches in + ``_extract_grid_from_like`` ship without direct coverage. + """ + + @staticmethod + def _template(height=4, width=6, dtype=np.float32): + """A small north-up template with float32 dtype and explicit coords.""" + # y descends top-to-bottom (north-up convention used elsewhere). + return xr.DataArray( + np.zeros((height, width), dtype=dtype), + dims=['y', 'x'], + coords={ + 'y': np.linspace(height - 0.5, 0.5, height), + 'x': np.linspace(0.5, width - 0.5, width), + }, + ) + + def test_like_inherits_width_height_dtype(self): + """Output shape and dtype match the template.""" + template = self._template(height=4, width=6, dtype=np.float32) + r = rasterize([(box(0, 0, 6, 4), 9.0)], like=template, fill=0) + assert r.shape == (4, 6) + assert r.dtype == np.float32 + + def test_like_inherits_bounds_from_coords(self): + """Bounds are reconstructed from the template's coordinate centres.""" + template = self._template(height=4, width=6, dtype=np.float32) + r = rasterize([(box(0, 0, 6, 4), 9.0)], like=template, fill=0) + # Cell centres should match the template exactly (so the half-pixel + # offsets that _extract_grid_from_like applies are consistent). + np.testing.assert_allclose( + r.coords['y'].values, template.coords['y'].values) + np.testing.assert_allclose( + r.coords['x'].values, template.coords['x'].values) + + def test_like_dtype_override(self): + """Explicit ``dtype=`` wins over the template dtype.""" + template = self._template(dtype=np.float32) + r = rasterize([(box(0, 0, 6, 4), 9.0)], + like=template, dtype=np.float64, fill=0) + assert r.dtype == np.float64 + + def test_like_bounds_override(self): + """Explicit ``bounds=`` wins over the template bounds (width/height + from template are still honoured).""" + template = self._template(height=4, width=6, dtype=np.float32) + r = rasterize([(box(0, 0, 2, 2), 1.0)], + like=template, bounds=(0, 0, 2, 2), fill=0) + # Shape stays from template, but the coords are recomputed off the + # overridden bounds so the pixel size shrinks. + assert r.shape == (4, 6) + # width=6 over x in [0, 2] -> px=1/3, centres at [1, 3, 5, 7, 9, 11]/6. + expected_x = np.array([1, 3, 5, 7, 9, 11]) / 6. + np.testing.assert_allclose(r.coords['x'].values, expected_x) + # height=4 over y in [0, 2] -> py=0.5, centres descend 1.75 -> 0.25. + expected_y = np.array([1.75, 1.25, 0.75, 0.25]) + np.testing.assert_allclose(r.coords['y'].values, expected_y) + + def test_like_width_height_override(self): + """Explicit ``width``/``height`` win over the template shape.""" + template = self._template(height=4, width=6, dtype=np.float32) + r = rasterize([(box(0, 0, 6, 4), 1.0)], + like=template, width=3, height=2, fill=0) + assert r.shape == (2, 3) + # Dtype still inherited. + assert r.dtype == np.float32 + + @skip_no_cuda + def test_like_with_use_cuda(self): + """``like=`` works on the cupy backend (dtype + shape inherited).""" + template = self._template(dtype=np.float32) + r = rasterize([(box(0, 0, 6, 4), 9.0)], + like=template, fill=0, use_cuda=True) + assert r.shape == template.shape + assert r.dtype == np.float32 + assert isinstance(r.data, cupy.ndarray) + + @skip_no_dask + def test_like_with_chunks(self): + """``like=`` works on the dask+numpy backend.""" + template = self._template(dtype=np.float32) + r = rasterize([(box(0, 0, 6, 4), 9.0)], + like=template, fill=0, chunks=(2, 3)) + assert r.shape == template.shape + assert r.dtype == np.float32 + # Dask-backed. + assert hasattr(r.data, 'dask') + + @skip_no_cuda + @skip_no_dask + def test_like_with_dask_cupy(self): + """``like=`` works on the dask+cupy backend.""" + template = self._template(dtype=np.float32) + r = rasterize([(box(0, 0, 6, 4), 9.0)], + like=template, fill=0, chunks=(2, 3), + use_cuda=True) + assert r.shape == template.shape + assert r.dtype == np.float32 + + def test_like_rejects_non_dataarray(self): + """Passing a numpy array as ``like`` raises ``TypeError``. + + Targets the ``isinstance(like, xr.DataArray)`` guard in + ``_extract_grid_from_like``. + """ + with pytest.raises(TypeError, match="must be an xr.DataArray"): + rasterize([(box(0, 0, 5, 5), 1.0)], + like=np.zeros((3, 3))) + + def test_like_rejects_3d(self): + """A 3D DataArray is rejected by the 2D shape guard. + + Note: this and ``test_like_rejects_wrong_dim_names`` both target + the same compound ``ndim != 2 or 'y' not in dims or 'x' not in + dims`` branch. The two tests are kept distinct to document both + sub-conditions; either would suffice for line coverage. + """ + bad = xr.DataArray(np.zeros((2, 3, 3)), dims=['b', 'y', 'x']) + with pytest.raises(ValueError, match="must be 2D"): + rasterize([(box(0, 0, 5, 5), 1.0)], like=bad) + + def test_like_rejects_wrong_dim_names(self): + """A 2D DataArray without 'y' and 'x' dims is rejected. + + Companion to ``test_like_rejects_3d``; targets the dim-name + sub-condition of the same compound guard. + """ + bad = xr.DataArray(np.zeros((3, 3)), dims=['lat', 'lon']) + with pytest.raises(ValueError, match="'y' and 'x'"): + rasterize([(box(0, 0, 5, 5), 1.0)], like=bad) + + +# --------------------------------------------------------------------------- +# Cat 4 HIGH -- ``resolution=`` parameter happy path +# --------------------------------------------------------------------------- + +class TestResolutionParameter: + """``resolution=`` resolves to width/height via ceil(extent / res). + + Only the oversize-rejection error path (test_oversize_resolution_rejected) + is tested in test_rasterize.py. The scalar and tuple branches in + rasterize.py:2158-2164 and the ``max(..., 1)`` clamp have no positive + coverage, on any backend. + """ + + def test_scalar_resolution_eager(self): + """A single float resolution applies to both axes.""" + r = rasterize([(box(0, 0, 4, 4), 1.0)], + resolution=1.0, bounds=(0, 0, 4, 4), fill=0) + assert r.shape == (4, 4) + # Pixel covers (0..1)..(3..4); polygon fills all 16. + assert int((r.values == 1.0).sum()) == 16 + + def test_tuple_resolution_asymmetric(self): + """A tuple resolution can give different x and y pixel counts.""" + r = rasterize([(box(0, 0, 10, 8), 1.0)], + resolution=(2.0, 4.0), bounds=(0, 0, 10, 8), fill=0) + # width = ceil(10 / 2) = 5 + # height = ceil( 8 / 4) = 2 + assert r.shape == (2, 5) + + def test_resolution_ceils_partial_extent(self): + """Non-integer division ceils up to a full pixel.""" + r = rasterize([(box(0, 0, 3, 3), 1.0)], + resolution=1.5, bounds=(0, 0, 3.5, 3.5), fill=0) + # ceil(3.5 / 1.5) = ceil(2.333) = 3 + assert r.shape == (3, 3) + + def test_resolution_clamps_to_at_least_one_pixel(self): + """A resolution larger than the extent clamps to a 1x1 output + rather than 0x0.""" + # extent 0.5 / resolution 1.0 = 0.5 -> ceil = 1 -> max(1, 1) = 1. + r = rasterize([(box(0, 0, 1, 1), 5.0)], + resolution=10.0, bounds=(0, 0, 0.5, 0.5), fill=0) + assert r.shape == (1, 1) + + @skip_no_cuda + def test_scalar_resolution_cupy_matches_numpy(self): + """resolution= on the cupy backend gives the same shape and values.""" + np_r = rasterize([(box(0, 0, 5, 5), 1.0)], + resolution=1.0, bounds=(0, 0, 5, 5), fill=0) + cp_r = rasterize([(box(0, 0, 5, 5), 1.0)], + resolution=1.0, bounds=(0, 0, 5, 5), fill=0, + use_cuda=True) + assert cp_r.shape == (5, 5) + # Positive pin: polygon covers the full 5x5 grid. + assert int((_as_numpy(cp_r) == 1.0).sum()) == 25 + np.testing.assert_array_equal(np_r.values, _as_numpy(cp_r)) + + @skip_no_dask + def test_scalar_resolution_dask_matches_numpy(self): + """resolution= on the dask+numpy backend gives matching output.""" + np_r = rasterize([(box(0, 0, 5, 5), 1.0)], + resolution=1.0, bounds=(0, 0, 5, 5), fill=0) + dk_r = rasterize([(box(0, 0, 5, 5), 1.0)], + resolution=1.0, bounds=(0, 0, 5, 5), fill=0, + chunks=(2, 2)) + assert dk_r.shape == (5, 5) + assert int((_as_numpy(dk_r) == 1.0).sum()) == 25 + np.testing.assert_array_equal(np_r.values, _as_numpy(dk_r)) + + @skip_no_cuda + @skip_no_dask + def test_scalar_resolution_dask_cupy_matches_numpy(self): + """resolution= on the dask+cupy backend gives matching output.""" + np_r = rasterize([(box(0, 0, 5, 5), 1.0)], + resolution=1.0, bounds=(0, 0, 5, 5), fill=0) + dkcp_r = rasterize([(box(0, 0, 5, 5), 1.0)], + resolution=1.0, bounds=(0, 0, 5, 5), fill=0, + chunks=(2, 2), use_cuda=True) + assert dkcp_r.shape == (5, 5) + assert int((_as_numpy(dkcp_r) == 1.0).sum()) == 25 + np.testing.assert_array_equal(np_r.values, _as_numpy(dkcp_r)) + + +# --------------------------------------------------------------------------- +# Cat 4 HIGH -- Non-empty GeometryCollection unpacking +# --------------------------------------------------------------------------- + +class TestGeometryCollection: + """Non-empty GeometryCollections should be recursively unpacked. + + rasterize.py:1995 documents: "GeometryCollection -- recursively + unpacked". The fast-path classifier (``_classify_geometries_vectorized``) + falls through to ``_classify_geometries_loop`` whenever any element is + a GeometryCollection (line 199), so this path has its own polygon / + line / point sub-bucketing logic that test_rasterize.py only + exercises with empty collections (test_unsupported_geom_type_skipped + at line 269). A regression in the loop classifier (dropping a + geometry type, mis-counting indices) would ship undetected. + """ + + @staticmethod + def _mixed_collection(): + """Polygon + Point inside a single GeometryCollection.""" + return GeometryCollection([box(0, 0, 5, 5), Point(7.5, 7.5)]) + + def test_polygon_and_point_in_collection_eager(self): + """Both the polygon and the point inside the GC are burned.""" + gc = self._mixed_collection() + r = rasterize([(gc, 1.0)], width=10, height=10, + bounds=(0, 0, 10, 10), fill=0) + vals = r.values + # Polygon covers rows 5..9, cols 0..4 -> 25 pixels. + # Point at (7.5, 7.5) -> one additional pixel. + assert int((vals == 1.0).sum()) == 26 + # The point pixel is in the upper-right quadrant (y descends). + # Row 2 (y in [7, 8]), col 7 (x in [7, 8]). + assert vals[2, 7] == 1.0 + # The polygon pixel sample at (2.5, 2.5) -> row 7 col 2. + assert vals[7, 2] == 1.0 + + def test_polygon_line_point_in_collection(self): + """All three primitive types inside a single GC are rasterized. + + Uses a 45-degree diagonal line so the Bresenham branch actually + steps in both axes (a horizontal/vertical line is the trivial + degenerate case). + """ + gc = GeometryCollection([ + box(0, 0, 4, 4), + # Diagonal from (col=5, row=4) to (col=9, row=0) inclusive: + # Bresenham steps (5,4) (6,3) (7,2) (8,1) (9,0). + LineString([(5.5, 5.5), (9.5, 9.5)]), + Point(7.5, 8.5), + ]) + r = rasterize([(gc, 1.0)], width=10, height=10, + bounds=(0, 0, 10, 10), fill=0) + vals = r.values + # Polygon: 16 cells (4x4). Line: 5 cells along the diagonal. + # Point: 1. No overlaps. + assert int((vals == 1.0).sum()) == 16 + 5 + 1 + # Specific spot checks. + assert vals[8, 2] == 1.0 # polygon interior + # Mid-line cell: row=3, col=6 -- only the diagonal Bresenham + # branch can light this exact cell. + assert vals[3, 6] == 1.0 + assert vals[1, 7] == 1.0 # point cell + + @skip_no_cuda + def test_collection_eager_cupy_matches_numpy(self): + """GeometryCollection unpacking is identical on cupy.""" + gc = self._mixed_collection() + np_r = rasterize([(gc, 1.0)], width=10, height=10, + bounds=(0, 0, 10, 10), fill=0) + cp_r = rasterize([(gc, 1.0)], width=10, height=10, + bounds=(0, 0, 10, 10), fill=0, use_cuda=True) + # 25 polygon cells + 1 point cell = 26 (eager case pins this too). + assert int((_as_numpy(cp_r) == 1.0).sum()) == 26 + np.testing.assert_array_equal(np_r.values, _as_numpy(cp_r)) + + @skip_no_dask + def test_collection_dask_numpy_matches_numpy(self): + """GeometryCollection unpacking is identical on dask+numpy.""" + gc = self._mixed_collection() + np_r = rasterize([(gc, 1.0)], width=10, height=10, + bounds=(0, 0, 10, 10), fill=0) + dk_r = rasterize([(gc, 1.0)], width=10, height=10, + bounds=(0, 0, 10, 10), fill=0, chunks=(5, 5)) + assert int((_as_numpy(dk_r) == 1.0).sum()) == 26 + np.testing.assert_array_equal(np_r.values, _as_numpy(dk_r)) + + @skip_no_cuda + @skip_no_dask + def test_collection_dask_cupy_matches_numpy(self): + """GeometryCollection unpacking is identical on dask+cupy.""" + gc = self._mixed_collection() + np_r = rasterize([(gc, 1.0)], width=10, height=10, + bounds=(0, 0, 10, 10), fill=0) + dkcp_r = rasterize([(gc, 1.0)], width=10, height=10, + bounds=(0, 0, 10, 10), fill=0, + chunks=(5, 5), use_cuda=True) + assert int((_as_numpy(dkcp_r) == 1.0).sum()) == 26 + np.testing.assert_array_equal(np_r.values, _as_numpy(dkcp_r)) + + +# --------------------------------------------------------------------------- +# Cat 1 MEDIUM -- eager cupy ``all_touched=True`` +# --------------------------------------------------------------------------- + +class TestEagerCupyAllTouched: + """``all_touched=True`` switches polygons to a different inclusion rule. + + test_rasterize.py covers all_touched on the eager numpy backend + (test_all_touched_fills_more_pixels at line 351) and the dask+cupy + backend (test_all_touched_parity at line 1369), but skips the eager + cupy path which invokes the GPU all_touched kernel directly. This + test pins eager-cupy/eager-numpy parity for that mode. + """ + + @skip_no_cuda + def test_eager_cupy_all_touched_matches_numpy(self): + # A tiny 0.2x0.2 polygon straddling pixel-centre boundaries on a + # 5x5 grid: with all_touched=False the centre-test misses every + # cell, with all_touched=True the kernel picks up the four cells + # whose corners the polygon overlaps. Eager cupy must match + # eager numpy on both kernels. + geom = box(1.9, 1.9, 2.1, 2.1) + np_r = rasterize([(geom, 1.0)], width=5, height=5, + bounds=(0, 0, 5, 5), fill=0, + all_touched=True) + cp_r = rasterize([(geom, 1.0)], width=5, height=5, + bounds=(0, 0, 5, 5), fill=0, + all_touched=True, use_cuda=True) + np.testing.assert_array_equal(np_r.values, _as_numpy(cp_r)) + # Sanity: the touched mode lights the four corner cells. + assert int((np_r.values == 1.0).sum()) == 4 + + @skip_no_cuda + def test_eager_cupy_all_touched_superset_of_default(self): + """all_touched=True burns >= the cells that all_touched=False burns.""" + # Small fractional polygon -- default centre-test fills zero + # cells, all_touched fills the four cells whose corners the + # polygon overlaps. + geom = box(1.9, 1.9, 2.1, 2.1) + cp_default = rasterize([(geom, 1.0)], width=5, height=5, + bounds=(0, 0, 5, 5), fill=0, + use_cuda=True) + cp_touched = rasterize([(geom, 1.0)], width=5, height=5, + bounds=(0, 0, 5, 5), fill=0, + all_touched=True, use_cuda=True) + default_mask = (_as_numpy(cp_default) == 1.0) + touched_mask = (_as_numpy(cp_touched) == 1.0) + # all_touched must fill everywhere the default mode filled. + assert np.all(touched_mask[default_mask]) + # And strictly more, given the centre-miss polygon. + assert touched_mask.sum() > default_mask.sum() + + +# --------------------------------------------------------------------------- +# Cat 2 MEDIUM -- integer dtype with the default NaN fill +# --------------------------------------------------------------------------- + +class TestIntegerDtypeNanFill: + """Pin the observed behaviour when ``dtype`` is integer but ``fill`` + defaults to ``np.nan``. + + Scope: numpy backend only. ``np.full((H, W), np.nan).astype(np.int32)`` + silently casts NaN to a platform-dependent sentinel: x86 yields + ``INT32_MIN`` while Apple Silicon yields ``0``. Both values are + unspecified by C and by numpy, so the test pins "rasterize emits the + same cast numpy emits" rather than a specific number. The cupy and + dask+cupy backends allocate their own backing arrays and the + CUDA-side NaN-to-int cast may differ from numpy's by CUDA version; + a cross-backend parametrization is deferred to a follow-up sweep + that can investigate per-backend cast semantics. This is + undocumented but must remain stable on the numpy backend: a future + refactor that switched to raising + ``ValueError("integer dtype requires explicit fill")`` would break + every caller that currently passes ``dtype=np.int32`` without + overriding ``fill``. Pin the cast so the choice is visible as a + code-review diff. + """ + + def test_int32_dtype_with_default_nan_fill_pins_sentinel(self): + """NaN fill on int32 dtype takes numpy's platform NaN-cast.""" + r = rasterize([(box(0, 0, 3, 3), 7.0)], + width=5, height=5, bounds=(0, 0, 5, 5), + dtype=np.int32) + assert r.dtype == np.int32 + # Derive the sentinel from numpy itself: whatever the platform + # produces when casting NaN to int32 is what rasterize must + # produce too. x86 -> INT32_MIN, Apple Silicon -> 0. + with np.errstate(invalid="ignore"): + sentinel = np.array([np.nan], dtype=np.float64).astype(np.int32)[0] + # Lower-left quadrant covered by polygon. + assert r.values[4, 0] == 7 + # Outside the polygon (top-right corner) takes the platform NaN-cast. + assert r.values[0, 4] == sentinel + + def test_int32_dtype_with_explicit_int_fill(self): + """Explicit int fill is honoured exactly (no NaN cast surprise).""" + r = rasterize([(box(0, 0, 3, 3), 7.0)], + width=5, height=5, bounds=(0, 0, 5, 5), + fill=-1, dtype=np.int32) + assert r.dtype == np.int32 + assert r.values[4, 0] == 7 + assert r.values[0, 4] == -1