polygonize: cupy backend honours float tolerance from numba path (#2151) by brendancol · Pull Request #2153 · xarray-contrib/xarray-spatial

brendancol · 2026-05-19T19:04:37Z

Summary

The cupy polygonize backend grouped float pixels into regions by exact equality (data == v), while the numba CPU path used by the numpy / dask / dask+cupy backends compared neighbouring pixels with _is_close (atol=1e-8, rtol=1e-5).
On float rasters with near-equal values the cupy backend produced more polygons than the other three backends, e.g. values 1.0 and 1.000001 were split into different regions on cupy but merged on numpy.
Fix: in _calculate_regions_cupy, for float dtypes greedily chain sorted unique values using the same tolerance and label each chained group as one region. Integer dtypes keep the existing exact-equality path.

Test plan

pytest xrspatial/tests/test_polygonize.py (all 115 tests pass, 13 skipped)
Added regression test_polygonize_cupy_float_tolerance_matches_numpy_2151 covering the divergence reported in polygonize: cupy backend uses exact equality where numpy uses tolerance #2151
Manually re-ran the issue reproducer; both backends now return 3 polygons with values [1.0, 2.0, 3.0]

Discovered by the /sweep-accuracy audit (Cat 5 - Backend Inconsistency).

…#2151) The cupy backend binned float pixels into regions by exact value while the numba CPU path (used by the numpy, dask and dask+cupy backends) compares neighbouring pixels with `_is_close` (atol=1e-8, rtol=1e-5). On float rasters with near-equal values this produced different polygon sets across backends. For float dtypes, group sorted unique values in the cupy backend using the same tolerance and label each group as a single region. Integer dtypes keep the existing exact-equality path. Adds a regression test asserting that the cupy backend yields the same polygon count and per-value areas as the numpy backend on a float raster with values 1.0 and 1.000001. Also records this finding in the accuracy sweep state CSV.

…parity The first pass at #2151 added value-only greedy chaining of unique float values in _calculate_regions_cupy. That fixed the reported pattern (adjacent near-equal pixels) but still diverged from the CPU path when three or more transitively-close values were present in the raster without intermediate adjacent pixels. Example: [1.0, 1.000018] [1.000009, 3.0 ] CPU spatial CCL yields three regions: 1.0 + 1.000009 (close, adjacent), 1.000018 alone (not close to either neighbour), and 3.0 alone. The value-grouping heuristic chained 1.0 -> 1.000009 -> 1.000018 and labeled the connected mask as a single region, returning two regions on cupy. Route float dtypes through _polygonize_numpy in _polygonize_cupy so the numba _is_close predicate is applied to spatially adjacent pixels, the only place CPU and GPU semantics can agree without re-implementing CCL. The data is already transferred to host for boundary tracing, so the fall-back cost is one extra _calculate_regions call. Drop _group_float_values_by_tolerance and the float branch of _calculate_regions_cupy as dead code. Extend the regression test to parametrize over the original adjacent pattern and the transitivity edge case.

brendancol · 2026-05-20T14:08:49Z

PR Review: polygonize: cupy backend honours float tolerance from numba path (#2151)

I ran this review on commit 2e9b364 and then pushed a follow-up commit (7bcecfb) that I think is needed for the stated parity goal. Details below.

Blocker (fixed in follow-up)

xrspatial/polygonize.py:591-614,650-693. value-grouping heuristic still diverges from CPU CCL on transitively-close values. The greedy chain merges 1.0, 1.000009, 1.000018 into a single tolerance group because each consecutive pair is within tolerance, but the numba CPU path runs spatial CCL with a pairwise _is_close check. On
```
[1.0,        1.000018]
[1.000009,   3.0     ]
```
CPU returns 3 regions (1.0+1.000009 adjacent and close, 1.000018 not close to either neighbour, 3.0 alone). The new value-grouping path returns 2 regions, since cp.isin(data, [1.0, 1.000009, 1.000018]) makes the top-row pair land in the same connected component on cupy.

The push at 7bcecfb routes float dtypes through _polygonize_numpy inside _polygonize_cupy. Data is already transferred to host for boundary tracing in this hybrid backend, so the cost is one extra _calculate_regions call on CPU rather than a per-value GPU label pass. The _group_float_values_by_tolerance helper and the float branch of _calculate_regions_cupy are removed as dead code.

Suggestions

The new test_polygonize_cupy_float_tolerance_matches_numpy_2151 only covered the adjacent-near-equal pattern from the issue. The follow-up parametrizes it with adjacent_near_equal and transitive_non_adjacent so the second case is locked in.

Nits

xrspatial/polygonize.py:1681-1682. the existing docstring already documents that cupy data is transferred to CPU for boundary tracing; after the follow-up, that note also covers the float CCL path, so no additional text is needed.

What looks good

The diagnosis in the PR body is accurate: the cupy backend's exact-equality binning was the source of the divergence reported in polygonize: cupy backend uses exact equality where numpy uses tolerance #2151.
The sweep-state CSV update is consistent with the previous entries.
The existing tests (test_polygonize_cupy_matches_numpy, test_polygonize_dask_cupy_matches_numpy) cover integer rasters, and the integer GPU path is unchanged.

Checklist

Algorithm matches the numba CPU _is_close predicate (with follow-up)
All implemented backends produce consistent results (with follow-up)
NaN handling preserved (float path falls through _polygonize_numpy, which masks NaN)
Edge cases covered: parametrized regression test now exercises the transitivity case
No premature materialization or unnecessary copies
Benchmark not needed (correctness fix, integer GPU path unchanged)
Docstrings updated to describe the integer-GPU / float-CPU split
Local pytest xrspatial/tests/test_polygonize.py passes (116 passed, 13 skipped) on the post-fix branch

github-actions Bot added the performance PR touches performance-sensitive code label May 19, 2026

brendancol merged commit 4e22efa into main May 20, 2026
4 checks passed

brendancol deleted the deep-sweep-accuracy-polygonize-2026-05-19 branch May 27, 2026 14:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

polygonize: cupy backend honours float tolerance from numba path (#2151)#2153

polygonize: cupy backend honours float tolerance from numba path (#2151)#2153
brendancol merged 2 commits into
mainfrom
deep-sweep-accuracy-polygonize-2026-05-19

brendancol commented May 19, 2026

Uh oh!

Uh oh!

brendancol commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

brendancol commented May 19, 2026

Summary

Test plan

Uh oh!

Uh oh!

brendancol commented May 20, 2026

PR Review: polygonize: cupy backend honours float tolerance from numba path (#2151)

Blocker (fixed in follow-up)

Suggestions

Nits

What looks good

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant