Skip to content

polygonize: cupy backend honours float tolerance from numba path (#2151)#2153

Merged
brendancol merged 2 commits into
mainfrom
deep-sweep-accuracy-polygonize-2026-05-19
May 20, 2026
Merged

polygonize: cupy backend honours float tolerance from numba path (#2151)#2153
brendancol merged 2 commits into
mainfrom
deep-sweep-accuracy-polygonize-2026-05-19

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

  • The cupy polygonize backend grouped float pixels into regions by exact equality (data == v), while the numba CPU path used by the numpy / dask / dask+cupy backends compared neighbouring pixels with _is_close (atol=1e-8, rtol=1e-5).
  • On float rasters with near-equal values the cupy backend produced more polygons than the other three backends, e.g. values 1.0 and 1.000001 were split into different regions on cupy but merged on numpy.
  • Fix: in _calculate_regions_cupy, for float dtypes greedily chain sorted unique values using the same tolerance and label each chained group as one region. Integer dtypes keep the existing exact-equality path.

Fixes #2151.

Test plan

Discovered by the /sweep-accuracy audit (Cat 5 - Backend Inconsistency).

…#2151)

The cupy backend binned float pixels into regions by exact value while
the numba CPU path (used by the numpy, dask and dask+cupy backends)
compares neighbouring pixels with `_is_close` (atol=1e-8, rtol=1e-5).
On float rasters with near-equal values this produced different polygon
sets across backends.

For float dtypes, group sorted unique values in the cupy backend using
the same tolerance and label each group as a single region.  Integer
dtypes keep the existing exact-equality path.

Adds a regression test asserting that the cupy backend yields the same
polygon count and per-value areas as the numpy backend on a float
raster with values 1.0 and 1.000001.

Also records this finding in the accuracy sweep state CSV.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 19, 2026
…parity

The first pass at #2151 added value-only greedy chaining of unique float
values in _calculate_regions_cupy.  That fixed the reported pattern
(adjacent near-equal pixels) but still diverged from the CPU path when
three or more transitively-close values were present in the raster
without intermediate adjacent pixels.  Example:

    [1.0,        1.000018]
    [1.000009,   3.0      ]

CPU spatial CCL yields three regions: 1.0 + 1.000009 (close, adjacent),
1.000018 alone (not close to either neighbour), and 3.0 alone.  The
value-grouping heuristic chained 1.0 -> 1.000009 -> 1.000018 and labeled
the connected mask as a single region, returning two regions on cupy.

Route float dtypes through _polygonize_numpy in _polygonize_cupy so the
numba _is_close predicate is applied to spatially adjacent pixels, the
only place CPU and GPU semantics can agree without re-implementing CCL.
The data is already transferred to host for boundary tracing, so the
fall-back cost is one extra _calculate_regions call.

Drop _group_float_values_by_tolerance and the float branch of
_calculate_regions_cupy as dead code.  Extend the regression test to
parametrize over the original adjacent pattern and the transitivity
edge case.
@brendancol brendancol merged commit 4e22efa into main May 20, 2026
4 checks passed
@brendancol
Copy link
Copy Markdown
Contributor Author

PR Review: polygonize: cupy backend honours float tolerance from numba path (#2151)

I ran this review on commit 2e9b364 and then pushed a follow-up commit (7bcecfb) that I think is needed for the stated parity goal. Details below.

Blocker (fixed in follow-up)

  • xrspatial/polygonize.py:591-614,650-693. value-grouping heuristic still diverges from CPU CCL on transitively-close values. The greedy chain merges 1.0, 1.000009, 1.000018 into a single tolerance group because each consecutive pair is within tolerance, but the numba CPU path runs spatial CCL with a pairwise _is_close check. On

    [1.0,        1.000018]
    [1.000009,   3.0     ]
    

    CPU returns 3 regions (1.0+1.000009 adjacent and close, 1.000018 not close to either neighbour, 3.0 alone). The new value-grouping path returns 2 regions, since cp.isin(data, [1.0, 1.000009, 1.000018]) makes the top-row pair land in the same connected component on cupy.

    The push at 7bcecfb routes float dtypes through _polygonize_numpy inside _polygonize_cupy. Data is already transferred to host for boundary tracing in this hybrid backend, so the cost is one extra _calculate_regions call on CPU rather than a per-value GPU label pass. The _group_float_values_by_tolerance helper and the float branch of _calculate_regions_cupy are removed as dead code.

Suggestions

  • The new test_polygonize_cupy_float_tolerance_matches_numpy_2151 only covered the adjacent-near-equal pattern from the issue. The follow-up parametrizes it with adjacent_near_equal and transitive_non_adjacent so the second case is locked in.

Nits

  • xrspatial/polygonize.py:1681-1682. the existing docstring already documents that cupy data is transferred to CPU for boundary tracing; after the follow-up, that note also covers the float CCL path, so no additional text is needed.

What looks good

  • The diagnosis in the PR body is accurate: the cupy backend's exact-equality binning was the source of the divergence reported in polygonize: cupy backend uses exact equality where numpy uses tolerance #2151.
  • The sweep-state CSV update is consistent with the previous entries.
  • The existing tests (test_polygonize_cupy_matches_numpy, test_polygonize_dask_cupy_matches_numpy) cover integer rasters, and the integer GPU path is unchanged.

Checklist

  • Algorithm matches the numba CPU _is_close predicate (with follow-up)
  • All implemented backends produce consistent results (with follow-up)
  • NaN handling preserved (float path falls through _polygonize_numpy, which masks NaN)
  • Edge cases covered: parametrized regression test now exercises the transitivity case
  • No premature materialization or unnecessary copies
  • Benchmark not needed (correctness fix, integer GPU path unchanged)
  • Docstrings updated to describe the integer-GPU / float-CPU split
  • Local pytest xrspatial/tests/test_polygonize.py passes (116 passed, 13 skipped) on the post-fix branch

@brendancol brendancol deleted the deep-sweep-accuracy-polygonize-2026-05-19 branch May 27, 2026 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

polygonize: cupy backend uses exact equality where numpy uses tolerance

1 participant