reproject: add bounds_policy parameter (#2187)#2199
Conversation
…2187) _compute_output_grid silently clamped geographic bounds and fell back to 2/98 percentile bounds when the projected extent blew up. Both branches cropped real data without telling the caller. Add an explicit bounds_policy parameter with four options: auto (default, current behaviour), raw (no heuristic), clamp (geographic clamp only), and percentile (force 2/98 fallback). When auto / clamp / percentile actually alters the bounds, emit a UserWarning naming the policy and reporting the per-side delta vs raw. Plumbed through reproject() and merge(). Tests cover the four policies, the warning behaviour, an explicit-bounds bypass, and the dask backend.
brendancol
left a comment
There was a problem hiding this comment.
PR Review: reproject bounds_policy parameter
Blockers
None.
Suggestions
-
xrspatial/reproject/_grid.py:230-238: thebounds_policy == "raw"branch appends four corner points toxs/ys, but when the policy is"raw"the geographic clamp at line 187 is skipped, sosrc_left/right/bottom/topalready equal the raw values. The four extra points are duplicates of corners already inedge_xs/edge_ys. Drop the block, or move the clamp gating so the extra corners actually contribute. As written it is dead code that obscures intent. -
xrspatial/reproject/__init__.py:1847: insidemerge()the per-input_compute_output_gridcall passesbounds_policythrough, so each input that crosses a singularity emits its own warning. A mosaic of N near-antimeridian rasters yields N near-identical warnings. Suppress the warning during the per-input gather pass and only emit it on the final merged-bounds call at line 1864.
Nits
-
xrspatial/reproject/_grid.py:187:bounds_policy='clamp'is silently a no-op whensource_crsis projected. Note that next to the"clamp"entry in thereproject()docstring so users do not assume something is happening on a UTM input. -
xrspatial/tests/test_reproject.py::TestBoundsPolicy: no test coversbounds_policy='clamp'on a mid-latitude geographic input where the clamp runs but trims nothing. A short test would pin the silent-no-op behaviour so it does not regress into spurious warnings. -
xrspatial/reproject/_grid.py:199-204: theif src_left >= src_rightfallback resetsclamp_applied = Falseafter the clamp would have inverted the range. The condition can only fire on inputs where right is within 0.02 deg of left, which_validate_grid_paramsrejects upstream. The branch is unreachable on valid inputs. Either drop it or add a comment explaining why it stays.
What looks good
- Validation lives at the API boundary in both
reproject()andmerge(), with the error message listing the valid tokens. - Warning text names the policy, the trigger, and the per-side delta vs the raw bounds. A caller can recover the uncropped bounds from the warning without re-running with
'raw'. - The
bounds=...short-circuit at line 177 means explicit bounds skip the policy logic entirely, andtest_explicit_bounds_skips_policy_logicexercises that. - Default value preserves historical behaviour, so no existing caller breaks.
Checklist
- Algorithm matches reference: heuristics unchanged in shape, just gated by the new parameter.
- All implemented backends produce consistent results: bounds policy runs in pure numpy at graph-build time.
- NaN handling is correct: existing
np.isfinitefilters on transformed coords are preserved. - Edge cases covered by tests: invalid-token rejection, explicit-bounds bypass, benign-input no-warning, dask backend.
- Dask chunk boundaries handled correctly: policy fires once at graph build, not per chunk.
- No premature materialization or unnecessary copies.
- Benchmark not needed (parameter on existing path).
- README feature matrix not needed (no new function, no backend change).
- Docstrings present and accurate.
…docstring + tests (#2187)
brendancol
left a comment
There was a problem hiding this comment.
Follow-up review after f959b9f
Dispositions for the first-pass findings:
- Suggestion 1 (dead
'raw'corner block in_grid.py:230-238): fixed -- removed the block. When'raw'is set, the clamp above is skipped, so the edge samples already use the originalsrc_*_rawvalues. Replaced with a clarifying comment. - Suggestion 2 (per-input warnings in
merge()): fixed -- per-input gather now capturesbounds_policywarnings, deduplicates them, and emits a single summary warning naming the count and first trigger. Non-policy warnings are re-emitted untouched. - Nit 1 (
'clamp'docstring on projected CRS): fixed -- docstring now says it is a no-op for projected source CRSes. - Nit 2 (missing test for
'clamp'no-op): fixed -- addedtest_clamp_policy_noop_on_benign_geographicandtest_clamp_policy_noop_on_projected_source. Also addedtest_merge_dedupes_per_input_warningsto pin the dedup behaviour. - Nit 3 (unreachable
if src_left >= src_rightbranch): kept with comment -- left the defensive guard in place and added a comment that_validate_grid_paramsrejects the degenerate inputs that would reach it. Cheaper than carrying the dependency on upstream validation.
Tests: 13 in TestBoundsPolicy (was 10), full test_reproject.py at 281 pass (was 278).
No further findings on the diff.
# Conflicts: # xrspatial/tests/test_reproject.py
Closes #2187.
Summary
bounds_policyparameter toreproject()andmerge()to control the output-bounds heuristics in_compute_output_grid. Options:auto(default, matches historical behaviour),raw(no heuristic),clamp(geographic clamp only),percentile(force 2/98 fallback).UserWarningwhenauto,clamp, orpercentileactually alters the bounds. The warning names the policy and reports the per-side delta vs the raw projected extent. Silent cropping was the bug the issue flagged.reproject()andmerge(). Validate unknown tokens at the API boundary. Skipped when explicitboundsis supplied.Backend coverage
test_raw_policy_dask_backend)Test plan
pytest xrspatial/tests/test_reproject.py::TestBoundsPolicy -x -q(10 new tests)pytest xrspatial/tests/test_reproject.py -x -q(278 tests, no regressions)