Pin actionable failure modes for unsupported COG writer inputs (#2301)#2307
Conversation
…y-contrib#2301) Add test_cog_invalid_input_errors_2286.py covering the seven rows in the issue: experimental codecs, internal-only JPEG, rotated transforms (rotated_affine attr, rotated 6-tuple transform, rotated Affine object, skewed Affine object), file-like / BytesIO destinations, CuPy + cog, object dtype, and conflicting CRS attrs. Each test pins the exception type and a substring of the message that names the violated constraint or the opt-in flag. Most rows pin behaviour the writer already enforced. The one writer-side change is in to_geotiff: a rasterio Affine in attrs['transform'] with non-zero rotation/shear used to slip past transform_from_attr because Affine iterates as a 9-element augmented matrix and the 6-tuple gate returned None for that length. The writer fell back to no-georef output and silently dropped the rotation. The new validation hook detects the Affine duck-type via the .b / .d attrs and raises the same diagnostic the 6-tuple branch already produced, so the rejection message stays consistent across both shapes.
brendancol
left a comment
There was a problem hiding this comment.
PR Review: Pin actionable failure modes for unsupported COG writer inputs (#2301)
Blockers
None.
Suggestions
-
xrspatial/geotiff/_writers/eager.py:378-380— thetry: _b = float(...); _d = float(...) except (TypeError, ValueError): _b = _d = 0.0swallow path silently turns a malformedattrs['transform'](e.g.transform.bset to a string) into "no rotation, proceed with the write". That used to be the no-op fall-through intransform_from_attr, but here it bypasses every downstream gate that would have caught the bad value (the resolver andcoords_to_transform). Consider re-raising as aValueErrornaming the attr instead of zero-defaulting. Low risk in practice (realAffineobjects always have float.b/.d) but the dead branch reads as "fail open" inside a fail-closed validator. -
xrspatial/geotiff/_writers/eager.py:360-389— the duck-typed Affine detection fixes the publicto_geotiffentry point, butwrite_geotiff_gpuis also a public-ish entry (used by the auto-dispatch path and exported on the package). If a caller invokeswrite_geotiff_gpudirectly with a rotated Affine in attrs, the new gate doesn't fire there. Out of scope for this PR per the issue's "public entry" wording, but worth a follow-up note since both entry points were intended to "hit the same gate" per the existing_validate_no_rotated_affinecomment two lines up.
Nits
-
xrspatial/geotiff/tests/test_cog_invalid_input_errors_2286.pyline ~188 (test_bytesio_destination_with_cog_raises) lacks thetmp_pathfixture and writes nothing to disk; the test name suffix_2301is also missing on the BytesIO buffer (other tests include it on file paths). Consistency nit only. -
xrspatial/geotiff/tests/test_cog_invalid_input_errors_2286.pyline ~296 (test_crs_kwarg_overrides_attrs_silently) usescrs_wkt='GEOGCS["foo"]'to make the conflict check no-op. The comment says "unparseable; check no-ops" but the check actually short-circuits atif context.get('crs_kwarg') is not None: returnbefore pyproj parsing. The comment overstates how the bypass works. -
xrspatial/geotiff/_writers/eager.py:382—_ROT_TOL = 1e-12is repeated verbatim from_coords.py:333. A shared constant would prevent the two gates drifting apart on a future tolerance tweak.
What looks good
- The new check is exactly as narrow as the issue asks: it touches one validator block inside
to_geotiff, leaves every other path alone, and reuses the existing 6-tuple error message verbatim so existingpytest.raises(match=...)callers in the codebase still hit. - Test coverage is thorough: 18 tests covering all seven matrix rows plus three sanity-guard rows that pin the happy paths (axis-aligned Affine writes, BytesIO without
cog=True, CRS-kwarg override). The sanity guards protect against a future "tighten the rejection" change widening the bucket and breaking real callers. - The CuPy +
cog=Truerow is correctly handled as a no-op pin with a comment explaining the tier-promotion decision is out of scope. Many reviewers would have flipped this to apytest.raisesto "match the issue body", which would have changed semantics on a currently-succeeding path. - The 6-tuple-rotated and Affine-rotated tests share the exact same
'rotation/shear'and'axis-aligned'assertions, which locks the two code paths to one error message. - Temporary file names include the issue suffix
_2301to avoid collisions with parallel test runs (matches the project convention).
Checklist
- Algorithm matches reference: N/A (validation hook, not numerical code)
- All implemented backends produce consistent results: covered (CuPy row pinned as currently-succeeds; rotated check applies pre-dispatch so GPU path is also gated when reached via
to_geotiff) - NaN handling: N/A
- Edge cases covered by tests: axis-aligned Affine sanity, drop_rotation opt-in, BytesIO without cog, crs-kwarg override
- Dask chunk boundaries handled correctly: N/A
- No premature materialization or unnecessary copies: N/A
- Benchmark exists or is not needed: not needed (validation-only)
- README feature matrix updated: N/A (no new functions)
- Docstrings present and accurate
…b#2301) * Promote ROTATION_SHEAR_TOL to a module-level constant in _coords.py so the 6-tuple gate in transform_from_attr and the Affine duck-type gate in to_geotiff stay in lockstep on future tolerance tweaks. * Fail closed when an attrs['transform'] object has .b / .d that cannot be converted to float. The previous zero-default branch treated a malformed affine as axis-aligned and bypassed every downstream georef gate. The fail-closed branch surfaces a clear ValueError naming the unconvertable values. * Add a test covering the new fail-closed branch with a class that quacks like Affine but holds a non-numeric ``b`` term. * Tighten the test_crs_kwarg_overrides_attrs_silently docstring to describe the actual short-circuit path (kwarg presence, not WKT parseability) so the test rationale matches the code it pins.
brendancol
left a comment
There was a problem hiding this comment.
Follow-up review after 40f5cd1
All three suggestions / nits from the first pass are addressed in the follow-up commit:
- Suggestion 1 (fail-closed on malformed
.b/.d): the zero-default branch is replaced with a typedValueErrornaming the unconvertable values. A new test (test_affine_attr_with_unconvertable_b_d_raises) pins the behaviour. - Suggestion 2 (
write_geotiff_gpudirect-caller gap): deferred. The issue scopes #2301 to the publicto_geotiffentry point, and the auto-dispatch already routes GPU writes throughto_geotiffso the gate fires before GPU dispatch. Direct callers ofwrite_geotiff_gpuare a follow-up, not a regression introduced here. - Nit 1 (BytesIO test consistency): not changed. The test path lives in a
BytesIObuffer with no on-disk component, so the_2301filename suffix convention does not apply. Leaving as-is. - Nit 2 (test comment overstates the bypass path): the docstring on
test_crs_kwarg_overrides_attrs_silentlynow describes the actual short-circuit (kwarg presence in_check_write_conflicting_crs), not the WKT parseability. - Nit 3 (shared
_ROT_TOL): promoted toROTATION_SHEAR_TOLat module level in_coords.py; both the 6-tuple branch intransform_from_attrand the Affine duck-type branch into_geotiffimport the same constant.
166 tests pass on the related test files (the new 19 plus the COG, rotated-transform, georef-resolver, fail-closed, and ambiguous-metadata suites). No remaining blockers from my side.
Summary
Pins typed, actionable exceptions for unsupported
to_geotiff(..., cog=True)input combinations, per the matrix in #2301. Most rows already raised; the test file just locks the type and a substring of the message in. One row needed a writer-side hook.Behavior matrix
allow_experimental_codecs=TrueValueErrornaming codec + flag + tierallow_internal_only_jpeg=TrueValueErrornaming codec + flagattrs['rotated_affine']set withoutdrop_rotation=TrueValueError(#2216)attrs['transform']6-tuple with rotation/shearValueError(transform_from_attr, #1987 PR 3)attrs['transform']rasterioAffinewith rotation/shearto_geotiffcog=TrueValueErrorcog=TrueValueErrornaming dtypeattrs['crs']vsattrs['crs_wkt']ConflictingCRSError(#1987 PR 6)The one production change
A rasterio
Affineiterates as a 9-element augmented matrix, so the 6-tuple branch intransform_from_attrreturnedNonefor it and the rotation/shear gate never fired. The writer then fell back to coord-derived or no-georef output and the rotation was lost on disk.to_geotiffnow duck-types the Affine via.b/.dattrs and raises the same diagnostic the 6-tuple branch already produces, so the rejection message stays identical across both input shapes.Test plan
test_cog.py,test_allow_rotated_*.py,test_cog_writer_compliance.pystill passtest_georef_resolver_parity_2211.py,test_remaining_fail_closed_1987.py,test_ambiguous_metadata_hooks_1987.pystill passCloses #2301