Tolerate unreadable .ovr sidecar on base read (#2416)#2419
Merged
Conversation
The release contract puts reader.local_file at the stable tier and reader.sidecar_ovr at advanced. Before this change, the eager CPU path, the eager GPU path, and the metadata-only helper all parsed a sibling .ovr before IFD selection, so a stale or corrupt sidecar (a gdaladdo dropping, a partial download, a third-party tool artefact) turned an advanced-tier fault into a stable-tier base read failure. Catch sidecar load failures during discovery, emit a RuntimeWarning, and fall through to base-file-only behaviour. CloudSizeLimitError still propagates because the byte budget is a caller-set contract. Requesting a specific external overview level surfaces the underlying parse error -- silent fallback only applies when the caller did not ask for the broken surface. Mirrors the contract that discover_remote_sidecar already uses on the dask metadata path.
brendancol
commented
May 26, 2026
Contributor
Author
brendancol
left a comment
There was a problem hiding this comment.
Review
The change is tight and the test surface is solid. A few items below; none are blockers.
Blockers
None.
Suggestions
xrspatial/geotiff/__init__.py:290-301andxrspatial/geotiff/_backends/gpu.py:419-435: the_read_geo_infoand GPU eager paths catch a bareExceptionand do not re-raiseCloudSizeLimitErrorthe way_reader._read_to_arraydoes. In practice both paths are local-file-only at this call site, so the budget error cannot fire today, but the asymmetry is easy to miss the next time someone routes a cloud source through here. Either re-raiseCloudSizeLimitErrorfor symmetry with_reader.py:244or add a one-line comment at each site noting "local-file-only here; budget breach cannot fire" so the next reader does not wonder.xrspatial/geotiff/_reader.py:248(and the two parallel sites):stacklevel=2points at the immediate caller of_read_to_array, not at the user'sopen_geotiff/read_geotiff_daskcall. The warning location in the user's traceback may end up inside the library rather than at the user's call line. Consider bumping tostacklevel=3or higher after walking the actual call chain.
Nits
- The warning text says "Request a specific external overview level to surface the error instead." That is accurate but a bit indirect. A short hint like "Delete the
.ovrfile or passoverview_level=N >= 1to surface the parse error" is more actionable. Optional. xrspatial/geotiff/tests/test_sidecar_bad_does_not_break_base_2416.pyhas a_gpu_available()helper that duplicates the same pattern used intest_sidecar_ovr_2112.py. Not worth a refactor in this PR, but a candidate for a sharedconftest.pyfixture if anyone consolidates later.
What looks good
- Three call sites covered:
_reader._read_to_array,__init__._read_geo_info,_backends.gpu.read_geotiff_gpu. The contract is consistent across CPU eager, GPU eager, and dask metadata paths. CloudSizeLimitErrorre-raise on the CPU eager path keeps the caller-set byte budget visible.- Tests pin the level-1-still-raises behaviour, the
overview_level=0path, theCloudSizeLimitErrorpass-through (via monkeypatch on the local-file path), and five corrupt-payload shapes. - Module-level
import warningsadded to_reader.pyrather than function-scoped, avoiding theUnboundLocalErrortrap that bit the GPU file before the fix. - Release-contract row updated so the tier promise is documented, not just implemented.
Checklist
- Algorithm matches reference: contract aligns with
_sidecar.discover_remote_sidecar's existing dask-path behaviour. - Backends consistent: CPU eager, GPU eager, metadata-only, dask metadata helper all follow the same rule.
- NaN handling: not applicable (read-path control flow).
- Edge cases tested: empty file, short file, gzip magic, PNG magic, plain text; plus
overview_level=0andoverview_level=1. - Dask chunk boundaries: not applicable.
- No premature materialization.
- Benchmark: not applicable (no perf-sensitive change).
- README feature matrix: not applicable (no new function).
- Docstrings present.
- Re-raise CloudSizeLimitError on the GPU eager path and the metadata-only _read_geo_info helper, mirroring the eager CPU path in _reader._read_to_array. The exception cannot fire on a local mmap source today, but keeping the symmetry explicit prevents a silent regression if a future patch widens either call site to a cloud source. Add CloudSizeLimitError to the module-level imports in __init__.py and _backends/gpu.py. - Bump warnings.warn stacklevel from 2 to 3 so the warning location resolves at the user's open_geotiff / read_geotiff_dask call site rather than inside the library. - Rewrite the warning text to give an actionable next step: "Delete the .ovr file or pass overview_level>=1 to surface the parse error." Replaces the indirect previous wording. - New test pinning the CloudSizeLimitError re-raise on the metadata-only path so the symmetry stays covered.
brendancol
commented
May 26, 2026
Contributor
Author
brendancol
left a comment
There was a problem hiding this comment.
Follow-up review after 2ddafa0
All Suggestions and Nits from the prior review applied:
Fixed
- Suggestion 1: Re-raise
CloudSizeLimitErroron the GPU eager path (_backends/gpu.py:421-429) and the metadata-only helper (__init__.py:292-300). Both sites now match_reader._read_to_array:244for symmetry.CloudSizeLimitErroradded to module-level imports in__init__.pyand_backends/gpu.py. - Suggestion 2:
warnings.warnstacklevelbumped from 2 to 3 at all three call sites so the warning resolves at the user'sopen_geotiff/read_geotiff_daskcall rather than inside the library. - Nit 1: Warning text rewritten to give an actionable next step ("Delete the .ovr file or pass overview_level>=1 to surface the parse error.").
Dismissed
- Nit 2 (shared
_gpu_available()fixture): out of scope for this PR. The duplication is in a parallel test file (test_sidecar_ovr_2112.py); consolidating it would touch unrelated tests.
New test
test_read_geo_info_cloud_size_limit_error_is_not_silencedpins the re-raise on the metadata-only path so the new symmetry stays covered.
Verification
pytest xrspatial/geotiff/tests/: 5732 passed, 68 skipped, 6 xfailed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #2416.
Summary
reader.local_fileat the stable tier andreader.sidecar_ovrat advanced. Before this change a stale or malformed sibling.ovrwould take the stable base read down on all three eager paths (CPU, GPU, metadata-only helper)._read_to_array,_read_geo_info, and the GPU backend; emit aRuntimeWarningand fall through to base-only.CloudSizeLimitErrorstill propagates because the byte budget is a caller-set contract.overview_level >= 1) still surfaces the underlying parse error. Silent fallback only applies when the caller did not ask for the sidecar surface.discover_remote_sidecar(dask metadata path) already followed this rule. The eager paths now match.Backend coverage
_read_to_arraychange._backends/gpu.pychange._read_geo_info(covered) anddiscover_remote_sidecar(already tolerant since Remote chunked GeoTIFF reads do not honor external .ovr sidecars #2239).Test plan
pytest xrspatial/geotiff/tests/test_sidecar_bad_does_not_break_base_2416.py(12 new tests including 5 parametrized payloads, GPU eager path, CloudSizeLimitError pass-through, explicit-level error surfacing).pytest xrspatial/geotiff/tests/test_sidecar_*.py xrspatial/geotiff/tests/test_remote_sidecar_*.py(70 existing sidecar tests still pass).pytest xrspatial/geotiff/tests/-> 5731 passed, 68 skipped, 6 xfailed.