From 031a3bd15825078b06572698e0fa3ea2fee55a79 Mon Sep 17 00:00:00 2001 From: Brendan Collins Date: Mon, 25 May 2026 20:22:15 -0700 Subject: [PATCH 1/3] geotiff tests: consolidate VRT validation cluster (#2395) Folds the validator-side VRT files into one parametrised module: - test_vrt_validation_2321.py - test_vrt_capability_validator_2371.py - test_vrt_unsupported_2370.py - test_vrt_narrow_except_1670.py - test_vrt_path_containment_1671.py All five are removed in this commit. The new ``vrt/test_validation.py`` parametrises the validator-rule matrix, the entry-point matrix (package read_vrt, internal _vrt.read_vrt, open_geotiff), the narrow-except matrix (exception class x default/strict mode), and the path-containment matrix. Also removes the stale ``CLUSTER_AUDIT.md`` from PR 1 that leaked to main, and updates two release-gate citations in ``docs/source/reference/release_gate_geotiff.rst`` so the ``test_release_gate_cites_only_existing_test_files`` gate still passes. Adds a temporary ``CLUSTER_AUDIT_PR2.md`` mapping every old ``file::test`` to its new ``file::test_id``; deleted in a follow-up commit on this branch before merge per the epic #2390 contract. Tests-only restructure. No changes to ``xrspatial/geotiff/`` source modules. ``find xrspatial/geotiff/tests -name 'test_*.py' | wc -l`` drops from 356 to 352 (5 deleted, 1 added). --- .../source/reference/release_gate_geotiff.rst | 4 +- xrspatial/geotiff/tests/CLUSTER_AUDIT.md | 75 - xrspatial/geotiff/tests/CLUSTER_AUDIT_PR2.md | 130 ++ .../test_vrt_capability_validator_2371.py | 519 ------ .../tests/test_vrt_narrow_except_1670.py | 381 ----- .../tests/test_vrt_path_containment_1671.py | 303 ---- .../tests/test_vrt_unsupported_2370.py | 510 ------ .../geotiff/tests/test_vrt_validation_2321.py | 488 ------ .../geotiff/tests/vrt/test_validation.py | 1469 +++++++++++++++++ 9 files changed, 1601 insertions(+), 2278 deletions(-) delete mode 100644 xrspatial/geotiff/tests/CLUSTER_AUDIT.md create mode 100644 xrspatial/geotiff/tests/CLUSTER_AUDIT_PR2.md delete mode 100644 xrspatial/geotiff/tests/test_vrt_capability_validator_2371.py delete mode 100644 xrspatial/geotiff/tests/test_vrt_narrow_except_1670.py delete mode 100644 xrspatial/geotiff/tests/test_vrt_path_containment_1671.py delete mode 100644 xrspatial/geotiff/tests/test_vrt_unsupported_2370.py delete mode 100644 xrspatial/geotiff/tests/test_vrt_validation_2321.py create mode 100644 xrspatial/geotiff/tests/vrt/test_validation.py diff --git a/docs/source/reference/release_gate_geotiff.rst b/docs/source/reference/release_gate_geotiff.rst index b616af39..71e87d24 100644 --- a/docs/source/reference/release_gate_geotiff.rst +++ b/docs/source/reference/release_gate_geotiff.rst @@ -521,7 +521,7 @@ VRT supported subset - stable - Relative source paths are constrained to the VRT's directory tree and cannot escape via ``..``. - - ``xrspatial/geotiff/tests/test_vrt_path_containment_1671.py`` + - ``xrspatial/geotiff/tests/vrt/test_validation.py`` - `#2344`_ * - VRT resampling algorithm allow-list - advanced @@ -557,7 +557,7 @@ VRT supported subset - stable - VRT-specific failures surface as typed exceptions rather than as generic ``Exception``. - - ``xrspatial/geotiff/tests/test_vrt_narrow_except_1670.py`` + - ``xrspatial/geotiff/tests/vrt/test_validation.py`` - `#2321`_ * - VRT presence gate - stable diff --git a/xrspatial/geotiff/tests/CLUSTER_AUDIT.md b/xrspatial/geotiff/tests/CLUSTER_AUDIT.md deleted file mode 100644 index 75b1b411..00000000 --- a/xrspatial/geotiff/tests/CLUSTER_AUDIT.md +++ /dev/null @@ -1,75 +0,0 @@ -# CLUSTER_AUDIT.md — PR 1 (Foundation + VRT missing-sources) - -Temporary audit table tracking every old `file::test` and where it lands -in the consolidated layout. Deleted in a follow-up commit on the same -branch before merge per the epic #2390 contract. - -## Foundation moves - -| Old location | New location | Notes | -|---|---|---| -| `conftest.py::make_minimal_tiff` (function) | `_helpers/tiff_builders.py::make_minimal_tiff` | Re-exported from `conftest.py` so existing `from .conftest import make_minimal_tiff` keeps working. | -| `conftest.py::gpu_available` | `_helpers/markers.py::gpu_available` | Re-exported from `conftest.py`. | -| `conftest.py::loopback_available` | `_helpers/markers.py::loopback_available` | Re-exported from `conftest.py`. | -| `conftest.py::requires_gpu` | `_helpers/markers.py::requires_gpu` | Marker name unchanged; re-exported. | -| `conftest.py::requires_loopback` | `_helpers/markers.py::requires_loopback` | Marker name unchanged; re-exported. | -| `conftest.py::requires_integration` | `_helpers/markers.py::requires_integration` | Marker name unchanged; re-exported. | -| `conftest.py::pytest_collection_modifyitems` | `conftest.py::pytest_collection_modifyitems` | Left in place; PR 11 removes it per the epic. | -| `_tiff_surgery.py` (whole module) | `_helpers/tiff_surgery.py` | Verbatim relocation. Direct importers updated. | - -### Updated import sites - -| File | Change | -|---|---| -| `test_local_tile_byte_cap_1664.py` | `from ._tiff_surgery import ...` -> `from ._helpers.tiff_surgery import ...` | -| `test_gpu_tile_byte_cap_2026_05_18.py` | `from ._tiff_surgery import ...` -> `from ._helpers.tiff_surgery import ...` | - -All other test files import `make_minimal_tiff`, `gpu_available`, and -the `requires_*` markers from `conftest.py` (or -`xrspatial.geotiff.tests.conftest`), which now re-exports them. No -further changes needed. - -## VRT missing-sources cluster - -### `test_vrt_missing_sources_policy_1799.py` (deleted) - -| Old `file::test` | New `file::test_id` | Notes | -|---|---|---| -| `test_vrt_missing_sources_policy_1799.py::test_read_vrt_missing_sources_warns_and_records_hole` | `vrt/test_missing_sources.py::TestWarnPolicyEmitsWarningAndFillsNodata::test_eager_byte_warn_records_hole` | Byte-band variant carried over verbatim. Asserts on the `"could not be read"` message and `vrt_holes`. | -| `test_vrt_missing_sources_policy_1799.py::test_read_vrt_missing_sources_raise_fails_fast` | `vrt/test_missing_sources.py::TestExplicitRaisePolicy::test_eager_byte_explicit_raise` | Byte-band variant. Renamed; assertion unchanged (raises `OSError` or `ValueError`). | -| `test_vrt_missing_sources_policy_1799.py::test_read_vrt_missing_sources_validates_policy` | `vrt/test_missing_sources.py::TestInvalidPolicyRejected::test_eager_byte_invalid_policy` | Byte-band invalid-policy smoke check. Parametrised matrix below covers more bad values across both readers; this stays as a literal port to keep byte-band coverage. | - -### `test_vrt_missing_sources_policy_2367.py` (deleted) - -| Old `file::test` | New `file::test_id` | Notes | -|---|---|---| -| `test_vrt_missing_sources_policy_2367.py::TestDefaultPolicyRaises::test_default_raises_filenotfound_naming_source[eager_read_vrt]` | `vrt/test_missing_sources.py::TestDefaultPolicyRaises::test_default_raises_filenotfound_naming_source[eager]` | Renamed reader id from `eager_read_vrt` to `eager`. Filename in the VRT helper renamed `missing_2367.tif` -> `missing_source.tif` (no issue numbers in fixtures). | -| `test_vrt_missing_sources_policy_2367.py::TestDefaultPolicyRaises::test_default_raises_filenotfound_naming_source[dask_open_geotiff_chunks]` | `vrt/test_missing_sources.py::TestDefaultPolicyRaises::test_default_raises_filenotfound_naming_source[dask]` | Renamed reader id; same coverage. | -| `test_vrt_missing_sources_policy_2367.py::TestExplicitRaisePolicy::test_explicit_raise_matches_default[eager_read_vrt]` | `vrt/test_missing_sources.py::TestExplicitRaisePolicy::test_explicit_raise_matches_default[eager]` | Renamed reader id. | -| `test_vrt_missing_sources_policy_2367.py::TestExplicitRaisePolicy::test_explicit_raise_matches_default[dask_open_geotiff_chunks]` | `vrt/test_missing_sources.py::TestExplicitRaisePolicy::test_explicit_raise_matches_default[dask]` | Renamed reader id. | -| `test_vrt_missing_sources_policy_2367.py::TestWarnPolicyEmitsWarningAndFillsNodata::test_eager_warn_emits_and_fills` | `vrt/test_missing_sources.py::TestWarnPolicyEmitsWarningAndFillsNodata::test_eager_warn_emits_and_fills` | Body unchanged except missing-source filename rename. | -| `test_vrt_missing_sources_policy_2367.py::TestWarnPolicyEmitsWarningAndFillsNodata::test_dask_warn_emits_at_compute_and_fills` | `vrt/test_missing_sources.py::TestWarnPolicyEmitsWarningAndFillsNodata::test_dask_warn_emits_at_compute_and_fills` | Body unchanged except missing-source filename rename. | -| `test_vrt_missing_sources_policy_2367.py::TestInvalidPolicyRejected::test_invalid_policy_raises_value_error_naming_value[eager_read_vrt-ignore]` | `vrt/test_missing_sources.py::TestInvalidPolicyRejected::test_invalid_policy_raises_value_error_naming_value[ignore-eager]` | Reader id renamed; parametrize order unchanged. | -| `..._policy_2367.py::TestInvalidPolicyRejected::test_invalid_policy_raises_value_error_naming_value[eager_read_vrt-RAISE]` | `vrt/test_missing_sources.py::TestInvalidPolicyRejected::test_invalid_policy_raises_value_error_naming_value[RAISE-eager]` | | -| `..._policy_2367.py::TestInvalidPolicyRejected::test_invalid_policy_raises_value_error_naming_value[eager_read_vrt-raises]` | `vrt/test_missing_sources.py::TestInvalidPolicyRejected::test_invalid_policy_raises_value_error_naming_value[raises-eager]` | | -| `..._policy_2367.py::TestInvalidPolicyRejected::test_invalid_policy_raises_value_error_naming_value[eager_read_vrt-]` | `vrt/test_missing_sources.py::TestInvalidPolicyRejected::test_invalid_policy_raises_value_error_naming_value[-eager]` | Empty-string bad value. | -| `..._policy_2367.py::TestInvalidPolicyRejected::test_invalid_policy_raises_value_error_naming_value[eager_read_vrt-warn ]` | `vrt/test_missing_sources.py::TestInvalidPolicyRejected::test_invalid_policy_raises_value_error_naming_value[warn -eager]` | Trailing-space bad value. | -| `..._policy_2367.py::TestInvalidPolicyRejected::test_invalid_policy_raises_value_error_naming_value[eager_read_vrt-1]` | `vrt/test_missing_sources.py::TestInvalidPolicyRejected::test_invalid_policy_raises_value_error_naming_value[1-eager]` | | -| `..._policy_2367.py::TestInvalidPolicyRejected::test_invalid_policy_raises_value_error_naming_value[dask_open_geotiff_chunks-ignore]` | `vrt/test_missing_sources.py::TestInvalidPolicyRejected::test_invalid_policy_raises_value_error_naming_value[ignore-dask]` | | -| `..._policy_2367.py::TestInvalidPolicyRejected::test_invalid_policy_raises_value_error_naming_value[dask_open_geotiff_chunks-RAISE]` | `vrt/test_missing_sources.py::TestInvalidPolicyRejected::test_invalid_policy_raises_value_error_naming_value[RAISE-dask]` | | -| `..._policy_2367.py::TestInvalidPolicyRejected::test_invalid_policy_raises_value_error_naming_value[dask_open_geotiff_chunks-raises]` | `vrt/test_missing_sources.py::TestInvalidPolicyRejected::test_invalid_policy_raises_value_error_naming_value[raises-dask]` | | -| `..._policy_2367.py::TestInvalidPolicyRejected::test_invalid_policy_raises_value_error_naming_value[dask_open_geotiff_chunks-]` | `vrt/test_missing_sources.py::TestInvalidPolicyRejected::test_invalid_policy_raises_value_error_naming_value[-dask]` | | -| `..._policy_2367.py::TestInvalidPolicyRejected::test_invalid_policy_raises_value_error_naming_value[dask_open_geotiff_chunks-warn ]` | `vrt/test_missing_sources.py::TestInvalidPolicyRejected::test_invalid_policy_raises_value_error_naming_value[warn -dask]` | | -| `..._policy_2367.py::TestInvalidPolicyRejected::test_invalid_policy_raises_value_error_naming_value[dask_open_geotiff_chunks-1]` | `vrt/test_missing_sources.py::TestInvalidPolicyRejected::test_invalid_policy_raises_value_error_naming_value[1-dask]` | | - -### Files NOT folded in (justified) - -| File | Reason left in place | -|---|---| -| `test_vrt_missing_sources_default_raise_1843.py` | Different surface area: tests the *internal* `xrspatial.geotiff._vrt.read_vrt` entry point (not the public `xrspatial.geotiff.read_vrt`), plus the `XRSPATIAL_GEOTIFF_STRICT=1` env-var override. Neither is in the public-API matrix covered by `vrt/test_missing_sources.py`. A future PR that consolidates the strict-env-var coverage can fold this in then. | - -## Verification - -- Old eager byte coverage: 3 tests preserved (warn / raise / invalid). -- Old eager+dask float coverage: 16 parametrised cases preserved (default x2, explicit raise x2, warn x2, invalid 6 bad values x 2 readers). -- Net file delta: 3 files deleted (`_tiff_surgery.py`, `test_vrt_missing_sources_policy_1799.py`, `test_vrt_missing_sources_policy_2367.py`); 6 files added (`_helpers/{__init__,tiff_builders,tiff_surgery,markers}.py`, `vrt/{__init__,test_missing_sources}.py`). Of those, 2 `test_*.py` files are removed and 1 `test_*.py` file is added under `vrt/`, so `find xrspatial/geotiff/tests -name 'test_*.py' | wc -l` goes from 357 to 356 (net -1). The spec called for a drop of 2; the new VRT module replaces both old files but is itself a `test_*.py` file, so consolidation by definition lands one net deletion per cluster. diff --git a/xrspatial/geotiff/tests/CLUSTER_AUDIT_PR2.md b/xrspatial/geotiff/tests/CLUSTER_AUDIT_PR2.md new file mode 100644 index 00000000..af276240 --- /dev/null +++ b/xrspatial/geotiff/tests/CLUSTER_AUDIT_PR2.md @@ -0,0 +1,130 @@ +# CLUSTER_AUDIT_PR2.md — VRT validation cluster + +Temporary audit table tracking every old `file::test` and where it lands +in the consolidated `vrt/test_validation.py`. Deleted in a follow-up +commit on the same branch before merge per the epic #2390 contract. + +## File mapping summary + +| Old file | New file | Status | +|---|---|---| +| `test_vrt_validation_2321.py` | `vrt/test_validation.py` | folded | +| `test_vrt_capability_validator_2371.py` | `vrt/test_validation.py` | folded | +| `test_vrt_unsupported_2370.py` | `vrt/test_validation.py` | folded | +| `test_vrt_narrow_except_1670.py` | `vrt/test_validation.py` | folded | +| `test_vrt_path_containment_1671.py` | `vrt/test_validation.py` | folded | + +## Test mapping (old → new) + +### From `test_vrt_validation_2321.py` + +| Old test | New location | Notes | +|---|---|---| +| `test_vrt_unsupported_error_is_geotiff_metadata_error` | `test_vrt_unsupported_error_subclass_contract` | identical assertions | +| `test_zero_bands_raises_vrt_unsupported` | `TestValidatorRules::test_zero_bands_rejected` | identical assertion | +| `test_zero_bands_parity_across_entry_points` | `test_zero_bands_parity_across_entry_points` | helper now `_expect_same_error` | +| `test_complex_dtype_band_rejected_by_validator` | `TestValidatorRules::test_complex_dtype_band_rejected` | identical assertions | +| `test_rotated_transform_rejected_without_opt_in` | `TestValidatorRules::test_rotated_transform_rejected_without_opt_in` | both opt-out and opt-in paths preserved | +| `test_negative_src_rect_size_rejected` | `TestValidatorRules::test_geometry_rules_rejected[reject[negative-src-size]]` | parametrised | +| `test_negative_src_rect_offset_rejected` | `TestValidatorRules::test_geometry_rules_rejected[reject[negative-src-offset]]` | parametrised | +| `test_negative_dst_rect_size_rejected` | `TestValidatorRules::test_geometry_rules_rejected[reject[negative-dst-size]]` | parametrised | +| `test_dst_rect_outside_vrt_extent_rejected` | `TestValidatorRules::test_geometry_rules_rejected[reject[dst-outside-extent]]` | parametrised | +| `test_zero_pixel_size_rejected` | `TestValidatorRules::test_geometry_rules_rejected[reject[zero-pixel-size]]` | parametrised | +| `test_unsupported_resample_alg_rejected_at_validate` | `TestValidatorRules::test_unsupported_resample_alg_rejected` | identical assertion | +| `test_mixed_band_nodata_rejected_without_opt_in` | `test_mixed_band_nodata_rejected_without_opt_in` | identical assertions | +| `test_unparseable_crs_rejected_without_opt_in` | `TestValidatorRules::test_unparseable_crs_rejected_without_opt_in` | both opt-out and opt-in paths preserved | +| `test_resample_parity_across_entry_points` | `test_resample_parity_across_entry_points` | uses `_expect_same_error` | +| `test_rotated_parity_across_entry_points` | `test_rotated_parity_across_entry_points` | uses `_expect_same_error` | +| `test_unsupported_resample_chunked_raises_at_build` | `test_unsupported_resample_chunked_raises_at_build` | identical assertion | +| `test_well_formed_vrt_validates_silently` | `TestValidatorRules::test_well_formed_vrt_validates_silently` | identical assertions | + +### From `test_vrt_capability_validator_2371.py` + +| Old test | New location | Notes | +|---|---|---| +| `test_validate_vrt_capability_alias_resolves_to_validate_parsed_vrt` | `test_validate_vrt_capability_is_validate_parsed_vrt` | identical assertion | +| `test_nested_vrt_rejected_at_validator` | `test_nested_vrt_message_names_outer_and_inner` | identical assertions (message, outer path, inner basename, keyword) | +| `test_nested_vrt_uppercase_extension_rejected` | `test_nested_vrt_uppercase_extension_rejected` | identical assertion | +| `test_nested_vrt_rejected_via_public_read_vrt` | `test_nested_vrt_rejected_via_entry_points[entry[package-read_vrt]]` | parametrised entry-point matrix | +| `test_nested_vrt_rejected_via_open_geotiff` | `test_nested_vrt_rejected_via_entry_points[entry[open_geotiff]]` | parametrised | +| `test_nested_vrt_rejected_via_internal_read_vrt` | `test_nested_vrt_rejected_via_entry_points[entry[internal-read_vrt]]` | parametrised | +| `test_warp_options_dataset_level_rejected_at_parse` | `test_warp_options_rejected_at_parse[warp[dataset-level]]` | parametrised over dataset / band scope | +| `test_warp_options_dataset_level_rejected_via_public_read_vrt` | `test_warp_options_dataset_rejected_via_entry_points[entry[package-read_vrt]]` | parametrised | +| `test_warp_options_dataset_level_rejected_via_internal_read_vrt` | `test_warp_options_dataset_rejected_via_entry_points[entry[internal-read_vrt]]` | parametrised | +| `test_warp_options_band_level_rejected` | `test_warp_options_rejected_at_parse[warp[band-level]]` | parametrised | +| `test_use_mask_band_true_rejected_at_validator` | `test_use_mask_band_message_names_source` | identical assertions | +| `test_use_mask_band_truthy_spellings_rejected[true/True/TRUE/1]` | `test_use_mask_band_truthy_spellings_rejected[truthy[true]/[True]/[TRUE]/[1]]` | descriptive IDs | +| `test_use_mask_band_false_is_accepted` | `test_use_mask_band_false_is_accepted` | identical | +| `test_use_mask_band_non_canonical_truthy_accepted[yes/on/Y]` | `test_use_mask_band_non_canonical_truthy_accepted[non-canonical[yes]/[on]/[Y]]` | descriptive IDs | +| `test_use_mask_band_rejected_via_public_read_vrt` | `test_use_mask_band_rejected_via_entry_points[entry[package-read_vrt]]` | parametrised | +| `test_use_mask_band_rejected_via_internal_read_vrt` | `test_use_mask_band_rejected_via_entry_points[entry[internal-read_vrt]]` | parametrised | +| `test_per_source_mask_band_rejected_at_validator` | `test_per_source_mask_band_message_names_source` | identical assertions | +| `test_resample_alg_now_rejected_at_internal_read_vrt` | `test_resample_alg_rejected_at_internal_read_vrt` | identical assertion | +| `test_nested_vrt_error_is_value_error` | `test_nested_vrt_error_remains_value_error_subclass` | identical assertions | + +### From `test_vrt_unsupported_2370.py` + +The `_assert_raises_or_xfail` helper from the original file is gone; PR +1's validator landed, so most cases assert directly. Two cases that +were already `xfail` in the original (mixed-CRS, mixed-dtype widening) +stay under `pytest.mark.xfail(strict=False)` until the validator delivers +the rejection contract. + +| Old test | New location | Notes | +|---|---|---| +| `test_warped_vrt_subclass_raises` | `test_warped_subclass_band_rejected_via_open_geotiff` | direct assertion (no xfail wrapper) | +| `test_warped_vrt_gdalwarpoptions_raises` | `test_warp_options_rejected_at_parse[warp[dataset-level]]` (and `..._via_entry_points`) | already covered by 2371 fold | +| `test_warped_vrt_open_geotiff_raises` | `test_warped_subclass_band_rejected_via_open_geotiff` | open_geotiff path preserved | +| `test_nested_vrt_source_raises` | `test_nested_vrt_rejected_via_entry_points[entry[package-read_vrt]]` | parametrised matrix | +| `test_nested_vrt_open_geotiff_raises` | `test_nested_vrt_rejected_via_entry_points[entry[open_geotiff]]` | parametrised matrix | +| `test_mixed_source_crs_raises` | `test_mixed_source_crs_rejected` | preserved as `xfail(strict=False)`; same assertion shape | +| `test_mixed_source_dtype_unsupported_complex_raises` | `test_mixed_source_dtype_complex_rejected` | direct assertion | +| `test_mixed_source_dtype_ambiguous_widening_raises` | `test_mixed_source_dtype_ambiguous_widening_rejected` | preserved as `xfail(strict=False)` | +| `test_mixed_source_band_count_raises` | `test_mixed_source_band_count_rejected` | direct assertion | +| `test_complex_mask_source_raises` | `test_dataset_level_mask_band_rejected` | direct assertion | +| `test_unsupported_resample_alg_raises[Bilinear/Cubic/Lanczos/Average/Mode]` | `test_unsupported_resample_alg_rejected_end_to_end[entry[package-read_vrt]-resample[]]` | merged with open_geotiff parametrise | +| `test_unsupported_resample_alg_open_geotiff` | `test_unsupported_resample_alg_rejected_end_to_end[entry[open_geotiff]-resample[cubic]]` | covered by full alg × entry matrix | +| `test_supported_simple_vrt_round_trips_via_open_geotiff` | `test_supported_simple_vrt_round_trips_via_open_geotiff` | identical assertion | + +### From `test_vrt_narrow_except_1670.py` + +The matrix is parametrised over exception class × mode rather than one +test per exception. The fixtures `clear_strict_env` and +`set_strict_env` are reused unchanged. + +| Old test | New location | Notes | +|---|---|---| +| `test_runtime_error_propagates_default_mode` | `test_narrow_except_bug_classes_propagate_in_default_mode[bug[runtime-error]]` | parametrised | +| `test_runtime_error_propagates_strict_mode` | `test_narrow_except_runtime_error_propagates_in_strict_mode` | dedicated case | +| `test_file_not_found_warns_and_continues` | `test_narrow_except_io_or_parse_warns_in_default_mode[io[file-not-found]]` | parametrised | +| `test_file_not_found_strict_reraises` | `test_narrow_except_io_or_parse_reraises_in_strict_mode[io[file-not-found]]` | parametrised | +| `test_value_error_warns_and_continues` | `test_narrow_except_io_or_parse_warns_in_default_mode[parse[value-error]]` | parametrised | +| `test_value_error_strict_reraises` | `test_narrow_except_io_or_parse_reraises_in_strict_mode[parse[value-error]]` | parametrised | +| `test_struct_error_warns_and_continues` | `test_narrow_except_io_or_parse_warns_in_default_mode[parse[struct-error]]` | parametrised | +| `test_permission_error_warns_and_continues` | `test_narrow_except_io_or_parse_warns_in_default_mode[io[permission-error]]` | parametrised | +| `test_memory_error_propagates_default_mode` | `test_narrow_except_bug_classes_propagate_in_default_mode[bug[memory-error]]` | parametrised | +| `test_zlib_error_warns_and_continues` | `test_narrow_except_io_or_parse_warns_in_default_mode[codec[zlib-error]]` | parametrised | +| `test_zlib_error_strict_reraises` | `test_narrow_except_io_or_parse_reraises_in_strict_mode[codec[zlib-error]]` | parametrised | +| `test_zstd_error_warns_and_continues_if_available` | `test_narrow_except_zstd_error_warns_in_default_mode` | kept as standalone; `pytest.importorskip` replaced with module-level `skipif(not _has_zstandard())` | +| (new) | `test_narrow_except_zstd_error_reraises_in_strict_mode` | added strict-mode case for parity with zlib (closes the matrix; previously only the warn path was covered for zstd) | + +### From `test_vrt_path_containment_1671.py` + +Folded into two classes (`TestPathContainment`, `TestPathContainmentAllowlist`). +The `_clear_allowlist_env` autouse fixture is replaced by an explicit +`clear_allowlist_env` fixture on the non-allowlist tests so the +allowlist class can set the env var via `monkeypatch.setenv` without a +race against the autouse delenv. + +| Old test | New location | Notes | +|---|---|---| +| `test_relative_source_with_dotdot_traversal_rejected` | `TestPathContainment::test_relative_dotdot_traversal_rejected` | identical assertion | +| `test_relative_source_symlink_traversal_rejected` | `TestPathContainment::test_relative_symlink_traversal_rejected` | identical assertion | +| `test_absolute_source_outside_vrt_dir_rejected` | `TestPathContainment::test_absolute_outside_vrt_dir_rejected` | identical assertion | +| `test_absolute_source_inside_vrt_dir_ok` | `TestPathContainment::test_absolute_inside_vrt_dir_ok` | identical assertion | +| `test_absolute_source_allowlisted_root_passes` | `TestPathContainmentAllowlist::test_single_root_allows_outside_absolute` | identical assertion | +| `test_allowlist_supports_multiple_roots` | `TestPathContainmentAllowlist::test_multiple_roots_pathsep_separated` | identical assertion | +| `test_allowlist_does_not_cover_traversal_via_relative_source` | `TestPathContainmentAllowlist::test_relative_source_escape_still_rejected` | identical assertion | +| `test_allowlist_empty_entries_ignored` | `TestPathContainmentAllowlist::test_empty_entries_ignored` | identical assertion | +| `test_normal_relative_source_under_vrt_dir` | `TestPathContainment::test_normal_relative_source_under_vrt_dir_ok` | identical assertion | +| `test_error_message_names_rejected_path` | `TestPathContainment::test_error_message_names_rejected_path` | identical assertion | diff --git a/xrspatial/geotiff/tests/test_vrt_capability_validator_2371.py b/xrspatial/geotiff/tests/test_vrt_capability_validator_2371.py deleted file mode 100644 index 6d96cd23..00000000 --- a/xrspatial/geotiff/tests/test_vrt_capability_validator_2371.py +++ /dev/null @@ -1,519 +0,0 @@ -"""Regression tests for issue #2371 (sub-task of epic #2342). - -The centralised VRT capability validator -(``xrspatial.geotiff._vrt_validation.validate_parsed_vrt``, exposed as -``validate_vrt_capability``) now covers four additional rejection -paths and is wired into both the internal ``_vrt.read_vrt`` and the -public ``_backends/vrt.read_vrt`` entry points. The four paths: - -1. Nested VRTs: a ``.vrt`` referenced as a ``SourceFilename`` inside - another VRT. -2. Warped VRTs declaring a ```` block at the dataset - or band level (the band-level ``subClass="VRTWarpedRasterBand"`` - marker is already rejected by the existing parse-time subclass - check). -3. Resample algorithm beyond nearest when SrcRect and DstRect sizes - differ (extended from ``_check_resample_alg_supported`` so the - chunked path also rejects at graph-build time). -4. Complex mask / alpha source semantics: per-source - ``true`` flags and per-source - ```` children. - -Each test asserts the rejection fires at validator time (before any -source decode) and that the message names the offending source path -or feature so a caller can locate the bad source without re-parsing -the VRT XML themselves. -""" -from __future__ import annotations - -import os - -import numpy as np -import pytest - -from xrspatial.geotiff import open_geotiff -from xrspatial.geotiff._backends.vrt import read_vrt as _public_read_vrt -from xrspatial.geotiff._errors import ( - GeoTIFFAmbiguousMetadataError, - UnsupportedGeoTIFFFeatureError, - VRTUnsupportedError, -) -from xrspatial.geotiff._vrt import parse_vrt -from xrspatial.geotiff._vrt import read_vrt as _internal_read_vrt -from xrspatial.geotiff._vrt_validation import ( - validate_parsed_vrt, - validate_vrt_capability, -) -from xrspatial.geotiff._writer import write - - -# --------------------------------------------------------------------------- -# Test fixtures -# --------------------------------------------------------------------------- - - -def _write_src(tmp_path, name: str = 'src_2371.tif', - shape=(4, 4), dtype=np.uint16) -> str: - """Write a small source TIFF and return its path.""" - arr = np.arange(int(np.prod(shape)), dtype=dtype).reshape(shape) - p = str(tmp_path / name) - write(arr, p, compression='none', tiled=False) - return p - - -def _write_vrt(tmp_path, xml: str, name: str = 'mosaic_2371.vrt') -> str: - """Write a VRT XML to disk and return its path.""" - p = str(tmp_path / name) - with open(p, 'w') as f: - f.write(xml) - return p - - -def _parse(tmp_path, xml: str, name: str = 'mosaic_2371.vrt'): - """Write + parse a VRT XML. Returns ``(path, parsed)``.""" - path = _write_vrt(tmp_path, xml, name) - parsed = parse_vrt(xml, os.path.dirname(os.path.abspath(path))) - return path, parsed - - -# --------------------------------------------------------------------------- -# Public-alias contract -# --------------------------------------------------------------------------- - - -def test_validate_vrt_capability_alias_resolves_to_validate_parsed_vrt(): - """``validate_vrt_capability`` is the public alias matching the - issue text. It must resolve to the same underlying callable as - ``validate_parsed_vrt`` so both names share one implementation.""" - assert validate_vrt_capability is validate_parsed_vrt - - -# --------------------------------------------------------------------------- -# Rule: nested VRT (a .vrt referenced as a SourceFilename) -# --------------------------------------------------------------------------- - - -def _nested_vrt_xml(inner_vrt_path: str) -> str: - return f""" - 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 - - - {inner_vrt_path} - 1 - - - - -""" - - -def test_nested_vrt_rejected_at_validator(tmp_path): - """A ``SimpleSource`` referencing another ``.vrt`` file must raise - ``VRTUnsupportedError`` at validate time with both VRT paths in the - message.""" - # Build an inner VRT that on its own is well-formed, then build an - # outer VRT that references it as a source. - src_path = _write_src(tmp_path) - inner_xml = f""" - 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 - - - {src_path} - 1 - - - - -""" - inner_path = _write_vrt(tmp_path, inner_xml, 'inner_2371.vrt') - - outer_path, parsed = _parse( - tmp_path, _nested_vrt_xml(inner_path), 'outer_2371.vrt' - ) - - with pytest.raises(VRTUnsupportedError) as excinfo: - validate_parsed_vrt(parsed, source=outer_path, mode='read') - msg = str(excinfo.value) - # Outer path appears as the failing VRT. - assert outer_path in msg - # Inner path is named so the caller can locate the bad source. - # ``parse_vrt`` canonicalises source filenames via ``os.path.realpath`` - # so the message carries the realpath form, not the raw string the - # test built. On Windows ``str(tmp_path / name)`` can produce a - # short-name path that differs from the realpath form, so compare - # the basename (the part that survives any normalisation). - assert os.path.basename(inner_path) in msg - # Message names the failure mode. - assert 'Nested' in msg or 'nested' in msg - - -def test_nested_vrt_uppercase_extension_rejected(tmp_path): - """``.VRT`` (uppercase) trips the same rejection: extension matching - must be case-insensitive so Windows-style emitters are caught.""" - src_path = _write_src(tmp_path) - inner_xml = f""" - 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 - - - {src_path} - 1 - - - - -""" - inner_path = _write_vrt(tmp_path, inner_xml, 'INNER_2371.VRT') - - outer_path, parsed = _parse( - tmp_path, _nested_vrt_xml(inner_path), 'outer_upper_2371.vrt' - ) - with pytest.raises(VRTUnsupportedError, match='[Nn]ested'): - validate_parsed_vrt(parsed, source=outer_path, mode='read') - - -def test_nested_vrt_rejected_via_public_read_vrt(tmp_path): - """The public ``_backends/vrt.read_vrt`` entry point must surface - the same rejection as the direct validator call.""" - src_path = _write_src(tmp_path) - inner_xml = f""" - 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 - - - {src_path} - 1 - - - - -""" - inner_path = _write_vrt(tmp_path, inner_xml, 'inner_pub_2371.vrt') - outer_path = _write_vrt( - tmp_path, - _nested_vrt_xml(inner_path), - 'outer_pub_2371.vrt', - ) - with pytest.raises(VRTUnsupportedError, match='[Nn]ested'): - _public_read_vrt(outer_path) - - -def test_nested_vrt_rejected_via_open_geotiff(tmp_path): - """The dispatched ``open_geotiff('foo.vrt')`` path runs through the - same backend wrapper and must produce the same rejection.""" - src_path = _write_src(tmp_path) - inner_xml = f""" - 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 - - - {src_path} - 1 - - - - -""" - inner_path = _write_vrt(tmp_path, inner_xml, 'inner_og_2371.vrt') - outer_path = _write_vrt( - tmp_path, - _nested_vrt_xml(inner_path), - 'outer_og_2371.vrt', - ) - with pytest.raises(VRTUnsupportedError, match='[Nn]ested'): - open_geotiff(outer_path) - - -def test_nested_vrt_rejected_via_internal_read_vrt(tmp_path): - """The internal ``_vrt.read_vrt`` is now routed through the - validator too (issue #2371 wires the same gate at both entry - points). A direct call must produce the rejection without going - through the public backend wrapper.""" - src_path = _write_src(tmp_path) - inner_xml = f""" - 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 - - - {src_path} - 1 - - - - -""" - inner_path = _write_vrt(tmp_path, inner_xml, 'inner_int_2371.vrt') - outer_path = _write_vrt( - tmp_path, - _nested_vrt_xml(inner_path), - 'outer_int_2371.vrt', - ) - with pytest.raises(VRTUnsupportedError, match='[Nn]ested'): - _internal_read_vrt(outer_path) - - -# --------------------------------------------------------------------------- -# Rule: warped VRT (```` block) -# --------------------------------------------------------------------------- - - -_WARP_DATASET_XML = """ - 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 - - 64.0 - NearestNeighbour - - -""" - - -def test_warp_options_dataset_level_rejected_at_parse(tmp_path): - """A dataset-level ```` block raises - ``UnsupportedGeoTIFFFeatureError`` during ``parse_vrt``. The parser - rejects the element via ``_UNSUPPORTED_DATASET_TAGS`` so callers - that route through the validator still see a typed failure (the - parse step runs first, before the validator is reached).""" - path = _write_vrt(tmp_path, _WARP_DATASET_XML, 'warp_ds_2371.vrt') - with pytest.raises( - UnsupportedGeoTIFFFeatureError, match='GDALWarpOptions' - ): - parse_vrt(_WARP_DATASET_XML, os.path.dirname(path)) - - -def test_warp_options_dataset_level_rejected_via_public_read_vrt(tmp_path): - """The public ``_backends/vrt.read_vrt`` entry point surfaces the - same warp rejection.""" - path = _write_vrt(tmp_path, _WARP_DATASET_XML, 'warp_pub_2371.vrt') - with pytest.raises( - UnsupportedGeoTIFFFeatureError, match='GDALWarpOptions' - ): - _public_read_vrt(path) - - -def test_warp_options_dataset_level_rejected_via_internal_read_vrt(tmp_path): - """The internal ``_vrt.read_vrt`` rejects the same input. Routing - through the validator preserves the parse-time rejection because - ``parse_vrt`` runs before ``validate_parsed_vrt``.""" - path = _write_vrt(tmp_path, _WARP_DATASET_XML, 'warp_int_2371.vrt') - with pytest.raises( - UnsupportedGeoTIFFFeatureError, match='GDALWarpOptions' - ): - _internal_read_vrt(path) - - -_WARP_BAND_XML = """ - 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 - - - NearestNeighbour - - -""" - - -def test_warp_options_band_level_rejected(tmp_path): - """A band-level ```` block (rare but possible - depending on the VRT emitter) is rejected via the band-children - sweep in ``_UNSUPPORTED_BAND_TAGS``.""" - path = _write_vrt(tmp_path, _WARP_BAND_XML, 'warp_band_2371.vrt') - with pytest.raises( - UnsupportedGeoTIFFFeatureError, match='GDALWarpOptions' - ): - parse_vrt(_WARP_BAND_XML, os.path.dirname(path)) - - -# --------------------------------------------------------------------------- -# Rule: per-source mask / alpha semantics -# --------------------------------------------------------------------------- - - -def _use_mask_band_xml(src_path: str, flag: str = 'true') -> str: - return f""" - 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 - - - {src_path} - 1 - - - {flag} - - -""" - - -def test_use_mask_band_true_rejected_at_validator(tmp_path): - """A ComplexSource declaring ``true`` - must raise ``VRTUnsupportedError`` at validate time, with the - offending source path in the message.""" - src_path = _write_src(tmp_path) - path, parsed = _parse( - tmp_path, _use_mask_band_xml(src_path), 'use_mask_2371.vrt' - ) - with pytest.raises(VRTUnsupportedError) as excinfo: - validate_parsed_vrt(parsed, source=path, mode='read') - msg = str(excinfo.value) - assert 'UseMaskBand' in msg - assert src_path in msg - - -@pytest.mark.parametrize('flag', ['true', 'True', 'TRUE', '1']) -def test_use_mask_band_truthy_spellings_rejected(tmp_path, flag): - """```` accepts the case-insensitive ``true`` and the - digit ``1`` -- GDAL writes lowercase ``true``, and the digit form - keeps the parser tolerant of XML emitters that normalise booleans - to ``1``. Anything else falls outside the truthy set and is - treated as not-mask (see ``test_use_mask_band_non_canonical_*`` - below).""" - src_path = _write_src(tmp_path, name=f'src_flag_{flag}_2371.tif') - path, parsed = _parse( - tmp_path, _use_mask_band_xml(src_path, flag=flag), - f'use_mask_{flag}_2371.vrt', - ) - with pytest.raises(VRTUnsupportedError, match='UseMaskBand'): - validate_parsed_vrt(parsed, source=path, mode='read') - - -def test_use_mask_band_false_is_accepted(tmp_path): - """An explicit ``false`` is a no-op and - must not trip the rejection. GDAL never writes ``false`` itself, - but hand-written VRTs occasionally do.""" - src_path = _write_src(tmp_path, name='src_false_2371.tif') - path, parsed = _parse( - tmp_path, _use_mask_band_xml(src_path, flag='false'), - 'use_mask_false_2371.vrt', - ) - # Must not raise. - validate_parsed_vrt(parsed, source=path, mode='read') - - -@pytest.mark.parametrize('flag', ['yes', 'on', 'Y']) -def test_use_mask_band_non_canonical_truthy_accepted(tmp_path, flag): - """Tokens outside the canonical GDAL set (``true`` / ``1``) are - treated as not-mask. The parser deliberately narrows the truthy - set so a hand-edited VRT using a Python-truthy spelling does not - silently flip the read into the rejection path. If GDAL ever - starts emitting one of these, the set should be widened then.""" - src_path = _write_src(tmp_path, name=f'src_ncf_{flag}_2371.tif') - path, parsed = _parse( - tmp_path, _use_mask_band_xml(src_path, flag=flag), - f'use_mask_ncf_{flag}_2371.vrt', - ) - # Must not raise -- non-canonical token is treated as not-mask. - validate_parsed_vrt(parsed, source=path, mode='read') - - -def test_use_mask_band_rejected_via_public_read_vrt(tmp_path): - """End-to-end: the public backend entry point surfaces the same - rejection.""" - src_path = _write_src(tmp_path, name='src_pub_mask_2371.tif') - path = _write_vrt( - tmp_path, _use_mask_band_xml(src_path), 'use_mask_pub_2371.vrt' - ) - with pytest.raises(VRTUnsupportedError, match='UseMaskBand'): - _public_read_vrt(path) - - -def test_use_mask_band_rejected_via_internal_read_vrt(tmp_path): - """End-to-end: the internal entry point also surfaces the rejection - via the validator now that #2371 wires it in.""" - src_path = _write_src(tmp_path, name='src_int_mask_2371.tif') - path = _write_vrt( - tmp_path, _use_mask_band_xml(src_path), 'use_mask_int_2371.vrt' - ) - with pytest.raises(VRTUnsupportedError, match='UseMaskBand'): - _internal_read_vrt(path) - - -def _per_source_mask_band_xml(src_path: str) -> str: - """A ComplexSource with a per-source ```` child (distinct - from a dataset-level ```` sibling). GDAL emits this when - a source TIFF carries an internal mask band that the VRT wires - through.""" - return f""" - 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 - - - {src_path} - 1 - - - - {src_path} - 1 - - - -""" - - -def test_per_source_mask_band_rejected_at_validator(tmp_path): - """A per-source ```` child raises ``VRTUnsupportedError`` - at validate time naming the source path.""" - src_path = _write_src(tmp_path, name='src_pmask_2371.tif') - path, parsed = _parse( - tmp_path, _per_source_mask_band_xml(src_path), - 'per_src_mask_2371.vrt', - ) - with pytest.raises(VRTUnsupportedError) as excinfo: - validate_parsed_vrt(parsed, source=path, mode='read') - msg = str(excinfo.value) - assert 'MaskBand' in msg - assert src_path in msg - - -# --------------------------------------------------------------------------- -# Rule: resample alg gate now fires at the internal entry point -# --------------------------------------------------------------------------- - - -def test_resample_alg_now_rejected_at_internal_read_vrt(tmp_path): - """The internal ``_vrt.read_vrt`` was previously not routed through - the validator and surfaced unsupported-resample as a - ``NotImplementedError`` at the placement site. After #2371 the - validator preempts that gate so the failure is now a typed - ``VRTUnsupportedError`` at graph build / eager setup.""" - src_path = _write_src(tmp_path, name='src_resample_2371.tif') - xml = f""" - 0.0, 2.0, 0.0, 0.0, 0.0, -2.0 - - - {src_path} - 1 - - - Bilinear - - -""" - path = _write_vrt(tmp_path, xml, 'resample_int_2371.vrt') - with pytest.raises(VRTUnsupportedError, match='Bilinear'): - _internal_read_vrt(path) - - -# --------------------------------------------------------------------------- -# Subclassing contract for the new path -# --------------------------------------------------------------------------- - - -def test_nested_vrt_error_is_value_error(tmp_path): - """``VRTUnsupportedError`` already subclasses ``ValueError`` via - ``GeoTIFFAmbiguousMetadataError``. The nested-VRT path uses the - same class, so ``except ValueError`` keeps catching the new - rejection too.""" - src_path = _write_src(tmp_path, name='src_subclass_2371.tif') - inner_xml = f""" - 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 - - - {src_path} - 1 - - - - -""" - inner_path = _write_vrt(tmp_path, inner_xml, 'inner_sub_2371.vrt') - outer_path, parsed = _parse( - tmp_path, _nested_vrt_xml(inner_path), 'outer_sub_2371.vrt' - ) - with pytest.raises(ValueError): # via VRTUnsupportedError -> ValueError - validate_parsed_vrt(parsed, source=outer_path, mode='read') - with pytest.raises(GeoTIFFAmbiguousMetadataError): - validate_parsed_vrt(parsed, source=outer_path, mode='read') diff --git a/xrspatial/geotiff/tests/test_vrt_narrow_except_1670.py b/xrspatial/geotiff/tests/test_vrt_narrow_except_1670.py deleted file mode 100644 index 95cb81f4..00000000 --- a/xrspatial/geotiff/tests/test_vrt_narrow_except_1670.py +++ /dev/null @@ -1,381 +0,0 @@ -"""Regression tests for #1670: narrow ``except Exception`` in VRT reader. - -``read_vrt`` historically wrapped each source read in ``except Exception``, -which under default mode swallowed any subclass of ``Exception`` -- including -``RuntimeError``, ``MemoryError``, and other bugs unrelated to the "source -file is unreadable" fallback the catch was meant to cover. - -This module pins the narrowed contract: - -* I/O errors (``OSError`` and its subclasses), parse errors - (``ValueError``, ``struct.error``), and codec-library decode errors - (``zlib.error`` for deflate; ``zstandard.ZstdError`` when zstandard is - installed) warn-and-continue in default mode and re-raise under - ``XRSPATIAL_GEOTIFF_STRICT=1``. -* Other exception types (``RuntimeError`` here as a stand-in for real bugs) - propagate in both modes so callers and CI see them. -""" -from __future__ import annotations - -import struct -import warnings -import zlib - -import pytest - -from xrspatial.geotiff import GeoTIFFFallbackWarning - - -@pytest.fixture -def clear_strict_env(monkeypatch): - """Ensure XRSPATIAL_GEOTIFF_STRICT is unset for default-mode tests.""" - monkeypatch.delenv('XRSPATIAL_GEOTIFF_STRICT', raising=False) - - -@pytest.fixture -def set_strict_env(monkeypatch): - """Set XRSPATIAL_GEOTIFF_STRICT=1 for strict-mode tests.""" - monkeypatch.setenv('XRSPATIAL_GEOTIFF_STRICT', '1') - - -def _write_simple_vrt(tmp_path, src_path, name='mosaic_1670.vrt'): - """Write a 4x4 VRT pointing at a single source path.""" - vrt_path = tmp_path / name - vrt_path.write_text( - '\n' - ' \n' - ' 0, 1, 0, 0, 0, -1\n' - ' \n' - ' -9999\n' - ' \n' - f' {src_path}' - '\n' - ' 1\n' - ' \n' - ' \n' - ' \n' - ' \n' - '\n' - ) - return vrt_path - - -def _patch_read_to_array(monkeypatch, exc): - """Make ``read_to_array`` inside ``_vrt`` raise ``exc`` on every call. - - The VRT reader does a local ``from ._reader import read_to_array`` inside - ``read_vrt``, so we patch the source attribute and the import will pick up - the stub. - """ - from xrspatial.geotiff import _reader - - def _boom(*args, **kwargs): - raise exc - - monkeypatch.setattr(_reader, 'read_to_array', _boom) - - -def test_runtime_error_propagates_default_mode( - clear_strict_env, monkeypatch, tmp_path, -): - """A non-I/O bug (RuntimeError) must NOT be absorbed by the VRT fallback. - - Default mode used to catch ``Exception``, which swallowed real bugs. The - narrowed catch only handles I/O / parse errors, so a RuntimeError raised - from ``read_to_array`` should bubble straight up. - """ - from xrspatial.geotiff import read_vrt - - src_path = tmp_path / 'src_1670.tif' - src_path.write_bytes(b'placeholder') # parse will be replaced by stub - vrt_path = _write_simple_vrt(tmp_path, str(src_path)) - - _patch_read_to_array(monkeypatch, RuntimeError("synthetic 1670")) - - with pytest.raises(RuntimeError, match='synthetic 1670'): - read_vrt(str(vrt_path)) - - -def test_runtime_error_propagates_strict_mode( - set_strict_env, monkeypatch, tmp_path, -): - """Strict mode already propagates everything; double-check RuntimeError.""" - from xrspatial.geotiff import read_vrt - - src_path = tmp_path / 'src_1670_strict.tif' - src_path.write_bytes(b'placeholder') - vrt_path = _write_simple_vrt( - tmp_path, str(src_path), name='mosaic_1670_rt_strict.vrt') - - _patch_read_to_array(monkeypatch, RuntimeError("synthetic 1670 strict")) - - with pytest.raises(RuntimeError, match='synthetic 1670 strict'): - read_vrt(str(vrt_path)) - - -def test_file_not_found_warns_and_continues( - clear_strict_env, monkeypatch, tmp_path, -): - """FileNotFoundError is the canonical "source is unreadable" case.""" - from xrspatial.geotiff import read_vrt - - src_path = tmp_path / 'src_1670_fnf.tif' - src_path.write_bytes(b'placeholder') - vrt_path = _write_simple_vrt( - tmp_path, str(src_path), name='mosaic_1670_fnf.vrt') - - _patch_read_to_array( - monkeypatch, FileNotFoundError("synthetic missing 1670")) - - with warnings.catch_warnings(record=True) as w: - warnings.simplefilter('always') - # The public ``read_vrt`` defaults to ``missing_sources='raise'`` - # since #1860; this test pins the warn-and-continue path, which - # is now an explicit opt-in. - da = read_vrt(str(vrt_path), missing_sources='warn') - - # Mosaic still loads (with a hole) and the skipped source is reported. - assert da.shape == (4, 4) - fallback_warnings = [ - x for x in w if issubclass(x.category, GeoTIFFFallbackWarning) - ] - assert len(fallback_warnings) >= 1 - msgs = ' '.join(str(x.message) for x in fallback_warnings) - assert 'VRT source' in msgs - assert 'FileNotFoundError' in msgs - - -def test_file_not_found_strict_reraises( - set_strict_env, monkeypatch, tmp_path, -): - """Strict mode re-raises the FileNotFoundError.""" - from xrspatial.geotiff import read_vrt - - src_path = tmp_path / 'src_1670_fnf_strict.tif' - src_path.write_bytes(b'placeholder') - vrt_path = _write_simple_vrt( - tmp_path, str(src_path), name='mosaic_1670_fnf_strict.vrt') - - _patch_read_to_array( - monkeypatch, FileNotFoundError("synthetic missing 1670 strict")) - - with pytest.raises(FileNotFoundError, match='synthetic missing 1670 strict'): - read_vrt(str(vrt_path)) - - -def test_value_error_warns_and_continues( - clear_strict_env, monkeypatch, tmp_path, -): - """A typed ValueError from parse_header / parse_ifd is a documented case.""" - from xrspatial.geotiff import read_vrt - - src_path = tmp_path / 'src_1670_val.tif' - src_path.write_bytes(b'placeholder') - vrt_path = _write_simple_vrt( - tmp_path, str(src_path), name='mosaic_1670_val.vrt') - - _patch_read_to_array( - monkeypatch, ValueError("bad header 1670")) - - with warnings.catch_warnings(record=True) as w: - warnings.simplefilter('always') - # Opt into the lenient path: the public default is now - # ``missing_sources='raise'`` since #1860. - da = read_vrt(str(vrt_path), missing_sources='warn') - - assert da.shape == (4, 4) - fallback_warnings = [ - x for x in w if issubclass(x.category, GeoTIFFFallbackWarning) - ] - assert len(fallback_warnings) >= 1 - msgs = ' '.join(str(x.message) for x in fallback_warnings) - assert 'VRT source' in msgs - assert 'ValueError' in msgs - - -def test_value_error_strict_reraises( - set_strict_env, monkeypatch, tmp_path, -): - """Strict mode re-raises the ValueError.""" - from xrspatial.geotiff import read_vrt - - src_path = tmp_path / 'src_1670_val_strict.tif' - src_path.write_bytes(b'placeholder') - vrt_path = _write_simple_vrt( - tmp_path, str(src_path), name='mosaic_1670_val_strict.vrt') - - _patch_read_to_array( - monkeypatch, ValueError("bad header 1670 strict")) - - with pytest.raises(ValueError, match='bad header 1670 strict'): - read_vrt(str(vrt_path)) - - -def test_struct_error_warns_and_continues( - clear_strict_env, monkeypatch, tmp_path, -): - """struct.error still leaks from some parse paths; catch it too. - - Future PRs may convert these to ValueError, but until then the VRT - reader should be robust to the raw ``struct.error`` exception so a - single malformed source does not abort the whole mosaic. - """ - from xrspatial.geotiff import read_vrt - - src_path = tmp_path / 'src_1670_se.tif' - src_path.write_bytes(b'placeholder') - vrt_path = _write_simple_vrt( - tmp_path, str(src_path), name='mosaic_1670_se.vrt') - - _patch_read_to_array( - monkeypatch, struct.error("short buffer 1670")) - - with warnings.catch_warnings(record=True) as w: - warnings.simplefilter('always') - # Opt into the lenient path: the public default is now - # ``missing_sources='raise'`` since #1860. - da = read_vrt(str(vrt_path), missing_sources='warn') - - assert da.shape == (4, 4) - fallback_warnings = [ - x for x in w if issubclass(x.category, GeoTIFFFallbackWarning) - ] - assert len(fallback_warnings) >= 1 - msgs = ' '.join(str(x.message) for x in fallback_warnings) - assert 'VRT source' in msgs - assert 'error' in msgs # struct.error type name is 'error' - - -def test_permission_error_warns_and_continues( - clear_strict_env, monkeypatch, tmp_path, -): - """PermissionError is an OSError subclass, so it goes through the fallback.""" - from xrspatial.geotiff import read_vrt - - src_path = tmp_path / 'src_1670_perm.tif' - src_path.write_bytes(b'placeholder') - vrt_path = _write_simple_vrt( - tmp_path, str(src_path), name='mosaic_1670_perm.vrt') - - _patch_read_to_array( - monkeypatch, PermissionError("denied 1670")) - - with warnings.catch_warnings(record=True) as w: - warnings.simplefilter('always') - # Opt into the lenient path: the public default is now - # ``missing_sources='raise'`` since #1860. - da = read_vrt(str(vrt_path), missing_sources='warn') - - assert da.shape == (4, 4) - fallback_warnings = [ - x for x in w if issubclass(x.category, GeoTIFFFallbackWarning) - ] - assert len(fallback_warnings) >= 1 - - -def test_memory_error_propagates_default_mode( - clear_strict_env, monkeypatch, tmp_path, -): - """MemoryError is a real failure, not an "unreadable source"; propagate.""" - from xrspatial.geotiff import read_vrt - - src_path = tmp_path / 'src_1670_mem.tif' - src_path.write_bytes(b'placeholder') - vrt_path = _write_simple_vrt( - tmp_path, str(src_path), name='mosaic_1670_mem.vrt') - - _patch_read_to_array(monkeypatch, MemoryError("OOM 1670")) - - with pytest.raises(MemoryError, match='OOM 1670'): - read_vrt(str(vrt_path)) - - -def test_zlib_error_warns_and_continues( - clear_strict_env, monkeypatch, tmp_path, -): - """A ``zlib.error`` from a corrupt deflate tile should warn-and-skip. - - The narrowed catch in PR #1675 covered only ``(OSError, ValueError, - struct.error)``; ``zlib.error`` is a direct ``Exception`` subclass and - used to abort the whole mosaic when a deflate tile was corrupt. The - follow-up adds codec-library decode errors to the allowlist so the - historical warn-and-skip behaviour is restored for corrupt payloads. - """ - from xrspatial.geotiff import read_vrt - - src_path = tmp_path / 'src_1670_zlib.tif' - src_path.write_bytes(b'placeholder') - vrt_path = _write_simple_vrt( - tmp_path, str(src_path), name='mosaic_1670_zlib.vrt') - - _patch_read_to_array(monkeypatch, zlib.error("synthetic deflate 1670")) - - with warnings.catch_warnings(record=True) as w: - warnings.simplefilter('always') - # Opt into the lenient path: the public default is now - # ``missing_sources='raise'`` since #1860. - da = read_vrt(str(vrt_path), missing_sources='warn') - - assert da.shape == (4, 4) - fallback_warnings = [ - x for x in w if issubclass(x.category, GeoTIFFFallbackWarning) - ] - assert len(fallback_warnings) >= 1 - msgs = ' '.join(str(x.message) for x in fallback_warnings) - assert 'VRT source' in msgs - assert 'error' in msgs # zlib.error type name is 'error' - - -def test_zlib_error_strict_reraises( - set_strict_env, monkeypatch, tmp_path, -): - """Under ``XRSPATIAL_GEOTIFF_STRICT=1`` a corrupt deflate tile re-raises.""" - from xrspatial.geotiff import read_vrt - - src_path = tmp_path / 'src_1670_zlib_strict.tif' - src_path.write_bytes(b'placeholder') - vrt_path = _write_simple_vrt( - tmp_path, str(src_path), name='mosaic_1670_zlib_strict.vrt') - - _patch_read_to_array( - monkeypatch, zlib.error("synthetic deflate 1670 strict")) - - with pytest.raises(zlib.error, match='synthetic deflate 1670 strict'): - read_vrt(str(vrt_path)) - - -def test_zstd_error_warns_and_continues_if_available( - clear_strict_env, monkeypatch, tmp_path, -): - """``zstandard.ZstdError`` from a corrupt ZSTD tile should warn-and-skip. - - Skipped when ``zstandard`` is not installed -- the wrapper path that - raises this exception is unreachable in that case and there's no class - to instantiate. See ``_CODEC_DECODE_EXCEPTIONS`` in ``_vrt.py``. - """ - pytest.importorskip('zstandard') - from zstandard import ZstdError - - from xrspatial.geotiff import read_vrt - - src_path = tmp_path / 'src_1670_zstd.tif' - src_path.write_bytes(b'placeholder') - vrt_path = _write_simple_vrt( - tmp_path, str(src_path), name='mosaic_1670_zstd.vrt') - - _patch_read_to_array(monkeypatch, ZstdError("synthetic zstd 1670")) - - with warnings.catch_warnings(record=True) as w: - warnings.simplefilter('always') - # Opt into the lenient path: the public default is now - # ``missing_sources='raise'`` since #1860. - da = read_vrt(str(vrt_path), missing_sources='warn') - - assert da.shape == (4, 4) - fallback_warnings = [ - x for x in w if issubclass(x.category, GeoTIFFFallbackWarning) - ] - assert len(fallback_warnings) >= 1 - msgs = ' '.join(str(x.message) for x in fallback_warnings) - assert 'VRT source' in msgs - assert 'ZstdError' in msgs diff --git a/xrspatial/geotiff/tests/test_vrt_path_containment_1671.py b/xrspatial/geotiff/tests/test_vrt_path_containment_1671.py deleted file mode 100644 index 692125c4..00000000 --- a/xrspatial/geotiff/tests/test_vrt_path_containment_1671.py +++ /dev/null @@ -1,303 +0,0 @@ -"""Regression tests for issue #1671: VRT path-traversal containment. - -The previous fix for path traversal (#1185) called ``os.path.realpath`` -on every ``SourceFilename`` but did not enforce that the resolved path -lived under the VRT directory. A crafted VRT could therefore read any -file the process had access to: ``../../etc/passwd``, a symlink that -escapes ``vrt_dir``, or an absolute path anywhere on disk. - -The new behaviour rejects: - -* relative sources (``relativeToVRT='1'``) that resolve outside the VRT - directory -* absolute sources (``relativeToVRT='0'``) that resolve outside both - the VRT directory and any allowlisted root - -Operators that legitimately need cross-directory reads opt in via the -``XRSPATIAL_VRT_ALLOWED_ROOTS`` environment variable -(``os.pathsep``-separated list of directory paths). -""" -from __future__ import annotations - -import os -import uuid - -import numpy as np -import pytest -import xarray as xr - -from xrspatial.geotiff import to_geotiff -from xrspatial.geotiff._vrt import parse_vrt -from xrspatial.geotiff._vrt import read_vrt as _read_vrt_internal - - -def _unique_dir(tmp_path, label: str) -> str: - """Return a sub-path under ``tmp_path`` carrying the issue id + a - uuid so parallel test workers cannot collide on the same name.""" - d = tmp_path / f"vrt_1671_{label}_{uuid.uuid4().hex[:8]}" - d.mkdir() - return str(d) - - -def _write_minimal_tif(path: str) -> None: - """Write a 4x4 float32 GeoTIFF the VRT can reference.""" - arr = np.arange(16, dtype=np.float32).reshape(4, 4) - y = np.linspace(1.0, 0.0, 4) - x = np.linspace(0.0, 1.0, 4) - da = xr.DataArray( - arr, dims=['y', 'x'], - coords={'y': y, 'x': x}, - attrs={'crs': 4326}, - ) - to_geotiff(da, path, compression='none') - - -def _build_vrt(vrt_path: str, source_filename: str, relative: str) -> None: - """Write a 4x4 single-band VRT pointing at *source_filename*. - - ``relative`` is the literal value of the ``relativeToVRT`` attribute - ('0' or '1'). - """ - xml = ( - '\n' - ' 0, 1, 0, 0, 0, -1\n' - ' \n' - ' \n' - f' ' - f'{source_filename}\n' - ' 1\n' - ' \n' - ' \n' - ' \n' - ' \n' - '\n' - ) - with open(vrt_path, 'w') as f: - f.write(xml) - - -@pytest.fixture(autouse=True) -def _clear_allowlist_env(monkeypatch): - """Make sure no stray XRSPATIAL_VRT_ALLOWED_ROOTS leaks across tests.""" - monkeypatch.delenv('XRSPATIAL_VRT_ALLOWED_ROOTS', raising=False) - - -# --------------------------------------------------------------------------- -# Relative-source traversal -# --------------------------------------------------------------------------- - - -def test_relative_source_with_dotdot_traversal_rejected(tmp_path): - """A relative source resolving outside ``vrt_dir`` raises ValueError. - - ``parse_vrt`` is the resolution layer; the rejection must happen there - so the dangerous path never reaches ``read_to_array``. - """ - vrt_dir = _unique_dir(tmp_path, "trav_rel") - xml = ( - '\n' - ' \n' - ' \n' - ' ' - '../../../../../etc/passwd\n' - ' 1\n' - ' \n' - ' \n' - ' \n' - ' \n' - '\n' - ) - with pytest.raises(ValueError, match="outside the VRT directory"): - parse_vrt(xml, vrt_dir) - - -def test_relative_source_symlink_traversal_rejected(tmp_path): - """A symlink under ``vrt_dir`` that points outside still gets rejected. - - The check uses ``realpath`` so a symlink-based escape is caught the - same way ``../`` is. - """ - vrt_dir = _unique_dir(tmp_path, "trav_sym") - outside_dir = _unique_dir(tmp_path, "trav_sym_outside") - outside_target = os.path.join(outside_dir, 'secret.tif') - _write_minimal_tif(outside_target) - - # Plant a symlink inside vrt_dir that points to the outside file. - # ``os.symlink`` can fail on Windows CI (requires Developer Mode or - # admin privileges) and on some filesystems, so guard it and skip - # rather than fail the suite on those platforms. - sym = os.path.join(vrt_dir, 'inside.tif') - try: - os.symlink(outside_target, sym) - except (OSError, NotImplementedError) as e: - pytest.skip(f"symlink not supported in this environment: {e}") - - vrt_path = os.path.join(vrt_dir, 'mosaic.vrt') - _build_vrt(vrt_path, 'inside.tif', relative='1') - - with pytest.raises(ValueError, match="outside the VRT directory"): - _read_vrt_internal(vrt_path) - - -# --------------------------------------------------------------------------- -# Absolute-source rejection by default -# --------------------------------------------------------------------------- - - -def test_absolute_source_outside_vrt_dir_rejected(tmp_path): - """A SourceFilename pointing at an absolute path outside ``vrt_dir`` - is rejected by default.""" - vrt_dir = _unique_dir(tmp_path, "abs_outside") - outside_dir = _unique_dir(tmp_path, "abs_outside_target") - outside_tif = os.path.join(outside_dir, 'data.tif') - _write_minimal_tif(outside_tif) - - vrt_path = os.path.join(vrt_dir, 'mosaic.vrt') - _build_vrt(vrt_path, outside_tif, relative='0') - - with pytest.raises(ValueError, match="outside the VRT directory"): - _read_vrt_internal(vrt_path) - - -def test_absolute_source_inside_vrt_dir_ok(tmp_path): - """An absolute path that happens to resolve under ``vrt_dir`` passes. - - Mirrors the writer's ``relative=False`` round-trip case: the on-disk - VRT carries the absolute path of a TIFF that still lives next to it. - """ - vrt_dir = _unique_dir(tmp_path, "abs_inside") - tif_path = os.path.join(vrt_dir, 'data.tif') - _write_minimal_tif(tif_path) - - vrt_path = os.path.join(vrt_dir, 'mosaic.vrt') - _build_vrt(vrt_path, tif_path, relative='0') - - arr, _ = _read_vrt_internal(vrt_path) - assert arr.shape == (4, 4) - - -# --------------------------------------------------------------------------- -# Allowlist opt-in via XRSPATIAL_VRT_ALLOWED_ROOTS -# --------------------------------------------------------------------------- - - -def test_absolute_source_allowlisted_root_passes(tmp_path, monkeypatch): - """Setting XRSPATIAL_VRT_ALLOWED_ROOTS allows cross-directory reads.""" - vrt_dir = _unique_dir(tmp_path, "allow_vrt") - outside_dir = _unique_dir(tmp_path, "allow_data") - outside_tif = os.path.join(outside_dir, 'data.tif') - _write_minimal_tif(outside_tif) - - vrt_path = os.path.join(vrt_dir, 'mosaic.vrt') - _build_vrt(vrt_path, outside_tif, relative='0') - - monkeypatch.setenv('XRSPATIAL_VRT_ALLOWED_ROOTS', outside_dir) - arr, _ = _read_vrt_internal(vrt_path) - assert arr.shape == (4, 4) - - -def test_allowlist_supports_multiple_roots(tmp_path, monkeypatch): - """An ``os.pathsep``-separated list permits sources under any listed root.""" - vrt_dir = _unique_dir(tmp_path, "multi_vrt") - dir_a = _unique_dir(tmp_path, "multi_a") - dir_b = _unique_dir(tmp_path, "multi_b") - tif_b = os.path.join(dir_b, 'data.tif') - _write_minimal_tif(tif_b) - - vrt_path = os.path.join(vrt_dir, 'mosaic.vrt') - _build_vrt(vrt_path, tif_b, relative='0') - - monkeypatch.setenv( - 'XRSPATIAL_VRT_ALLOWED_ROOTS', os.pathsep.join([dir_a, dir_b])) - arr, _ = _read_vrt_internal(vrt_path) - assert arr.shape == (4, 4) - - -def test_allowlist_does_not_cover_traversal_via_relative_source( - tmp_path, monkeypatch, -): - """A relative source that escapes ``vrt_dir`` is rejected even if - the resolved path happens to land under an allowlisted root. - - Relative paths declare intent to stay inside the VRT directory. - Honouring that intent prevents an attacker from chaining an - allowlist entry into a relative-source traversal. - """ - vrt_dir = _unique_dir(tmp_path, "rel_with_allow") - outside_dir = _unique_dir(tmp_path, "rel_with_allow_target") - outside_tif = os.path.join(outside_dir, 'data.tif') - _write_minimal_tif(outside_tif) - - # Build a relative path from vrt_dir to outside_tif. - rel = os.path.relpath(outside_tif, vrt_dir) - - vrt_path = os.path.join(vrt_dir, 'mosaic.vrt') - _build_vrt(vrt_path, rel, relative='1') - - monkeypatch.setenv('XRSPATIAL_VRT_ALLOWED_ROOTS', outside_dir) - with pytest.raises(ValueError, match="outside the VRT directory"): - _read_vrt_internal(vrt_path) - - -def test_allowlist_empty_entries_ignored(tmp_path, monkeypatch): - """Empty entries in the allowlist (from stray separators) are skipped.""" - vrt_dir = _unique_dir(tmp_path, "empty_entry_vrt") - outside_dir = _unique_dir(tmp_path, "empty_entry_data") - outside_tif = os.path.join(outside_dir, 'data.tif') - _write_minimal_tif(outside_tif) - - vrt_path = os.path.join(vrt_dir, 'mosaic.vrt') - _build_vrt(vrt_path, outside_tif, relative='0') - - # Leading/trailing/embedded empty entries should not crash the parser - # or accidentally grant access to ``/``. Build the value with - # ``os.pathsep`` so the test stays cross-platform (``:`` on POSIX, - # ``;`` on Windows). - sep = os.pathsep - value = f"{sep}{outside_dir}{sep}{sep}" - monkeypatch.setenv('XRSPATIAL_VRT_ALLOWED_ROOTS', value) - arr, _ = _read_vrt_internal(vrt_path) - assert arr.shape == (4, 4) - - -# --------------------------------------------------------------------------- -# Happy-path regression: existing relative-source under vrt_dir still works -# --------------------------------------------------------------------------- - - -def test_normal_relative_source_under_vrt_dir(tmp_path): - """A plain relative source under the VRT directory still reads fine.""" - vrt_dir = _unique_dir(tmp_path, "happy") - tif_path = os.path.join(vrt_dir, 'data.tif') - _write_minimal_tif(tif_path) - - vrt_path = os.path.join(vrt_dir, 'mosaic.vrt') - _build_vrt(vrt_path, 'data.tif', relative='1') - - arr, _ = _read_vrt_internal(vrt_path) - assert arr.shape == (4, 4) - - -def test_error_message_names_rejected_path(tmp_path): - """The ValueError mentions the offending path so operators can diagnose.""" - vrt_dir = _unique_dir(tmp_path, "msg_check") - xml = ( - '\n' - ' \n' - ' \n' - ' ' - '../../etc/shadow\n' - ' 1\n' - ' \n' - ' \n' - ' \n' - ' \n' - '\n' - ) - with pytest.raises(ValueError) as excinfo: - parse_vrt(xml, vrt_dir) - msg = str(excinfo.value) - # The rejected resolved path should appear somewhere in the message. - assert 'shadow' in msg - # And the trusted root should also be cited. - assert os.path.realpath(vrt_dir) in msg diff --git a/xrspatial/geotiff/tests/test_vrt_unsupported_2370.py b/xrspatial/geotiff/tests/test_vrt_unsupported_2370.py deleted file mode 100644 index 21318653..00000000 --- a/xrspatial/geotiff/tests/test_vrt_unsupported_2370.py +++ /dev/null @@ -1,510 +0,0 @@ -"""Negative coverage for unsupported VRT features (issue #2370, epic #2342). - -The release does not promise full GDAL VRT parity. A VRT that asks for -something outside the implemented subset must fail with a clear, -actionable error rather than silently produce wrong data. This module -locks in the rejection contract for the following cases: - -* Warped VRT (```` or a - dataset carrying ````). -* Nested VRT (a ``.vrt`` referenced as ```` inside - another ``.vrt``). -* Mixed source CRS across band sources. -* Mixed source dtype across band sources where the output dtype is - ambiguous. -* Mixed band count across sources. -* Complex mask source semantics that the attrs contract cannot - represent. -* Unsupported resample algorithm (anything outside the implemented - nearest-neighbour subset). - -Each test asserts the exception type AND checks the error message -names the unsupported feature so users can fix the input. Where the -current code already rejects the case, the test locks the behaviour -in. Where the centralized validator from sibling PR #2329 is needed -for the rejection, the assertion is wrapped with ``pytest.xfail`` so -this PR can land independently. - -Coverage spans both ``read_vrt`` and ``open_geotiff(... .vrt ...)`` -entry points -- a missing rejection at either path leaves a release -loophole. - -Note on overlap: the resample-algorithm tests intentionally duplicate -the cases in ``test_vrt_resample_alg_1751.py``. That file is the -regression anchor for the original bug; this file is the -rejection-contract anchor for the release. Keeping them separate -makes the intent of each test file legible to future readers. -""" -from __future__ import annotations - -import numpy as np -import pytest -import xarray as xr - -from xrspatial.geotiff import open_geotiff, to_geotiff -from xrspatial.geotiff._vrt import read_vrt - - -# --------------------------------------------------------------------------- -# Helpers -# --------------------------------------------------------------------------- - - -PR1_XFAIL = "depends on centralized validate_vrt_capability (#2329)" - - -def _write_src_tif(tmp_path, *, name: str, dtype=np.float32, - shape=(4, 4)) -> str: - """Write a tiny GeoTIFF with the requested dtype/shape and return the path. - - All filenames include ``2370`` so parallel test workers do not collide - with other rockout worktrees. - """ - arr = np.arange(int(np.prod(shape)), dtype=dtype).reshape(shape) - y = np.linspace(1.0, 0.0, shape[0]) - x = np.linspace(0.0, 1.0, shape[1]) - fill = -9999 if np.issubdtype(arr.dtype, np.integer) else -9999.0 - da = xr.DataArray( - arr, dims=['y', 'x'], - coords={'y': y, 'x': x}, - attrs={'nodata': fill, 'crs': 'EPSG:4326'}, - ) - path = str(tmp_path / f'src_2370_{name}.tif') - to_geotiff(da, path) - return path - - -def _write_vrt(tmp_path, xml: str, name: str) -> str: - """Write ``xml`` to ``tmp_path/.vrt`` and return the path.""" - path = str(tmp_path / f'vrt_2370_{name}.vrt') - with open(path, 'w') as fh: - fh.write(xml) - return path - - -def _simple_source_xml(src_path: str, *, band: int = 1) -> str: - """Render a single ```` block over a 4x4 source. - - All callers in this file use the matched 4x4 SrcRect/DstRect - geometry; specialised geometry (size-changing rects, ResampleAlg) - is built inline by the few tests that need it. - """ - return f""" - {src_path} - {band} - - - """ - - -def _vrt_xml(*, width: int = 4, height: int = 4, - dtype_name: str = 'Float32', - body: str = '', - extra_dataset_inner: str = '', - srs: str = 'EPSG:4326') -> str: - """Render a minimal VRT XML wrapper.""" - return f""" - {srs} - 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 - {extra_dataset_inner} - -{body} - -""" - - -def _assert_raises_or_xfail(exc_types: tuple[type[BaseException], ...], - keywords: tuple[str, ...], - call): - """Run ``call`` and check whether the rejection contract is met. - - The contract: ``call`` raises one of ``exc_types`` AND the lowercased - message contains at least one of ``keywords``. If either half is - missing, mark the test ``xfail`` with the PR1 dependency reason -- - so PR1 lands the validator and this test starts passing without an - edit here. - - ``except Exception`` (not ``BaseException``) on the diagnostic - branch keeps ``KeyboardInterrupt`` and ``SystemExit`` propagating - so a test runner can still be interrupted cleanly. - """ - try: - call() - except exc_types as exc: - msg = str(exc).lower() - if any(k in msg for k in keywords): - return - pytest.xfail( - f"{PR1_XFAIL}: raised {type(exc).__name__} but message " - f"did not name expected keyword ({keywords!r}): {msg!r}") - except Exception as exc: # pragma: no cover -- diagnostic - pytest.xfail( - f"{PR1_XFAIL}: raised unexpected {type(exc).__name__}: {exc!r}") - else: - pytest.xfail(f"{PR1_XFAIL}: call did not raise") - - -# --------------------------------------------------------------------------- -# Group 1 -- Warped VRT -# --------------------------------------------------------------------------- - - -def test_warped_vrt_subclass_raises(tmp_path): - """```` is a warped VRT - and must be rejected: read_vrt has no warp pipeline, so honouring the - band would silently emit unprojected pixels labelled as warped. - """ - src_path = _write_src_tif(tmp_path, name='warped_sub') - warped_xml = f""" - EPSG:4326 - 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 - - {src_path} - -""" - vrt_path = _write_vrt(tmp_path, warped_xml, 'warped_subclass') - - # Current behaviour: no centralized validator yet -- the reader - # accepts the band as a no-source band (silent zero fill). PR #2329 - # is the one that turns this into the documented rejection. - _assert_raises_or_xfail( - (ValueError, NotImplementedError, RuntimeError), - ('warp', 'vrtwarped'), - lambda: read_vrt(vrt_path), - ) - - -def test_warped_vrt_gdalwarpoptions_raises(tmp_path): - """A VRT containing ```` is by definition a warped - VRT regardless of the band subClass. Must be rejected at parse or - read time with a message that names ``GDALWarpOptions`` or 'warp'. - """ - src_path = _write_src_tif(tmp_path, name='warpopts') - body = _simple_source_xml(src_path) - warp_options = """ - 6.71089e+07 - NearestNeighbour - Float32 - """ - xml = _vrt_xml(body=body, extra_dataset_inner=warp_options) - vrt_path = _write_vrt(tmp_path, xml, 'warp_options') - - _assert_raises_or_xfail( - (ValueError, NotImplementedError, RuntimeError), - ('warp',), - lambda: read_vrt(vrt_path), - ) - - -def test_warped_vrt_open_geotiff_raises(tmp_path): - """``open_geotiff(... warped.vrt ...)`` must reject too -- the - rejection cannot live only in ``read_vrt`` or callers that go - through the public accessor would slip past the contract. - """ - src_path = _write_src_tif(tmp_path, name='warped_og') - warped_xml = f""" - EPSG:4326 - 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 - - {src_path} - -""" - vrt_path = _write_vrt(tmp_path, warped_xml, 'warped_og') - - _assert_raises_or_xfail( - (ValueError, NotImplementedError, RuntimeError), - ('warp', 'vrtwarped'), - lambda: open_geotiff(vrt_path), - ) - - -# --------------------------------------------------------------------------- -# Group 2 -- Nested VRT -# --------------------------------------------------------------------------- - - -def test_nested_vrt_source_raises(tmp_path): - """A VRT whose ```` is itself a ``.vrt`` is a nested - VRT. Resolution semantics are GDAL-specific and not promised here, - so it must raise rather than try to parse the inner ``.vrt`` as a - TIFF (which is what would happen today). - """ - # Inner VRT pointing at a real TIFF. - inner_src = _write_src_tif(tmp_path, name='nested_inner') - inner_body = _simple_source_xml(inner_src) - inner_xml = _vrt_xml(body=inner_body) - inner_vrt_path = _write_vrt(tmp_path, inner_xml, 'nested_inner') - - # Outer VRT that references the inner .vrt. - outer_body = _simple_source_xml(inner_vrt_path) - outer_xml = _vrt_xml(body=outer_body) - outer_vrt_path = _write_vrt(tmp_path, outer_xml, 'nested_outer') - - # Today the read trips a generic TIFF parse error that does not - # name '.vrt' or 'nested'; xfail until PR #2329 lands the validator. - _assert_raises_or_xfail( - (ValueError, NotImplementedError, RuntimeError, OSError), - ('.vrt', 'nested'), - lambda: read_vrt(outer_vrt_path), - ) - - -def test_nested_vrt_open_geotiff_raises(tmp_path): - """Same nested-VRT rejection contract through the public entry point.""" - inner_src = _write_src_tif(tmp_path, name='nested_og_inner') - inner_body = _simple_source_xml(inner_src) - inner_xml = _vrt_xml(body=inner_body) - inner_vrt_path = _write_vrt(tmp_path, inner_xml, 'nested_og_inner') - - outer_body = _simple_source_xml(inner_vrt_path) - outer_xml = _vrt_xml(body=outer_body) - outer_vrt_path = _write_vrt(tmp_path, outer_xml, 'nested_og_outer') - - _assert_raises_or_xfail( - (ValueError, NotImplementedError, RuntimeError, OSError), - ('.vrt', 'nested'), - lambda: open_geotiff(outer_vrt_path), - ) - - -# --------------------------------------------------------------------------- -# Group 3 -- Mixed source CRS -# --------------------------------------------------------------------------- - - -def test_mixed_source_crs_raises(tmp_path): - """Two band sources with disagreeing CRS (one EPSG:4326 source, one - EPSG:3857 source) cannot mosaic correctly without reprojection. The - VRT XML itself only carries a dataset-level ````, so the - mismatch only surfaces when the validator opens each source TIFF. - Pinned here so the validator PR (#2329) is the one that has to - deliver it. - """ - src_4326 = _write_src_tif(tmp_path, name='crs_4326') - # Build a second source with a different CRS by writing the TIFF - # through ``to_geotiff`` with a different crs attr. - arr = np.arange(16, dtype=np.float32).reshape(4, 4) - y = np.linspace(1.0, 0.0, 4) - x = np.linspace(0.0, 1.0, 4) - da_3857 = xr.DataArray( - arr, dims=['y', 'x'], coords={'y': y, 'x': x}, - attrs={'nodata': -9999.0, 'crs': 'EPSG:3857'}, - ) - src_3857 = str(tmp_path / 'src_2370_crs_3857.tif') - to_geotiff(da_3857, src_3857) - - body = ( - _simple_source_xml(src_4326) - + "\n" - + _simple_source_xml(src_3857) - ) - xml = _vrt_xml(body=body, srs='EPSG:4326') - vrt_path = _write_vrt(tmp_path, xml, 'mixed_crs') - - _assert_raises_or_xfail( - (ValueError, NotImplementedError, RuntimeError), - ('crs', 'srs', 'projection', 'epsg'), - lambda: read_vrt(vrt_path), - ) - - -# --------------------------------------------------------------------------- -# Group 4 -- Mixed source dtype across band sources -# --------------------------------------------------------------------------- - - -def test_mixed_source_dtype_unsupported_complex_raises(tmp_path): - """``dataType="CFloat32"`` (and other complex dtype declarations) - already raises ``ValueError`` per issue #1783 because ``read_vrt`` - has no complex code path. Lock the message in: it must name the - rejected dtype so users know what to change. - """ - src_path = _write_src_tif(tmp_path, name='dtype_complex') - body = _simple_source_xml(src_path) - xml = _vrt_xml(body=body, dtype_name='CFloat32') - vrt_path = _write_vrt(tmp_path, xml, 'complex_dtype') - - with pytest.raises(ValueError, match=r'CFloat32') as excinfo: - read_vrt(vrt_path) - # The message should be actionable: it names the rejected dtype AND - # mentions complex. - assert 'complex' in str(excinfo.value).lower() - - -def test_mixed_source_dtype_ambiguous_widening_raises(tmp_path): - """When two bands declare incompatible dtypes (e.g. ``UInt16`` and - ``Float32``) the current code silently widens the output buffer to - a common dtype. That widening is fine for compatible mixes, but the - contract for the release is to reject mixed band dtypes unless the - user opted in. Pinned for PR #2329. - """ - src_u16 = _write_src_tif(tmp_path, name='dtype_u16', dtype=np.uint16) - src_f32 = _write_src_tif(tmp_path, name='dtype_f32', dtype=np.float32) - body_b1 = _simple_source_xml(src_u16) - body_b2 = _simple_source_xml(src_f32) - xml = f""" - EPSG:4326 - 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 - -{body_b1} - - -{body_b2} - -""" - vrt_path = _write_vrt(tmp_path, xml, 'mixed_dtype') - - _assert_raises_or_xfail( - (ValueError, NotImplementedError), - ('dtype', 'datatype', 'mixed'), - lambda: read_vrt(vrt_path), - ) - - -# --------------------------------------------------------------------------- -# Group 5 -- Mixed band count -# --------------------------------------------------------------------------- - - -def test_mixed_source_band_count_raises(tmp_path): - """When the VRT references sources with disagreeing band counts (a - single-band source feeding into a multi-band band layout, or a - multi-band source feeding into a single-band layout where the - requested ``SourceBand`` does not exist), the read must fail with a - message that names the band-count mismatch rather than silently - decoding the wrong band. - """ - # Single-band source. - single_band_src = _write_src_tif(tmp_path, name='band_single') - # Reference SourceBand=2 against a single-band file -- there is no - # band 2 to read. - body = _simple_source_xml(single_band_src, band=2) - xml = _vrt_xml(body=body) - vrt_path = _write_vrt(tmp_path, xml, 'band_count') - - _assert_raises_or_xfail( - (ValueError, IndexError, RuntimeError, NotImplementedError), - ('band',), - lambda: read_vrt(vrt_path), - ) - - -# --------------------------------------------------------------------------- -# Group 6 -- Complex mask source semantics -# --------------------------------------------------------------------------- - - -def test_complex_mask_source_raises(tmp_path): - """A dataset-level ```` declares a per-pixel mask that the - GeoTIFF attrs contract does not represent. Reading the mosaic and - dropping the mask silently would produce a result the caller cannot - distinguish from one with no mask. Must be rejected. - """ - src_path = _write_src_tif(tmp_path, name='mask_src') - mask_src = _write_src_tif(tmp_path, name='mask_msk', dtype=np.uint8) - body = _simple_source_xml(src_path) - mask_block = f""" - - - {mask_src} - 1 - - - - - """ - xml = _vrt_xml(body=body, extra_dataset_inner=mask_block) - vrt_path = _write_vrt(tmp_path, xml, 'mask_band') - - # No rejection today -- the mask is silently dropped. Pin the - # contract for PR #2329. - _assert_raises_or_xfail( - (ValueError, NotImplementedError), - ('mask',), - lambda: read_vrt(vrt_path), - ) - - -# --------------------------------------------------------------------------- -# Group 7 -- Unsupported resample algorithm -# --------------------------------------------------------------------------- - - -@pytest.mark.parametrize('alg', ['Bilinear', 'Cubic', 'Lanczos', - 'Average', 'Mode']) -def test_unsupported_resample_alg_raises(tmp_path, alg): - """A ```` value outside the implemented nearest subset, - paired with size-changing SrcRect/DstRect rects, is rejected with - the algorithm name in the message. The legacy resample-site check - (#1751) raises ``NotImplementedError``; the centralised validator - (#2329) raises ``VRTUnsupportedError`` (a ``ValueError`` subclass) - at parse time. Either is a valid rejection so long as the message - names the offending algorithm -- matches the dual-type contract in - ``test_unsupported_resample_alg_open_geotiff``. - """ - src_path = _write_src_tif(tmp_path, name=f'res_{alg.lower()}') - inner = f'{alg}' - # DstRect 2x2 vs SrcRect 4x4 forces the resample path so the alg - # check actually fires. - body = f""" - {src_path} - 1 - - - {inner} - """ - xml = _vrt_xml(width=2, height=2, body=body) - vrt_path = _write_vrt(tmp_path, xml, f'resample_{alg.lower()}') - - with pytest.raises((NotImplementedError, ValueError)) as excinfo: - read_vrt(vrt_path) - msg = str(excinfo.value) - assert alg in msg, f"error must name the rejected algorithm: {msg!r}" - - -def test_unsupported_resample_alg_open_geotiff(tmp_path): - """The same rejection must fire through ``open_geotiff`` -- the - public entry point shares the read path, so a regression there - would mean only the low-level helper is safe. - - Either exception type is acceptable: the legacy resample-site check - raises ``NotImplementedError`` (#1751), and the centralised - validator from #2329 raises ``VRTUnsupportedError`` (a - ``ValueError`` subclass) at parse time. Both are valid rejections - so long as the message names the offending algorithm. - """ - src_path = _write_src_tif(tmp_path, name='res_og') - body = f""" - {src_path} - 1 - - - Cubic - """ - xml = _vrt_xml(width=2, height=2, body=body) - vrt_path = _write_vrt(tmp_path, xml, 'resample_og') - - with pytest.raises((NotImplementedError, ValueError), match='Cubic'): - open_geotiff(vrt_path) - - -# --------------------------------------------------------------------------- -# Multi-entrypoint contract -- the trivial passing-case anchor so the -# file always exercises the public ``open_geotiff`` path too. If this -# regresses (e.g. extension dispatch breaks for .vrt) every test above -# becomes ambiguous, so guard it explicitly. -# --------------------------------------------------------------------------- - - -def test_supported_simple_vrt_round_trips_via_open_geotiff(tmp_path): - """Sanity anchor: a supported single-source VRT opens cleanly via - ``open_geotiff``, so the negative tests above are exercising a live - extension-dispatch code path rather than a broken accessor. - """ - src_path = _write_src_tif(tmp_path, name='anchor') - body = _simple_source_xml(src_path) - xml = _vrt_xml(body=body) - vrt_path = _write_vrt(tmp_path, xml, 'anchor') - - da = open_geotiff(vrt_path) - assert da.shape == (4, 4) diff --git a/xrspatial/geotiff/tests/test_vrt_validation_2321.py b/xrspatial/geotiff/tests/test_vrt_validation_2321.py deleted file mode 100644 index ce178556..00000000 --- a/xrspatial/geotiff/tests/test_vrt_validation_2321.py +++ /dev/null @@ -1,488 +0,0 @@ -"""Regression tests for issue #2329 / parent #2321 sub-task 2. - -Centralised VRT capability validation: one negative test per rule the -validator enforces. Every test asserts ``VRTUnsupportedError`` is -raised at validator-time (i.e. before any source decode), and that the -direct ``read_vrt`` entry point and the dispatched -``open_geotiff('foo.vrt')`` entry point produce the same error type -and the same message for the same bad input. - -The validator accepts an already-parsed ``VRTDataset``; the tests -exercise it both directly (with a parsed structure) and through the -two public entry points so the wiring is covered too. -""" -from __future__ import annotations - -import numpy as np -import pytest - -from xrspatial.geotiff import open_geotiff -from xrspatial.geotiff._backends.vrt import read_vrt -from xrspatial.geotiff._errors import ( - GeoTIFFAmbiguousMetadataError, - MixedBandMetadataError, - RotatedTransformError, - UnparseableCRSError, - VRTUnsupportedError, -) -from xrspatial.geotiff._vrt import parse_vrt -from xrspatial.geotiff._vrt_validation import validate_parsed_vrt -from xrspatial.geotiff._writer import write - - -def _write_src(tmp_path, name: str = 'src.tif', - shape=(4, 4), dtype=np.uint16) -> str: - """Write a small source TIFF and return its path.""" - arr = np.arange(int(np.prod(shape)), dtype=dtype).reshape(shape) - p = str(tmp_path / name) - write(arr, p, compression='none', tiled=False) - return p - - -def _write_vrt(tmp_path, xml: str, name: str = 'test.vrt') -> str: - p = str(tmp_path / name) - with open(p, 'w') as f: - f.write(xml) - return p - - -def _parse(tmp_path, xml: str, name: str = 'test.vrt'): - """Helper: write + parse a VRT XML, return (path, parsed).""" - import os - path = _write_vrt(tmp_path, xml, name) - parsed = parse_vrt(xml, os.path.dirname(os.path.abspath(path))) - return path, parsed - - -# --------------------------------------------------------------------------- -# Subclassing contract -# --------------------------------------------------------------------------- - - -def test_vrt_unsupported_error_is_geotiff_metadata_error(): - """``VRTUnsupportedError`` must subclass - ``GeoTIFFAmbiguousMetadataError`` (and ``ValueError`` via the base) - so ``except ValueError`` callers keep catching VRT failures.""" - assert issubclass(VRTUnsupportedError, GeoTIFFAmbiguousMetadataError) - assert issubclass(VRTUnsupportedError, ValueError) - - -# --------------------------------------------------------------------------- -# Rule 1: band count sanity (zero bands) -# --------------------------------------------------------------------------- - - -_NO_BANDS_VRT = """ - 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 -""" - - -def test_zero_bands_raises_vrt_unsupported(tmp_path): - path, parsed = _parse(tmp_path, _NO_BANDS_VRT, 'nobands.vrt') - with pytest.raises(VRTUnsupportedError, match='band'): - validate_parsed_vrt(parsed, source=path, mode='read') - - -def test_zero_bands_parity_across_entry_points(tmp_path): - path = _write_vrt(tmp_path, _NO_BANDS_VRT, 'nobands_parity.vrt') - with pytest.raises(VRTUnsupportedError) as a: - read_vrt(path) - with pytest.raises(VRTUnsupportedError) as b: - open_geotiff(path) - assert type(a.value) is type(b.value) - assert str(a.value) == str(b.value) - - -# --------------------------------------------------------------------------- -# Rule 2: dtype compatibility / unsupported dataType -# (``parse_vrt`` already rejects unknown dtype tokens; the validator -# rejects ``Complex`` placeholders that slip past via a typo'd map. -# The negative we exercise here is per-band dtype that is not in the -# supported numpy kind set: complex bands cannot ride through.) -# --------------------------------------------------------------------------- - - -def test_complex_dtype_band_rejected_by_validator(tmp_path): - """Even if a complex numpy dtype appears on a parsed band (e.g. - via a future _DTYPE_MAP entry), the validator must reject the - band before any read execution begins.""" - src_path = _write_src(tmp_path) - xml = f""" - 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 - - - {src_path} - 1 - - - - -""" - path, parsed = _parse(tmp_path, xml, 'complex.vrt') - # Patch the parsed band dtype to complex128 to simulate a - # hypothetical map regression. The validator should refuse it. - parsed.bands[0].dtype = np.dtype('complex128') - with pytest.raises(VRTUnsupportedError, match='dtype'): - validate_parsed_vrt(parsed, source=path, mode='read') - - -# --------------------------------------------------------------------------- -# Rule 3: transform orientation (rotation / shear) -# --------------------------------------------------------------------------- - - -def test_rotated_transform_rejected_without_opt_in(tmp_path): - """A VRT with non-zero rotation/shear terms in its GeoTransform must - be rejected by the validator unless ``allow_rotated=True`` is - passed.""" - src_path = _write_src(tmp_path) - # GeoTransform: origin_x, res_x, skew_x, origin_y, skew_y, res_y - # Non-zero skew_x / skew_y => rotated. - xml = f""" - 0.0, 1.0, 0.1, 0.0, 0.1, -1.0 - - - {src_path} - 1 - - - - -""" - path, parsed = _parse(tmp_path, xml, 'rotated.vrt') - # Rotated transforms raise the existing typed - # ``RotatedTransformError`` for backward compatibility with - # callers that ``except RotatedTransformError``; the centralised - # validator's value here is in adding the source path to the - # message and lifting the rejection ahead of any decode. - with pytest.raises(RotatedTransformError, match='rotat'): - validate_parsed_vrt(parsed, source=path, mode='read', - allow_rotated=False) - # Opt-in path returns silently. - validate_parsed_vrt(parsed, source=path, mode='read', - allow_rotated=True) - - -# --------------------------------------------------------------------------- -# Rule 4: SrcRect sanity (negative size / negative offset) -# --------------------------------------------------------------------------- - - -def test_negative_src_rect_size_rejected(tmp_path): - src_path = _write_src(tmp_path) - xml = f""" - 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 - - - {src_path} - 1 - - - - -""" - path, parsed = _parse(tmp_path, xml, 'neg_src.vrt') - with pytest.raises(VRTUnsupportedError, match='SrcRect'): - validate_parsed_vrt(parsed, source=path, mode='read') - - -def test_negative_src_rect_offset_rejected(tmp_path): - src_path = _write_src(tmp_path) - xml = f""" - 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 - - - {src_path} - 1 - - - - -""" - path, parsed = _parse(tmp_path, xml, 'neg_src_off.vrt') - with pytest.raises(VRTUnsupportedError, match='SrcRect'): - validate_parsed_vrt(parsed, source=path, mode='read') - - -# --------------------------------------------------------------------------- -# Rule 5: DstRect sanity (negative size, plus out-of-VRT-extent rect) -# --------------------------------------------------------------------------- - - -def test_negative_dst_rect_size_rejected(tmp_path): - src_path = _write_src(tmp_path) - xml = f""" - 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 - - - {src_path} - 1 - - - - -""" - path, parsed = _parse(tmp_path, xml, 'neg_dst.vrt') - with pytest.raises(VRTUnsupportedError, match='DstRect'): - validate_parsed_vrt(parsed, source=path, mode='read') - - -def test_dst_rect_outside_vrt_extent_rejected(tmp_path): - """A DstRect that lands entirely outside the VRT's - ``rasterXSize x rasterYSize`` extent contributes nothing and is a - malformed mosaic. Reject up front.""" - src_path = _write_src(tmp_path) - xml = f""" - 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 - - - {src_path} - 1 - - - - -""" - path, parsed = _parse(tmp_path, xml, 'out_dst.vrt') - with pytest.raises(VRTUnsupportedError, match='DstRect'): - validate_parsed_vrt(parsed, source=path, mode='read') - - -# --------------------------------------------------------------------------- -# Rule 6: pixel-size compatibility (zero pixel size in transform) -# --------------------------------------------------------------------------- - - -def test_zero_pixel_size_rejected(tmp_path): - """A GeoTransform with zero res_x or res_y produces a degenerate - coord array and divides by zero in coord generation. Reject up - front.""" - src_path = _write_src(tmp_path) - xml = f""" - 0.0, 0.0, 0.0, 0.0, 0.0, -1.0 - - - {src_path} - 1 - - - - -""" - path, parsed = _parse(tmp_path, xml, 'zero_px.vrt') - with pytest.raises(VRTUnsupportedError, match='pixel size'): - validate_parsed_vrt(parsed, source=path, mode='read') - - -# --------------------------------------------------------------------------- -# Rule 7: unsupported resampling algorithm -# --------------------------------------------------------------------------- - - -def test_unsupported_resample_alg_rejected_at_validate(tmp_path): - """A ComplexSource with a non-nearest resampler and differing - SrcRect/DstRect sizes must be rejected by the validator before - any pixels are read.""" - src_path = _write_src(tmp_path) - xml = f""" - 0.0, 2.0, 0.0, 0.0, 0.0, -2.0 - - - {src_path} - 1 - - - Bilinear - - -""" - path, parsed = _parse(tmp_path, xml, 'bilinear.vrt') - with pytest.raises(VRTUnsupportedError, match='[Rr]esampl'): - validate_parsed_vrt(parsed, source=path, mode='read') - - -# --------------------------------------------------------------------------- -# Rule 8: nodata policy (mixed per-band sentinels without opt-in) -# --------------------------------------------------------------------------- - - -def test_mixed_band_nodata_rejected_without_opt_in(tmp_path): - """A two-band VRT with disagreeing per-band ```` must - be rejected unless ``band_nodata='first'`` is passed.""" - src_path = _write_src(tmp_path, name='src_mixed.tif') - xml = f""" - 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 - - 0 - - {src_path} - 1 - - - - - - 9999 - - {src_path} - 1 - - - - -""" - path, _ = _parse(tmp_path, xml, 'mixed_nodata.vrt') - # Mixed per-band sentinels raise the existing typed - # ``MixedBandMetadataError`` (registered via the - # ``validate_read_metadata`` hook). The centralised - # ``validate_parsed_vrt`` delegates the disagreement-detection - # logic to that hook so the ``None``-counts-as-undeclared - # semantics stay canonical in one place. The public entry points - # run both the validator and ``validate_read_metadata``, so the - # rejection still surfaces at boundary time. - with pytest.raises(MixedBandMetadataError): - read_vrt(path) - with pytest.raises(MixedBandMetadataError): - open_geotiff(path) - # Opt-in returns silently. - r = read_vrt(path, band_nodata='first') - assert r.shape == (4, 4, 2) - - -# --------------------------------------------------------------------------- -# Rule 9: CRS compatibility -# (the parsed VRTDataset only carries one ``crs_wkt`` field, so the -# CRS-mismatch case happens when a caller's expected CRS differs from -# the file's. The validator surfaces an unparseable CRS string up -# front unless ``allow_unparseable_crs=True``.) -# --------------------------------------------------------------------------- - - -def test_unparseable_crs_rejected_without_opt_in(tmp_path): - src_path = _write_src(tmp_path) - xml = f""" - GARBAGE_CRS_NOT_A_REAL_WKT - 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 - - - {src_path} - 1 - - - - -""" - path, parsed = _parse(tmp_path, xml, 'badcrs.vrt') - # CRS strings that pyproj cannot resolve raise the existing typed - # ``UnparseableCRSError`` so callers that already - # ``except UnparseableCRSError`` keep working. - with pytest.raises(UnparseableCRSError): - validate_parsed_vrt(parsed, source=path, mode='read', - allow_unparseable_crs=False) - # Opt-in path returns silently. - validate_parsed_vrt(parsed, source=path, mode='read', - allow_unparseable_crs=True) - - -# --------------------------------------------------------------------------- -# Entry-point parity: direct read_vrt and open_geotiff produce the same -# rejection (same error type and same message) for the same bad input. -# --------------------------------------------------------------------------- - - -_BILINEAR_XML_TEMPLATE = """ - 0.0, 2.0, 0.0, 0.0, 0.0, -2.0 - - - {src} - 1 - - - Bilinear - - -""" - - -def test_resample_parity_across_entry_points(tmp_path): - src_path = _write_src(tmp_path) - xml = _BILINEAR_XML_TEMPLATE.format(src=src_path) - path = _write_vrt(tmp_path, xml, 'bilinear_parity.vrt') - - with pytest.raises(VRTUnsupportedError) as a: - read_vrt(path) - with pytest.raises(VRTUnsupportedError) as b: - open_geotiff(path) - assert type(a.value) is type(b.value) - assert str(a.value) == str(b.value) - - -def test_rotated_parity_across_entry_points(tmp_path): - src_path = _write_src(tmp_path) - xml = f""" - 0.0, 1.0, 0.1, 0.0, 0.1, -1.0 - - - {src_path} - 1 - - - - -""" - path = _write_vrt(tmp_path, xml, 'rotated_parity.vrt') - - # Both entry points raise the same typed error - # (``RotatedTransformError`` here, since rotated transforms keep - # their pre-existing subclass) with the same message. - with pytest.raises(RotatedTransformError) as a: - read_vrt(path) - with pytest.raises(RotatedTransformError) as b: - open_geotiff(path) - assert type(a.value) is type(b.value) - assert str(a.value) == str(b.value) - - -# --------------------------------------------------------------------------- -# Chunked (dask) path: rejection happens at graph build, not in a chunk -# function. Same error type and message as the eager path. -# --------------------------------------------------------------------------- - - -def test_unsupported_resample_chunked_raises_at_build(tmp_path): - """The chunked dispatcher must run the validator before building - the dask graph so the failure surfaces at construction time, not - deep in a chunk function during ``compute()``.""" - src_path = _write_src(tmp_path) - xml = _BILINEAR_XML_TEMPLATE.format(src=src_path) - path = _write_vrt(tmp_path, xml, 'bilinear_chunked.vrt') - - with pytest.raises(VRTUnsupportedError): - read_vrt(path, chunks=2) - - -# --------------------------------------------------------------------------- -# Sanity: a well-formed minimal VRT validates with no exception. -# --------------------------------------------------------------------------- - - -def test_well_formed_vrt_validates_silently(tmp_path): - src_path = _write_src(tmp_path) - xml = f""" - 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 - - - {src_path} - 1 - - - - -""" - path, parsed = _parse(tmp_path, xml, 'good.vrt') - # No exception: - validate_parsed_vrt(parsed, source=path, mode='read') - # And the public entry points read it without raising the - # validator error. - result = read_vrt(path) - assert result.shape == (4, 4) diff --git a/xrspatial/geotiff/tests/vrt/test_validation.py b/xrspatial/geotiff/tests/vrt/test_validation.py new file mode 100644 index 00000000..59974152 --- /dev/null +++ b/xrspatial/geotiff/tests/vrt/test_validation.py @@ -0,0 +1,1469 @@ +"""VRT validator + reader-error contract. + +Consolidates the validator-side VRT cluster: + +* ``test_vrt_validation_2321.py`` -- centralised + ``validate_parsed_vrt`` rule rejections. +* ``test_vrt_capability_validator_2371.py`` -- nested/warped VRT, + ``UseMaskBand`` / per-source ``MaskBand``, internal-entry-point + resample-alg gate, and the ``validate_vrt_capability`` alias. +* ``test_vrt_unsupported_2370.py`` -- end-to-end negative coverage + (warped, nested, mixed CRS, mixed dtype, mixed band count, mask, + resample alg) through both public entry points. +* ``test_vrt_narrow_except_1670.py`` -- narrowed-``except`` contract + in ``read_vrt`` for source-read failures (warn-and-continue vs + propagate) under default and ``XRSPATIAL_GEOTIFF_STRICT=1`` modes. +* ``test_vrt_path_containment_1671.py`` -- path-traversal rejection + in ``parse_vrt`` / ``_read_vrt_internal`` and the + ``XRSPATIAL_VRT_ALLOWED_ROOTS`` opt-in. + +Conventions: + +* Parametrise IDs are descriptive (``id="reject[bad-resample-alg]"``); + issue numbers do not appear in test or parametrise names. +* Source-TIFF and VRT filenames carry a uuid suffix so parallel test + workers across worktrees never collide on the same on-disk name. +""" +from __future__ import annotations + +import os +import struct +import uuid +import warnings +import zlib + +import numpy as np +import pytest +import xarray as xr + +from xrspatial.geotiff import ( + GeoTIFFFallbackWarning, + open_geotiff, + to_geotiff, +) +from xrspatial.geotiff._backends.vrt import read_vrt as _public_read_vrt +from xrspatial.geotiff._errors import ( + GeoTIFFAmbiguousMetadataError, + MixedBandMetadataError, + RotatedTransformError, + UnparseableCRSError, + UnsupportedGeoTIFFFeatureError, + VRTUnsupportedError, +) +from xrspatial.geotiff._vrt import parse_vrt +from xrspatial.geotiff._vrt import read_vrt as _internal_read_vrt +from xrspatial.geotiff._vrt_validation import ( + validate_parsed_vrt, + validate_vrt_capability, +) +from xrspatial.geotiff._writer import write + +# ``xrspatial.geotiff.read_vrt`` is the public alias re-exported from the +# package init; the backend module ``_backends.vrt.read_vrt`` is the same +# callable. The 2321 fixtures used the package alias; the 2371 fixtures +# imported the backend module directly. Use one name for the rest of the +# module so the parametrise IDs stay legible. +_package_read_vrt = _public_read_vrt + + +# --------------------------------------------------------------------------- +# Helpers +# --------------------------------------------------------------------------- + + +def _uniq(prefix: str) -> str: + """Short unique tag so on-disk artefact names never collide across + parallel workers or worktrees.""" + return f"{prefix}_{uuid.uuid4().hex[:8]}" + + +def _write_src_uint16(tmp_path, name: str | None = None, + shape=(4, 4)) -> str: + """Write a small uint16 source TIFF via the low-level ``write`` helper. + + Matches the fixture used in 2321 / 2371: ``write(arr, path, ...)`` + bypasses the higher-level ``to_geotiff`` so the validator tests do + not depend on the writer side adding CRS / nodata metadata that + would influence parser behaviour. + """ + name = name or _uniq("src_uint16") + arr = np.arange(int(np.prod(shape)), dtype=np.uint16).reshape(shape) + path = str(tmp_path / f"{name}.tif") + write(arr, path, compression='none', tiled=False) + return path + + +def _write_src_float32_geotiff(tmp_path, name: str | None = None, + *, dtype=np.float32, shape=(4, 4), + crs='EPSG:4326') -> str: + """Write a small GeoTIFF via the high-level ``to_geotiff`` entry point. + + The 2370 fixtures use this shape so the source carries CRS and + nodata attrs the cross-CRS / mixed-dtype tests need. + """ + name = name or _uniq("src_geo") + arr = np.arange(int(np.prod(shape)), dtype=dtype).reshape(shape) + y = np.linspace(1.0, 0.0, shape[0]) + x = np.linspace(0.0, 1.0, shape[1]) + fill = -9999 if np.issubdtype(arr.dtype, np.integer) else -9999.0 + da = xr.DataArray( + arr, dims=['y', 'x'], + coords={'y': y, 'x': x}, + attrs={'nodata': fill, 'crs': crs}, + ) + path = str(tmp_path / f"{name}.tif") + to_geotiff(da, path) + return path + + +def _write_vrt(tmp_path, xml: str, name: str | None = None) -> str: + """Write VRT XML to disk and return its absolute path.""" + name = name or _uniq("mosaic") + path = str(tmp_path / f"{name}.vrt") + with open(path, 'w') as f: + f.write(xml) + return path + + +def _parse(tmp_path, xml: str, name: str | None = None): + """Write + parse a VRT XML, return ``(path, parsed)``.""" + path = _write_vrt(tmp_path, xml, name) + parsed = parse_vrt(xml, os.path.dirname(os.path.abspath(path))) + return path, parsed + + +def _simple_source_xml(src_path: str, *, band: int = 1) -> str: + """Render one ```` over a matched 4x4 SrcRect/DstRect.""" + return f""" + {src_path} + {band} + + + """ + + +def _vrt_xml(*, width: int = 4, height: int = 4, + dtype_name: str = 'Float32', + body: str = '', + extra_dataset_inner: str = '', + srs: str = 'EPSG:4326') -> str: + """Render a minimal VRT XML wrapper.""" + return f""" + {srs} + 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 + {extra_dataset_inner} + +{body} + +""" + + +# --------------------------------------------------------------------------- +# Error subclass contracts +# --------------------------------------------------------------------------- + + +def test_vrt_unsupported_error_subclass_contract(): + """``VRTUnsupportedError`` must subclass + ``GeoTIFFAmbiguousMetadataError`` (and ``ValueError`` via the base) + so existing ``except ValueError`` callers keep catching VRT + failures.""" + assert issubclass(VRTUnsupportedError, GeoTIFFAmbiguousMetadataError) + assert issubclass(VRTUnsupportedError, ValueError) + + +def test_validate_vrt_capability_is_validate_parsed_vrt(): + """``validate_vrt_capability`` is the public alias matching the + issue text. It resolves to the same callable as + ``validate_parsed_vrt`` so both names share one implementation.""" + assert validate_vrt_capability is validate_parsed_vrt + + +# --------------------------------------------------------------------------- +# Rule-driven validator rejections. +# +# Each row describes a malformed VRT that ``validate_parsed_vrt`` must +# reject at validate time (before any source decode). The error class +# and a substring of the message are pinned per-rule. +# +# A handful of rules raise typed subclasses (``RotatedTransformError``, +# ``UnparseableCRSError``, ``MixedBandMetadataError``) that callers +# already match against; the parametrise carries the expected class so +# the typed contract stays explicit. +# --------------------------------------------------------------------------- + + +_BAND_TEMPLATE_NO_GEOTRANSFORM = """ +""" + +_ZERO_BANDS_VRT = """ + 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 +""" + + +def _rotated_vrt(src_path: str) -> str: + return f""" + 0.0, 1.0, 0.1, 0.0, 0.1, -1.0 + + + {src_path} + 1 + + + + +""" + + +def _negative_src_size_vrt(src_path: str) -> str: + return f""" + 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 + + + {src_path} + 1 + + + + +""" + + +def _negative_src_offset_vrt(src_path: str) -> str: + return f""" + 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 + + + {src_path} + 1 + + + + +""" + + +def _negative_dst_size_vrt(src_path: str) -> str: + return f""" + 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 + + + {src_path} + 1 + + + + +""" + + +def _dst_outside_extent_vrt(src_path: str) -> str: + return f""" + 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 + + + {src_path} + 1 + + + + +""" + + +def _zero_pixel_size_vrt(src_path: str) -> str: + return f""" + 0.0, 0.0, 0.0, 0.0, 0.0, -1.0 + + + {src_path} + 1 + + + + +""" + + +def _bilinear_resample_vrt(src_path: str) -> str: + """ComplexSource with non-nearest resampler and size-changing rects.""" + return f""" + 0.0, 2.0, 0.0, 0.0, 0.0, -2.0 + + + {src_path} + 1 + + + Bilinear + + +""" + + +def _bad_crs_vrt(src_path: str) -> str: + return f""" + GARBAGE_CRS_NOT_A_REAL_WKT + 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 + + + {src_path} + 1 + + + + +""" + + +class TestValidatorRules: + """Each rule of ``validate_parsed_vrt`` gets one negative test that + builds a malformed VRT, parses it, and asserts the validator + rejects it with the expected typed error and message keyword.""" + + def test_zero_bands_rejected(self, tmp_path): + path, parsed = _parse(tmp_path, _ZERO_BANDS_VRT) + with pytest.raises(VRTUnsupportedError, match='band'): + validate_parsed_vrt(parsed, source=path, mode='read') + + def test_complex_dtype_band_rejected(self, tmp_path): + """A complex numpy dtype on a parsed band (e.g. via a future + ``_DTYPE_MAP`` regression) must be refused before any decode + runs.""" + src_path = _write_src_uint16(tmp_path) + xml = f""" + 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 + +{_simple_source_xml(src_path)} + +""" + path, parsed = _parse(tmp_path, xml) + parsed.bands[0].dtype = np.dtype('complex128') + with pytest.raises(VRTUnsupportedError, match='dtype'): + validate_parsed_vrt(parsed, source=path, mode='read') + + def test_rotated_transform_rejected_without_opt_in(self, tmp_path): + """Rotated transforms raise the existing typed + ``RotatedTransformError`` for backward compatibility; the + validator's value here is adding the source path to the message + and lifting the rejection ahead of any decode. ``allow_rotated`` + is the documented opt-in.""" + src_path = _write_src_uint16(tmp_path) + path, parsed = _parse(tmp_path, _rotated_vrt(src_path)) + with pytest.raises(RotatedTransformError, match='rotat'): + validate_parsed_vrt(parsed, source=path, mode='read', + allow_rotated=False) + # Opt-in is silent. + validate_parsed_vrt(parsed, source=path, mode='read', + allow_rotated=True) + + @pytest.mark.parametrize( + "build_xml, match", + [ + pytest.param(_negative_src_size_vrt, 'SrcRect', + id="reject[negative-src-size]"), + pytest.param(_negative_src_offset_vrt, 'SrcRect', + id="reject[negative-src-offset]"), + pytest.param(_negative_dst_size_vrt, 'DstRect', + id="reject[negative-dst-size]"), + pytest.param(_dst_outside_extent_vrt, 'DstRect', + id="reject[dst-outside-extent]"), + pytest.param(_zero_pixel_size_vrt, 'pixel size', + id="reject[zero-pixel-size]"), + ], + ) + def test_geometry_rules_rejected(self, tmp_path, build_xml, match): + src_path = _write_src_uint16(tmp_path) + path, parsed = _parse(tmp_path, build_xml(src_path)) + with pytest.raises(VRTUnsupportedError, match=match): + validate_parsed_vrt(parsed, source=path, mode='read') + + def test_unsupported_resample_alg_rejected(self, tmp_path): + """A non-nearest resampler paired with size-changing rects must + be rejected by the validator before any pixels are read.""" + src_path = _write_src_uint16(tmp_path) + path, parsed = _parse(tmp_path, _bilinear_resample_vrt(src_path)) + with pytest.raises(VRTUnsupportedError, match='[Rr]esampl'): + validate_parsed_vrt(parsed, source=path, mode='read') + + def test_unparseable_crs_rejected_without_opt_in(self, tmp_path): + """CRS strings pyproj cannot resolve raise the existing typed + ``UnparseableCRSError``. ``allow_unparseable_crs=True`` is the + documented opt-in.""" + src_path = _write_src_uint16(tmp_path) + path, parsed = _parse(tmp_path, _bad_crs_vrt(src_path)) + with pytest.raises(UnparseableCRSError): + validate_parsed_vrt(parsed, source=path, mode='read', + allow_unparseable_crs=False) + validate_parsed_vrt(parsed, source=path, mode='read', + allow_unparseable_crs=True) + + def test_well_formed_vrt_validates_silently(self, tmp_path): + src_path = _write_src_uint16(tmp_path) + xml = f""" + 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 + +{_simple_source_xml(src_path)} + +""" + path, parsed = _parse(tmp_path, xml) + validate_parsed_vrt(parsed, source=path, mode='read') + result = _package_read_vrt(path) + assert result.shape == (4, 4) + + +# --------------------------------------------------------------------------- +# Mixed per-band nodata: the rejection lives in ``validate_read_metadata`` +# and surfaces through the public entry points, not through +# ``validate_parsed_vrt`` directly. Pin the cross-entry-point behaviour. +# --------------------------------------------------------------------------- + + +def _mixed_nodata_vrt(src_path: str) -> str: + return f""" + 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 + + 0 +{_simple_source_xml(src_path)} + + + 9999 +{_simple_source_xml(src_path)} + +""" + + +def test_mixed_band_nodata_rejected_without_opt_in(tmp_path): + """Two bands with disagreeing per-band ```` raise + ``MixedBandMetadataError`` unless ``band_nodata='first'`` is + passed. The validator delegates disagreement detection to the + ``validate_read_metadata`` hook so the ``None``-counts-as-undeclared + semantics live in one place.""" + src_path = _write_src_uint16(tmp_path) + path = _write_vrt(tmp_path, _mixed_nodata_vrt(src_path)) + with pytest.raises(MixedBandMetadataError): + _package_read_vrt(path) + with pytest.raises(MixedBandMetadataError): + open_geotiff(path) + r = _package_read_vrt(path, band_nodata='first') + assert r.shape == (4, 4, 2) + + +# --------------------------------------------------------------------------- +# Nested VRT: a ``.vrt`` referenced as a ````. +# --------------------------------------------------------------------------- + + +def _nested_outer_xml(inner_vrt_path: str) -> str: + return f""" + 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 + +{_simple_source_xml(inner_vrt_path)} + +""" + + +def _write_inner_vrt(tmp_path, src_path: str, *, name: str = None) -> str: + name = name or _uniq("inner") + inner_xml = f""" + 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 + +{_simple_source_xml(src_path)} + +""" + return _write_vrt(tmp_path, inner_xml, name) + + +def test_nested_vrt_message_names_outer_and_inner(tmp_path): + """The validator-level rejection must name both the outer VRT path + (the failing file) and the inner VRT (the bad source). Inner path + comparison uses the basename so it survives ``realpath`` + canonicalisation on Windows-style short paths.""" + src_path = _write_src_uint16(tmp_path) + inner_path = _write_inner_vrt(tmp_path, src_path) + outer_path, parsed = _parse(tmp_path, _nested_outer_xml(inner_path)) + + with pytest.raises(VRTUnsupportedError) as excinfo: + validate_parsed_vrt(parsed, source=outer_path, mode='read') + msg = str(excinfo.value) + assert outer_path in msg + assert os.path.basename(inner_path) in msg + assert 'Nested' in msg or 'nested' in msg + + +def test_nested_vrt_uppercase_extension_rejected(tmp_path): + """``.VRT`` (uppercase) trips the same rejection. Extension matching + is case-insensitive so Windows-style emitters are caught.""" + src_path = _write_src_uint16(tmp_path) + inner_path = _write_inner_vrt(tmp_path, src_path, + name=_uniq("INNER").upper()) + outer_path, parsed = _parse(tmp_path, _nested_outer_xml(inner_path)) + with pytest.raises(VRTUnsupportedError, match='[Nn]ested'): + validate_parsed_vrt(parsed, source=outer_path, mode='read') + + +@pytest.mark.parametrize( + "reader", + [ + pytest.param(_package_read_vrt, id="entry[package-read_vrt]"), + pytest.param(_internal_read_vrt, id="entry[internal-read_vrt]"), + pytest.param(open_geotiff, id="entry[open_geotiff]"), + ], +) +def test_nested_vrt_rejected_via_entry_points(tmp_path, reader): + """All three public-ish entry points must surface the nested-VRT + rejection identically: the validator is wired at both the package + backend wrapper and the internal ``_vrt.read_vrt`` since #2371.""" + src_path = _write_src_uint16(tmp_path) + inner_path = _write_inner_vrt(tmp_path, src_path) + outer_path = _write_vrt(tmp_path, _nested_outer_xml(inner_path)) + with pytest.raises(VRTUnsupportedError, match='[Nn]ested'): + reader(outer_path) + + +def test_nested_vrt_error_remains_value_error_subclass(tmp_path): + """``VRTUnsupportedError`` keeps subclassing ``ValueError`` via + ``GeoTIFFAmbiguousMetadataError`` so ``except ValueError`` callers + still catch the new rejection path.""" + src_path = _write_src_uint16(tmp_path) + inner_path = _write_inner_vrt(tmp_path, src_path) + outer_path, parsed = _parse(tmp_path, _nested_outer_xml(inner_path)) + with pytest.raises(ValueError): + validate_parsed_vrt(parsed, source=outer_path, mode='read') + with pytest.raises(GeoTIFFAmbiguousMetadataError): + validate_parsed_vrt(parsed, source=outer_path, mode='read') + + +# --------------------------------------------------------------------------- +# Warped VRT: ```` block (dataset and band level) and +# ``subClass="VRTWarpedRasterBand"`` are all rejected. +# --------------------------------------------------------------------------- + + +_WARP_DATASET_XML = """ + 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 + + 64.0 + NearestNeighbour + + +""" + +_WARP_BAND_XML = """ + 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 + + + NearestNeighbour + + +""" + + +@pytest.mark.parametrize( + "xml, scope", + [ + pytest.param(_WARP_DATASET_XML, 'dataset', + id="warp[dataset-level]"), + pytest.param(_WARP_BAND_XML, 'band', + id="warp[band-level]"), + ], +) +def test_warp_options_rejected_at_parse(tmp_path, xml, scope): + """Both dataset-level and band-level ```` are + rejected by ``parse_vrt`` itself (via ``_UNSUPPORTED_*_TAGS``), so + callers that route through the validator still see a typed + failure -- parse runs before the validator is reached.""" + path = _write_vrt(tmp_path, xml) + with pytest.raises(UnsupportedGeoTIFFFeatureError, match='GDALWarpOptions'): + parse_vrt(xml, os.path.dirname(path)) + + +@pytest.mark.parametrize( + "reader", + [ + pytest.param(_package_read_vrt, id="entry[package-read_vrt]"), + pytest.param(_internal_read_vrt, id="entry[internal-read_vrt]"), + ], +) +def test_warp_options_dataset_rejected_via_entry_points(tmp_path, reader): + path = _write_vrt(tmp_path, _WARP_DATASET_XML) + with pytest.raises(UnsupportedGeoTIFFFeatureError, match='GDALWarpOptions'): + reader(path) + + +def test_warped_subclass_band_rejected_via_open_geotiff(tmp_path): + """``subClass="VRTWarpedRasterBand"`` is the band-level warped + marker; ``open_geotiff`` must reject it too so callers cannot slip + a warped VRT through the public accessor.""" + src_path = _write_src_float32_geotiff(tmp_path) + warped_xml = f""" + EPSG:4326 + 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 + + {src_path} + +""" + vrt_path = _write_vrt(tmp_path, warped_xml) + with pytest.raises((ValueError, NotImplementedError, RuntimeError, + UnsupportedGeoTIFFFeatureError)): + open_geotiff(vrt_path) + + +# --------------------------------------------------------------------------- +# Per-source mask / alpha semantics. +# --------------------------------------------------------------------------- + + +def _use_mask_band_xml(src_path: str, flag: str = 'true') -> str: + return f""" + 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 + + + {src_path} + 1 + + + {flag} + + +""" + + +def test_use_mask_band_message_names_source(tmp_path): + """``true`` raises at validate time with + both the feature name and the offending source path in the + message.""" + src_path = _write_src_uint16(tmp_path) + path, parsed = _parse(tmp_path, _use_mask_band_xml(src_path)) + with pytest.raises(VRTUnsupportedError) as excinfo: + validate_parsed_vrt(parsed, source=path, mode='read') + msg = str(excinfo.value) + assert 'UseMaskBand' in msg + assert src_path in msg + + +@pytest.mark.parametrize( + "flag", + ['true', 'True', 'TRUE', '1'], + ids=lambda f: f"truthy[{f}]", +) +def test_use_mask_band_truthy_spellings_rejected(tmp_path, flag): + """The canonical truthy set is GDAL's ``true`` (any case) plus the + digit ``1``. Hand-edited VRTs occasionally normalise booleans to + ``1`` via XML emitters, so both spellings trip the rejection.""" + src_path = _write_src_uint16(tmp_path) + path, parsed = _parse( + tmp_path, _use_mask_band_xml(src_path, flag=flag), + ) + with pytest.raises(VRTUnsupportedError, match='UseMaskBand'): + validate_parsed_vrt(parsed, source=path, mode='read') + + +def test_use_mask_band_false_is_accepted(tmp_path): + """An explicit ``false`` is a no-op. + GDAL never writes it, but hand-written VRTs occasionally do.""" + src_path = _write_src_uint16(tmp_path) + path, parsed = _parse( + tmp_path, _use_mask_band_xml(src_path, flag='false'), + ) + validate_parsed_vrt(parsed, source=path, mode='read') + + +@pytest.mark.parametrize( + "flag", + ['yes', 'on', 'Y'], + ids=lambda f: f"non-canonical[{f}]", +) +def test_use_mask_band_non_canonical_truthy_accepted(tmp_path, flag): + """Tokens outside the canonical GDAL set are treated as not-mask. + The parser deliberately narrows the truthy set so a hand-edited VRT + using a Python-truthy spelling does not silently flip the read into + the rejection path. If GDAL ever starts emitting one of these, the + set should be widened then.""" + src_path = _write_src_uint16(tmp_path) + path, parsed = _parse( + tmp_path, _use_mask_band_xml(src_path, flag=flag), + ) + validate_parsed_vrt(parsed, source=path, mode='read') + + +@pytest.mark.parametrize( + "reader", + [ + pytest.param(_package_read_vrt, id="entry[package-read_vrt]"), + pytest.param(_internal_read_vrt, id="entry[internal-read_vrt]"), + ], +) +def test_use_mask_band_rejected_via_entry_points(tmp_path, reader): + src_path = _write_src_uint16(tmp_path) + path = _write_vrt(tmp_path, _use_mask_band_xml(src_path)) + with pytest.raises(VRTUnsupportedError, match='UseMaskBand'): + reader(path) + + +def test_per_source_mask_band_message_names_source(tmp_path): + """A per-source ```` child (distinct from a dataset-level + sibling) raises at validate time naming the source path.""" + src_path = _write_src_uint16(tmp_path) + xml = f""" + 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 + + + {src_path} + 1 + + + + {src_path} + 1 + + + +""" + path, parsed = _parse(tmp_path, xml) + with pytest.raises(VRTUnsupportedError) as excinfo: + validate_parsed_vrt(parsed, source=path, mode='read') + msg = str(excinfo.value) + assert 'MaskBand' in msg + assert src_path in msg + + +# --------------------------------------------------------------------------- +# Entry-point parity: ``read_vrt`` (package) and ``open_geotiff`` produce +# the same typed exception with the same message for the same bad input. +# --------------------------------------------------------------------------- + + +def _expect_same_error(path, exc_type): + """Run ``_package_read_vrt`` and ``open_geotiff``, assert both raise + the same type with the same message.""" + with pytest.raises(exc_type) as a: + _package_read_vrt(path) + with pytest.raises(exc_type) as b: + open_geotiff(path) + assert type(a.value) is type(b.value) + assert str(a.value) == str(b.value) + + +def test_zero_bands_parity_across_entry_points(tmp_path): + path = _write_vrt(tmp_path, _ZERO_BANDS_VRT) + _expect_same_error(path, VRTUnsupportedError) + + +def test_resample_parity_across_entry_points(tmp_path): + src_path = _write_src_uint16(tmp_path) + path = _write_vrt(tmp_path, _bilinear_resample_vrt(src_path)) + _expect_same_error(path, VRTUnsupportedError) + + +def test_rotated_parity_across_entry_points(tmp_path): + """Rotated transforms keep their pre-existing + ``RotatedTransformError`` subclass; both entry points raise it + with the same message.""" + src_path = _write_src_uint16(tmp_path) + path = _write_vrt(tmp_path, _rotated_vrt(src_path)) + _expect_same_error(path, RotatedTransformError) + + +def test_unsupported_resample_chunked_raises_at_build(tmp_path): + """The chunked dispatcher runs the validator before building the + dask graph so the failure surfaces at construction time, not deep + in a chunk function during ``compute()``.""" + src_path = _write_src_uint16(tmp_path) + path = _write_vrt(tmp_path, _bilinear_resample_vrt(src_path)) + with pytest.raises(VRTUnsupportedError): + _package_read_vrt(path, chunks=2) + + +def test_resample_alg_rejected_at_internal_read_vrt(tmp_path): + """The internal ``_vrt.read_vrt`` is now routed through the + validator (since #2371). A direct call produces the typed + ``VRTUnsupportedError`` at graph build / eager setup rather than the + old ``NotImplementedError`` at the placement site.""" + src_path = _write_src_uint16(tmp_path) + path = _write_vrt(tmp_path, _bilinear_resample_vrt(src_path)) + with pytest.raises(VRTUnsupportedError, match='Bilinear'): + _internal_read_vrt(path) + + +# --------------------------------------------------------------------------- +# End-to-end negative coverage: cases that historically required PR-1's +# centralised validator. After #2329 / #2371 the rejections fire, so the +# xfail wrapper from the 2370 file is gone -- assertions are direct. +# --------------------------------------------------------------------------- + + +def test_mixed_source_dtype_complex_rejected(tmp_path): + """``dataType="CFloat32"`` (complex) raises ``ValueError`` per #1783; + the message must name the rejected dtype AND mention 'complex' so + the rejection is actionable.""" + src_path = _write_src_float32_geotiff(tmp_path) + body = _simple_source_xml(src_path) + xml = _vrt_xml(body=body, dtype_name='CFloat32') + vrt_path = _write_vrt(tmp_path, xml) + with pytest.raises(ValueError, match=r'CFloat32') as excinfo: + _package_read_vrt(vrt_path) + assert 'complex' in str(excinfo.value).lower() + + +def test_mixed_source_band_count_rejected(tmp_path): + """A single-band source referenced via ``SourceBand=2`` cannot read + band 2 (it does not exist). Must fail with a message that names the + band rather than silently decoding the wrong one.""" + single_band_src = _write_src_float32_geotiff(tmp_path) + body = _simple_source_xml(single_band_src, band=2) + xml = _vrt_xml(body=body) + vrt_path = _write_vrt(tmp_path, xml) + with pytest.raises((ValueError, IndexError, RuntimeError, + NotImplementedError)) as excinfo: + _package_read_vrt(vrt_path) + assert 'band' in str(excinfo.value).lower() + + +@pytest.mark.parametrize( + "alg", + ['Bilinear', 'Cubic', 'Lanczos', 'Average', 'Mode'], + ids=lambda a: f"resample[{a.lower()}]", +) +@pytest.mark.parametrize( + "reader", + [ + pytest.param(_package_read_vrt, id="entry[package-read_vrt]"), + pytest.param(open_geotiff, id="entry[open_geotiff]"), + ], +) +def test_unsupported_resample_alg_rejected_end_to_end(tmp_path, alg, reader): + """A ```` outside the implemented nearest subset, + paired with size-changing rects, is rejected by both entry points. + Either ``NotImplementedError`` (legacy resample-site check) or a + ``ValueError`` subclass (``VRTUnsupportedError`` from the + centralised validator) is acceptable so long as the message names + the offending algorithm.""" + src_path = _write_src_float32_geotiff(tmp_path) + body = f""" + {src_path} + 1 + + + {alg} + """ + xml = _vrt_xml(width=2, height=2, body=body) + vrt_path = _write_vrt(tmp_path, xml) + + with pytest.raises((NotImplementedError, ValueError)) as excinfo: + reader(vrt_path) + assert alg in str(excinfo.value) + + +def test_dataset_level_mask_band_rejected(tmp_path): + """A dataset-level ```` declares a per-pixel mask the + GeoTIFF attrs contract cannot represent. Reading the mosaic and + silently dropping the mask would produce a result the caller cannot + distinguish from one with no mask. Must be rejected with 'mask' in + the message.""" + src_path = _write_src_float32_geotiff(tmp_path) + mask_src = _write_src_float32_geotiff(tmp_path, dtype=np.uint8) + body = _simple_source_xml(src_path) + mask_block = f""" + + + {mask_src} + 1 + + + + + """ + xml = _vrt_xml(body=body, extra_dataset_inner=mask_block) + vrt_path = _write_vrt(tmp_path, xml) + with pytest.raises((ValueError, NotImplementedError)) as excinfo: + _package_read_vrt(vrt_path) + assert 'mask' in str(excinfo.value).lower() + + +# The next two cases were ``xfail`` wrappers in the original 2370 file: +# the rejection contract is documented but the centralised validator +# does not yet name the failure mode in a way that ``except (ValueError, +# NotImplementedError)`` catches with the expected keyword. Keep them +# under ``xfail(strict=False)`` so a future validator change starts +# passing them without an edit here. + + +@pytest.mark.xfail( + reason="mixed source CRS rejection awaits a validator pass that " + "opens each source TIFF", + strict=False, +) +def test_mixed_source_crs_rejected(tmp_path): + """Two band sources with disagreeing CRS (EPSG:4326 vs EPSG:3857) + cannot mosaic correctly without reprojection. The VRT XML carries + only a dataset-level ````, so the mismatch only surfaces when + the validator opens each source TIFF.""" + src_4326 = _write_src_float32_geotiff(tmp_path, crs='EPSG:4326') + arr = np.arange(16, dtype=np.float32).reshape(4, 4) + y = np.linspace(1.0, 0.0, 4) + x = np.linspace(0.0, 1.0, 4) + da_3857 = xr.DataArray( + arr, dims=['y', 'x'], coords={'y': y, 'x': x}, + attrs={'nodata': -9999.0, 'crs': 'EPSG:3857'}, + ) + src_3857 = str(tmp_path / f"{_uniq('src_3857')}.tif") + to_geotiff(da_3857, src_3857) + + body = _simple_source_xml(src_4326) + "\n" + _simple_source_xml(src_3857) + xml = _vrt_xml(body=body, srs='EPSG:4326') + vrt_path = _write_vrt(tmp_path, xml) + + with pytest.raises((ValueError, NotImplementedError)) as excinfo: + _package_read_vrt(vrt_path) + msg = str(excinfo.value).lower() + assert any(k in msg for k in ('crs', 'srs', 'projection', 'epsg')) + + +@pytest.mark.xfail( + reason="mixed band dtype rejection still silently widens to a " + "common dtype; the validator needs to honour the contract", + strict=False, +) +def test_mixed_source_dtype_ambiguous_widening_rejected(tmp_path): + """Two bands declaring incompatible dtypes (``UInt16`` and + ``Float32``) silently widen the output buffer today. The contract + for the release is to reject mixed band dtypes unless the user opts + in.""" + src_u16 = _write_src_float32_geotiff(tmp_path, dtype=np.uint16) + src_f32 = _write_src_float32_geotiff(tmp_path, dtype=np.float32) + body_b1 = _simple_source_xml(src_u16) + body_b2 = _simple_source_xml(src_f32) + xml = f""" + EPSG:4326 + 0.0, 1.0, 0.0, 0.0, 0.0, -1.0 + +{body_b1} + + +{body_b2} + +""" + vrt_path = _write_vrt(tmp_path, xml) + + with pytest.raises((ValueError, NotImplementedError)) as excinfo: + _package_read_vrt(vrt_path) + msg = str(excinfo.value).lower() + assert any(k in msg for k in ('dtype', 'datatype', 'mixed')) + + +def test_supported_simple_vrt_round_trips_via_open_geotiff(tmp_path): + """Sanity anchor: a supported single-source VRT opens cleanly via + ``open_geotiff``, so the negative tests are exercising a live + extension-dispatch path rather than a broken accessor.""" + src_path = _write_src_float32_geotiff(tmp_path) + body = _simple_source_xml(src_path) + xml = _vrt_xml(body=body) + vrt_path = _write_vrt(tmp_path, xml) + da = open_geotiff(vrt_path) + assert da.shape == (4, 4) + + +# --------------------------------------------------------------------------- +# Reader-error narrowing: ``read_vrt`` historically ``except Exception``-ed +# every source read, swallowing real bugs. After #1670 the catch is +# narrowed to I/O / parse / codec-decode errors only. +# +# The matrix is parametrised over exception class x mode: +# * ``io_or_parse`` (default mode): warn-and-continue +# * ``io_or_parse`` (strict mode): re-raise +# * ``bug`` (any mode): propagate +# --------------------------------------------------------------------------- + + +@pytest.fixture +def clear_strict_env(monkeypatch): + """``XRSPATIAL_GEOTIFF_STRICT`` unset for default-mode tests.""" + monkeypatch.delenv('XRSPATIAL_GEOTIFF_STRICT', raising=False) + + +@pytest.fixture +def set_strict_env(monkeypatch): + """``XRSPATIAL_GEOTIFF_STRICT=1`` for strict-mode tests.""" + monkeypatch.setenv('XRSPATIAL_GEOTIFF_STRICT', '1') + + +def _write_simple_vrt(tmp_path, src_path, *, name: str | None = None): + """Write a 4x4 single-source float VRT pointing at ``src_path``.""" + name = name or _uniq("simple") + vrt_path = tmp_path / f"{name}.vrt" + vrt_path.write_text( + '\n' + ' \n' + ' 0, 1, 0, 0, 0, -1\n' + ' \n' + ' -9999\n' + ' \n' + f' {src_path}' + '\n' + ' 1\n' + ' \n' + ' \n' + ' \n' + ' \n' + '\n' + ) + return vrt_path + + +def _patch_read_to_array(monkeypatch, exc): + """Make ``_reader.read_to_array`` raise ``exc`` on every call. + + ``read_vrt`` does a local ``from ._reader import read_to_array`` + inside the function body, so patching the source attribute is + enough -- the import picks up the stub at call time.""" + from xrspatial.geotiff import _reader + + def _boom(*args, **kwargs): + raise exc + + monkeypatch.setattr(_reader, 'read_to_array', _boom) + + +def _has_zstandard(): + try: + import zstandard # noqa: F401 + except ImportError: + return False + return True + + +def _zstd_error_or_skip(): + """Resolve the ``zstandard.ZstdError`` class or skip the case. + + Used at test-collection time. The wrapper path that raises this + exception is unreachable when ``zstandard`` is not installed, so + skipping is the right behaviour rather than synthesising a stub. + """ + if not _has_zstandard(): + pytest.skip("zstandard not installed") + from zstandard import ZstdError + return ZstdError("synthetic zstd") + + +_NARROW_EXCEPT_IO_OR_PARSE_CASES = [ + pytest.param( + FileNotFoundError("synthetic missing"), + 'FileNotFoundError', + id="io[file-not-found]", + ), + pytest.param( + ValueError("bad header"), + 'ValueError', + id="parse[value-error]", + ), + pytest.param( + struct.error("short buffer"), + 'error', # ``struct.error`` type name is ``error`` + id="parse[struct-error]", + ), + pytest.param( + PermissionError("denied"), + 'PermissionError', + id="io[permission-error]", + ), + pytest.param( + zlib.error("synthetic deflate"), + 'error', # ``zlib.error`` type name is ``error`` + id="codec[zlib-error]", + ), +] + + +@pytest.mark.parametrize( + "exc, expected_type_name", + _NARROW_EXCEPT_IO_OR_PARSE_CASES, +) +def test_narrow_except_io_or_parse_warns_in_default_mode( + clear_strict_env, monkeypatch, tmp_path, exc, expected_type_name, +): + """I/O and parse failures warn-and-continue when + ``missing_sources='warn'`` is opted in. The warning message names + the source and the underlying exception type.""" + from xrspatial.geotiff import read_vrt + + src_path = tmp_path / f"src_{_uniq('narrow')}.tif" + src_path.write_bytes(b'placeholder') + vrt_path = _write_simple_vrt(tmp_path, str(src_path)) + + _patch_read_to_array(monkeypatch, exc) + + with warnings.catch_warnings(record=True) as w: + warnings.simplefilter('always') + da = read_vrt(str(vrt_path), missing_sources='warn') + + assert da.shape == (4, 4) + fallback = [ + x for x in w if issubclass(x.category, GeoTIFFFallbackWarning) + ] + assert fallback + msgs = ' '.join(str(x.message) for x in fallback) + assert 'VRT source' in msgs + assert expected_type_name in msgs + + +@pytest.mark.parametrize( + "exc, _expected_type_name", + _NARROW_EXCEPT_IO_OR_PARSE_CASES, +) +def test_narrow_except_io_or_parse_reraises_in_strict_mode( + set_strict_env, monkeypatch, tmp_path, exc, _expected_type_name, +): + """Strict mode re-raises all I/O / parse / codec-decode failures.""" + from xrspatial.geotiff import read_vrt + + src_path = tmp_path / f"src_{_uniq('strict')}.tif" + src_path.write_bytes(b'placeholder') + vrt_path = _write_simple_vrt(tmp_path, str(src_path)) + + _patch_read_to_array(monkeypatch, exc) + + with pytest.raises(type(exc)): + read_vrt(str(vrt_path)) + + +@pytest.mark.parametrize( + "exc", + [ + pytest.param(RuntimeError("synthetic bug"), id="bug[runtime-error]"), + pytest.param(MemoryError("OOM"), id="bug[memory-error]"), + ], +) +def test_narrow_except_bug_classes_propagate_in_default_mode( + clear_strict_env, monkeypatch, tmp_path, exc, +): + """Non-I/O bugs (``RuntimeError`` here as a stand-in, plus + ``MemoryError``) must propagate even in default mode -- they are + real failures, not "unreadable source" cases.""" + from xrspatial.geotiff import read_vrt + + src_path = tmp_path / f"src_{_uniq('bug')}.tif" + src_path.write_bytes(b'placeholder') + vrt_path = _write_simple_vrt(tmp_path, str(src_path)) + + _patch_read_to_array(monkeypatch, exc) + + with pytest.raises(type(exc)): + read_vrt(str(vrt_path)) + + +def test_narrow_except_runtime_error_propagates_in_strict_mode( + set_strict_env, monkeypatch, tmp_path, +): + """Strict mode propagates non-I/O bugs too (double-checks the + runtime-error case under the strict flag).""" + from xrspatial.geotiff import read_vrt + + src_path = tmp_path / f"src_{_uniq('bug_strict')}.tif" + src_path.write_bytes(b'placeholder') + vrt_path = _write_simple_vrt(tmp_path, str(src_path)) + + _patch_read_to_array(monkeypatch, RuntimeError("synthetic strict")) + + with pytest.raises(RuntimeError, match='synthetic strict'): + read_vrt(str(vrt_path)) + + +@pytest.mark.skipif(not _has_zstandard(), + reason="zstandard not installed") +def test_narrow_except_zstd_error_warns_in_default_mode( + clear_strict_env, monkeypatch, tmp_path, +): + """``zstandard.ZstdError`` is the zstd-codec equivalent of + ``zlib.error``: warn-and-continue under the lenient policy so a + single corrupt ZSTD tile does not abort the whole mosaic.""" + from zstandard import ZstdError + + from xrspatial.geotiff import read_vrt + + src_path = tmp_path / f"src_{_uniq('zstd')}.tif" + src_path.write_bytes(b'placeholder') + vrt_path = _write_simple_vrt(tmp_path, str(src_path)) + + _patch_read_to_array(monkeypatch, ZstdError("synthetic zstd")) + + with warnings.catch_warnings(record=True) as w: + warnings.simplefilter('always') + da = read_vrt(str(vrt_path), missing_sources='warn') + + assert da.shape == (4, 4) + fallback = [ + x for x in w if issubclass(x.category, GeoTIFFFallbackWarning) + ] + assert fallback + msgs = ' '.join(str(x.message) for x in fallback) + assert 'VRT source' in msgs + assert 'ZstdError' in msgs + + +@pytest.mark.skipif(not _has_zstandard(), + reason="zstandard not installed") +def test_narrow_except_zstd_error_reraises_in_strict_mode( + set_strict_env, monkeypatch, tmp_path, +): + from zstandard import ZstdError + + from xrspatial.geotiff import read_vrt + + src_path = tmp_path / f"src_{_uniq('zstd_strict')}.tif" + src_path.write_bytes(b'placeholder') + vrt_path = _write_simple_vrt(tmp_path, str(src_path)) + + _patch_read_to_array(monkeypatch, ZstdError("synthetic zstd strict")) + + with pytest.raises(ZstdError, match='synthetic zstd strict'): + read_vrt(str(vrt_path)) + + +# --------------------------------------------------------------------------- +# Path containment: ``parse_vrt`` rejects sources that resolve outside +# ``vrt_dir`` unless the absolute path lands under one of the +# ``XRSPATIAL_VRT_ALLOWED_ROOTS`` entries. +# --------------------------------------------------------------------------- + + +@pytest.fixture +def clear_allowlist_env(monkeypatch): + """Make sure no stray XRSPATIAL_VRT_ALLOWED_ROOTS leaks across tests.""" + monkeypatch.delenv('XRSPATIAL_VRT_ALLOWED_ROOTS', raising=False) + + +def _unique_dir(tmp_path, label: str) -> str: + """Sub-directory carrying a uuid so parallel workers cannot collide.""" + d = tmp_path / f"{label}_{uuid.uuid4().hex[:8]}" + d.mkdir() + return str(d) + + +def _write_minimal_tif(path: str) -> None: + """4x4 float32 GeoTIFF the path-containment tests reference.""" + arr = np.arange(16, dtype=np.float32).reshape(4, 4) + y = np.linspace(1.0, 0.0, 4) + x = np.linspace(0.0, 1.0, 4) + da = xr.DataArray( + arr, dims=['y', 'x'], + coords={'y': y, 'x': x}, + attrs={'crs': 4326}, + ) + to_geotiff(da, path, compression='none') + + +def _build_containment_vrt(vrt_path: str, source_filename: str, + relative: str) -> None: + """Write a 4x4 single-band VRT pointing at ``source_filename``.""" + xml = ( + '\n' + ' 0, 1, 0, 0, 0, -1\n' + ' \n' + ' \n' + f' ' + f'{source_filename}\n' + ' 1\n' + ' \n' + ' \n' + ' \n' + ' \n' + '\n' + ) + with open(vrt_path, 'w') as f: + f.write(xml) + + +class TestPathContainment: + """Sources that resolve outside the VRT directory raise + ``ValueError`` at parse / read time. The ``allowlist`` opt-in + expands the trusted root set without weakening the relative-source + rejection.""" + + def test_relative_dotdot_traversal_rejected( + self, clear_allowlist_env, tmp_path, + ): + """A relative source resolving outside ``vrt_dir`` raises at + ``parse_vrt`` so the dangerous path never reaches + ``read_to_array``.""" + vrt_dir = _unique_dir(tmp_path, "trav_rel") + xml = ( + '\n' + ' \n' + ' \n' + ' ' + '../../../../../etc/passwd\n' + ' 1\n' + ' \n' + ' \n' + ' \n' + ' \n' + '\n' + ) + with pytest.raises(ValueError, match="outside the VRT directory"): + parse_vrt(xml, vrt_dir) + + def test_relative_symlink_traversal_rejected( + self, clear_allowlist_env, tmp_path, + ): + """A symlink under ``vrt_dir`` pointing outside is rejected via + ``realpath``.""" + vrt_dir = _unique_dir(tmp_path, "trav_sym") + outside_dir = _unique_dir(tmp_path, "trav_sym_outside") + outside_target = os.path.join(outside_dir, 'secret.tif') + _write_minimal_tif(outside_target) + + sym = os.path.join(vrt_dir, 'inside.tif') + try: + os.symlink(outside_target, sym) + except (OSError, NotImplementedError) as e: + pytest.skip(f"symlink not supported in this environment: {e}") + + vrt_path = os.path.join(vrt_dir, 'mosaic.vrt') + _build_containment_vrt(vrt_path, 'inside.tif', relative='1') + + with pytest.raises(ValueError, match="outside the VRT directory"): + _internal_read_vrt(vrt_path) + + def test_absolute_outside_vrt_dir_rejected( + self, clear_allowlist_env, tmp_path, + ): + vrt_dir = _unique_dir(tmp_path, "abs_outside") + outside_dir = _unique_dir(tmp_path, "abs_outside_target") + outside_tif = os.path.join(outside_dir, 'data.tif') + _write_minimal_tif(outside_tif) + + vrt_path = os.path.join(vrt_dir, 'mosaic.vrt') + _build_containment_vrt(vrt_path, outside_tif, relative='0') + + with pytest.raises(ValueError, match="outside the VRT directory"): + _internal_read_vrt(vrt_path) + + def test_absolute_inside_vrt_dir_ok( + self, clear_allowlist_env, tmp_path, + ): + """An absolute path that happens to resolve under ``vrt_dir`` + passes -- mirrors the writer's ``relative=False`` round-trip + case.""" + vrt_dir = _unique_dir(tmp_path, "abs_inside") + tif_path = os.path.join(vrt_dir, 'data.tif') + _write_minimal_tif(tif_path) + + vrt_path = os.path.join(vrt_dir, 'mosaic.vrt') + _build_containment_vrt(vrt_path, tif_path, relative='0') + + arr, _ = _internal_read_vrt(vrt_path) + assert arr.shape == (4, 4) + + def test_normal_relative_source_under_vrt_dir_ok( + self, clear_allowlist_env, tmp_path, + ): + """Happy-path regression: a plain relative source under the VRT + directory still reads fine.""" + vrt_dir = _unique_dir(tmp_path, "happy") + tif_path = os.path.join(vrt_dir, 'data.tif') + _write_minimal_tif(tif_path) + + vrt_path = os.path.join(vrt_dir, 'mosaic.vrt') + _build_containment_vrt(vrt_path, 'data.tif', relative='1') + + arr, _ = _internal_read_vrt(vrt_path) + assert arr.shape == (4, 4) + + def test_error_message_names_rejected_path( + self, clear_allowlist_env, tmp_path, + ): + """The ``ValueError`` mentions both the offending resolved path + and the trusted root so operators can diagnose the rejection.""" + vrt_dir = _unique_dir(tmp_path, "msg_check") + xml = ( + '\n' + ' \n' + ' \n' + ' ' + '../../etc/shadow\n' + ' 1\n' + ' \n' + ' \n' + ' \n' + ' \n' + '\n' + ) + with pytest.raises(ValueError) as excinfo: + parse_vrt(xml, vrt_dir) + msg = str(excinfo.value) + assert 'shadow' in msg + assert os.path.realpath(vrt_dir) in msg + + +class TestPathContainmentAllowlist: + """The ``XRSPATIAL_VRT_ALLOWED_ROOTS`` env var opts in cross-directory + reads. The opt-in is for absolute sources only -- relative sources + that try to escape ``vrt_dir`` stay rejected even when the resolved + path lands under an allowlisted root.""" + + def test_single_root_allows_outside_absolute(self, tmp_path, monkeypatch): + vrt_dir = _unique_dir(tmp_path, "allow_vrt") + outside_dir = _unique_dir(tmp_path, "allow_data") + outside_tif = os.path.join(outside_dir, 'data.tif') + _write_minimal_tif(outside_tif) + + vrt_path = os.path.join(vrt_dir, 'mosaic.vrt') + _build_containment_vrt(vrt_path, outside_tif, relative='0') + + monkeypatch.setenv('XRSPATIAL_VRT_ALLOWED_ROOTS', outside_dir) + arr, _ = _internal_read_vrt(vrt_path) + assert arr.shape == (4, 4) + + def test_multiple_roots_pathsep_separated(self, tmp_path, monkeypatch): + vrt_dir = _unique_dir(tmp_path, "multi_vrt") + dir_a = _unique_dir(tmp_path, "multi_a") + dir_b = _unique_dir(tmp_path, "multi_b") + tif_b = os.path.join(dir_b, 'data.tif') + _write_minimal_tif(tif_b) + + vrt_path = os.path.join(vrt_dir, 'mosaic.vrt') + _build_containment_vrt(vrt_path, tif_b, relative='0') + + monkeypatch.setenv( + 'XRSPATIAL_VRT_ALLOWED_ROOTS', + os.pathsep.join([dir_a, dir_b]), + ) + arr, _ = _internal_read_vrt(vrt_path) + assert arr.shape == (4, 4) + + def test_relative_source_escape_still_rejected(self, tmp_path, monkeypatch): + """A relative source that escapes ``vrt_dir`` stays rejected even + when the resolved path lands under an allowlisted root. + Relative paths declare intent to stay inside the VRT directory; + honouring that intent prevents chaining an allowlist entry into + a relative-source traversal.""" + vrt_dir = _unique_dir(tmp_path, "rel_with_allow") + outside_dir = _unique_dir(tmp_path, "rel_with_allow_target") + outside_tif = os.path.join(outside_dir, 'data.tif') + _write_minimal_tif(outside_tif) + + rel = os.path.relpath(outside_tif, vrt_dir) + vrt_path = os.path.join(vrt_dir, 'mosaic.vrt') + _build_containment_vrt(vrt_path, rel, relative='1') + + monkeypatch.setenv('XRSPATIAL_VRT_ALLOWED_ROOTS', outside_dir) + with pytest.raises(ValueError, match="outside the VRT directory"): + _internal_read_vrt(vrt_path) + + def test_empty_entries_ignored(self, tmp_path, monkeypatch): + """Leading / trailing / embedded empty entries from stray + separators are skipped (and never grant access to ``/``).""" + vrt_dir = _unique_dir(tmp_path, "empty_entry_vrt") + outside_dir = _unique_dir(tmp_path, "empty_entry_data") + outside_tif = os.path.join(outside_dir, 'data.tif') + _write_minimal_tif(outside_tif) + + vrt_path = os.path.join(vrt_dir, 'mosaic.vrt') + _build_containment_vrt(vrt_path, outside_tif, relative='0') + + sep = os.pathsep + value = f"{sep}{outside_dir}{sep}{sep}" + monkeypatch.setenv('XRSPATIAL_VRT_ALLOWED_ROOTS', value) + arr, _ = _internal_read_vrt(vrt_path) + assert arr.shape == (4, 4) From 21757752da54c9137b67c32e6a887f25b351ca69 Mon Sep 17 00:00:00 2001 From: Brendan Collins Date: Mon, 25 May 2026 20:25:19 -0700 Subject: [PATCH 2/3] Address review nits and suggestions on VRT validation cluster (#2395) * Drop the dead `_zstd_error_or_skip()` helper; the two zstd tests use module-level `skipif(not _has_zstandard())` and import `ZstdError` inline. * Tighten `test_warped_subclass_band_rejected_via_open_geotiff` with a message-keyword assertion (`'warp'` or `'vrtwarped'`) so a regression that raises a generic `ValueError` without naming the failure mode fails the test. * Collapse `_unique_dir` onto `_uniq` so the uuid format is in one place. * Drop the alias dance (`_package_read_vrt = _public_read_vrt`); import the backend `read_vrt` directly as `_package_read_vrt`. --- .../geotiff/tests/vrt/test_validation.py | 38 ++++++++----------- 1 file changed, 15 insertions(+), 23 deletions(-) diff --git a/xrspatial/geotiff/tests/vrt/test_validation.py b/xrspatial/geotiff/tests/vrt/test_validation.py index 59974152..25e20e5a 100644 --- a/xrspatial/geotiff/tests/vrt/test_validation.py +++ b/xrspatial/geotiff/tests/vrt/test_validation.py @@ -41,7 +41,7 @@ open_geotiff, to_geotiff, ) -from xrspatial.geotiff._backends.vrt import read_vrt as _public_read_vrt +from xrspatial.geotiff._backends.vrt import read_vrt as _package_read_vrt from xrspatial.geotiff._errors import ( GeoTIFFAmbiguousMetadataError, MixedBandMetadataError, @@ -58,12 +58,10 @@ ) from xrspatial.geotiff._writer import write -# ``xrspatial.geotiff.read_vrt`` is the public alias re-exported from the -# package init; the backend module ``_backends.vrt.read_vrt`` is the same -# callable. The 2321 fixtures used the package alias; the 2371 fixtures -# imported the backend module directly. Use one name for the rest of the -# module so the parametrise IDs stay legible. -_package_read_vrt = _public_read_vrt +# ``xrspatial.geotiff.read_vrt`` (re-exported from the package init) is the +# same callable as ``_backends.vrt.read_vrt``; the parametrise IDs below +# label the backend-module path as the "package" entry point because that +# is what the public alias resolves to. # --------------------------------------------------------------------------- @@ -594,7 +592,10 @@ def test_warp_options_dataset_rejected_via_entry_points(tmp_path, reader): def test_warped_subclass_band_rejected_via_open_geotiff(tmp_path): """``subClass="VRTWarpedRasterBand"`` is the band-level warped marker; ``open_geotiff`` must reject it too so callers cannot slip - a warped VRT through the public accessor.""" + a warped VRT through the public accessor. The message has to name + the failure mode (``warp`` or ``vrtwarped``) so a regression that + raises a generic ``ValueError`` without identifying the cause does + not slip past the gate.""" src_path = _write_src_float32_geotiff(tmp_path) warped_xml = f""" EPSG:4326 @@ -605,8 +606,12 @@ def test_warped_subclass_band_rejected_via_open_geotiff(tmp_path): """ vrt_path = _write_vrt(tmp_path, warped_xml) with pytest.raises((ValueError, NotImplementedError, RuntimeError, - UnsupportedGeoTIFFFeatureError)): + UnsupportedGeoTIFFFeatureError)) as excinfo: open_geotiff(vrt_path) + msg = str(excinfo.value).lower() + assert any(k in msg for k in ('warp', 'vrtwarped')), ( + f"warped-VRT rejection must name 'warp' or 'vrtwarped'; got: {msg!r}" + ) # --------------------------------------------------------------------------- @@ -1033,19 +1038,6 @@ def _has_zstandard(): return True -def _zstd_error_or_skip(): - """Resolve the ``zstandard.ZstdError`` class or skip the case. - - Used at test-collection time. The wrapper path that raises this - exception is unreachable when ``zstandard`` is not installed, so - skipping is the right behaviour rather than synthesising a stub. - """ - if not _has_zstandard(): - pytest.skip("zstandard not installed") - from zstandard import ZstdError - return ZstdError("synthetic zstd") - - _NARROW_EXCEPT_IO_OR_PARSE_CASES = [ pytest.param( FileNotFoundError("synthetic missing"), @@ -1235,7 +1227,7 @@ def clear_allowlist_env(monkeypatch): def _unique_dir(tmp_path, label: str) -> str: """Sub-directory carrying a uuid so parallel workers cannot collide.""" - d = tmp_path / f"{label}_{uuid.uuid4().hex[:8]}" + d = tmp_path / _uniq(label) d.mkdir() return str(d) From 61287de412c0d72b45fa8861762fc7203e74011d Mon Sep 17 00:00:00 2001 From: Brendan Collins Date: Mon, 25 May 2026 20:26:14 -0700 Subject: [PATCH 3/3] Remove temporary CLUSTER_AUDIT_PR2.md per epic #2390 contract (#2395) The audit table served its purpose during review (mapping every old ``file::test`` to its new home in ``vrt/test_validation.py``). Per the epic, the audit is deleted on the same branch before merge so it does not land on main. --- xrspatial/geotiff/tests/CLUSTER_AUDIT_PR2.md | 130 ------------------- 1 file changed, 130 deletions(-) delete mode 100644 xrspatial/geotiff/tests/CLUSTER_AUDIT_PR2.md diff --git a/xrspatial/geotiff/tests/CLUSTER_AUDIT_PR2.md b/xrspatial/geotiff/tests/CLUSTER_AUDIT_PR2.md deleted file mode 100644 index af276240..00000000 --- a/xrspatial/geotiff/tests/CLUSTER_AUDIT_PR2.md +++ /dev/null @@ -1,130 +0,0 @@ -# CLUSTER_AUDIT_PR2.md — VRT validation cluster - -Temporary audit table tracking every old `file::test` and where it lands -in the consolidated `vrt/test_validation.py`. Deleted in a follow-up -commit on the same branch before merge per the epic #2390 contract. - -## File mapping summary - -| Old file | New file | Status | -|---|---|---| -| `test_vrt_validation_2321.py` | `vrt/test_validation.py` | folded | -| `test_vrt_capability_validator_2371.py` | `vrt/test_validation.py` | folded | -| `test_vrt_unsupported_2370.py` | `vrt/test_validation.py` | folded | -| `test_vrt_narrow_except_1670.py` | `vrt/test_validation.py` | folded | -| `test_vrt_path_containment_1671.py` | `vrt/test_validation.py` | folded | - -## Test mapping (old → new) - -### From `test_vrt_validation_2321.py` - -| Old test | New location | Notes | -|---|---|---| -| `test_vrt_unsupported_error_is_geotiff_metadata_error` | `test_vrt_unsupported_error_subclass_contract` | identical assertions | -| `test_zero_bands_raises_vrt_unsupported` | `TestValidatorRules::test_zero_bands_rejected` | identical assertion | -| `test_zero_bands_parity_across_entry_points` | `test_zero_bands_parity_across_entry_points` | helper now `_expect_same_error` | -| `test_complex_dtype_band_rejected_by_validator` | `TestValidatorRules::test_complex_dtype_band_rejected` | identical assertions | -| `test_rotated_transform_rejected_without_opt_in` | `TestValidatorRules::test_rotated_transform_rejected_without_opt_in` | both opt-out and opt-in paths preserved | -| `test_negative_src_rect_size_rejected` | `TestValidatorRules::test_geometry_rules_rejected[reject[negative-src-size]]` | parametrised | -| `test_negative_src_rect_offset_rejected` | `TestValidatorRules::test_geometry_rules_rejected[reject[negative-src-offset]]` | parametrised | -| `test_negative_dst_rect_size_rejected` | `TestValidatorRules::test_geometry_rules_rejected[reject[negative-dst-size]]` | parametrised | -| `test_dst_rect_outside_vrt_extent_rejected` | `TestValidatorRules::test_geometry_rules_rejected[reject[dst-outside-extent]]` | parametrised | -| `test_zero_pixel_size_rejected` | `TestValidatorRules::test_geometry_rules_rejected[reject[zero-pixel-size]]` | parametrised | -| `test_unsupported_resample_alg_rejected_at_validate` | `TestValidatorRules::test_unsupported_resample_alg_rejected` | identical assertion | -| `test_mixed_band_nodata_rejected_without_opt_in` | `test_mixed_band_nodata_rejected_without_opt_in` | identical assertions | -| `test_unparseable_crs_rejected_without_opt_in` | `TestValidatorRules::test_unparseable_crs_rejected_without_opt_in` | both opt-out and opt-in paths preserved | -| `test_resample_parity_across_entry_points` | `test_resample_parity_across_entry_points` | uses `_expect_same_error` | -| `test_rotated_parity_across_entry_points` | `test_rotated_parity_across_entry_points` | uses `_expect_same_error` | -| `test_unsupported_resample_chunked_raises_at_build` | `test_unsupported_resample_chunked_raises_at_build` | identical assertion | -| `test_well_formed_vrt_validates_silently` | `TestValidatorRules::test_well_formed_vrt_validates_silently` | identical assertions | - -### From `test_vrt_capability_validator_2371.py` - -| Old test | New location | Notes | -|---|---|---| -| `test_validate_vrt_capability_alias_resolves_to_validate_parsed_vrt` | `test_validate_vrt_capability_is_validate_parsed_vrt` | identical assertion | -| `test_nested_vrt_rejected_at_validator` | `test_nested_vrt_message_names_outer_and_inner` | identical assertions (message, outer path, inner basename, keyword) | -| `test_nested_vrt_uppercase_extension_rejected` | `test_nested_vrt_uppercase_extension_rejected` | identical assertion | -| `test_nested_vrt_rejected_via_public_read_vrt` | `test_nested_vrt_rejected_via_entry_points[entry[package-read_vrt]]` | parametrised entry-point matrix | -| `test_nested_vrt_rejected_via_open_geotiff` | `test_nested_vrt_rejected_via_entry_points[entry[open_geotiff]]` | parametrised | -| `test_nested_vrt_rejected_via_internal_read_vrt` | `test_nested_vrt_rejected_via_entry_points[entry[internal-read_vrt]]` | parametrised | -| `test_warp_options_dataset_level_rejected_at_parse` | `test_warp_options_rejected_at_parse[warp[dataset-level]]` | parametrised over dataset / band scope | -| `test_warp_options_dataset_level_rejected_via_public_read_vrt` | `test_warp_options_dataset_rejected_via_entry_points[entry[package-read_vrt]]` | parametrised | -| `test_warp_options_dataset_level_rejected_via_internal_read_vrt` | `test_warp_options_dataset_rejected_via_entry_points[entry[internal-read_vrt]]` | parametrised | -| `test_warp_options_band_level_rejected` | `test_warp_options_rejected_at_parse[warp[band-level]]` | parametrised | -| `test_use_mask_band_true_rejected_at_validator` | `test_use_mask_band_message_names_source` | identical assertions | -| `test_use_mask_band_truthy_spellings_rejected[true/True/TRUE/1]` | `test_use_mask_band_truthy_spellings_rejected[truthy[true]/[True]/[TRUE]/[1]]` | descriptive IDs | -| `test_use_mask_band_false_is_accepted` | `test_use_mask_band_false_is_accepted` | identical | -| `test_use_mask_band_non_canonical_truthy_accepted[yes/on/Y]` | `test_use_mask_band_non_canonical_truthy_accepted[non-canonical[yes]/[on]/[Y]]` | descriptive IDs | -| `test_use_mask_band_rejected_via_public_read_vrt` | `test_use_mask_band_rejected_via_entry_points[entry[package-read_vrt]]` | parametrised | -| `test_use_mask_band_rejected_via_internal_read_vrt` | `test_use_mask_band_rejected_via_entry_points[entry[internal-read_vrt]]` | parametrised | -| `test_per_source_mask_band_rejected_at_validator` | `test_per_source_mask_band_message_names_source` | identical assertions | -| `test_resample_alg_now_rejected_at_internal_read_vrt` | `test_resample_alg_rejected_at_internal_read_vrt` | identical assertion | -| `test_nested_vrt_error_is_value_error` | `test_nested_vrt_error_remains_value_error_subclass` | identical assertions | - -### From `test_vrt_unsupported_2370.py` - -The `_assert_raises_or_xfail` helper from the original file is gone; PR -1's validator landed, so most cases assert directly. Two cases that -were already `xfail` in the original (mixed-CRS, mixed-dtype widening) -stay under `pytest.mark.xfail(strict=False)` until the validator delivers -the rejection contract. - -| Old test | New location | Notes | -|---|---|---| -| `test_warped_vrt_subclass_raises` | `test_warped_subclass_band_rejected_via_open_geotiff` | direct assertion (no xfail wrapper) | -| `test_warped_vrt_gdalwarpoptions_raises` | `test_warp_options_rejected_at_parse[warp[dataset-level]]` (and `..._via_entry_points`) | already covered by 2371 fold | -| `test_warped_vrt_open_geotiff_raises` | `test_warped_subclass_band_rejected_via_open_geotiff` | open_geotiff path preserved | -| `test_nested_vrt_source_raises` | `test_nested_vrt_rejected_via_entry_points[entry[package-read_vrt]]` | parametrised matrix | -| `test_nested_vrt_open_geotiff_raises` | `test_nested_vrt_rejected_via_entry_points[entry[open_geotiff]]` | parametrised matrix | -| `test_mixed_source_crs_raises` | `test_mixed_source_crs_rejected` | preserved as `xfail(strict=False)`; same assertion shape | -| `test_mixed_source_dtype_unsupported_complex_raises` | `test_mixed_source_dtype_complex_rejected` | direct assertion | -| `test_mixed_source_dtype_ambiguous_widening_raises` | `test_mixed_source_dtype_ambiguous_widening_rejected` | preserved as `xfail(strict=False)` | -| `test_mixed_source_band_count_raises` | `test_mixed_source_band_count_rejected` | direct assertion | -| `test_complex_mask_source_raises` | `test_dataset_level_mask_band_rejected` | direct assertion | -| `test_unsupported_resample_alg_raises[Bilinear/Cubic/Lanczos/Average/Mode]` | `test_unsupported_resample_alg_rejected_end_to_end[entry[package-read_vrt]-resample[]]` | merged with open_geotiff parametrise | -| `test_unsupported_resample_alg_open_geotiff` | `test_unsupported_resample_alg_rejected_end_to_end[entry[open_geotiff]-resample[cubic]]` | covered by full alg × entry matrix | -| `test_supported_simple_vrt_round_trips_via_open_geotiff` | `test_supported_simple_vrt_round_trips_via_open_geotiff` | identical assertion | - -### From `test_vrt_narrow_except_1670.py` - -The matrix is parametrised over exception class × mode rather than one -test per exception. The fixtures `clear_strict_env` and -`set_strict_env` are reused unchanged. - -| Old test | New location | Notes | -|---|---|---| -| `test_runtime_error_propagates_default_mode` | `test_narrow_except_bug_classes_propagate_in_default_mode[bug[runtime-error]]` | parametrised | -| `test_runtime_error_propagates_strict_mode` | `test_narrow_except_runtime_error_propagates_in_strict_mode` | dedicated case | -| `test_file_not_found_warns_and_continues` | `test_narrow_except_io_or_parse_warns_in_default_mode[io[file-not-found]]` | parametrised | -| `test_file_not_found_strict_reraises` | `test_narrow_except_io_or_parse_reraises_in_strict_mode[io[file-not-found]]` | parametrised | -| `test_value_error_warns_and_continues` | `test_narrow_except_io_or_parse_warns_in_default_mode[parse[value-error]]` | parametrised | -| `test_value_error_strict_reraises` | `test_narrow_except_io_or_parse_reraises_in_strict_mode[parse[value-error]]` | parametrised | -| `test_struct_error_warns_and_continues` | `test_narrow_except_io_or_parse_warns_in_default_mode[parse[struct-error]]` | parametrised | -| `test_permission_error_warns_and_continues` | `test_narrow_except_io_or_parse_warns_in_default_mode[io[permission-error]]` | parametrised | -| `test_memory_error_propagates_default_mode` | `test_narrow_except_bug_classes_propagate_in_default_mode[bug[memory-error]]` | parametrised | -| `test_zlib_error_warns_and_continues` | `test_narrow_except_io_or_parse_warns_in_default_mode[codec[zlib-error]]` | parametrised | -| `test_zlib_error_strict_reraises` | `test_narrow_except_io_or_parse_reraises_in_strict_mode[codec[zlib-error]]` | parametrised | -| `test_zstd_error_warns_and_continues_if_available` | `test_narrow_except_zstd_error_warns_in_default_mode` | kept as standalone; `pytest.importorskip` replaced with module-level `skipif(not _has_zstandard())` | -| (new) | `test_narrow_except_zstd_error_reraises_in_strict_mode` | added strict-mode case for parity with zlib (closes the matrix; previously only the warn path was covered for zstd) | - -### From `test_vrt_path_containment_1671.py` - -Folded into two classes (`TestPathContainment`, `TestPathContainmentAllowlist`). -The `_clear_allowlist_env` autouse fixture is replaced by an explicit -`clear_allowlist_env` fixture on the non-allowlist tests so the -allowlist class can set the env var via `monkeypatch.setenv` without a -race against the autouse delenv. - -| Old test | New location | Notes | -|---|---|---| -| `test_relative_source_with_dotdot_traversal_rejected` | `TestPathContainment::test_relative_dotdot_traversal_rejected` | identical assertion | -| `test_relative_source_symlink_traversal_rejected` | `TestPathContainment::test_relative_symlink_traversal_rejected` | identical assertion | -| `test_absolute_source_outside_vrt_dir_rejected` | `TestPathContainment::test_absolute_outside_vrt_dir_rejected` | identical assertion | -| `test_absolute_source_inside_vrt_dir_ok` | `TestPathContainment::test_absolute_inside_vrt_dir_ok` | identical assertion | -| `test_absolute_source_allowlisted_root_passes` | `TestPathContainmentAllowlist::test_single_root_allows_outside_absolute` | identical assertion | -| `test_allowlist_supports_multiple_roots` | `TestPathContainmentAllowlist::test_multiple_roots_pathsep_separated` | identical assertion | -| `test_allowlist_does_not_cover_traversal_via_relative_source` | `TestPathContainmentAllowlist::test_relative_source_escape_still_rejected` | identical assertion | -| `test_allowlist_empty_entries_ignored` | `TestPathContainmentAllowlist::test_empty_entries_ignored` | identical assertion | -| `test_normal_relative_source_under_vrt_dir` | `TestPathContainment::test_normal_relative_source_under_vrt_dir_ok` | identical assertion | -| `test_error_message_names_rejected_path` | `TestPathContainment::test_error_message_names_rejected_path` | identical assertion |