geotiff tests: consolidate integration cluster (http, dask pipeline, gpu pipeline)#2411
Conversation
PR 9 of the 11-PR epic #2390. Fold 26 integration-cluster test files into three parametrised files under xrspatial/geotiff/tests/integration/: - integration/test_http_sources.py (17 HTTP files, 218 tests) - integration/test_dask_pipeline.py (7 dask + accessor files, 61 tests) - integration/test_gpu_pipeline.py (1 GPU file, 5 tests) Each old file lands as a named section in its consolidated module. Helpers, fixtures, and classes are suffixed with the section id so cross-section names cannot collide. Top-level autouse fixtures from each source file lose their autouse flag and apply via an explicit @pytest.mark.usefixtures marker on the tests and classes of that section only. Without this, sections that set XRSPATIAL_GEOTIFF_ALLOW_PRIVATE_HOSTS=1 would leak that env var into the scheme_case tests that exist precisely to assert SSRF rejection in the production default. Issue-number suffixes on test names (_2266, _2026_05_15, _issue_A) are stripped per epic convention. The audit table in xrspatial/geotiff/tests/CLUSTER_AUDIT_PR9.md maps every old file::test to its new home (204/204 mapped) and is deleted on the final commit before approval. test_http_sources.py carries a module-level @requires_loopback marker so PR 11's removal of the pytest_collection_modifyitems socketserver hack does not regress coverage in sandboxes that deny loopback bind.
brendancol
left a comment
There was a problem hiding this comment.
PR Review: geotiff tests: consolidate integration cluster (http, dask pipeline, gpu pipeline)
Tests-only restructure. 26 source files folded into 3 consolidated files under integration/, 284 tests collected and green. The audit table maps 204/204 old tests. The autouse-to-usefixtures conversion correctly scopes section-specific env-var setting, so the SSRF rejection tests in the scheme_case section still see the production default. Module-level pytestmark = requires_loopback on the HTTP file lines up with PR 11's planned removal of the socketserver hook.
Blockers (must fix before merge)
None.
Suggestions (should fix, not blocking)
-
xrspatial/geotiff/tests/integration/test_gpu_pipeline.py:13: the file defines its own_gpu_available_dask_cupy_combined()helper instead of usingrequires_gpufromxrspatial.geotiff.tests._helpers.markers. The epic calls out the markers module as the one place forrequires_gpu. Swap the local helper forfrom ..._helpers.markers import requires_gpuand apply@requires_gputo the tests (or setpytestmark = requires_gpuat module scope). -
xrspatial/geotiff/tests/integration/test_http_sources.py:12,test_dask_pipeline.py:12,test_gpu_pipeline.py:10: duplicatedfrom __future__ import annotations. One sits in the module header, the second comes from the per-section import dedup. Future imports must be the first executable statement; two is harmless but noisy. Drop the duplicate. -
xrspatial/geotiff/tests/integration/test_http_sources.py:13-30: redundant import aliases for the same module.from xrspatial.geotiff import _reader as _reader_modandfrom xrspatial.geotiff import _reader as reader_modboth exist;from xrspatial.geotiff import open_geotiffappears in three separate lines. The script preserved each source file's import statements verbatim and only deduped exact-string matches. Collapse to one canonical alias each (_reader_modis used most; drop the unaliased and thereader_modvariants and rewrite references).
Nits (optional improvements)
-
The audit table in
CLUSTER_AUDIT_PR9.mdannotates parametrised test ids by matching against the first collected parametrize variant. For tests with multiple parametrize axes (e.g.test_private_host_rejected_regardless_of_scheme_case[127.0.0.1-HTTP]), only one variant is listed. Coverage is still traceable, but a one-line note in the audit's preamble would clarify that one row may cover several parametrize variants. -
test_http_sources.pykeeps multiple section-local_serve_<sid>,_RangeHandler_<sid>, and_stop_<sid>helpers that all do roughly the same thing. Out of scope here (the audit-driven approach mandates faithful preservation), but worth a follow-up issue to factor a shared loopback-server helper into_helpers/.
What looks good
- The autouse-fixture conversion is the load-bearing piece of this PR and it is correct:
_allow_loopback*fixtures that setXRSPATIAL_GEOTIFF_ALLOW_PRIVATE_HOSTS=1no longer leak across section boundaries. Thetest_private_host_rejected_regardless_of_scheme_casefamily in thescheme_casesection still asserts SSRF rejection because the marker is now scoped to its origin section. - 204/204 old test functions accounted for in the audit table.
- Module-level
pytestmark = requires_loopbackontest_http_sources.pyis the right primitive for PR 11's hook removal. - Helper, fixture, and class suffixing prevents cross-section name collisions.
Checklist
- Algorithm matches reference/paper -- N/A, tests-only restructure
- All implemented backends produce consistent results -- preserved, no source changes
- NaN handling is correct -- preserved
- Edge cases are covered by tests -- preserved (audit shows 1:1 mapping)
- Dask chunk boundaries handled correctly -- preserved
- No premature materialization or unnecessary copies -- N/A, tests-only
- Benchmark exists or is not needed -- N/A
- README feature matrix updated -- N/A
- Docstrings present and accurate -- module-level docstrings on each consolidated file
- test_gpu_pipeline.py: drop the section-local _gpu_available helper and use requires_gpu from _helpers/markers.py per the epic's single-source-of-truth rule. Apply as a module-level pytestmark. - Drop duplicated `from __future__ import annotations` in all three consolidated files (the per-section import dedup re-emitted it). - Collapse redundant import aliases in test_http_sources.py: one canonical `_reader as _reader_mod` alias (was both _reader_mod and reader_mod), one combined open_geotiff/read_geotiff_dask line, and one merged _reader symbol list (was 10 partial lines). - Same dedup for test_dask_pipeline.py: one combined open_geotiff, read_geotiff_dask, to_geotiff line. - CLUSTER_AUDIT_PR9.md: add a preamble note clarifying that one row in the New column can cover several parametrize variants of the same test function (a Nit from the review).
brendancol
left a comment
There was a problem hiding this comment.
Follow-up review after 761149f
All three Suggestions and the first Nit applied:
test_gpu_pipeline.pynow importsrequires_gpufrom_helpers/markers.pyand applies it as a module-levelpytestmark. The local_gpu_available_dask_cupy_combinedhelper is gone.- Duplicated
from __future__ import annotationsremoved from all three files. test_http_sources.pyimport block deduped: one_reader as _reader_modalias (was_reader_mod+reader_mod), oneopen_geotiff, read_geotiff_daskline (was three), one merged_readersymbol list (was 10 partial lines).test_dask_pipeline.pyimport block deduped: oneopen_geotiff, read_geotiff_dask, to_geotiffline.CLUSTER_AUDIT_PR9.mdpreamble clarifies that one row can cover several parametrize variants.
Second Nit (shared loopback-server helper in _helpers/) left for a follow-up issue: the audit-driven approach mandates faithful preservation, and a shared helper crosses section boundaries.
Re-ran pytest xrspatial/geotiff/tests/integration/ -q: 284 passed.
PR 9 moved HTTP and dask-pipeline tests into integration/, but the release-gate checklist still cited the old paths. The CI gate test_release_gate_cites_only_existing_test_files asserted on disk existence and failed. Paths updated: - test_dask_overview_level.py -> integration/test_dask_pipeline.py - test_cog_http_* -> integration/test_http_sources.py - test_http_read_all_bounded_2051.py + test_http_dask_allow_rotated_2130.py -> integration/test_http_sources.py - test_cloud_read_byte_limit_1928.py -> integration/test_http_sources.py
# Conflicts: # docs/source/reference/release_gate_geotiff.rst
# Conflicts: # docs/source/reference/release_gate_geotiff.rst
Closes #2402. PR 9 of the 11-PR epic #2390.
Summary
Fold 26 integration-cluster test files into three parametrised files under
xrspatial/geotiff/tests/integration/:integration/test_http_sources.py: 17 HTTP files, 218 testsintegration/test_dask_pipeline.py: 7 dask + accessor files, 61 testsintegration/test_gpu_pipeline.py: 1 GPU file, 5 testsTotal: 284 tests, all green.
How the merge handles cross-file collisions
Each old file lands as a named section in its consolidated module. Helpers, fixtures, and classes are suffixed with the section id (
_<section>) so cross-section names cannot collide.Top-level
autouse=Truefixtures from each source file lose their autouse flag and apply via an explicit@pytest.mark.usefixtures(...)marker on the tests and classes of that section only. Without this, sections that setXRSPATIAL_GEOTIFF_ALLOW_PRIVATE_HOSTS=1would leak that env var into thescheme_casetests, which exist precisely to assert SSRF rejection in the production default.Issue-number suffixes on test names (
_2266,_2026_05_15,_issue_A) are stripped per epic convention. Issue numbers stay in the git log and PR descriptions.@requires_loopback annotation
test_http_sources.pycarries a module-levelpytestmark = requires_loopback, so PR 11's removal of thepytest_collection_modifyitemssocketserver hack inconftest.pydoes not regress coverage on sandboxes that deny loopback bind.Audit
xrspatial/geotiff/tests/CLUSTER_AUDIT_PR9.mdmaps every oldfile::testto its new home (204/204 mapped). Deleted on the final commit before approval.Test plan
pytest xrspatial/geotiff/tests/integration/ -v: 284 passedpytest xrspatial/geotiff/tests/ -q: 5704 passed, 1 unrelated flake intest_streaming_codecs_2026_05_11.py(passes alone)