Skip to content

geotiff tests: consolidate integration cluster (http, dask pipeline, gpu pipeline)#2411

Merged
brendancol merged 6 commits into
mainfrom
issue-2402
May 26, 2026
Merged

geotiff tests: consolidate integration cluster (http, dask pipeline, gpu pipeline)#2411
brendancol merged 6 commits into
mainfrom
issue-2402

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Closes #2402. PR 9 of the 11-PR epic #2390.

Summary

Fold 26 integration-cluster test files into three parametrised files under xrspatial/geotiff/tests/integration/:

  • integration/test_http_sources.py: 17 HTTP files, 218 tests
  • integration/test_dask_pipeline.py: 7 dask + accessor files, 61 tests
  • integration/test_gpu_pipeline.py: 1 GPU file, 5 tests

Total: 284 tests, all green.

How the merge handles cross-file collisions

Each old file lands as a named section in its consolidated module. Helpers, fixtures, and classes are suffixed with the section id (_<section>) so cross-section names cannot collide.

Top-level autouse=True fixtures from each source file lose their autouse flag and apply via an explicit @pytest.mark.usefixtures(...) marker on the tests and classes of that section only. Without this, sections that set XRSPATIAL_GEOTIFF_ALLOW_PRIVATE_HOSTS=1 would leak that env var into the scheme_case tests, which exist precisely to assert SSRF rejection in the production default.

Issue-number suffixes on test names (_2266, _2026_05_15, _issue_A) are stripped per epic convention. Issue numbers stay in the git log and PR descriptions.

@requires_loopback annotation

test_http_sources.py carries a module-level pytestmark = requires_loopback, so PR 11's removal of the pytest_collection_modifyitems socketserver hack in conftest.py does not regress coverage on sandboxes that deny loopback bind.

Audit

xrspatial/geotiff/tests/CLUSTER_AUDIT_PR9.md maps every old file::test to its new home (204/204 mapped). Deleted on the final commit before approval.

Test plan

  • pytest xrspatial/geotiff/tests/integration/ -v: 284 passed
  • pytest xrspatial/geotiff/tests/ -q: 5704 passed, 1 unrelated flake in test_streaming_codecs_2026_05_11.py (passes alone)

PR 9 of the 11-PR epic #2390.

Fold 26 integration-cluster test files into three parametrised files
under xrspatial/geotiff/tests/integration/:

- integration/test_http_sources.py (17 HTTP files, 218 tests)
- integration/test_dask_pipeline.py (7 dask + accessor files, 61 tests)
- integration/test_gpu_pipeline.py (1 GPU file, 5 tests)

Each old file lands as a named section in its consolidated module.
Helpers, fixtures, and classes are suffixed with the section id so
cross-section names cannot collide.

Top-level autouse fixtures from each source file lose their autouse
flag and apply via an explicit @pytest.mark.usefixtures marker on the
tests and classes of that section only. Without this, sections that
set XRSPATIAL_GEOTIFF_ALLOW_PRIVATE_HOSTS=1 would leak that env var
into the scheme_case tests that exist precisely to assert SSRF
rejection in the production default.

Issue-number suffixes on test names (_2266, _2026_05_15, _issue_A)
are stripped per epic convention. The audit table in
xrspatial/geotiff/tests/CLUSTER_AUDIT_PR9.md maps every old
file::test to its new home (204/204 mapped) and is deleted on the
final commit before approval.

test_http_sources.py carries a module-level @requires_loopback marker
so PR 11's removal of the pytest_collection_modifyitems socketserver
hack does not regress coverage in sandboxes that deny loopback bind.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 26, 2026
Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: geotiff tests: consolidate integration cluster (http, dask pipeline, gpu pipeline)

Tests-only restructure. 26 source files folded into 3 consolidated files under integration/, 284 tests collected and green. The audit table maps 204/204 old tests. The autouse-to-usefixtures conversion correctly scopes section-specific env-var setting, so the SSRF rejection tests in the scheme_case section still see the production default. Module-level pytestmark = requires_loopback on the HTTP file lines up with PR 11's planned removal of the socketserver hook.

Blockers (must fix before merge)

None.

Suggestions (should fix, not blocking)

  • xrspatial/geotiff/tests/integration/test_gpu_pipeline.py:13: the file defines its own _gpu_available_dask_cupy_combined() helper instead of using requires_gpu from xrspatial.geotiff.tests._helpers.markers. The epic calls out the markers module as the one place for requires_gpu. Swap the local helper for from ..._helpers.markers import requires_gpu and apply @requires_gpu to the tests (or set pytestmark = requires_gpu at module scope).

  • xrspatial/geotiff/tests/integration/test_http_sources.py:12, test_dask_pipeline.py:12, test_gpu_pipeline.py:10: duplicated from __future__ import annotations. One sits in the module header, the second comes from the per-section import dedup. Future imports must be the first executable statement; two is harmless but noisy. Drop the duplicate.

  • xrspatial/geotiff/tests/integration/test_http_sources.py:13-30: redundant import aliases for the same module. from xrspatial.geotiff import _reader as _reader_mod and from xrspatial.geotiff import _reader as reader_mod both exist; from xrspatial.geotiff import open_geotiff appears in three separate lines. The script preserved each source file's import statements verbatim and only deduped exact-string matches. Collapse to one canonical alias each (_reader_mod is used most; drop the unaliased and the reader_mod variants and rewrite references).

Nits (optional improvements)

  • The audit table in CLUSTER_AUDIT_PR9.md annotates parametrised test ids by matching against the first collected parametrize variant. For tests with multiple parametrize axes (e.g. test_private_host_rejected_regardless_of_scheme_case[127.0.0.1-HTTP]), only one variant is listed. Coverage is still traceable, but a one-line note in the audit's preamble would clarify that one row may cover several parametrize variants.

  • test_http_sources.py keeps multiple section-local _serve_<sid>, _RangeHandler_<sid>, and _stop_<sid> helpers that all do roughly the same thing. Out of scope here (the audit-driven approach mandates faithful preservation), but worth a follow-up issue to factor a shared loopback-server helper into _helpers/.

What looks good

  • The autouse-fixture conversion is the load-bearing piece of this PR and it is correct: _allow_loopback* fixtures that set XRSPATIAL_GEOTIFF_ALLOW_PRIVATE_HOSTS=1 no longer leak across section boundaries. The test_private_host_rejected_regardless_of_scheme_case family in the scheme_case section still asserts SSRF rejection because the marker is now scoped to its origin section.
  • 204/204 old test functions accounted for in the audit table.
  • Module-level pytestmark = requires_loopback on test_http_sources.py is the right primitive for PR 11's hook removal.
  • Helper, fixture, and class suffixing prevents cross-section name collisions.

Checklist

  • Algorithm matches reference/paper -- N/A, tests-only restructure
  • All implemented backends produce consistent results -- preserved, no source changes
  • NaN handling is correct -- preserved
  • Edge cases are covered by tests -- preserved (audit shows 1:1 mapping)
  • Dask chunk boundaries handled correctly -- preserved
  • No premature materialization or unnecessary copies -- N/A, tests-only
  • Benchmark exists or is not needed -- N/A
  • README feature matrix updated -- N/A
  • Docstrings present and accurate -- module-level docstrings on each consolidated file

- test_gpu_pipeline.py: drop the section-local _gpu_available helper
  and use requires_gpu from _helpers/markers.py per the epic's
  single-source-of-truth rule. Apply as a module-level pytestmark.
- Drop duplicated `from __future__ import annotations` in all three
  consolidated files (the per-section import dedup re-emitted it).
- Collapse redundant import aliases in test_http_sources.py: one
  canonical `_reader as _reader_mod` alias (was both _reader_mod and
  reader_mod), one combined open_geotiff/read_geotiff_dask line, and
  one merged _reader symbol list (was 10 partial lines).
- Same dedup for test_dask_pipeline.py: one combined open_geotiff,
  read_geotiff_dask, to_geotiff line.
- CLUSTER_AUDIT_PR9.md: add a preamble note clarifying that one row
  in the New column can cover several parametrize variants of the
  same test function (a Nit from the review).
Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up review after 761149f

All three Suggestions and the first Nit applied:

  • test_gpu_pipeline.py now imports requires_gpu from _helpers/markers.py and applies it as a module-level pytestmark. The local _gpu_available_dask_cupy_combined helper is gone.
  • Duplicated from __future__ import annotations removed from all three files.
  • test_http_sources.py import block deduped: one _reader as _reader_mod alias (was _reader_mod + reader_mod), one open_geotiff, read_geotiff_dask line (was three), one merged _reader symbol list (was 10 partial lines).
  • test_dask_pipeline.py import block deduped: one open_geotiff, read_geotiff_dask, to_geotiff line.
  • CLUSTER_AUDIT_PR9.md preamble clarifies that one row can cover several parametrize variants.

Second Nit (shared loopback-server helper in _helpers/) left for a follow-up issue: the audit-driven approach mandates faithful preservation, and a shared helper crosses section boundaries.

Re-ran pytest xrspatial/geotiff/tests/integration/ -q: 284 passed.

PR 9 moved HTTP and dask-pipeline tests into integration/, but the
release-gate checklist still cited the old paths. The CI gate
test_release_gate_cites_only_existing_test_files asserted on disk
existence and failed.

Paths updated:
- test_dask_overview_level.py -> integration/test_dask_pipeline.py
- test_cog_http_* -> integration/test_http_sources.py
- test_http_read_all_bounded_2051.py + test_http_dask_allow_rotated_2130.py
  -> integration/test_http_sources.py
- test_cloud_read_byte_limit_1928.py -> integration/test_http_sources.py
# Conflicts:
#	docs/source/reference/release_gate_geotiff.rst
# Conflicts:
#	docs/source/reference/release_gate_geotiff.rst
@brendancol brendancol merged commit f82f385 into main May 26, 2026
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Consolidate integration / HTTP / dask-pipeline tests (epic #2390 PR 9)

1 participant