feat(waterdata): Add resume support for partial chunked queries#282
Draft
thodson-usgs wants to merge 6 commits into
Draft
feat(waterdata): Add resume support for partial chunked queries#282thodson-usgs wants to merge 6 commits into
thodson-usgs wants to merge 6 commits into
Conversation
For multi-value waterdata queries (e.g. monitoring_location_id with ~300+ sites), the GET URL produced by PR DOI-USGS#233 blows past the server's ~8 KB nginx buffer and the API returns HTTP 414. This PR adds a chunker that transparently splits long list params across sub-requests so each URL fits the byte budget. The chunker is a decorator applied to ``_fetch_once`` outside the existing ``@filters.chunked`` (CQL chunker), so list-chunking is the outer loop and filter-chunking is the inner loop: @chunking.multi_value_chunked(build_request=_construct_api_requests) @filters.chunked(build_request=_construct_api_requests) def _fetch_once(args): ... Key design points: - ``_plan_chunks`` greedy-halves the largest chunk across all dimensions until the worst-case sub-request fits ``url_limit`` (URL + body, via ``_request_bytes``, so POST routes are sized correctly). Cartesian product of per-dim partitions becomes the sub-request set; capped at ``max_chunks=1000``. - ``_filter_aware_probe_args`` coordinates with ``filters.chunked``: the planner probes URL length using a synthetic clause that matches the inner filter chunker's bail-floor size (longest single clause, scaled by worst-case URL encoding ratio). Without this coordination, the outer planner would raise ``RequestTooLarge`` on combinations the stacked chunkers can actually handle. - ``QuotaExhausted`` mid-call guard reads ``x-ratelimit-remaining`` after each sub-request; if it drops below ``quota_safety_floor=50``, the wrapper raises with the partial frame, completed-chunk offset, and last observed remaining quota — letting callers salvage or resume after the rate-limit window resets, rather than crash into a silent mid-pagination 429. - ``RequestTooLarge`` is raised when the smallest reducible plan still exceeds ``url_limit`` (every multi-value param at a singleton chunk and any chunkable filter at the inner chunker's bail floor) or when the cartesian product exceeds ``max_chunks``. - All defaults (``url_limit``, ``max_chunks``, ``quota_safety_floor``) resolve at call time, so monkey-patching ``filters._WATERDATA_URL_ BYTE_LIMIT`` for tests / non-default quotas affects the decorator uniformly. Public additions: - ``dataretrieval.waterdata.chunking.multi_value_chunked`` - ``dataretrieval.waterdata.chunking.RequestTooLarge`` - ``dataretrieval.waterdata.chunking.QuotaExhausted`` (carries ``partial_frame``, ``partial_response``, ``completed_chunks``, ``total_chunks``, ``remaining``) Tests (30 new): - ``_filter_aware_probe_args`` worst-case-clause modelling - ``_plan_chunks`` greedy halving, RequestTooLarge floor, filter- chunker coordination, ``max_chunks`` cap, lazy-default reads - ``multi_value_chunked`` pass-through, cartesian-product shape, end-to-end with stacked filter chunker - ``QuotaExhausted`` header parsing, mid-call abort, last-chunk no- abort, zero-floor disable - ``RequestTooLarge`` message contents and triggering conditions End-to-end correctness verified against the live API: identical per-site cell-for-cell output between unchunked (single call) and chunked (forced fan-out via patched limit) paths. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…queries When a chunked OGC call fails partway through, the chunker now raises PartialResult (or its QuotaExhausted subclass) carrying the combined partial DataFrame plus a ChunkManifest recording how many sub-requests of the cartesian-product plan completed. The same getter accepts the partial metadata back via a new resume_from= kwarg; the chunker validates the saved plan matches the fresh args and fetches only the remaining combinations. Callers concatenate the saved partial frame with the resume call's return value to reconstruct the full result. - ChunkManifest dataclass (frozen, hashable) with plan / completed / total / is_complete / remaining - PartialResult base exception with partial_frame, partial_response, manifest, and a lazy partial_metadata property; QuotaExhausted now subclasses it and carries the manifest instead of bare counts - BaseMetadata.chunk_manifest exposes the manifest end-to-end - multi_value_chunked wrapper validates resume_from, skips completed cartesian-product combinations via itertools.islice, wraps any sub-request exception as PartialResult with __cause__ preserved - resume_from= added to all 11 chunked-getter signatures - 14 new tests covering manifest properties, resume validation (mismatched plan / no manifest / already-complete / no chunking), PartialResult wrapping of fetch errors, end-to-end quota-exhaust- then-resume, and partial_metadata wrapping
Worked example in get_daily's docstring showing the canonical PartialResult catch / accumulate-partials / sleep-and-retry pattern, capped at a one-hour deadline matched to the API's hourly rate-limit window so structural failures surface rather than spin forever. Module-level multi_value_chunked docstring and the per-getter resume_from parameter doc now point to get_daily for the example.
The example doesn't belong in get_daily's docstring — it's a topical explanation of an API contract that applies to every chunked getter, not a usage demo of one function. Move it to a dedicated Sphinx user- guide page (waterdata_chunking.rst) covering the chunker's resume contract, the canonical retry-loop pattern with a one-hour deadline, the four resume-validation failure modes, and how to inspect the chunk manifest on successful calls. multi_value_chunked's module docstring and the per-getter resume_from parameter doc now cross-reference the new page.
…tial Both abort sites (quota-exhausted bail and sub-request exception) and the success path now share one helper that combines responses, restores the canonical URL, builds the manifest, and attaches it to the response — eliminating the three near-identical inline blocks. Message formatting moves to a single _partial_result_message() so the three "Catch ... to access .partial_frame and resume" strings collapse to one. The "resume_from" kwarg literal becomes _RESUME_FROM_KEY for consistency with _QUOTA_HEADER, and the args-strip uses the standard dict()+pop() idiom. PR-number references that would rot in public docstrings dropped.
ChunkManifest, PartialResult, QuotaExhausted, and RequestTooLarge are now available from ``dataretrieval.waterdata`` directly. Callers following the resume retry-loop pattern no longer need to reach into the ``dataretrieval.waterdata.chunking`` submodule to catch PartialResult — the public import matches where the getters live. multi_value_chunked's docstring now states explicitly that the wrapper catches every ``Exception`` (not just the three named example types) and that ``BaseException`` subclasses propagate unchanged, so callers know KeyboardInterrupt aborts a chunked call cleanly while a programmer-error TypeError still gets wrapped with its partial state. The userguide example uses the new top-level import.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR adds resumable partial-result support for chunked OGC waterdata queries, building on the multi-value chunking and pagination failure handling work.
Changes:
- Adds
ChunkManifest,PartialResult,QuotaExhausted, andresume_fromsupport for chunked calls. - Updates public
waterdatagetter signatures, metadata, docs, and NEWS. - Adds extensive chunking/resume tests.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
dataretrieval/waterdata/chunking.py |
Implements chunk planning, manifests, partial exceptions, quota handling, and resume behavior. |
dataretrieval/waterdata/utils.py |
Wires multi-value chunking around OGC fetches and updates paginated response metadata aggregation. |
dataretrieval/waterdata/filters.py |
Shares filter encoding-budget logic and restores canonical URLs/headers on filter-chunked responses. |
dataretrieval/waterdata/api.py |
Adds resume_from to supported public getter signatures and docs. |
dataretrieval/waterdata/__init__.py |
Exports new chunking-related public classes. |
dataretrieval/utils.py |
Exposes chunk_manifest through BaseMetadata. |
docs/source/userguide/waterdata_chunking.rst |
Adds user documentation for chunking and resume workflow. |
docs/source/userguide/index.rst |
Adds the new chunking guide to the user-guide toctree. |
NEWS.md |
Documents the resumable chunked-query behavior. |
tests/waterdata_test.py |
Adds planner, manifest, partial-result, quota, and resume tests. |
Comments suppressed due to low confidence (1)
dataretrieval/waterdata/chunking.py:708
QuotaExhausted.partial_frameis assembled before the public getter's normal post-processing runs, so callers receive raw_fetch_oncecolumns/dtypes for the saved partial data but post-processed columns/dtypes from the successful resume call. This breaks the advertised concat-to-reconstruct workflow for real getters; apply the sameget_ogc_datapost-processing to the exception's partial frame before it reaches callers, or raise the partial exception from a layer that already has processed frames.
raise QuotaExhausted(
partial_frame=partial_frame,
partial_response=partial_response,
manifest=manifest,
remaining=remaining,
)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -1,3 +1,7 @@ | |||
| **05/17/2026:** Chunked `waterdata` calls that fail partway through are now resumable. Any sub-request failure (quota exhaustion, mid-pagination 5xx/429, transport error) raises `PartialResult` (or its `QuotaExhausted` subclass) carrying the combined partial DataFrame, a `BaseMetadata.partial_metadata` accessor, and a `ChunkManifest` recording how many sub-requests of the cartesian-product plan completed. The same getter accepts the partial metadata via a new `resume_from=` kwarg; the chunker validates the saved plan matches the fresh args and fetches only the remaining cartesian-product combinations. Callers concatenate their saved partial DataFrame with the resume call's return value to reconstruct the full result. The manifest is also attached to `BaseMetadata.chunk_manifest` on successful chunked calls for observability. | |||
Comment on lines
+683
to
+688
| raise PartialResult( | ||
| partial_frame=partial_frame, | ||
| partial_response=partial_response, | ||
| manifest=manifest, | ||
| cause=exc, | ||
| ) from exc |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stacked on top of #280. This PR's diff includes #280's chunker commit; rebase / squash after #280 merges.
When a chunked OGC
waterdatacall fails partway through (quota exhaustion, mid-pagination 5xx/429, transport error), the chunker now raisesPartialResult(or itsQuotaExhaustedsubclass) carrying the combined partial DataFrame plus aChunkManifestthat records how many sub-requests of the cartesian-product plan completed. The same getter accepts the partial metadata back via a newresume_from=kwarg; the chunker validates the saved plan matches the fresh args and fetches only the remaining cartesian-product combinations. Callers concatenate their saved partial frame with the resume call's return value to reconstruct the full result.Caller flow
Design
ChunkManifestis a frozen dataclass pinning the normalized cartesian-product plan and the completed count. Pinning the plan (not just the input args) lets resume detect when a caller has changed their inputs between the original call and the retry — same-looking args that chunk differently would silently re-fetch wrong sub-ranges. Resume rejects four invalid states with explicit error messages: no manifest on the metadata, current args don't chunk, current plan differs from the saved plan, or the saved manifest is already complete.PartialResultis the new base exception;QuotaExhaustednow subclasses it and carries the manifest instead of the barecompleted_chunks/total_chunkscounts of #280. Any sub-request exception (including PR #279's_walk_pagesRuntimeError) is wrapped viaraise PartialResult(...) from exc, so the original cause stays accessible via__cause__. On a first-chunk-failed scenario, the chunker synthesizes a minimalrequests.Responsecarrying just the canonical URL + manifest so caller-sideBaseMetadata.chunk_manifestalways works.The manifest is also attached to
BaseMetadata.chunk_manifeston successful chunked calls so callers can logmd.chunk_manifestto confirm fan-out and observe sub-request count.Surface
resume_from: BaseMetadata | None = Noneadded to all 11 chunked-getter signatures (get_daily,get_continuous,get_monitoring_locations,get_time_series_metadata,get_combined_metadata,get_latest_continuous,get_latest_daily,get_field_measurements,get_field_measurements_metadata,get_peaks,get_channel).ChunkManifest,PartialResult,_normalize_plan.Tests
14 new chunker tests cover: manifest properties (total / is_complete / remaining), frozen-dataclass immutability and hashability, normalized-plan order sensitivity, manifest attached on successful chunked calls, no manifest on pass-through,
BaseMetadata.chunk_manifestround-trip,PartialResultwrapping of fetch exceptions with__cause__chained, empty-frame handling on first-chunk failure,partial_metadatalazy wrapping, resume happy path (skip completed chunks), and four resume rejection paths (mismatched plan / no manifest / already-complete / no chunking). Plus an end-to-end test that exhausts quota partway through, then resumes to complete the query and verifies frame concat reproduces a single-call equivalent.All 59 chunker / utility tests pass; 42 chunker-specific tests focused.