fix: correct per-operation tracing lifecycle and HTTP transport edge cases#7
Merged
Merged
Conversation
Pipeline tracing: - The per-operation HttpTracer lifecycle (operation_started / operation_succeeded / operation_failed) was emitted from TracingPolicy, which runs inside the retry and redirect wrappers and is re-entered once per attempt/hop. Gated on the first attempt, a call that failed once and then succeeded on a retry reported operation_failed and never operation_succeeded, and a redirect whose later hop failed reported operation_succeeded. - Add OperationTracingPolicy at a new outermost Stage.OPERATION that brackets the whole call from outside the wrappers, emitting operation_started before the chain and exactly one operation_succeeded / operation_failed on the final outcome. TracingPolicy keeps the per-attempt span and the per-request events (request_sent / response_*); default_pipeline wires both. Transports and utilities: - urllib: stop dropping a valid Content-Length under Content-Encoding. http.client does not decode content encodings, so the served body is the wire payload whose length is the upstream Content-Length; the header is accurate and is now surfaced (the decompressing requests/httpx/aiohttp adapters still drop it). - urllib and asyncio: report "Invalid status code" for an out-of-range status, matching the other adapters. - asyncio: match the chunked transfer-coding by token rather than substring so a coding such as "x-chunked" is not mistaken for chunked framing. - Share one cross-origin helper (http.common.url._origin) between the redirect and auth policies instead of duplicating it. Docs and the committed API surface baseline are updated accordingly. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Add CHANGELOG entries for OperationTracingPolicy and the new outermost Stage.OPERATION, the TracingPolicy lifecycle split (with a migration note for pipelines assembled by hand), and the urllib/asyncio transport corrections (Content-Length under Content-Encoding, chunked-framing detection, and out-of-range status wording). - Document that the tracing and logging policies are sync-only and the async pipeline carries no tracing, in both the tracing module docstring and the changelog's scope boundaries. - List Stage.OPERATION among the pillar stages in the staged-builder docstring. - Remove leftover shorthand prefixes from a few test comments, keeping the explanatory text. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The README architecture diagram, stage-spacing note, policy surface table, and observability summary predated the new outermost operation-tracing stage. - Show OPERATION as the outermost pillar in the request-flow diagram. - List OperationTracingPolicy in the pipeline.policies surface table and the observability feature summary. - Correct the stage-spacing wording (values are spaced out rather than a fixed 100 apart) in both the README and the Stage docstring, now that the operation stage sits at 50. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
CLAUDE.md's pipeline overview listed the shipped policies and the default stack order without the new operation-tracing stage that brackets the redirect/retry wrappers. - Add operation-tracing to the shipped-policies list and note it ships sync-only alongside logging and tracing. - Put operation-tracing at the head of the documented default stack order. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…rationTracingPolicy The async pipeline already emitted attempt-level HttpTracer events (AsyncRetryPolicy's attempt_started/attempt_failed and AsyncRedirectPolicy's request_url_resolved) but never the operation lifecycle, so an async operator saw attempts and redirect hops with no operation_started bracket and no final operation_succeeded/operation_failed. The per-operation lifecycle had been extracted into the sync-only OperationTracingPolicy without an async counterpart, leaving the async tracer stream permanently incomplete. The HttpTracer callbacks are synchronous and transport-agnostic by contract, so there is no async barrier to emitting them — only a missing policy. - Add AsyncOperationTracingPolicy (async twin of OperationTracingPolicy) at the outermost Stage.OPERATION; it awaits only the downstream send. - Wire it as the outermost policy in default_async_pipeline and export it from pipeline.policies. - Cover it: the lifecycle fires exactly once and reflects the final outcome across async retry (success-after-failure and exhaustion) and redirect (success and late-hop failure), the tracing-disabled path, and the default async stack wiring. - Regenerate the public API surface baseline. - Correct the docs that described the async stack as carrying no tracing (CHANGELOG, README, CLAUDE.md, tracing module docstring): only the per-attempt OpenTelemetry span policy and structured logging remain sync-only. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ingPolicy Splitting the per-operation lifecycle into OperationTracingPolicy left one case silent: a hand-built pipeline that lists TracingPolicy on its own keeps emitting per-attempt spans but no longer emits operation_started / operation_succeeded / operation_failed. With a real HttpTracer installed, that is a misconfiguration the operator should be told about rather than discover through missing telemetry. TracingPolicy now logs a one-time warning when it runs with a non-noop HttpTracer and no OperationTracingPolicy bracketed the operation (detected via a ctx.data marker the operation policy sets). A no-op tracer has no consumer, so it stays silent; default_pipeline wires both policies, so the common path never warns. The once-guard takes a lock so it stays correct under free-threaded CPython. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Follow-up correctness fixes on top of the HTTP-stack hardening in #6.
Pipeline tracing
The per-operation
HttpTracerlifecycle (operation_started/operation_succeeded/operation_failed) was emitted byTracingPolicy, which sits inside the retry and redirect wrappers and is re-entered once per attempt/hop. With the lifecycle gated on the first attempt, a call that failed on its first attempt and then succeeded on a retry emittedoperation_failedand neveroperation_succeeded— and a redirect whose later hop failed emittedoperation_succeeded. The reported operation outcome did not match what the caller actually observed, which skews error-rate / SLO telemetry for any retried or redirected request.A per-attempt policy that runs inside the wrappers cannot know whether a given failure will be retried, so the fix separates the two scopes the
HttpTracerinterface already implies:OperationTracingPolicy(new) sits at a new outermostStage.OPERATION, outside the redirect/retry wrappers, so a single entry brackets the whole operation. It emitsoperation_startedbefore the chain and exactly oneoperation_succeeded/operation_failedon the final outcome.TracingPolicykeeps its per-attempt span and per-request events (request_sent,response_headers_received,response_received).Behavior change for hand-built pipelines. Moving the lifecycle out of
TracingPolicymeans a pipeline that listsTracingPolicyon its own no longer emitsoperation_started/operation_succeeded/operation_failed; it emits only the per-attempt span and per-request events. This is intentional and unavoidable — a per-attempt policy cannot produce a correct operation outcome, so the lifecycle has to live in a separate outer policy.default_pipelinewires both, so callers using it are unaffected; callers composing policies by hand should addOperationTracingPolicyalongsideTracingPolicy. So the change is not silent, aTracingPolicythat runs with a realHttpTracerbut noOperationTracingPolicybracketing it logs a one-time warning.Async tracing
The async pipeline already emitted attempt-level tracer events (
AsyncRetryPolicy→attempt_started/attempt_failed,AsyncRedirectPolicy→request_url_resolved) but had no operation lifecycle at all, so an async operator saw attempts and redirect hops with nooperation_startedbracket and no final outcome. Because theHttpTracercallbacks are synchronous,AsyncOperationTracingPolicy(new) mirrorsOperationTracingPolicyatStage.OPERATION, anddefault_async_pipelinewires it as the outermost policy — completing the async lifecycle so it matches the sync stack. The per-attempt OpenTelemetry span policy (TracingPolicy) andLoggingPolicyremain sync-only.Regression tests cover exception-retry→success and redirect-then-late-hop-failure on both the sync and async stacks.
Transports & utilities
Content-LengthunderContent-Encoding.http.clientdoes not decode content encodings, so the served body is the wire payload whose length equalsContent-Length; the header is accurate and is surfaced. (The decompressing requests/httpx/aiohttp adapters still drop it, correctly.)Invalid status codefor an out-of-range status, matching the other adapters.chunkedtransfer-coding by token, not substring, so a coding likex-chunkedis not mistaken for chunked framing.http.common.url._origin) between the redirect and auth policies instead of duplicating it.Testing
uv run pytest -q→ 1412 passed,uv run mypy --strictclean,uv run ruff check/ruff format --checkclean. The committed API surface baseline is regenerated for the newOperationTracingPolicyandAsyncOperationTracingPolicy.🤖 Generated with Claude Code