Skip to content

fix: correct per-operation tracing lifecycle and HTTP transport edge cases#7

Merged
OmarAlJarrah merged 6 commits into
mainfrom
fix/tracing-and-transport-followups
Jun 10, 2026
Merged

fix: correct per-operation tracing lifecycle and HTTP transport edge cases#7
OmarAlJarrah merged 6 commits into
mainfrom
fix/tracing-and-transport-followups

Conversation

@OmarAlJarrah

@OmarAlJarrah OmarAlJarrah commented Jun 10, 2026

Copy link
Copy Markdown
Member

Summary

Follow-up correctness fixes on top of the HTTP-stack hardening in #6.

Pipeline tracing

The per-operation HttpTracer lifecycle (operation_started / operation_succeeded / operation_failed) was emitted by TracingPolicy, which sits inside the retry and redirect wrappers and is re-entered once per attempt/hop. With the lifecycle gated on the first attempt, a call that failed on its first attempt and then succeeded on a retry emitted operation_failed and never operation_succeeded — and a redirect whose later hop failed emitted operation_succeeded. The reported operation outcome did not match what the caller actually observed, which skews error-rate / SLO telemetry for any retried or redirected request.

A per-attempt policy that runs inside the wrappers cannot know whether a given failure will be retried, so the fix separates the two scopes the HttpTracer interface already implies:

  • OperationTracingPolicy (new) sits at a new outermost Stage.OPERATION, outside the redirect/retry wrappers, so a single entry brackets the whole operation. It emits operation_started before the chain and exactly one operation_succeeded / operation_failed on the final outcome.
  • TracingPolicy keeps its per-attempt span and per-request events (request_sent, response_headers_received, response_received).

Behavior change for hand-built pipelines. Moving the lifecycle out of TracingPolicy means a pipeline that lists TracingPolicy on its own no longer emits operation_started / operation_succeeded / operation_failed; it emits only the per-attempt span and per-request events. This is intentional and unavoidable — a per-attempt policy cannot produce a correct operation outcome, so the lifecycle has to live in a separate outer policy. default_pipeline wires both, so callers using it are unaffected; callers composing policies by hand should add OperationTracingPolicy alongside TracingPolicy. So the change is not silent, a TracingPolicy that runs with a real HttpTracer but no OperationTracingPolicy bracketing it logs a one-time warning.

Async tracing

The async pipeline already emitted attempt-level tracer events (AsyncRetryPolicyattempt_started / attempt_failed, AsyncRedirectPolicyrequest_url_resolved) but had no operation lifecycle at all, so an async operator saw attempts and redirect hops with no operation_started bracket and no final outcome. Because the HttpTracer callbacks are synchronous, AsyncOperationTracingPolicy (new) mirrors OperationTracingPolicy at Stage.OPERATION, and default_async_pipeline wires it as the outermost policy — completing the async lifecycle so it matches the sync stack. The per-attempt OpenTelemetry span policy (TracingPolicy) and LoggingPolicy remain sync-only.

Regression tests cover exception-retry→success and redirect-then-late-hop-failure on both the sync and async stacks.

Transports & utilities

  • urllib: no longer drops a valid Content-Length under Content-Encoding. http.client does not decode content encodings, so the served body is the wire payload whose length equals Content-Length; the header is accurate and is surfaced. (The decompressing requests/httpx/aiohttp adapters still drop it, correctly.)
  • urllib + asyncio: report Invalid status code for an out-of-range status, matching the other adapters.
  • asyncio: match the chunked transfer-coding by token, not substring, so a coding like x-chunked is not mistaken for chunked framing.
  • Share one cross-origin helper (http.common.url._origin) between the redirect and auth policies instead of duplicating it.

Testing

uv run pytest -q1412 passed, uv run mypy --strict clean, uv run ruff check / ruff format --check clean. The committed API surface baseline is regenerated for the new OperationTracingPolicy and AsyncOperationTracingPolicy.

🤖 Generated with Claude Code

OmarAlJarrah and others added 6 commits June 10, 2026 19:45
Pipeline tracing:
- The per-operation HttpTracer lifecycle (operation_started /
  operation_succeeded / operation_failed) was emitted from TracingPolicy,
  which runs inside the retry and redirect wrappers and is re-entered once per
  attempt/hop. Gated on the first attempt, a call that failed once and then
  succeeded on a retry reported operation_failed and never operation_succeeded,
  and a redirect whose later hop failed reported operation_succeeded.
- Add OperationTracingPolicy at a new outermost Stage.OPERATION that brackets
  the whole call from outside the wrappers, emitting operation_started before
  the chain and exactly one operation_succeeded / operation_failed on the final
  outcome. TracingPolicy keeps the per-attempt span and the per-request events
  (request_sent / response_*); default_pipeline wires both.

Transports and utilities:
- urllib: stop dropping a valid Content-Length under Content-Encoding.
  http.client does not decode content encodings, so the served body is the
  wire payload whose length is the upstream Content-Length; the header is
  accurate and is now surfaced (the decompressing requests/httpx/aiohttp
  adapters still drop it).
- urllib and asyncio: report "Invalid status code" for an out-of-range status,
  matching the other adapters.
- asyncio: match the chunked transfer-coding by token rather than substring so
  a coding such as "x-chunked" is not mistaken for chunked framing.
- Share one cross-origin helper (http.common.url._origin) between the redirect
  and auth policies instead of duplicating it.

Docs and the committed API surface baseline are updated accordingly.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Add CHANGELOG entries for OperationTracingPolicy and the new outermost
  Stage.OPERATION, the TracingPolicy lifecycle split (with a migration note
  for pipelines assembled by hand), and the urllib/asyncio transport
  corrections (Content-Length under Content-Encoding, chunked-framing
  detection, and out-of-range status wording).
- Document that the tracing and logging policies are sync-only and the async
  pipeline carries no tracing, in both the tracing module docstring and the
  changelog's scope boundaries.
- List Stage.OPERATION among the pillar stages in the staged-builder docstring.
- Remove leftover shorthand prefixes from a few test comments, keeping the
  explanatory text.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The README architecture diagram, stage-spacing note, policy surface table,
and observability summary predated the new outermost operation-tracing stage.

- Show OPERATION as the outermost pillar in the request-flow diagram.
- List OperationTracingPolicy in the pipeline.policies surface table and the
  observability feature summary.
- Correct the stage-spacing wording (values are spaced out rather than a fixed
  100 apart) in both the README and the Stage docstring, now that the operation
  stage sits at 50.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
CLAUDE.md's pipeline overview listed the shipped policies and the default
stack order without the new operation-tracing stage that brackets the
redirect/retry wrappers.

- Add operation-tracing to the shipped-policies list and note it ships
  sync-only alongside logging and tracing.
- Put operation-tracing at the head of the documented default stack order.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…rationTracingPolicy

The async pipeline already emitted attempt-level HttpTracer events
(AsyncRetryPolicy's attempt_started/attempt_failed and AsyncRedirectPolicy's
request_url_resolved) but never the operation lifecycle, so an async operator
saw attempts and redirect hops with no operation_started bracket and no final
operation_succeeded/operation_failed. The per-operation lifecycle had been
extracted into the sync-only OperationTracingPolicy without an async
counterpart, leaving the async tracer stream permanently incomplete.

The HttpTracer callbacks are synchronous and transport-agnostic by contract, so
there is no async barrier to emitting them — only a missing policy.

- Add AsyncOperationTracingPolicy (async twin of OperationTracingPolicy) at the
  outermost Stage.OPERATION; it awaits only the downstream send.
- Wire it as the outermost policy in default_async_pipeline and export it from
  pipeline.policies.
- Cover it: the lifecycle fires exactly once and reflects the final outcome
  across async retry (success-after-failure and exhaustion) and redirect
  (success and late-hop failure), the tracing-disabled path, and the default
  async stack wiring.
- Regenerate the public API surface baseline.
- Correct the docs that described the async stack as carrying no tracing
  (CHANGELOG, README, CLAUDE.md, tracing module docstring): only the
  per-attempt OpenTelemetry span policy and structured logging remain sync-only.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ingPolicy

Splitting the per-operation lifecycle into OperationTracingPolicy left one case
silent: a hand-built pipeline that lists TracingPolicy on its own keeps emitting
per-attempt spans but no longer emits operation_started / operation_succeeded /
operation_failed. With a real HttpTracer installed, that is a misconfiguration
the operator should be told about rather than discover through missing telemetry.

TracingPolicy now logs a one-time warning when it runs with a non-noop
HttpTracer and no OperationTracingPolicy bracketed the operation (detected via a
ctx.data marker the operation policy sets). A no-op tracer has no consumer, so
it stays silent; default_pipeline wires both policies, so the common path never
warns. The once-guard takes a lock so it stays correct under free-threaded
CPython.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@OmarAlJarrah OmarAlJarrah merged commit 97f0a10 into main Jun 10, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant