fix: correctness and security hardening across the HTTP stack#6
Merged
Conversation
…and serde Resolve a batch of correctness and security issues found while reviewing the HTTP stack end to end. Pipeline & auth: - Credential policies are origin-aware: after a cross-origin redirect the bearer/basic/key credential is no longer re-stamped onto the foreign host, and the redirect policy strips a caller-set Authorization only on cross-origin hops (same-origin hops keep it). - Retry decides single-use body buffering from the effective per-call retry total and no longer retries non-idempotent methods on read-phase errors; async body buffering runs off the event loop. - Pipeline construction rejects reuse of a policy instance already wired into another pipeline. Transports: - requests: never emit Content-Length and Transfer-Encoding together; frame known-length bodies by length (streamed, not buffered into memory) and chunk unknown-length bodies with any stale Content-Length removed. Close only a session the client owns; preserve repeated response headers; report the negotiated HTTP version; drop a misleading Content-Length under Content-Encoding. - urllib: stop following redirects inside the transport so the 3xx reaches the pipeline; map response-read failures into the SDK error hierarchy. - asyncio: send Host with the port for non-default ports; apply an SSL context only to https URLs; reject chunked and read connection-close framing instead of fabricating an empty body; detect chunked across multiple Transfer-Encoding lines; buffer the request body off the event loop. - httpx/aiohttp: set Content-Length for known-length bodies and pump sync body iterators on a worker thread. - All transports preserve a valid-but-unregistered status code instead of discarding the response. Serde, pagination, streaming, multipart: - Codec decodes Annotated[...] fields (including annotated Tristate) correctly, guards recursion depth, and wraps conversion failures in CodecError. - Link-header pagination reads every Link line and tolerates commas inside the target URI; cursor pagination accepts non-string cursors; JSON page parsing surfaces DeserializationError. - JSONL and SSE decoding map invalid UTF-8 onto the streaming error contract. - Multipart rejects CR/LF/NUL in field names, filenames, media types, custom part headers, and the boundary to prevent header injection. Also fixes the digest nonce-count reset, per-operation tracing event emission, URL-redactor fail-closed behaviour, proxy configuration parsing, and a number of smaller correctness and documentation issues. Adds regression tests throughout; the full suite, mypy --strict, and ruff all pass. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Decoding a PEP 695 generic dataclass (e.g. `Box[int]`, declared `class Box[T]`) raised `CodecError: name 'T' is not defined` on Python 3.12. The field annotations reference the class type parameters, and 3.12's `get_type_hints` does not place those parameters in scope (3.13+ resolves them automatically). Pass the class `__type_params__` through `get_type_hints`'s `localns` so the annotations resolve on every supported interpreter. Verified against the full suite on 3.12, 3.13, and 3.14. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
A focused correctness and security pass over the HTTP stack — the pipeline policies, the four transport adapters, and the serde/pagination/streaming layers. The issues here mostly live in cross-layer interactions and transport wire behaviour that the existing unit tests didn't exercise.
Pipeline & auth
Authorizationonly on cross-origin hops, so ordinary same-origin redirects (e.g. a trailing-slash 301) keep it.retry_totaloverride no longer crashes a streamed body mid-retry. Non-idempotent methods (POST/PATCH) are no longer retried on read-phase errors, where the request may already have been processed. Async body buffering runs off the event loop.Transports
Content-LengthandTransfer-Encoding: chunkedtogether (the previous behaviour sent an unframed body under both headers). Known-length bodies are framed by length and streamed without buffering into memory; unknown-length bodies are chunked with any staleContent-Lengthremoved. Only a session the client created is closed; repeated response headers (e.g.Set-Cookie) are preserved; the negotiated HTTP version is reported; aContent-Lengththat would misdescribe a decompressed body is dropped.Hostwith the port for non-default ports; apply a caller-supplied SSL context only tohttpsURLs; reject chunked responses and read connection-close-framed bodies to EOF instead of fabricating an empty body; detectchunkedacross multipleTransfer-Encodinglines; buffer the request body off the event loop.Content-Lengthfor known-length bodies (instead of always chunking) and pump sync body iterators on a worker thread rather than the event loop.Serde, pagination, streaming, multipart
Annotated[...]fields (including annotatedTristate) correctly, guards recursion depth, and wraps conversion failures inCodecError.Linkline and tolerates commas inside the target URI; cursor pagination accepts non-string cursors; malformed JSON pages surfaceDeserializationError.Also includes the digest nonce-count reset, per-operation (rather than per-attempt) tracing events, fail-closed URL redaction, more robust proxy-configuration parsing, and a number of smaller correctness and documentation fixes.
Public API
One intentional signature change:
HttpTracer.request_sentnow acceptsint | Noneso an unknown-length upload still emits the event. The committed surface baseline is regenerated accordingly.Testing
uv run pytest -q→ 1398 passed (242 new regression tests),uv run mypy --strictclean,uv run ruff check/ruff format --checkclean.🤖 Generated with Claude Code