refactor(provider): unify inbound headers as opts.headers (drop 4th/5th args)#70
Closed
Menci wants to merge 6 commits into
Closed
refactor(provider): unify inbound headers as opts.headers (drop 4th/5th args)#70Menci wants to merge 6 commits into
Menci wants to merge 6 commits into
Conversation
…CallOptions Add optional `clientRequestHeaders` and `clientRequestPathname` to `UpstreamCallOptions` and capture them once on GatewayCtx from the inbound Hono request. Each `attempt.ts` consumes them via a new `buildUpstreamCallOptions(candidate, ctx, recorder)` helper so the build is uniform across messages / responses / chat-completions and across the generate / count-tokens / compact entry points. The captured Headers is the native `Headers` instance the runtime gave us, not a snapshot — header lookups on the hot path stay native. The UpstreamCallOptions field accepts either `Headers` or a plain `Record<string, string>` so synthetic test sites can pass a record without boxing. Synthetic ctxs (in-process tests, translation chains that never originated as a real HTTP request) leave both fields undefined. Consumers that key on the inbound request shape (e.g. native-client shaped-passthrough detection) must treat undefined as "no inbound context, do not shortcut".
…and-forget persistence
Add an optional `waitUntil` slot on `UpstreamCallOptions` so providers can
register fire-and-forget promises that need to outlive the response. Reuses
the existing per-request backgroundScheduler — they share the runtime
primitive (`ExecutionContext.waitUntil` on workerd, a no-op on Node) and the
per-call name reads more naturally at the provider boundary.
Wire it through the three places that build `UpstreamCallOptions`: the
attempt-layer helper (messages / responses / chat-completions), the
passthrough-serve path (embeddings / images / count_tokens), and the
inline construction inside the responses-side image-generation server tool.
Codex's transport is the first consumer. Three best-effort state writes —
the quota snapshot on 2xx, the quota snapshot on 429, and the freshly-minted
access token on the 401-retry path — used to rely on a bare
`.catch(() => {})` fire-and-forget. On workerd a fire-and-forget promise is
cancelled the instant the streaming response returns to the client, so the
write could be dropped mid-flight and the next request would re-mint or
re-race the upstream. Hand each write to the runtime's waitUntil slot via
a single `registerBackgroundWrite` helper. The error swallow stays (these
writes are recoverable noise — CAS losses to concurrent rotations,
transient storage failures — and the request must not crash on them), but
the swallow now happens inside a lifetime the runtime promises us.
The slot is optional on UpstreamCallOptions because synthetic call sites
(in-process tests, internal compaction) legitimately omit it; the optional
chain in `registerBackgroundWrite` degrades cleanly for those.
…king stack trace
The Codex upstream only exposes /responses; the catalog mapper advertises
that single endpoint and getProvidedModels is the only place the data
plane learns what surfaces are reachable. Even so, a stray dispatch from
an upstream routing bug would land in one of the stub `call*` methods on
the provider — and the stubs rejected with a bare `Error`, which escaped
the boundary as a 500 with a raw stack trace.
Replace each stub with a synthetic 405 carrying the gateway's standard
JSON error envelope (`{ error: { type, message } }`). Each call still
flows through the per-call `recordUpstreamLatency` recorder to honour
the wrap-once contract that every boundary-facing response must satisfy,
including synthetic ones that never touch the network.
The existing throw-asserting test is replaced with one that exercises
every unsupported surface (now including callMessages and
callChatCompletions, which the previous assertion missed) and verifies
the 405 envelope.
Reviewed the 3-commit precursor diff as a single final state and
removed the scaffolding that crept in:
- UpstreamCallOptions: drop the speculative `Headers | Record<string, string>`
union on clientRequestHeaders (no caller passes a Record) and the matching
JSDoc block; trim the per-field preamble to one short paragraph per field.
Make `waitUntil` required — every production callsite supplies it via
ctx.backgroundScheduler or its local scheduler, and the test stub already
passes a no-op, so the optional chain in codex/fetch.ts defended a case
that cannot occur.
- codex/fetch.ts: collapse the 9-line comment on registerBackgroundWrite to
one line and drop the now-unneeded `?.()` on opts.call.waitUntil.
- codex/provider.ts: delete the duplicate rationale comment on synthetic405
(the call-site cluster already explains the why).
- gateway-ctx.ts: trim the clientRequest* block and inline the one-shot
`new URL(c.req.url).pathname` instead of allocating a URL twice.
- attempt-helpers.ts: trim the 9-line preamble on buildUpstreamCallOptions
and drop the conditional spread — straight assignment of an optional
source to an optional target is equivalent.
- test-utils/stubs.ts: drop the redundant `waitUntil discards the promise`
sentence; the `() => {}` is self-explanatory.
- codex/fetch_test.ts: enforcingRecorder now supplies waitUntil to match
the tightened UpstreamCallOptions shape.
Also fixes a preexisting import-order lint error in responses/websocket_test.ts
that was inherited from the branch base (origin/main has the fixed order
in a commit our base predates); reordering here keeps `pnpm run lint`
green for this PR.
The `clientRequestPathname` field on `UpstreamCallOptions` was added in aacceae anticipating a consumer that keys on the inbound pathname (e.g. native-client shaped-passthrough detection). No provider in this PR reads it, and the would-be downstream consumer in the Claude Code provider is out of scope here. Carrying a field with no consumer obscures the interface and invites speculative wiring elsewhere — drop it now and add it back the day a real callsite needs it. Removes the field from `UpstreamCallOptions`, the matching `GatewayCtx` mirror, both gateway-ctx constructors, and the projection in `buildUpstreamCallOptions`. `clientRequestHeaders` stays because the upstream provider transports do consult it (Copilot's shaped-passthrough leg in particular).
…th args)
Consolidates the three inbound-headers conduits on the provider boundary
into a single `opts.headers: Headers` on `UpstreamCallOptions`. Before:
callMessages(model, body, signal, headers, anthropicBeta, opts)
// plus opts.clientRequestHeaders
After:
callMessages(model, body, signal, opts)
// opts.headers: Headers (always present)
The 4th `headers` (Record, interceptor-mutated) and 5th `anthropicBeta`
(typed array) arguments are gone from `callMessages` / `callResponses` /
`callChatCompletions` / `callMessagesCountTokens` / `callEmbeddings` /
`callImagesGenerations` / `callImagesEdits`. `opts.clientRequestHeaders`
is renamed to `opts.headers` and narrowed to `Headers` (always present;
fresh `new Headers()` for translated and synthesized callsites).
Provider boundary-ctx headers are now `Headers` too — interceptors mutate
via `.set('x-foo', 'bar')` / `.delete('x-foo')` / `.get('x-foo')` instead
of Record indexing. The same shape goes all the way down to the upstream
fetch helpers: `UpstreamFetchOptions.extraHeaders` is `Headers`, and the
helpers iterate it directly. The `mergeAnthropicBetaHeader` helper is
dead — Azure and custom just pass `opts.headers` through.
The Copilot Messages chain still needs the typed `anthropicBeta` slice
for raw-variant selection (claude carve-outs run before the wire header
is filtered down to the Copilot allow-list). The provider parses it back
out of `opts.headers.get('anthropic-beta')` once at entry and exposes it
as `MessagesBoundaryCtx.anthropicBeta` for variant selection, identical
to the old typed-arg path. The new
`withAnthropicBetaHeaderFiltered` interceptor also has to
`opts.headers.delete('anthropic-beta')` before policy branches: with the
unified bag seeded from the inbound header, leaving the unfiltered raw
value in place would 400 on Copilot's allow-list.
Gateway-side `MessagesInvocation` / `ResponsesInvocation` / etc. `headers`
fields likewise become `Headers`. The Messages attempt seeds the inbound
`anthropic-beta` from `c.req.raw.headers` into `invocation.headers` so it
rides the same bag the interceptor chain mutates. `parseAnthropicBeta` in
the HTTP layer is gone — the typed slice is rebuilt where it is consumed.
Test fixtures, `noopUpstreamCallOptions`, and `stubProvider` updated to
match — `noopUpstreamCallOptions.headers` is a getter that returns a fresh
`Headers` per access so tests that mutate the bag do not bleed state.
9c9a007 to
1259127
Compare
Owner
Author
|
Folded into #62; the unified opts.headers + dropped 4th/5th args + interceptor updates now ship as a single coherent PR. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Consolidates the three current inbound-headers conduits on the provider boundary into a single
opts.headers: Headers. Stacked on #62.Before:
callMessageshadheaders(4th arg, Record),anthropicBeta(5th arg, array), andopts.clientRequestHeaders(Headers|Record). Three things saying "inbound headers". Now unified:Migration
call*methods onModelProviderlose their 4thheadersand 5thanthropicBetaparameters; provider impls read everything offopts.headers.headersfields becomeHeaders; interceptors call.set/.get/.deleteonopts.headersinstead of Record indexing.UpstreamFetchOptions.extraHeadersisHeaderstoo; upstream fetch helpers iterate it directly.withAnthropicBetaHeaderFilterednowheaders.delete('anthropic-beta')before its policy branches — with the unified bag seeded from the inbound header, leaving the unfiltered raw value would 400 against Copilot's allow-list.mergeAnthropicBetaHeaderis dead — Azure and custom just passopts.headersthrough.anthropic-beta(read fromctx.clientRequestHeaders) ontoinvocation.headersso it rides the same bag the interceptor chain mutates. The HTTP-layerparseAnthropicBetahelper is gone; Copilot rebuilds the typed slice fromopts.headers.get('anthropic-beta')at provider entry for its variant-selection branch.noopUpstreamCallOptions.headersis a getter returning a freshHeadersper access so tests that mutate the bag do not bleed state.Naming
Per review direction: field is
headers; context infers in/out direction (inUpstreamCallOptionsinbound, inProviderStreamResult.ok:true.headersoutbound — see #64).Stack
Stacked on #62 (introduces the
clientRequestHeadersfield this PR renames + unifies). GitHub auto-retargets tomainwhen #62 merges.Test plan