Skip to content

refactor(provider): unify inbound headers as opts.headers (drop 4th/5th args)#70

Closed
Menci wants to merge 6 commits into
precursor-provider-interface-plumbingfrom
precursor-headers-unification
Closed

refactor(provider): unify inbound headers as opts.headers (drop 4th/5th args)#70
Menci wants to merge 6 commits into
precursor-provider-interface-plumbingfrom
precursor-headers-unification

Conversation

@Menci

@Menci Menci commented Jun 19, 2026

Copy link
Copy Markdown
Owner

Summary

Consolidates the three current inbound-headers conduits on the provider boundary into a single opts.headers: Headers. Stacked on #62.

Before: callMessages had headers (4th arg, Record), anthropicBeta (5th arg, array), and opts.clientRequestHeaders (Headers|Record). Three things saying "inbound headers". Now unified:

callMessages(model, body, signal, opts)
// opts.headers: Headers (always present)

Migration

  • All call* methods on ModelProvider lose their 4th headers and 5th anthropicBeta parameters; provider impls read everything off opts.headers.
  • Provider boundary-ctx headers fields become Headers; interceptors call .set/.get/.delete on opts.headers instead of Record indexing.
  • UpstreamFetchOptions.extraHeaders is Headers too; upstream fetch helpers iterate it directly.
  • Copilot's withAnthropicBetaHeaderFiltered now headers.delete('anthropic-beta') before its policy branches — with the unified bag seeded from the inbound header, leaving the unfiltered raw value would 400 against Copilot's allow-list.
  • mergeAnthropicBetaHeader is dead — Azure and custom just pass opts.headers through.
  • Messages attempt encodes inbound anthropic-beta (read from ctx.clientRequestHeaders) onto invocation.headers so it rides the same bag the interceptor chain mutates. The HTTP-layer parseAnthropicBeta helper is gone; Copilot rebuilds the typed slice from opts.headers.get('anthropic-beta') at provider entry for its variant-selection branch.
  • noopUpstreamCallOptions.headers is a getter returning a fresh Headers per access so tests that mutate the bag do not bleed state.

Naming

Per review direction: field is headers; context infers in/out direction (in UpstreamCallOptions inbound, in ProviderStreamResult.ok:true.headers outbound — see #64).

Stack

Stacked on #62 (introduces the clientRequestHeaders field this PR renames + unifies). GitHub auto-retargets to main when #62 merges.

Test plan

  • typecheck clean
  • 2704 vitest tests passing
  • eslint clean

Menci added 6 commits June 20, 2026 00:39
…CallOptions

Add optional `clientRequestHeaders` and `clientRequestPathname` to
`UpstreamCallOptions` and capture them once on GatewayCtx from the inbound
Hono request. Each `attempt.ts` consumes them via a new
`buildUpstreamCallOptions(candidate, ctx, recorder)` helper so the build is
uniform across messages / responses / chat-completions and across the
generate / count-tokens / compact entry points.

The captured Headers is the native `Headers` instance the runtime gave us,
not a snapshot — header lookups on the hot path stay native. The
UpstreamCallOptions field accepts either `Headers` or a plain
`Record<string, string>` so synthetic test sites can pass a record without
boxing.

Synthetic ctxs (in-process tests, translation chains that never originated
as a real HTTP request) leave both fields undefined. Consumers that key on
the inbound request shape (e.g. native-client shaped-passthrough detection)
must treat undefined as "no inbound context, do not shortcut".
…and-forget persistence

Add an optional `waitUntil` slot on `UpstreamCallOptions` so providers can
register fire-and-forget promises that need to outlive the response. Reuses
the existing per-request backgroundScheduler — they share the runtime
primitive (`ExecutionContext.waitUntil` on workerd, a no-op on Node) and the
per-call name reads more naturally at the provider boundary.

Wire it through the three places that build `UpstreamCallOptions`: the
attempt-layer helper (messages / responses / chat-completions), the
passthrough-serve path (embeddings / images / count_tokens), and the
inline construction inside the responses-side image-generation server tool.

Codex's transport is the first consumer. Three best-effort state writes —
the quota snapshot on 2xx, the quota snapshot on 429, and the freshly-minted
access token on the 401-retry path — used to rely on a bare
`.catch(() => {})` fire-and-forget. On workerd a fire-and-forget promise is
cancelled the instant the streaming response returns to the client, so the
write could be dropped mid-flight and the next request would re-mint or
re-race the upstream. Hand each write to the runtime's waitUntil slot via
a single `registerBackgroundWrite` helper. The error swallow stays (these
writes are recoverable noise — CAS losses to concurrent rotations,
transient storage failures — and the request must not crash on them), but
the swallow now happens inside a lifetime the runtime promises us.

The slot is optional on UpstreamCallOptions because synthetic call sites
(in-process tests, internal compaction) legitimately omit it; the optional
chain in `registerBackgroundWrite` degrades cleanly for those.
…king stack trace

The Codex upstream only exposes /responses; the catalog mapper advertises
that single endpoint and getProvidedModels is the only place the data
plane learns what surfaces are reachable. Even so, a stray dispatch from
an upstream routing bug would land in one of the stub `call*` methods on
the provider — and the stubs rejected with a bare `Error`, which escaped
the boundary as a 500 with a raw stack trace.

Replace each stub with a synthetic 405 carrying the gateway's standard
JSON error envelope (`{ error: { type, message } }`). Each call still
flows through the per-call `recordUpstreamLatency` recorder to honour
the wrap-once contract that every boundary-facing response must satisfy,
including synthetic ones that never touch the network.

The existing throw-asserting test is replaced with one that exercises
every unsupported surface (now including callMessages and
callChatCompletions, which the previous assertion missed) and verifies
the 405 envelope.
Reviewed the 3-commit precursor diff as a single final state and
removed the scaffolding that crept in:

- UpstreamCallOptions: drop the speculative `Headers | Record<string, string>`
  union on clientRequestHeaders (no caller passes a Record) and the matching
  JSDoc block; trim the per-field preamble to one short paragraph per field.
  Make `waitUntil` required — every production callsite supplies it via
  ctx.backgroundScheduler or its local scheduler, and the test stub already
  passes a no-op, so the optional chain in codex/fetch.ts defended a case
  that cannot occur.
- codex/fetch.ts: collapse the 9-line comment on registerBackgroundWrite to
  one line and drop the now-unneeded `?.()` on opts.call.waitUntil.
- codex/provider.ts: delete the duplicate rationale comment on synthetic405
  (the call-site cluster already explains the why).
- gateway-ctx.ts: trim the clientRequest* block and inline the one-shot
  `new URL(c.req.url).pathname` instead of allocating a URL twice.
- attempt-helpers.ts: trim the 9-line preamble on buildUpstreamCallOptions
  and drop the conditional spread — straight assignment of an optional
  source to an optional target is equivalent.
- test-utils/stubs.ts: drop the redundant `waitUntil discards the promise`
  sentence; the `() => {}` is self-explanatory.
- codex/fetch_test.ts: enforcingRecorder now supplies waitUntil to match
  the tightened UpstreamCallOptions shape.

Also fixes a preexisting import-order lint error in responses/websocket_test.ts
that was inherited from the branch base (origin/main has the fixed order
in a commit our base predates); reordering here keeps `pnpm run lint`
green for this PR.
The `clientRequestPathname` field on `UpstreamCallOptions` was added in
aacceae anticipating a consumer that keys on the inbound pathname (e.g.
native-client shaped-passthrough detection). No provider in this PR reads
it, and the would-be downstream consumer in the Claude Code provider is
out of scope here. Carrying a field with no consumer obscures the
interface and invites speculative wiring elsewhere — drop it now and add
it back the day a real callsite needs it.

Removes the field from `UpstreamCallOptions`, the matching `GatewayCtx`
mirror, both gateway-ctx constructors, and the projection in
`buildUpstreamCallOptions`. `clientRequestHeaders` stays because the
upstream provider transports do consult it (Copilot's
shaped-passthrough leg in particular).
…th args)

Consolidates the three inbound-headers conduits on the provider boundary
into a single `opts.headers: Headers` on `UpstreamCallOptions`. Before:

  callMessages(model, body, signal, headers, anthropicBeta, opts)
  // plus opts.clientRequestHeaders

After:

  callMessages(model, body, signal, opts)
  // opts.headers: Headers (always present)

The 4th `headers` (Record, interceptor-mutated) and 5th `anthropicBeta`
(typed array) arguments are gone from `callMessages` / `callResponses` /
`callChatCompletions` / `callMessagesCountTokens` / `callEmbeddings` /
`callImagesGenerations` / `callImagesEdits`. `opts.clientRequestHeaders`
is renamed to `opts.headers` and narrowed to `Headers` (always present;
fresh `new Headers()` for translated and synthesized callsites).

Provider boundary-ctx headers are now `Headers` too — interceptors mutate
via `.set('x-foo', 'bar')` / `.delete('x-foo')` / `.get('x-foo')` instead
of Record indexing. The same shape goes all the way down to the upstream
fetch helpers: `UpstreamFetchOptions.extraHeaders` is `Headers`, and the
helpers iterate it directly. The `mergeAnthropicBetaHeader` helper is
dead — Azure and custom just pass `opts.headers` through.

The Copilot Messages chain still needs the typed `anthropicBeta` slice
for raw-variant selection (claude carve-outs run before the wire header
is filtered down to the Copilot allow-list). The provider parses it back
out of `opts.headers.get('anthropic-beta')` once at entry and exposes it
as `MessagesBoundaryCtx.anthropicBeta` for variant selection, identical
to the old typed-arg path. The new
`withAnthropicBetaHeaderFiltered` interceptor also has to
`opts.headers.delete('anthropic-beta')` before policy branches: with the
unified bag seeded from the inbound header, leaving the unfiltered raw
value in place would 400 on Copilot's allow-list.

Gateway-side `MessagesInvocation` / `ResponsesInvocation` / etc. `headers`
fields likewise become `Headers`. The Messages attempt seeds the inbound
`anthropic-beta` from `c.req.raw.headers` into `invocation.headers` so it
rides the same bag the interceptor chain mutates. `parseAnthropicBeta` in
the HTTP layer is gone — the typed slice is rebuilt where it is consumed.

Test fixtures, `noopUpstreamCallOptions`, and `stubProvider` updated to
match — `noopUpstreamCallOptions.headers` is a getter that returns a fresh
`Headers` per access so tests that mutate the bag do not bleed state.
@Menci Menci force-pushed the precursor-provider-interface-plumbing branch from 9c9a007 to 1259127 Compare June 20, 2026 04:06
@Menci

Menci commented Jun 20, 2026

Copy link
Copy Markdown
Owner Author

Folded into #62; the unified opts.headers + dropped 4th/5th args + interceptor updates now ship as a single coherent PR.

@Menci Menci closed this Jun 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant