diff --git a/dto/claude.go b/dto/claude.go
index d7fed412aaa..3d79bfa3cf1 100644
--- a/dto/claude.go
+++ b/dto/claude.go
@@ -171,9 +171,17 @@ func (c *ClaudeMessage) ParseContent() ([]ClaudeMediaMessage, error) {
}
type Tool struct {
- Name string `json:"name"`
- Description string `json:"description,omitempty"`
- InputSchema map[string]interface{} `json:"input_schema"`
+ Name string `json:"name"`
+ Description string `json:"description,omitempty"`
+ InputSchema map[string]interface{} `json:"input_schema"`
+ CacheControl *ClaudeCacheControl `json:"cache_control,omitempty"`
+}
+
+// ClaudeCacheControl mirrors Anthropic's prompt-caching marker.
+// See https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching
+type ClaudeCacheControl struct {
+ Type string `json:"type"`
+ TTL string `json:"ttl,omitempty"`
}
type InputSchema struct {
diff --git a/openspec/changes/archive/2026-05-20-responses-to-anthropic-translation/.openspec.yaml b/openspec/changes/archive/2026-05-20-responses-to-anthropic-translation/.openspec.yaml
new file mode 100644
index 00000000000..8b769149815
--- /dev/null
+++ b/openspec/changes/archive/2026-05-20-responses-to-anthropic-translation/.openspec.yaml
@@ -0,0 +1,2 @@
+schema: spec-driven
+created: 2026-05-20
diff --git a/openspec/changes/archive/2026-05-20-responses-to-anthropic-translation/design.md b/openspec/changes/archive/2026-05-20-responses-to-anthropic-translation/design.md
new file mode 100644
index 00000000000..87fc5954873
--- /dev/null
+++ b/openspec/changes/archive/2026-05-20-responses-to-anthropic-translation/design.md
@@ -0,0 +1,149 @@
+## Context
+
+The gateway today routes `POST /v1/responses` through a single relay dispatch and supports two upstream surface families: OpenAI-compatible (`/v1/chat/completions`, `/v1/responses` on OpenAI itself) and Anthropic Messages (`/v1/messages`). When a `/v1/responses` request is routed to an Anthropic-typed channel, no translation layer exists for either the request body or the streaming response, so the request fails. Adding the missing pipeline lets a single inbound request shape (Responses-API) be served by either upstream family.
+
+The reference behavioral surface (analyzed externally, source-free) establishes a stable contract: a **two-step pivot** through an intermediate Chat-Completions-shaped object, on both the request side and the response side. Reusing that pivot keeps each translator focused and gives a clean composition: Responses ↔ Chat-Completions ↔ Anthropic.
+
+The existing codebase already covers the Chat-Completions ↔ Anthropic legs end-to-end:
+
+- `relay/channel/claude/relay-claude.go::RequestOpenAI2ClaudeMessage` — Chat-Completions request → Anthropic Messages request. Already handles system extraction, tool_use/tool_result ordering, image mapping (data: vs http:), `max_tokens` adjustment for thinking and tools, response_format JSON-mode shim, and merge of consecutive same-role messages.
+- `relay/channel/claude/relay-claude.go::ClaudeStreamHandler` (+ `StreamResponseClaude2OpenAI`, `FormatClaudeResponseInfo`) — streaming Anthropic response → Chat-Completions chunks, including cache-token decomposition and finish_reason mapping.
+- `relay/channel/claude/relay-claude.go::ClaudeHandler` (+ `ResponseClaude2OpenAI`) — non-streaming Anthropic response → Chat-Completions response.
+
+The only legs that do NOT yet exist are: Responses-request → Chat-Completions-request, and Chat-Completions-stream → Responses-events (plus a non-streaming variant of the latter). This change therefore adds exactly those legs as new functions under `service/openaicompat/`, plus one orchestration file under `relay/` that mirrors the existing `relay/chat_completions_via_responses.go` in the opposite direction.
+
+Other anchors used by this change:
+
+- The relay format dispatch keys off `info.RelayMode == relayconstant.RelayModeResponses` and `info.ApiType == appconstant.APITypeAnthropic`; the new translation triggers at that exact branch in `relay/responses_handler.go`.
+- The project's JSON wrapper (`common.Marshal`/`common.Unmarshal`) is mandatory (project Rule 1).
+- Env-var feature flags follow the `common.GetEnvOrDefaultBool("FLAG_NAME", default)` pattern (see `common/env.go`).
+
+## Goals / Non-Goals
+
+**Goals:**
+- Provide a complete, source-free behavioral specification of the two pipelines (request and response).
+- Maintain a clean separation: each translator function takes a body or chunk and returns the next-stage body or chunk, with no I/O side effects.
+- Preserve all existing behavior for non-Anthropic upstreams and for non-Responses inbound requests.
+- Express each behavioral invariant as an objectively checkable requirement in the capability spec.
+- Establish a per-stream state object that survives across chunk callbacks (sequence numbers, item indices, buffered reasoning text, tool-call open/close state).
+
+**Non-Goals:**
+- Picking the final Go package path (left for Phase 3).
+- Specifying internal struct names (left for Phase 3, beyond placeholders).
+- Modifying quota, billing, retry, or auto-ban behavior.
+- Adding new channel adaptors or external dependencies.
+
+## Decisions
+
+### D1. Two-step pivot through a Chat-Completions intermediate
+
+The translator does **not** map Responses-API ↔ Anthropic Messages directly. It maps Responses → ChatCompletions → AnthropicMessages on the request side, and AnthropicMessages → ChatCompletions → Responses on the response side.
+
+- *Why*: The Chat-Completions shape is the most stable and most widely-implemented "lingua franca" inside the gateway (the existing OpenAI-compatible path already uses it). Pivoting through it means the new code only adds two missing legs (Responses↔ChatCompletions on the request side, ChatCompletions→Responses on the response side) and reuses the existing ChatCompletions↔Anthropic legs.
+- *Alternative considered*: Direct Responses↔Anthropic translator. Rejected — doubles the surface area we need to maintain, and creates a second source of truth for tool-use ordering and reasoning passthrough.
+
+### D2. Stateful streaming translators
+
+Streaming translators take `(chunk, state)` and return `(events[], state')`. The state object holds: sequence counter, open item indices, buffered reasoning text, tool-call index → call_id map, "started/completed sent" flags, accumulated usage. Translators only emit events; they do not write to a socket.
+
+- *Why*: Lets the outer SSE handler stay protocol-agnostic and lets us unit-test the translators with deterministic chunk-by-chunk inputs.
+- *Alternative considered*: Pure functional translators with no state. Rejected — Responses-API events carry monotonically increasing `sequence_number` and require open/close bookkeeping across many chunks.
+
+### D3. Open/close discipline for content blocks
+
+The streaming translator enforces the Responses-API contract:
+1. `response.created` and `response.in_progress` fire exactly once each at first usable chunk.
+2. Each `output_item` (message, reasoning, function_call) is bracketed by `output_item.added` and `output_item.done`; deltas only fire between them.
+3. Switching from reasoning to text closes the reasoning block before opening the text block. Switching from text to a tool call closes the text block before opening the tool-call item.
+4. On finish, every open block is closed in deterministic order before `response.completed` fires.
+5. A `null` chunk (end-of-stream sentinel from the SSE reader) triggers the flush path which closes any still-open blocks and emits `response.completed` exactly once.
+
+### D4. Tool-call ID hygiene at the boundary
+
+The Anthropic API requires tool IDs to match `^[a-zA-Z0-9_-]+$` and the Responses API caps tool IDs at 64 characters. The translator follows a three-tier sanitization policy on the upstream Anthropic side:
+
+1. **Pass-through** when the ID already matches the regex AND is ≤ 64 characters.
+2. **Strip-and-keep** when the ID contains some invalid characters: drop every char not in `[a-zA-Z0-9_-]`; if the residue is non-empty AND ≤ 64 characters, use the residue.
+3. **UUID fallback** when the ID is empty, becomes empty after stripping, or exceeds 64 characters: generate a fresh UUID (no deterministic synthesis, no positional encoding).
+
+On the OUTBOUND Responses-side, IDs longer than 64 characters are clamped to the first 64 characters.
+
+- *Why*: pass-through preserves client-supplied IDs that already pass; strip-and-keep recovers common patterns like `call:abc/123` losslessly; UUID fallback is simpler than positional synthesis and avoids leaking message-index/tool-call-index information to clients. Determinism for prompt-cache continuity is unnecessary because the upstream cache key is computed by Anthropic from the prompt content, not from tool-call IDs.
+
+### D5. Tool-result placement repair
+
+Anthropic requires that each `tool_use` block in an assistant message be followed immediately by a separate user message whose content is the matching `tool_result` block. The translator:
+- Splits any user message that mixes `tool_result` with other content; the `tool_result` goes first in its own message.
+- Drops assistant text blocks that appear AFTER a `tool_use` block in the same message (Anthropic rejects them).
+- Merges consecutive same-role messages after the split.
+- If an assistant message contains tool_calls and the next message has no matching tool_result, injects an empty tool_result for each missing call so the upstream does not 400.
+
+### D6. Reasoning passthrough has two modes
+
+- **Reasoning as a separate output item** (preferred for clients that understand Responses-API reasoning items): when the upstream emits `reasoning_content` deltas, the translator opens a `reasoning` output item and emits `reasoning_summary_text.delta` events.
+- **Reasoning embedded as `...` in text content**: legacy upstreams put thinking text inline. The translator recognises `` and `` markers in the text stream and routes the enclosed text into the reasoning channel instead of the text channel.
+
+### D7. Usage propagation is lossless across the pivot
+
+Cache tokens flow through the pivot without being dropped:
+- Anthropic `cache_read_input_tokens` → Chat-Completions `prompt_tokens_details.cached_tokens` → Responses `input_tokens_details.cached_tokens`.
+- Anthropic `cache_creation_input_tokens` → Chat-Completions `prompt_tokens_details.cache_creation_tokens`.
+- `input_tokens = prompt_tokens − cached_tokens − cache_creation_tokens` is the canonical decomposition rule applied at the Chat-Completions → Anthropic hop.
+
+### D8. `max_tokens` adjustment is upstream-friendly
+
+The translator:
+- Falls back to a default `max_tokens` if the client did not provide one.
+- Raises `max_tokens` to a configurable minimum when `tools[]` is non-empty (prevents truncated tool arguments).
+- Raises `max_tokens` above `thinking.budget_tokens + buffer` (Anthropic requires strictly greater).
+
+### D9. System prompt extraction and JSON-mode shim
+
+- All `role: "system"` messages in the intermediate Chat-Completions shape are concatenated and lifted to the Anthropic `system` block list.
+- A Responses-API `instructions` field is treated as a single system message at the head of the message list.
+- `response_format = json_schema` appends a system block telling the model to emit strict JSON matching the supplied schema. `response_format = json_object` appends a generic strict-JSON instruction. (Anthropic has no native equivalent.)
+
+### D10. Image input mapping
+
+- Responses-API `input_image` with `image_url` (string) becomes intermediate `image_url` with `{ url, detail: "auto" }`.
+- Intermediate `image_url` whose URL starts with `data:;base64,...` becomes Anthropic `image` with `source: { type: "base64", media_type, data }`.
+- Intermediate `image_url` whose URL starts with `http://` or `https://` becomes Anthropic `image` with `source: { type: "url", url }`.
+- Any other URL shape is dropped (Anthropic does not support arbitrary file IDs natively).
+
+### D11. Reasoning items in INPUT
+
+When a `reasoning` input item appears between turns, its text is extracted (from `summary[].text` if present, else from `content[].text`) and **buffered** until the next assistant message or function_call; it is then attached as `reasoning_content` to that assistant turn. A `reasoning` item is never emitted as a standalone Chat-Completions message.
+
+### D12. Format detection by endpoint
+
+The dispatch decision uses the endpoint path as the primary key: `/v1/responses` → Responses-API source format, `/v1/messages` → Anthropic source format, `/v1/chat/completions` with a body field that looks like Responses-API → Responses-API source (for CLI clients that send Responses bodies to the chat endpoint).
+
+## Risks / Trade-offs
+
+- **[Risk]** Streaming SSE order is observable to clients; a bug in open/close discipline produces malformed `output_item` brackets that crash strict SDKs.
+ - **Mitigation**: Behavioral assertions in the spec pin down exact event ordering; tests cover the cross-block transitions (reasoning→text, text→tool_call, finish flush, null-flush).
+- **[Risk]** Tool-call ID UUID fallback assigns a fresh UUID when the client's ID fails the regex AND has no usable residue; the client cannot correlate the resulting tool_use back to its original local ID.
+ - **Mitigation**: UUID fallback only triggers when the original ID is unrecoverable. The strip-and-keep tier handles the common case (`call:abc/123` → `callabc123`) without losing correlation. Document the policy in the operator-facing notes.
+- **[Risk]** Token-usage decomposition (`input − cached − cache_creation`) underflows to negative when upstreams report inconsistent values.
+ - **Mitigation**: Clamp to zero; document the invariant in the spec.
+- **[Risk]** The intermediate Chat-Completions pivot adds latency on the request-build path.
+ - **Mitigation**: All translation is pure-CPU JSON shape rewriting; profile after first integration test pass.
+- **[Risk]** The Anthropic `thinking` block requires `max_tokens > budget_tokens`; clients may set both and break the upstream.
+ - **Mitigation**: Translator raises `max_tokens` automatically; documented in the spec.
+- **[Trade-off]** We do not attempt to round-trip every Responses-API field (`store`, `background`, `prompt_cache_key`, `include`). These are stripped silently. Clients that rely on them get no error but no behavior change either. Phase 3 may decide to surface a warning.
+
+## Migration Plan
+
+- This is additive. No data migration. No client-visible change for requests that previously succeeded.
+- Rollout: feature flag `RESPONSES_TO_ANTHROPIC_ENABLED` read via `common.GetEnvOrDefaultBool("RESPONSES_TO_ANTHROPIC_ENABLED", true)`, **default `true`**. Operators who want the prior "not implemented" behavior can set `RESPONSES_TO_ANTHROPIC_ENABLED=false`.
+- Rollback: set the flag to `false`; the gateway falls back to the existing `adaptor.ConvertOpenAIResponsesRequest` path which returns the pre-change error.
+
+## Locked decisions
+
+- **Package placement** — confirmed: shape converters in `service/openaicompat/`, orchestration in `relay/responses_via_chat_completions.go`.
+- **Public translator entry-point names** — confirmed: `ResponsesRequestToChatCompletionsRequest`, `ChatCompletionsStreamToResponsesEvents`, `ChatCompletionsResponseToResponsesResponse`.
+- **Per-stream state struct** — confirmed: `ResponsesStreamState` exported from `service/openaicompat/`.
+- **OAuth tool-name prefix** — confirmed: not applicable; no prefix is applied and no name-mapping table is kept.
+- **JSON-mode system-prompt strings** — confirmed: hard-coded English.
+- **Tool-call ID strategy** — confirmed: pass-through / strip-and-keep / UUID fallback (D4 above). No deterministic positional synthesis.
+- **Feature flag default** — confirmed: `RESPONSES_TO_ANTHROPIC_ENABLED=true` (default ON).
diff --git a/openspec/changes/archive/2026-05-20-responses-to-anthropic-translation/proposal.md b/openspec/changes/archive/2026-05-20-responses-to-anthropic-translation/proposal.md
new file mode 100644
index 00000000000..418a00150ec
--- /dev/null
+++ b/openspec/changes/archive/2026-05-20-responses-to-anthropic-translation/proposal.md
@@ -0,0 +1,78 @@
+## Why
+
+Clients of the gateway today can hit `POST /v1/responses` (OpenAI Responses API shape) and expect to be served by any routed upstream channel. The relay supports OpenAI-compatible upstreams and Anthropic `/v1/messages` upstreams independently, but when a `/v1/responses` request is routed to an Anthropic-typed channel the gateway has no end-to-end translation path: the request shape cannot be forwarded to `/v1/messages` as-is, and the upstream streaming events cannot be re-encoded into Responses-API events without a translation layer.
+
+This change introduces that translation layer so a single Responses-API request can be served transparently by an Anthropic upstream, with full feature parity for streaming text, reasoning (thinking) passthrough, multi-turn tool use, image input, system prompt extraction, JSON-mode hints, and token usage (including prompt cache tokens) propagation.
+
+## What Changes
+
+- **New translation pipeline** for inbound requests: Responses-shaped request → Chat-Completions-shaped intermediate → Anthropic Messages-shaped request, wired into the existing relay format dispatch so that routing a `/v1/responses` request to an Anthropic-typed channel succeeds instead of returning "not implemented".
+- **New translation pipeline** for outbound responses (both streaming SSE and final non-streaming): Anthropic Messages event stream → Chat-Completions chunk shape → Responses-API event stream, including correct `response.created` / `response.in_progress` / `response.output_item.added` / delta / `response.completed` event ordering and sequence numbering.
+- **Reasoning passthrough**: when the upstream emits a `thinking` block, the gateway re-emits it as Responses-API `reasoning` output items with proper `reasoning_summary_text.delta` / `reasoning_summary_text.done` / `reasoning_summary_part.done` / `output_item.done` event sequencing. `...` inline markers in regular text are also recognised and rerouted.
+- **System prompt extraction**: a Responses-API `instructions` field, or a `system` message in an intermediate shape, is lifted into the Anthropic `system` block list with proper cache_control handling.
+- **Tool use round-tripping**: tool declarations, tool calls, and tool results are converted in both directions; tool-use blocks and their tool_result counterparts are placed in adjacent Anthropic messages per Anthropic API rules; missing tool results are auto-injected as empty before forwarding upstream; assistant text emitted after a `tool_use` block is dropped; consecutive same-role messages are merged.
+- **Tool-call ID hygiene**: every tool call must have an ID. IDs that already match the Anthropic-compatible regex `^[a-zA-Z0-9_-]+$` and are ≤ 64 characters are passed through unchanged. IDs that contain invalid characters are sanitized by stripping non-`[a-zA-Z0-9_-]` characters and keeping the result if non-empty; otherwise a fresh UUID is generated as the replacement. IDs longer than 64 characters are clamped at the Responses-side boundary. Nameless tool calls and hosted (no-name) tool declarations are filtered out before forwarding upstream.
+- **`max_tokens` clamp**: `max_tokens` is set from the request, raised to a configurable minimum when tools are present (to avoid truncated tool arguments), and raised above `thinking.budget_tokens + buffer` when the upstream is in thinking mode (Anthropic requires `max_tokens > budget_tokens`).
+- **Image input mapping**: Responses-API `input_image` items are converted to intermediate `image_url`, then to Anthropic `image` blocks; `data:` URLs become `base64` sources and `http(s)` URLs become `url` sources.
+- **Reasoning-effort mapping**: a Chat-Completions-shaped `reasoning_effort` enum (none/low/medium/high/xhigh) is converted to a Claude `thinking.budget_tokens` value when no explicit `thinking` block is present.
+- **Response-format mapping**: `response_format = json_object` or `json_schema` injects an extra system-prompt block instructing the model to return strict JSON (Anthropic has no native equivalent field).
+- **Usage propagation**: prompt cache read/write tokens are propagated through every translation hop. In the upstream-to-OpenAI direction, `cache_read_input_tokens` and `cache_creation_input_tokens` flow into `prompt_tokens_details.cached_tokens` and `prompt_tokens_details.cache_creation_tokens`. In the downstream-to-Responses direction, they flow into `input_tokens_details.cached_tokens`.
+- **Input shape normalization**: a string `input` is wrapped as a single user message with an `input_text` part; an empty array `input[]` is replaced with a single placeholder message so the upstream does not receive `messages: []`; items with a `role` field but no `type` are treated as `message` items.
+- **Reasoning items in input**: a `reasoning` input item is buffered and attached to the next assistant message as `reasoning_content`, never forwarded as a standalone message.
+- **Failure mapping**: upstream `error` and `response.failed` events surface as a documented OpenAI-shaped error chunk (no duplicate emission).
+- The current behavior of returning a 5xx-class "not implemented" error for `/v1/responses` requests routed to Anthropic-typed channels is **REMOVED**.
+
+## Capabilities
+
+### New Capabilities
+- `responses-to-anthropic-translation`: end-to-end translation of OpenAI Responses-API requests and streamed responses to and from the Anthropic Messages-API shape, including request body conversion, response event re-encoding, tool-use round-tripping, reasoning passthrough, image input mapping, system prompt extraction, JSON-mode hint injection, token usage propagation (including prompt-cache token classes), and input-shape normalization.
+
+### Modified Capabilities
+- (none — this introduces a new translation pipeline rather than altering existing spec-level behavior. The change does not modify existing channel BYOK, quota, billing, retry, or auto-ban behavior.)
+
+## Scope
+
+**In scope (this change):**
+- Request shape: Responses-API `{ input, instructions, tools, tool_choice, temperature, top_p, max_tokens, reasoning, reasoning_effort, response_format, thinking, model, stream }`
+- Response stream: text deltas, reasoning deltas, tool-call deltas, finish reasons (`stop`, `length`, `tool_calls`), usage (including cache tokens)
+- Both streaming and non-streaming Responses-API client modes
+- Tool declarations in both `{ type: "function", function: { name, ... } }` and bare `{ type: "function", name, ... }` Responses-API forms; pass-through of built-in (non-function) tool types when target is Anthropic
+- Behavioral parity for the existing flow of intermediate-Chat-Completions ↔ Anthropic Messages, since the Responses-to-Anthropic path piggybacks on it
+
+**Out of scope (explicit non-goals):**
+- File-search / web-search / computer-use / code-interpreter hosted tools on the Responses-API surface beyond pass-through of declarations
+- Anthropic-side `output_config`, structured-output JSON schema enforcement, and provider-specific quirks for non-Anthropic upstreams (these are pre-existing behaviors and are not modified here)
+- Persistent conversation storage (`store: true` semantics); the translator strips this field
+- Background mode (`background: true` Responses-API field)
+- Encrypted content reasoning items (`encrypted_content` summary fallback) beyond the documented text-extraction path
+- Any change to quota, billing, log attribution, or channel selection
+- Any change to the existing OpenAI-compatible `/v1/chat/completions` path
+
+## Impact
+
+- **Affected APIs**: `POST /v1/responses` becomes routable to Anthropic-typed channels.
+- **Affected code areas**:
+ - `service/openaicompat/responses_to_chat.go` (new function `ResponsesRequestToChatCompletionsRequest`)
+ - `service/openaicompat/chat_to_responses.go` (new functions `ChatCompletionsStreamToResponsesEvents` + `ChatCompletionsResponseToResponsesResponse` + per-stream state struct)
+ - `relay/responses_via_chat_completions.go` (new orchestration file, mirror of `relay/chat_completions_via_responses.go`)
+ - `relay/responses_handler.go` (new branch when `info.ApiType == APITypeAnthropic`, calling the new orchestration before falling back to `adaptor.ConvertOpenAIResponsesRequest`)
+- **Reused converters (not duplicated)**:
+ - `relay/channel/claude/relay-claude.go::RequestOpenAI2ClaudeMessage` — Chat-Completions request → Anthropic Messages request (already handles tool ordering, max_tokens adjustment, image mapping, system extraction)
+ - `relay/channel/claude/relay-claude.go::ClaudeStreamHandler` + `StreamResponseClaude2OpenAI` — Claude streaming response → Chat-Completions chunks
+ - `relay/channel/claude/relay-claude.go::ClaudeHandler` + `ResponseClaude2OpenAI` — Claude non-streaming response → Chat-Completions response
+- **Dependencies**: no new third-party dependencies; uses the project's existing JSON wrapper (`common.Marshal`/`common.Unmarshal`) and the standard library UUID/random generator.
+- **Database**: no migrations.
+- **Frontend**: no UI changes; the translation is transparent to clients.
+- **Backward compatibility**: additive. Requests that were previously rejected ("not implemented") now succeed. Requests that previously succeeded (Responses-to-OpenAI-compatible upstreams) are not affected.
+
+## Locked decisions (Phase 3)
+
+- **Package placement**: shape converters land in `service/openaicompat/` parallel to the existing `chat_to_responses.go`/`responses_to_chat.go`; orchestration lands in `relay/responses_via_chat_completions.go` mirroring the existing `relay/chat_completions_via_responses.go`.
+- **Naming**: PascalCase `XToY` style matching project convention: `ResponsesRequestToChatCompletionsRequest`, `ChatCompletionsStreamToResponsesEvents`, `ChatCompletionsResponseToResponsesResponse`. Per-stream state struct: `ResponsesStreamState`.
+- **Reuse strategy**: the `ChatCompletions ↔ AnthropicMessages` legs are NOT reimplemented; the existing Claude adaptor converters listed above are called directly.
+- **Tool-call ID strategy**: pass-through when valid; sanitize non-empty residue when partially invalid; UUID fallback (no deterministic synthesis) when fully invalid. Clamp to 64 characters at the Responses-side boundary.
+- **OAuth tool-name prefix**: NOT applicable to this project (the Anthropic adaptor uses `x-api-key`, not an OAuth flow). The translator hard-codes no prefix; no `prefixedName→originalName` map exists.
+- **JSON-mode prompt text**: hard-coded English, matching the convention of other converters in this codebase.
+- **Test style**: assertion-style using `testify/require` and `t.Errorf`, matching `relay/channel/claude/relay_claude_test.go`. No golden files.
+- **Feature gate**: `RESPONSES_TO_ANTHROPIC_ENABLED`, default `true`. Operators can set the variable to `false` to restore the prior "not implemented" behavior.
+- **Conflict surface**: clean. The only uncommitted change at the time of this proposal is this OpenSpec change itself; no in-flight work touches `relay/responses_handler.go` or `relay/channel/claude/`.
diff --git a/openspec/changes/archive/2026-05-20-responses-to-anthropic-translation/specs/responses-to-anthropic-translation/spec.md b/openspec/changes/archive/2026-05-20-responses-to-anthropic-translation/specs/responses-to-anthropic-translation/spec.md
new file mode 100644
index 00000000000..b0300a40765
--- /dev/null
+++ b/openspec/changes/archive/2026-05-20-responses-to-anthropic-translation/specs/responses-to-anthropic-translation/spec.md
@@ -0,0 +1,856 @@
+## ADDED Requirements
+
+### Requirement: Endpoint-driven source format detection
+
+The gateway SHALL classify the inbound request's source format from the URL path before consulting the body shape. A request whose path contains `/v1/responses` SHALL be treated as the Responses-API source format. A request whose path contains `/v1/messages` SHALL be treated as the Anthropic-Messages source format. A request whose path contains `/v1/chat/completions` SHALL be treated as the OpenAI Chat-Completions source format, except that when its JSON body has a top-level `input` field that is an array, it SHALL be reclassified as the Responses-API source format.
+
+#### Scenario: `/v1/responses` path is Responses-API source
+
+- **WHEN** a client sends `POST /v1/responses`
+- **THEN** the gateway SHALL select the Responses-API translator chain regardless of body shape
+
+#### Scenario: `/v1/messages` path is Anthropic source
+
+- **WHEN** a client sends `POST /v1/messages`
+- **THEN** the gateway SHALL select the Anthropic-source translator chain regardless of body shape
+
+#### Scenario: `/v1/chat/completions` with Responses-style body
+
+- **WHEN** a client sends `POST /v1/chat/completions` with a JSON body whose `input` field is an array
+- **THEN** the gateway SHALL select the Responses-API source format
+
+#### Scenario: `/v1/chat/completions` with normal body
+
+- **WHEN** a client sends `POST /v1/chat/completions` with a JSON body that has no `input` array and uses `messages[]`
+- **THEN** the gateway SHALL select the OpenAI Chat-Completions source format
+
+### Requirement: Two-step pivot through Chat-Completions intermediate
+
+When the inbound source format and the outbound target format differ, the gateway SHALL perform translation in two hops through a Chat-Completions-shaped intermediate object. The Responses-API to Anthropic-Messages request translation SHALL execute `Responses → ChatCompletions` followed by `ChatCompletions → AnthropicMessages`. The Anthropic-Messages to Responses-API response translation SHALL execute `AnthropicMessages → ChatCompletions` followed by `ChatCompletions → ResponsesEvents`.
+
+#### Scenario: Request pivot is two-hop
+
+- **WHEN** a Responses-API request body is routed to an Anthropic-typed channel
+- **THEN** the request body delivered to the upstream SHALL be the result of applying the Responses→ChatCompletions translator followed by the ChatCompletions→AnthropicMessages translator, in that order
+
+#### Scenario: Response pivot is two-hop
+
+- **WHEN** an Anthropic streaming response chunk is received and the original client expects Responses-API events
+- **THEN** the chunk SHALL be passed through the Anthropic→ChatCompletions translator, and each emitted Chat-Completions chunk SHALL be passed through the ChatCompletions→ResponsesEvents translator before being written to the client
+
+#### Scenario: Same-format requests skip translation
+
+- **WHEN** the source and target formats are identical
+- **THEN** no translator is invoked and the body or chunk passes through unchanged
+
+### Requirement: Responses-API input shape normalization
+
+The gateway SHALL accept the Responses-API `input` field in three shapes and normalize them to an internal array of input items before translation: (a) a non-empty string, (b) an empty or whitespace-only string, (c) an array (possibly empty). A non-empty string SHALL be wrapped as a single user message item whose content is a single `input_text` part with the original text. An empty or whitespace-only string SHALL be wrapped as a single user message item whose content is a single `input_text` part with the placeholder text `"..."`. An empty array SHALL be replaced with a single user message item whose content is a single `input_text` part with the placeholder text `"..."`. A non-empty array SHALL be passed through unchanged. Any other shape SHALL be treated as invalid and SHALL cause the body to be forwarded unchanged (no translation).
+
+#### Scenario: String input is wrapped as user message
+
+- **WHEN** the request body contains `input: "hello world"`
+- **THEN** the normalized input items SHALL be `[{ type: "message", role: "user", content: [{ type: "input_text", text: "hello world" }] }]`
+
+#### Scenario: Empty string input is wrapped as placeholder
+
+- **WHEN** the request body contains `input: ""`
+- **THEN** the normalized input items SHALL be `[{ type: "message", role: "user", content: [{ type: "input_text", text: "..." }] }]`
+
+#### Scenario: Empty array input is replaced with placeholder
+
+- **WHEN** the request body contains `input: []`
+- **THEN** the normalized input items SHALL be `[{ type: "message", role: "user", content: [{ type: "input_text", text: "..." }] }]`
+
+#### Scenario: Non-empty array is passed through
+
+- **WHEN** the request body contains `input: [{ type: "message", role: "user", content: [...] }]`
+- **THEN** the normalized input items SHALL equal the original array
+
+#### Scenario: Non-string non-array input
+
+- **WHEN** the request body contains `input: 42` or `input: { foo: "bar" }`
+- **THEN** the gateway SHALL forward the body unchanged without invoking the Responses→ChatCompletions translator
+
+### Requirement: Responses-API `instructions` becomes a system message
+
+When the Responses-API request body contains a non-empty `instructions` string, the gateway SHALL prepend a single `role: "system"` message whose `content` is that string to the Chat-Completions `messages[]`.
+
+#### Scenario: Instructions prepended as system
+
+- **WHEN** the request body contains `instructions: "You are helpful."`
+- **THEN** the first message in the resulting Chat-Completions `messages[]` SHALL be `{ role: "system", content: "You are helpful." }`
+
+#### Scenario: Empty instructions is skipped
+
+- **WHEN** the request body contains `instructions: ""` or no `instructions` field
+- **THEN** no system message SHALL be prepended on behalf of `instructions`
+
+### Requirement: Input item type detection with role-only fallback
+
+The gateway SHALL determine each input item's type by reading its `type` field. If the `type` field is missing but a `role` field is present, the item SHALL be treated as type `"message"`. If neither field is present, the item SHALL be skipped silently.
+
+#### Scenario: Explicit type wins
+
+- **WHEN** an input item is `{ type: "function_call", call_id: "x", name: "y", arguments: "{}" }`
+- **THEN** the item SHALL be processed as a function call
+
+#### Scenario: Role-only fallback
+
+- **WHEN** an input item is `{ role: "user", content: [{ type: "input_text", text: "hi" }] }` with no `type` field
+- **THEN** the item SHALL be processed as type `"message"`
+
+#### Scenario: Neither type nor role
+
+- **WHEN** an input item is `{ foo: "bar" }`
+- **THEN** the item SHALL be skipped without error
+
+### Requirement: Message item content normalization
+
+For each input item of type `"message"`, the gateway SHALL map content parts to Chat-Completions content parts as follows: `input_text` and `output_text` parts SHALL become `{ type: "text", text }` parts; `input_image` parts SHALL become `{ type: "image_url", image_url: { url, detail } }` parts where `url` is the part's `image_url` field (if a string) or `file_id` field (if no `image_url`), and `detail` is the part's `detail` field or `"auto"` if absent. Parts of any other type SHALL be passed through unchanged.
+
+#### Scenario: input_text becomes text
+
+- **WHEN** a message item has `content: [{ type: "input_text", text: "hello" }]`
+- **THEN** the converted Chat-Completions message content SHALL be `[{ type: "text", text: "hello" }]`
+
+#### Scenario: output_text becomes text
+
+- **WHEN** a message item has `content: [{ type: "output_text", text: "answer" }]`
+- **THEN** the converted Chat-Completions message content SHALL be `[{ type: "text", text: "answer" }]`
+
+#### Scenario: input_image with image_url becomes image_url
+
+- **WHEN** a message item has `content: [{ type: "input_image", image_url: "https://example.com/a.png", detail: "high" }]`
+- **THEN** the converted Chat-Completions message content SHALL be `[{ type: "image_url", image_url: { url: "https://example.com/a.png", detail: "high" } }]`
+
+#### Scenario: input_image with file_id fallback
+
+- **WHEN** a message item has `content: [{ type: "input_image", file_id: "file_abc" }]` and no `image_url`
+- **THEN** the converted content SHALL be `[{ type: "image_url", image_url: { url: "file_abc", detail: "auto" } }]`
+
+#### Scenario: input_image with no url or file_id
+
+- **WHEN** a message item has `content: [{ type: "input_image" }]` with neither `image_url` nor `file_id`
+- **THEN** the converted content SHALL be `[{ type: "image_url", image_url: { url: "", detail: "auto" } }]`
+
+### Requirement: Function-call items become assistant tool_calls
+
+For each input item of type `"function_call"`, the gateway SHALL append the call to a buffered assistant message in the form `{ role: "assistant", content: null, tool_calls: [...] }`. Each tool call SHALL be `{ id: , type: "function", function: { name, arguments } }`. The buffered assistant message SHALL be flushed to the message list when the next non-function-call item is encountered or at end-of-input. Function-call items whose `name` is missing, not a string, or trimmed-empty SHALL be skipped silently.
+
+#### Scenario: Single function call
+
+- **WHEN** input contains `{ type: "function_call", call_id: "c1", name: "search", arguments: "{\"q\":\"x\"}" }` followed by no more items
+- **THEN** the resulting messages SHALL include `{ role: "assistant", content: null, tool_calls: [{ id: "c1", type: "function", function: { name: "search", arguments: "{\"q\":\"x\"}" } }] }`
+
+#### Scenario: Multiple consecutive function calls collapse
+
+- **WHEN** input contains two consecutive function_call items with call_ids `c1` and `c2`
+- **THEN** both calls SHALL be in the same assistant message's `tool_calls` array, in order
+
+#### Scenario: Function call with empty name is dropped
+
+- **WHEN** input contains `{ type: "function_call", call_id: "c1", name: "", arguments: "{}" }`
+- **THEN** the call SHALL NOT appear in any resulting assistant message
+
+#### Scenario: Function call with missing name is dropped
+
+- **WHEN** input contains `{ type: "function_call", call_id: "c1", arguments: "{}" }` with no `name` field
+- **THEN** the call SHALL NOT appear in any resulting assistant message
+
+### Requirement: Function-call-output items become tool messages
+
+For each input item of type `"function_call_output"`, the gateway SHALL flush any buffered assistant message and SHALL append a tool message `{ role: "tool", tool_call_id: , content: