feat(relay): Responses-API → Anthropic translation pivot#5002
feat(relay): Responses-API → Anthropic translation pivot#5002nullmastermind wants to merge 3 commits into
Conversation
Route `/v1/responses` requests to Anthropic-typed channels via a two-step pivot: Responses→ChatCompletions (new `service/openaicompat` translators) then ChatCompletions→Anthropic (existing Claude adaptor). Response side mirrors the pivot in reverse. Previously these requests returned a "not implemented" error. The pivot is feature-gated via `RESPONSES_TO_ANTHROPIC_ENABLED` (default true) so operators can restore the prior behavior without a redeploy.
The Responses-API spec requires each output item to carry its own stable `id` (e.g. `msg_*`, `rs_*`, `fc_*`) and for all child events (`content_part.*`, `output_text.*`, `reasoning_summary_*`, `function_call_arguments.*`) to reference that item id, not the top-level response id. Previously every `item_id` field was set to `state.ResponseID`, causing clients to see mismatched ids between `output_item.added` and the delta/done events that follow. The non-streaming path also emitted output items with no `id` field at all.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (8)
WalkthroughAdds an end-to-end Responses → Chat-Completions → Anthropic translation path: request normalization, tool-call sanitization, Claude relay enhancements (cache-control, file/media handling, response-format shims, missing tool_result injection), streaming and non-streaming response translators, dispatch wiring with a feature flag, full OpenSpec docs, and extensive tests. ChangesResponses-to-Anthropic Translation
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
✨ Finishing Touches🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Actionable comments posted: 12
🧹 Nitpick comments (2)
openspec/changes/archive/2026-05-20-responses-to-anthropic-translation/specs/responses-to-anthropic-translation/spec.md (1)
3-4: 💤 Low valueUpdate the placeholder Purpose section.
The Purpose section contains a TBD placeholder noting it should be updated after archiving. Since this spec is now part of the PR, the Purpose should be filled in with a meaningful description of the specification's intent.
📝 Suggested Purpose text
## Purpose -TBD - created by archiving change responses-to-anthropic-translation. Update Purpose after archive. +This specification defines the behavioral requirements for translating OpenAI Responses API requests to Anthropic Messages API format and back, enabling `/v1/responses` requests to be routed to Anthropic-typed upstream channels through a two-hop Chat-Completions intermediate translation.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@openspec/changes/archive/2026-05-20-responses-to-anthropic-translation/specs/responses-to-anthropic-translation/spec.md` around lines 3 - 4, Replace the TBD placeholder in the Purpose section of the spec (look for the "Purpose" heading in responses-to-anthropic-translation spec) with a concise description stating the specification's intent: to define endpoint-driven source format detection for Anthropic response translations, explain the problem it solves, the scope (when/where detection occurs), and the expected outcome (how consumers should use the detected format). Ensure the text is clear, present-tense, and a few sentences long to provide meaningful context now that the spec is being archived.openspec/specs/responses-to-anthropic-translation/spec.md (1)
3-4: 💤 Low valueUpdate the placeholder Purpose section.
The Purpose section contains a TBD placeholder noting it should be updated after archiving. Since this spec is now part of the PR, the Purpose should be filled in with a meaningful description of the specification's intent.
📝 Suggested Purpose text
## Purpose -TBD - created by archiving change responses-to-anthropic-translation. Update Purpose after archive. +This specification defines the behavioral requirements for translating OpenAI Responses API requests to Anthropic Messages API format and back, enabling `/v1/responses` requests to be routed to Anthropic-typed upstream channels through a two-hop Chat-Completions intermediate translation.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@openspec/specs/responses-to-anthropic-translation/spec.md` around lines 3 - 4, Replace the placeholder text under the "## Purpose" header in the responses-to-anthropic-translation spec with a concise description of the specification's intent: explain what this translation spec covers, its goals (e.g., mapping response formats from source to Anthropic-compatible outputs), intended consumers, and any scope/limitations; update the header block so "## Purpose" contains this meaningful paragraph instead of "TBD - created by archiving change responses-to-anthropic-translation."
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@relay/channel/claude/relay-claude.go`:
- Around line 142-160: The helper applyCacheControlToLastAssistantContent
currently returns if the last assistant message's Content isn't a
[]dto.ClaudeMediaMessage; change it to also handle the plain-string case
produced by RequestOpenAI2ClaudeMessage by detecting when messages[i].Content is
a string, creating a single-block []dto.ClaudeMediaMessage with Type "text" (or
the appropriate text field), set its CacheControl to
claudeAssistantCacheControlMarker, assign that slice back to
messages[i].Content, and return; keep the existing logic for when Content is
already []dto.ClaudeMediaMessage.
In `@relay/responses_via_chat_completions.go`:
- Around line 137-150: This bypass path around
HandleStreamResponseData/HandleClaudeResponseData needs to preserve Claude
refusal marking: after retrieving claudeError from claudeResponse (using
GetClaudeError) and before emitting events/writeEvents and calling
sr.Stop(types.WithClaudeError(...)), detect refusal errors (claudeError.Type ==
"refusal" or equivalent) and set the same marker used elsewhere
(constant.ContextKeyAdminRejectReason with the refusal message) on the
request/context or invoke the shared handling function
(HandleClaudeResponseData/HandleStreamResponseData) that performs that work so
the stop_reason=refusal signal is recorded for moderation/accounting.
- Around line 159-163: The code assigns raw Anthropic/Claude usage
(claudeInfo.Usage) directly to chatChunk.Usage, but the Responses translator
expects OpenAI-style counts and needs cached read/creation tokens folded into
prompt_tokens; update the assignment to convert/normalize Claude usage to OpenAI
semantics (reuse the same conversion logic used by the existing Claude→OpenAI
path), e.g. call a helper like
convertAnthropicUsageToOpenAI/normalizeClaudeUsage(claudeInfo.Usage) before
setting chatChunk.Usage so prompt_tokens includes cached read/creation tokens;
apply the same change for the other branch handling claudeInfo usage.
- Around line 136-176: The handler currently calls sr.Error(e) on JSON parse or
write failures but never sets the shared streamErr variable, so the final EOS
flush (openaicompat.ChatCompletionsStreamToResponsesEvents(nil, state) followed
by writeEvents) still runs and can emit a synthetic response.completed; fix by
assigning an appropriate error to streamErr when calling sr.Error (e.g., set
streamErr = types.WithStreamError(e) or wrap as needed) inside the
helper.StreamScannerHandler error branches (the JSON unmarshal and writeEvents
failure paths) and then skip the unconditional EOS flush by only
generating/writing flushEvents if streamErr == nil (or return early if streamErr
is non-nil) so real failures propagate instead of completing the stream.
In `@service/openaicompat/chat_stream_to_responses_test.go`:
- Around line 3-20: The test helper unmarshalEvent (and other test helpers
around lines referenced) currently calls json.Unmarshal directly; replace those
direct uses with common.Unmarshal to follow the repository JSON wrapper policy:
call common.Unmarshal(data, &m) and assert require.NoError(t, err) (similar to
the existing Marshal usage), remove the import of "encoding/json" if no longer
needed, and update any other direct encoding/json deserialization occurrences
(e.g., the area noted at 457-460) to use the appropriate common.Unmarshal/Common
helpers (common.UnmarshalJsonStr / common.DecodeJson) as applicable.
In `@service/openaicompat/chat_stream_to_responses.go`:
- Around line 441-471: The loop in closeAllOpenFunctionCalls iterates
state.FuncCalls (a map) which yields nondeterministic order; to fix, collect the
open function-call entries from state.FuncCalls into a slice, sort that slice by
the tool index / fc.ItemIndex (or by fc.ItemIndex then fc.ID for tie-breaker),
then iterate the sorted slice and call emitEvent as before (keep funcCallItemID,
arguments defaulting to "{}" and setting fc.Done = true). This ensures
ResponsesStreamState.FuncCalls are closed in a stable, deterministic order when
emitting response.function_call_arguments.done and response.output_item.done
events.
- Around line 186-252: The handler fails when a '<think>' or '</think>' token is
split across chunks; modify ResponsesStreamState to hold a small
pendingTagBuffer string and in handleTextDeltaWithInlineThink prepend that
buffer to incoming text, then after processing keep any trailing partial tag
(e.g., leading '<', '</', '<th', etc.) in pendingTagBuffer instead of emitting
it as regular text; update state.InThinkInlineTag, call
ensureMessageOpen/ensureReasoningOpen and
closeMessageIfOpen/closeReasoningIfOpen and emitEvent exactly as before but
operate on the combined buffer, and clear pendingTagBuffer once a full tag is
consumed so tag boundaries across chunks are handled correctly.
- Around line 153-177: The code currently calls emitEvent(state,
"response.created", ...) and discards the return (which still increments the
sequence), causing the first real returned event to have sequence_number 2; in
EmitChatStreamErrorEvent avoid emitting the created prelude as a discarded
side-effect — instead, when state.Started is false, prepare the prelude (set
CreatedAt/ResponseID), and include the "response.created" event in the returned
events slice (using emitEvent) followed by the "response.failed" event, removing
the `_ = emitEvent(...)` discard; ensure you still set state.Started = true and
state.ErrorEmitted = true after preparing the events so sequence numbers are
produced only for returned events.
In `@service/openaicompat/chat_to_responses.go`:
- Around line 497-503: The fallback ID logic for function_call items uses the
same constant fc_<idBase> when tc.ID is empty, causing duplicate IDs; update the
code that sets fcItemID (the block using tc.ID and idBase) to generate a unique
ID per output item (e.g., append a per-message index, timestamp, or UUID) so
that when tc.ID is blank you produce something like fc_<idBase>_<seq> (while
still ensuring the "fc_" prefix); ensure this uniqueness strategy is applied
where fcItemID is assigned so subsequent uses of fcItemID refer to the unique
value for each output item.
In `@service/openaicompat/responses_to_chat.go`:
- Around line 583-589: The switch branch handling `"input_text", "output_text"`
currently uses `if t, _ := pm["text"].(string); true { ... }` which
unconditionally appends a text MediaContent even when `pm["text"]` is missing or
not a string; change the condition to check the type assertion result (e.g., `if
t, ok := pm["text"].(string); ok { result = append(result, dto.MediaContent{
Type: dto.ContentTypeText, Text: t, }) }`) so only valid non-empty string `text`
values are converted and appended.
- Line 4: Remove the unnecessary "encoding/json" sentinel import from
responses_to_chat.go and delete the sentinel variable that exists solely to keep
that import alive; ensure all JSON usage continues to use the
repository-compliant wrappers (common.Unmarshal, common.Marshal,
common.GetJsonType) and run tests/lint to confirm no direct encoding/json
references remain (locate the import declaration and the sentinel variable near
the JSON handling code in this file and remove both).
In `@service/openaicompat/tool_call_ids.go`:
- Around line 74-79: The current logic in the loop that sets tc.ID calls
sanitizeOneToolID and then writes idMap[origID] = newID, which can overwrite an
existing remap and produce inconsistent IDs for repeated invalid originals;
change the flow in the block handling tc.ID so it first checks if idMap already
contains origID and, if so, reuses idMap[origID] to set tc.ID, otherwise call
sanitizeOneToolID(origID), assign the newID to tc.ID and add idMap[origID] =
newID (without overwriting existing entries). Ensure you update the code paths
that reference sanitizeOneToolID, idMap and tc.ID to follow this
lookup-before-generate behavior.
---
Nitpick comments:
In
`@openspec/changes/archive/2026-05-20-responses-to-anthropic-translation/specs/responses-to-anthropic-translation/spec.md`:
- Around line 3-4: Replace the TBD placeholder in the Purpose section of the
spec (look for the "Purpose" heading in responses-to-anthropic-translation spec)
with a concise description stating the specification's intent: to define
endpoint-driven source format detection for Anthropic response translations,
explain the problem it solves, the scope (when/where detection occurs), and the
expected outcome (how consumers should use the detected format). Ensure the text
is clear, present-tense, and a few sentences long to provide meaningful context
now that the spec is being archived.
In `@openspec/specs/responses-to-anthropic-translation/spec.md`:
- Around line 3-4: Replace the placeholder text under the "## Purpose" header in
the responses-to-anthropic-translation spec with a concise description of the
specification's intent: explain what this translation spec covers, its goals
(e.g., mapping response formats from source to Anthropic-compatible outputs),
intended consumers, and any scope/limitations; update the header block so "##
Purpose" contains this meaningful paragraph instead of "TBD - created by
archiving change responses-to-anthropic-translation."
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: e27c3a1a-4f23-4a60-ae5c-dac32a80cfe8
📒 Files selected for processing (23)
dto/claude.goopenspec/changes/archive/2026-05-20-responses-to-anthropic-translation/.openspec.yamlopenspec/changes/archive/2026-05-20-responses-to-anthropic-translation/design.mdopenspec/changes/archive/2026-05-20-responses-to-anthropic-translation/proposal.mdopenspec/changes/archive/2026-05-20-responses-to-anthropic-translation/specs/responses-to-anthropic-translation/spec.mdopenspec/changes/archive/2026-05-20-responses-to-anthropic-translation/tasks.mdopenspec/specs/responses-to-anthropic-translation/spec.mdrelay/channel/claude/relay-claude.gorelay/channel/claude/relay_claude_test.gorelay/helper/stream_scanner.gorelay/responses_handler.gorelay/responses_via_chat_completions.gorelay/responses_via_chat_completions_test.goservice/channel_affinity_usage_cache_test.goservice/openaicompat/chat_stream_to_responses.goservice/openaicompat/chat_stream_to_responses_test.goservice/openaicompat/chat_to_responses.goservice/openaicompat/chat_to_responses_test.goservice/openaicompat/responses_stream_state.goservice/openaicompat/responses_to_chat.goservice/openaicompat/responses_to_chat_test.goservice/openaicompat/tool_call_ids.goservice/openaicompat/tool_call_ids_test.go
…pivot
- relay-claude.go: applyCacheControlToLastAssistantContent now promotes plain
string assistant content to a single-block []ClaudeMediaMessage so cache_control
is attached on the common text-only case produced by RequestOpenAI2ClaudeMessage.
- responses_via_chat_completions.go: assign streamErr on JSON/SSE-write failures
and skip the EOS flush when set, so upstream errors stop emitting a synthetic
response.completed. Mark Claude refusals on both streaming and non-streaming
paths (parity with HandleStreamResponseData / non-pivot handler). Fold Claude
cache_read/creation tokens into prompt_tokens via a new
normalizeClaudeUsageForOpenAISemantics helper so the Responses translator,
which subtracts cached from prompt, yields correct input/total counts.
- chat_stream_to_responses.go:
- EmitChatStreamErrorEvent now returns the response.created prelude in the
events slice instead of discarding it, so error-only streams emit
response.failed at sequence_number 1.
- handleTextDeltaWithInlineThink buffers a trailing partial <think>/</think>
fragment across chunks via a new splitPendingThinkTag helper and
state.PendingTagBuffer; the buffer is flushed at EOS.
- closeAllOpenFunctionCalls iterates FuncCalls sorted by tool index for
deterministic close ordering (was nondeterministic Go map iteration).
- chat_to_responses.go: fallback function_call item IDs are now suffixed with a
per-output-item counter so multiple toolless calls don't collide on
fc_<idBase>.
- responses_to_chat.go: drop unused encoding/json sentinel; fix unconditional
text-part append in convertResponsesContentParts (only emit when type
assertion succeeds).
- tool_call_ids.go: reuse an existing idMap entry before sanitizing again so
repeated invalid originals (e.g. multiple "::::") map to a single sanitized id
consistent with the tool_result remap.
- chat_stream_to_responses_test.go: replace direct json.Unmarshal calls with
common.Unmarshal per the repository JSON wrapper rule.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
您好 @Calcium-Ion @seefs001,方便的时候麻烦帮忙 review 一下这个 PR 吗?感谢 🙏 Quick readiness summary:
Happy to address any further feedback. |
Summary
Route
/v1/responsesrequests targeted at Anthropic-typed channels through a two-step translation pivot instead of returning anot implementederror.Responses → ChatCompletions(new translator inservice/openaicompat/responses_to_chat.go) →ChatCompletions → Anthropic(existingrelay/channel/claude/relay-claude.go::RequestOpenAI2ClaudeMessage).service/openaicompat/chat_stream_to_responses.go,chat_to_responses.go).relay/responses_via_chat_completions.go, mirroring the opposite-directionrelay/chat_completions_via_responses.go.^[a-zA-Z0-9_-]{1,64}$.service/openaicompat/tool_call_ids.goapplies a pass-through → strip-and-keep → UUID-fallback strategy.msg_*,rs_*,fc_*) carries its own stableid, and child events (content_part.*,output_text.*,reasoning_summary_*,function_call_arguments.*) reference that item id rather than the top-level response id. The non-streaming path now also emitsidon each output item.RESPONSES_TO_ANTHROPIC_ENABLED(defaulttrue) — operators can disable to restore prior behavior without a redeploy.Why
Currently, OpenAI Responses-API requests routed to Anthropic-typed channels are rejected. This blocks integrations that standardize on the Responses-API client while wanting to use Claude models behind the gateway. Pivoting through the existing battle-tested Chat-Completions ↔ Anthropic path reuses validated translation logic and avoids a parallel Anthropic translator.
The per-item stream ID fix corrects a spec-conformance issue where every
item_idfield was set tostate.ResponseID, causing clients to see mismatched ids betweenoutput_item.addedand the delta/done events that follow.Changes
relay/responses_via_chat_completions.go+ tests — orchestration for the new pivot path.relay/responses_handler.go— dispatch Anthropic-typed channels into the pivot when the feature flag is enabled.relay/channel/claude/relay-claude.go+ tests — supporting changes for the pivot.relay/helper/stream_scanner.go— small change to streaming helper.service/openaicompat/— new package containing translators (responses_to_chat.go,chat_to_responses.go,chat_stream_to_responses.go), stream state (responses_stream_state.go), and tool-call ID sanitization (tool_call_ids.go), each with accompanying_test.go.dto/claude.go— DTO additions needed by the translators.openspec/specs/responses-to-anthropic-translation/spec.md+openspec/changes/archive/2026-05-20-responses-to-anthropic-translation/— design documentation for the change (proposal, design notes, spec, tasks). Maintainers can drop these if they prefer not to track design docs in-repo.Test plan
go build ./...passesservice/openaicompat/*_test.go,relay/responses_via_chat_completions_test.go,relay/channel/claude/relay_claude_test.go)/v1/responsesrequest (streaming and non-streaming, with and without tool calls) to a Claude channel; verify stream events carry per-item ids and tool-call ids satisfy Anthropic's regexRESPONSES_TO_ANTHROPIC_ENABLED=falseand confirm the prior "not implemented" response returns🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Bug Fixes
Improvements