Skip to content

feat(relay): Responses-API → Anthropic translation pivot#5002

Open
nullmastermind wants to merge 3 commits into
QuantumNous:mainfrom
nullmastermind:feat/responses-to-anthropic-pivot
Open

feat(relay): Responses-API → Anthropic translation pivot#5002
nullmastermind wants to merge 3 commits into
QuantumNous:mainfrom
nullmastermind:feat/responses-to-anthropic-pivot

Conversation

@nullmastermind
Copy link
Copy Markdown

@nullmastermind nullmastermind commented May 20, 2026

Summary

Route /v1/responses requests targeted at Anthropic-typed channels through a two-step translation pivot instead of returning a not implemented error.

  • Request path: Responses → ChatCompletions (new translator in service/openaicompat/responses_to_chat.go) → ChatCompletions → Anthropic (existing relay/channel/claude/relay-claude.go::RequestOpenAI2ClaudeMessage).
  • Response path: Claude stream/non-stream handler emits Chat-Completions chunks → re-translated to Responses-API events (service/openaicompat/chat_stream_to_responses.go, chat_to_responses.go).
  • Orchestration: relay/responses_via_chat_completions.go, mirroring the opposite-direction relay/chat_completions_via_responses.go.
  • Tool-call ID sanitization: Anthropic enforces ^[a-zA-Z0-9_-]{1,64}$. service/openaicompat/tool_call_ids.go applies a pass-through → strip-and-keep → UUID-fallback strategy.
  • Per-item stream IDs: each Responses output item (msg_*, rs_*, fc_*) carries its own stable id, and child events (content_part.*, output_text.*, reasoning_summary_*, function_call_arguments.*) reference that item id rather than the top-level response id. The non-streaming path now also emits id on each output item.
  • Feature gate: RESPONSES_TO_ANTHROPIC_ENABLED (default true) — operators can disable to restore prior behavior without a redeploy.

Why

Currently, OpenAI Responses-API requests routed to Anthropic-typed channels are rejected. This blocks integrations that standardize on the Responses-API client while wanting to use Claude models behind the gateway. Pivoting through the existing battle-tested Chat-Completions ↔ Anthropic path reuses validated translation logic and avoids a parallel Anthropic translator.

The per-item stream ID fix corrects a spec-conformance issue where every item_id field was set to state.ResponseID, causing clients to see mismatched ids between output_item.added and the delta/done events that follow.

Changes

  • relay/responses_via_chat_completions.go + tests — orchestration for the new pivot path.
  • relay/responses_handler.go — dispatch Anthropic-typed channels into the pivot when the feature flag is enabled.
  • relay/channel/claude/relay-claude.go + tests — supporting changes for the pivot.
  • relay/helper/stream_scanner.go — small change to streaming helper.
  • service/openaicompat/ — new package containing translators (responses_to_chat.go, chat_to_responses.go, chat_stream_to_responses.go), stream state (responses_stream_state.go), and tool-call ID sanitization (tool_call_ids.go), each with accompanying _test.go.
  • dto/claude.go — DTO additions needed by the translators.
  • openspec/specs/responses-to-anthropic-translation/spec.md + openspec/changes/archive/2026-05-20-responses-to-anthropic-translation/ — design documentation for the change (proposal, design notes, spec, tasks). Maintainers can drop these if they prefer not to track design docs in-repo.

Test plan

  • go build ./... passes
  • Unit tests for the new translators (service/openaicompat/*_test.go, relay/responses_via_chat_completions_test.go, relay/channel/claude/relay_claude_test.go)
  • Manual smoke test: send /v1/responses request (streaming and non-streaming, with and without tool calls) to a Claude channel; verify stream events carry per-item ids and tool-call ids satisfy Anthropic's regex
  • Manual smoke test: set RESPONSES_TO_ANTHROPIC_ENABLED=false and confirm the prior "not implemented" response returns

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Responses API can route to Anthropic backends and return Responses-format streaming or non-streaming results
    • JSON object and JSON schema response formatting for Claude/Anthropic flows
  • Bug Fixes

    • Auto-inserts missing tool results to preserve message ordering
    • Sanitizes tool-call IDs to meet Anthropic constraints
  • Improvements

    • Better streaming reasoning routing and stable event sequencing
    • Improved token-usage propagation including cached-token accounting

Review Change Stack

nullmastermind added 2 commits May 20, 2026 22:59
Route `/v1/responses` requests to Anthropic-typed channels via a two-step
pivot: Responses→ChatCompletions (new `service/openaicompat` translators)
then ChatCompletions→Anthropic (existing Claude adaptor). Response side
mirrors the pivot in reverse.

Previously these requests returned a "not implemented" error. The pivot
is feature-gated via `RESPONSES_TO_ANTHROPIC_ENABLED` (default true) so
operators can restore the prior behavior without a redeploy.
The Responses-API spec requires each output item to carry its own stable
`id` (e.g. `msg_*`, `rs_*`, `fc_*`) and for all child events
(`content_part.*`, `output_text.*`, `reasoning_summary_*`,
`function_call_arguments.*`) to reference that item id, not the top-level
response id.

Previously every `item_id` field was set to `state.ResponseID`, causing
clients to see mismatched ids between `output_item.added` and the delta/done
events that follow. The non-streaming path also emitted output items with no
`id` field at all.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 20, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5a6a3879-24d7-423f-8743-5096c26ec319

📥 Commits

Reviewing files that changed from the base of the PR and between 18e56d9 and 3910958.

📒 Files selected for processing (8)
  • relay/channel/claude/relay-claude.go
  • relay/responses_via_chat_completions.go
  • service/openaicompat/chat_stream_to_responses.go
  • service/openaicompat/chat_stream_to_responses_test.go
  • service/openaicompat/chat_to_responses.go
  • service/openaicompat/responses_stream_state.go
  • service/openaicompat/responses_to_chat.go
  • service/openaicompat/tool_call_ids.go

Walkthrough

Adds an end-to-end Responses → Chat-Completions → Anthropic translation path: request normalization, tool-call sanitization, Claude relay enhancements (cache-control, file/media handling, response-format shims, missing tool_result injection), streaming and non-streaming response translators, dispatch wiring with a feature flag, full OpenSpec docs, and extensive tests.

Changes

Responses-to-Anthropic Translation

Layer / File(s) Summary
Spec, design, and task inventory
openspec/changes/archive/2026-05-20-responses-to-anthropic-translation/*, openspec/specs/responses-to-anthropic-translation/spec.md
OpenSpec proposal, formal spec, design notes, and task checklist describing the two-hop pivot, streaming semantics, tool-call ID policy, cache-control placement, response-format shims, and rollout via RESPONSES_TO_ANTHROPIC_ENABLED.
Streaming state and DTO definitions
service/openaicompat/responses_stream_state.go, dto/claude.go
New per-stream and per-tool-call state types; Tool extended with optional cache_control (ClaudeCacheControl) for Anthropic prompt-caching metadata.
Responses request translation
service/openaicompat/responses_to_chat.go, service/openaicompat/responses_to_chat_test.go
Adapter converting OpenAIResponsesRequestGeneralOpenAIRequest (chat-completions intermediate) with input normalization, instructions→system lifting, content-part conversion, function-call buffering, reasoning attachment, tool declaration normalization, and response-format mapping.
Tool-call ID sanitization
service/openaicompat/tool_call_ids.go, service/openaicompat/tool_call_ids_test.go
Sanitize Anthropic tool_use.ids with regex enforcement, strip-and-keep, and UUID fallback; default missing tool-call type to function and remap references consistently.
Claude relay enhancements
relay/channel/claude/relay-claude.go, relay/channel/claude/relay_claude_test.go
Classify ContentTypeFile inputs, emit Claude media blocks, append JSON-mode response_format system shim, apply ephemeral cache_control to last tool and last eligible assistant content block (skip thinking), and inject synthetic/empty tool_result blocks when required.
Responses-via-Chat-Completions handler
relay/responses_via_chat_completions.go, relay/responses_via_chat_completions_test.go
Orchestrates the pivot: translate Responses→Chat, sanitize tool IDs, forward via adaptor to Claude, detect stream vs non-stream upstream, and route upstream output through translators with usage normalization and refusal propagation.
Chat-Completions streaming → Responses events
service/openaicompat/chat_stream_to_responses.go, service/openaicompat/chat_stream_to_responses_test.go
Convert Chat-Completions stream chunks into Responses SSE events with sequence numbering, inline <think> parsing across chunk boundaries, function-call argument deltas, deterministic EOS flush, usage token decomposition, and error deduplication.
Chat-Completions non-stream → Responses
service/openaicompat/chat_to_responses.go, service/openaicompat/chat_to_responses_test.go
Map non-stream Chat-Completions JSON responses into Responses-API JSON responses, building output[] items and mapping usage with cached-token decomposition and truncation signaling.
Dispatch and small helpers
relay/responses_handler.go, relay/helper/stream_scanner.go, service/channel_affinity_usage_cache_test.go
Feature-flag gating for the pivot (shouldUseResponsesToAnthropicPivot), early-return branch to responsesViaChatCompletions, preserve caller-provided StreamStatus, and test cache reset helper for isolation.
Tests & test helpers
multiple *_test.go files across packages
Comprehensive unit tests for request translation, stream event translation, Claude relay behaviors (cache_control, response_format, missing tool_result), tool ID sanitization, handler wiring, and stream/non-stream conversion scenarios.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • QuantumNous/new-api#3505: Modifies Claude file/media handling in the same relay path; overlapping concerns around ContentTypeFile conversion.
  • QuantumNous/new-api#1531: Related to Claude tool_use/tool_result handling and name population, operating on the same tool-use/result surface.
  • QuantumNous/new-api#1384: Adjacent changes to Claude relay/tool handling and DTOs; complements prompt-cache and tool shaping edits.

Suggested reviewers

  • seefs001
  • Calcium-Ion
  • creamlike1024

"🐰
A pivot stitched from spec to code with care,
Tools scrubbed, caches marked, and streams sorted fair,
Messages hop the bridge and find Anthropic's shore,
Clients clap as Responses flow back once more."

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

🧹 Nitpick comments (2)
openspec/changes/archive/2026-05-20-responses-to-anthropic-translation/specs/responses-to-anthropic-translation/spec.md (1)

3-4: 💤 Low value

Update the placeholder Purpose section.

The Purpose section contains a TBD placeholder noting it should be updated after archiving. Since this spec is now part of the PR, the Purpose should be filled in with a meaningful description of the specification's intent.

📝 Suggested Purpose text
 ## Purpose
-TBD - created by archiving change responses-to-anthropic-translation. Update Purpose after archive.
+This specification defines the behavioral requirements for translating OpenAI Responses API requests to Anthropic Messages API format and back, enabling `/v1/responses` requests to be routed to Anthropic-typed upstream channels through a two-hop Chat-Completions intermediate translation.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@openspec/changes/archive/2026-05-20-responses-to-anthropic-translation/specs/responses-to-anthropic-translation/spec.md`
around lines 3 - 4, Replace the TBD placeholder in the Purpose section of the
spec (look for the "Purpose" heading in responses-to-anthropic-translation spec)
with a concise description stating the specification's intent: to define
endpoint-driven source format detection for Anthropic response translations,
explain the problem it solves, the scope (when/where detection occurs), and the
expected outcome (how consumers should use the detected format). Ensure the text
is clear, present-tense, and a few sentences long to provide meaningful context
now that the spec is being archived.
openspec/specs/responses-to-anthropic-translation/spec.md (1)

3-4: 💤 Low value

Update the placeholder Purpose section.

The Purpose section contains a TBD placeholder noting it should be updated after archiving. Since this spec is now part of the PR, the Purpose should be filled in with a meaningful description of the specification's intent.

📝 Suggested Purpose text
 ## Purpose
-TBD - created by archiving change responses-to-anthropic-translation. Update Purpose after archive.
+This specification defines the behavioral requirements for translating OpenAI Responses API requests to Anthropic Messages API format and back, enabling `/v1/responses` requests to be routed to Anthropic-typed upstream channels through a two-hop Chat-Completions intermediate translation.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@openspec/specs/responses-to-anthropic-translation/spec.md` around lines 3 -
4, Replace the placeholder text under the "## Purpose" header in the
responses-to-anthropic-translation spec with a concise description of the
specification's intent: explain what this translation spec covers, its goals
(e.g., mapping response formats from source to Anthropic-compatible outputs),
intended consumers, and any scope/limitations; update the header block so "##
Purpose" contains this meaningful paragraph instead of "TBD - created by
archiving change responses-to-anthropic-translation."
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@relay/channel/claude/relay-claude.go`:
- Around line 142-160: The helper applyCacheControlToLastAssistantContent
currently returns if the last assistant message's Content isn't a
[]dto.ClaudeMediaMessage; change it to also handle the plain-string case
produced by RequestOpenAI2ClaudeMessage by detecting when messages[i].Content is
a string, creating a single-block []dto.ClaudeMediaMessage with Type "text" (or
the appropriate text field), set its CacheControl to
claudeAssistantCacheControlMarker, assign that slice back to
messages[i].Content, and return; keep the existing logic for when Content is
already []dto.ClaudeMediaMessage.

In `@relay/responses_via_chat_completions.go`:
- Around line 137-150: This bypass path around
HandleStreamResponseData/HandleClaudeResponseData needs to preserve Claude
refusal marking: after retrieving claudeError from claudeResponse (using
GetClaudeError) and before emitting events/writeEvents and calling
sr.Stop(types.WithClaudeError(...)), detect refusal errors (claudeError.Type ==
"refusal" or equivalent) and set the same marker used elsewhere
(constant.ContextKeyAdminRejectReason with the refusal message) on the
request/context or invoke the shared handling function
(HandleClaudeResponseData/HandleStreamResponseData) that performs that work so
the stop_reason=refusal signal is recorded for moderation/accounting.
- Around line 159-163: The code assigns raw Anthropic/Claude usage
(claudeInfo.Usage) directly to chatChunk.Usage, but the Responses translator
expects OpenAI-style counts and needs cached read/creation tokens folded into
prompt_tokens; update the assignment to convert/normalize Claude usage to OpenAI
semantics (reuse the same conversion logic used by the existing Claude→OpenAI
path), e.g. call a helper like
convertAnthropicUsageToOpenAI/normalizeClaudeUsage(claudeInfo.Usage) before
setting chatChunk.Usage so prompt_tokens includes cached read/creation tokens;
apply the same change for the other branch handling claudeInfo usage.
- Around line 136-176: The handler currently calls sr.Error(e) on JSON parse or
write failures but never sets the shared streamErr variable, so the final EOS
flush (openaicompat.ChatCompletionsStreamToResponsesEvents(nil, state) followed
by writeEvents) still runs and can emit a synthetic response.completed; fix by
assigning an appropriate error to streamErr when calling sr.Error (e.g., set
streamErr = types.WithStreamError(e) or wrap as needed) inside the
helper.StreamScannerHandler error branches (the JSON unmarshal and writeEvents
failure paths) and then skip the unconditional EOS flush by only
generating/writing flushEvents if streamErr == nil (or return early if streamErr
is non-nil) so real failures propagate instead of completing the stream.

In `@service/openaicompat/chat_stream_to_responses_test.go`:
- Around line 3-20: The test helper unmarshalEvent (and other test helpers
around lines referenced) currently calls json.Unmarshal directly; replace those
direct uses with common.Unmarshal to follow the repository JSON wrapper policy:
call common.Unmarshal(data, &m) and assert require.NoError(t, err) (similar to
the existing Marshal usage), remove the import of "encoding/json" if no longer
needed, and update any other direct encoding/json deserialization occurrences
(e.g., the area noted at 457-460) to use the appropriate common.Unmarshal/Common
helpers (common.UnmarshalJsonStr / common.DecodeJson) as applicable.

In `@service/openaicompat/chat_stream_to_responses.go`:
- Around line 441-471: The loop in closeAllOpenFunctionCalls iterates
state.FuncCalls (a map) which yields nondeterministic order; to fix, collect the
open function-call entries from state.FuncCalls into a slice, sort that slice by
the tool index / fc.ItemIndex (or by fc.ItemIndex then fc.ID for tie-breaker),
then iterate the sorted slice and call emitEvent as before (keep funcCallItemID,
arguments defaulting to "{}" and setting fc.Done = true). This ensures
ResponsesStreamState.FuncCalls are closed in a stable, deterministic order when
emitting response.function_call_arguments.done and response.output_item.done
events.
- Around line 186-252: The handler fails when a '<think>' or '</think>' token is
split across chunks; modify ResponsesStreamState to hold a small
pendingTagBuffer string and in handleTextDeltaWithInlineThink prepend that
buffer to incoming text, then after processing keep any trailing partial tag
(e.g., leading '<', '</', '<th', etc.) in pendingTagBuffer instead of emitting
it as regular text; update state.InThinkInlineTag, call
ensureMessageOpen/ensureReasoningOpen and
closeMessageIfOpen/closeReasoningIfOpen and emitEvent exactly as before but
operate on the combined buffer, and clear pendingTagBuffer once a full tag is
consumed so tag boundaries across chunks are handled correctly.
- Around line 153-177: The code currently calls emitEvent(state,
"response.created", ...) and discards the return (which still increments the
sequence), causing the first real returned event to have sequence_number 2; in
EmitChatStreamErrorEvent avoid emitting the created prelude as a discarded
side-effect — instead, when state.Started is false, prepare the prelude (set
CreatedAt/ResponseID), and include the "response.created" event in the returned
events slice (using emitEvent) followed by the "response.failed" event, removing
the `_ = emitEvent(...)` discard; ensure you still set state.Started = true and
state.ErrorEmitted = true after preparing the events so sequence numbers are
produced only for returned events.

In `@service/openaicompat/chat_to_responses.go`:
- Around line 497-503: The fallback ID logic for function_call items uses the
same constant fc_<idBase> when tc.ID is empty, causing duplicate IDs; update the
code that sets fcItemID (the block using tc.ID and idBase) to generate a unique
ID per output item (e.g., append a per-message index, timestamp, or UUID) so
that when tc.ID is blank you produce something like fc_<idBase>_<seq> (while
still ensuring the "fc_" prefix); ensure this uniqueness strategy is applied
where fcItemID is assigned so subsequent uses of fcItemID refer to the unique
value for each output item.

In `@service/openaicompat/responses_to_chat.go`:
- Around line 583-589: The switch branch handling `"input_text", "output_text"`
currently uses `if t, _ := pm["text"].(string); true { ... }` which
unconditionally appends a text MediaContent even when `pm["text"]` is missing or
not a string; change the condition to check the type assertion result (e.g., `if
t, ok := pm["text"].(string); ok { result = append(result, dto.MediaContent{
Type: dto.ContentTypeText, Text: t, }) }`) so only valid non-empty string `text`
values are converted and appended.
- Line 4: Remove the unnecessary "encoding/json" sentinel import from
responses_to_chat.go and delete the sentinel variable that exists solely to keep
that import alive; ensure all JSON usage continues to use the
repository-compliant wrappers (common.Unmarshal, common.Marshal,
common.GetJsonType) and run tests/lint to confirm no direct encoding/json
references remain (locate the import declaration and the sentinel variable near
the JSON handling code in this file and remove both).

In `@service/openaicompat/tool_call_ids.go`:
- Around line 74-79: The current logic in the loop that sets tc.ID calls
sanitizeOneToolID and then writes idMap[origID] = newID, which can overwrite an
existing remap and produce inconsistent IDs for repeated invalid originals;
change the flow in the block handling tc.ID so it first checks if idMap already
contains origID and, if so, reuses idMap[origID] to set tc.ID, otherwise call
sanitizeOneToolID(origID), assign the newID to tc.ID and add idMap[origID] =
newID (without overwriting existing entries). Ensure you update the code paths
that reference sanitizeOneToolID, idMap and tc.ID to follow this
lookup-before-generate behavior.

---

Nitpick comments:
In
`@openspec/changes/archive/2026-05-20-responses-to-anthropic-translation/specs/responses-to-anthropic-translation/spec.md`:
- Around line 3-4: Replace the TBD placeholder in the Purpose section of the
spec (look for the "Purpose" heading in responses-to-anthropic-translation spec)
with a concise description stating the specification's intent: to define
endpoint-driven source format detection for Anthropic response translations,
explain the problem it solves, the scope (when/where detection occurs), and the
expected outcome (how consumers should use the detected format). Ensure the text
is clear, present-tense, and a few sentences long to provide meaningful context
now that the spec is being archived.

In `@openspec/specs/responses-to-anthropic-translation/spec.md`:
- Around line 3-4: Replace the placeholder text under the "## Purpose" header in
the responses-to-anthropic-translation spec with a concise description of the
specification's intent: explain what this translation spec covers, its goals
(e.g., mapping response formats from source to Anthropic-compatible outputs),
intended consumers, and any scope/limitations; update the header block so "##
Purpose" contains this meaningful paragraph instead of "TBD - created by
archiving change responses-to-anthropic-translation."
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e27c3a1a-4f23-4a60-ae5c-dac32a80cfe8

📥 Commits

Reviewing files that changed from the base of the PR and between 20d3e73 and 18e56d9.

📒 Files selected for processing (23)
  • dto/claude.go
  • openspec/changes/archive/2026-05-20-responses-to-anthropic-translation/.openspec.yaml
  • openspec/changes/archive/2026-05-20-responses-to-anthropic-translation/design.md
  • openspec/changes/archive/2026-05-20-responses-to-anthropic-translation/proposal.md
  • openspec/changes/archive/2026-05-20-responses-to-anthropic-translation/specs/responses-to-anthropic-translation/spec.md
  • openspec/changes/archive/2026-05-20-responses-to-anthropic-translation/tasks.md
  • openspec/specs/responses-to-anthropic-translation/spec.md
  • relay/channel/claude/relay-claude.go
  • relay/channel/claude/relay_claude_test.go
  • relay/helper/stream_scanner.go
  • relay/responses_handler.go
  • relay/responses_via_chat_completions.go
  • relay/responses_via_chat_completions_test.go
  • service/channel_affinity_usage_cache_test.go
  • service/openaicompat/chat_stream_to_responses.go
  • service/openaicompat/chat_stream_to_responses_test.go
  • service/openaicompat/chat_to_responses.go
  • service/openaicompat/chat_to_responses_test.go
  • service/openaicompat/responses_stream_state.go
  • service/openaicompat/responses_to_chat.go
  • service/openaicompat/responses_to_chat_test.go
  • service/openaicompat/tool_call_ids.go
  • service/openaicompat/tool_call_ids_test.go

Comment thread relay/channel/claude/relay-claude.go
Comment thread relay/responses_via_chat_completions.go Outdated
Comment thread relay/responses_via_chat_completions.go
Comment thread relay/responses_via_chat_completions.go
Comment thread service/openaicompat/chat_stream_to_responses_test.go Outdated
Comment thread service/openaicompat/chat_stream_to_responses.go
Comment thread service/openaicompat/chat_to_responses.go
Comment thread service/openaicompat/responses_to_chat.go Outdated
Comment thread service/openaicompat/responses_to_chat.go
Comment thread service/openaicompat/tool_call_ids.go
…pivot

- relay-claude.go: applyCacheControlToLastAssistantContent now promotes plain
  string assistant content to a single-block []ClaudeMediaMessage so cache_control
  is attached on the common text-only case produced by RequestOpenAI2ClaudeMessage.
- responses_via_chat_completions.go: assign streamErr on JSON/SSE-write failures
  and skip the EOS flush when set, so upstream errors stop emitting a synthetic
  response.completed. Mark Claude refusals on both streaming and non-streaming
  paths (parity with HandleStreamResponseData / non-pivot handler). Fold Claude
  cache_read/creation tokens into prompt_tokens via a new
  normalizeClaudeUsageForOpenAISemantics helper so the Responses translator,
  which subtracts cached from prompt, yields correct input/total counts.
- chat_stream_to_responses.go:
  - EmitChatStreamErrorEvent now returns the response.created prelude in the
    events slice instead of discarding it, so error-only streams emit
    response.failed at sequence_number 1.
  - handleTextDeltaWithInlineThink buffers a trailing partial <think>/</think>
    fragment across chunks via a new splitPendingThinkTag helper and
    state.PendingTagBuffer; the buffer is flushed at EOS.
  - closeAllOpenFunctionCalls iterates FuncCalls sorted by tool index for
    deterministic close ordering (was nondeterministic Go map iteration).
- chat_to_responses.go: fallback function_call item IDs are now suffixed with a
  per-output-item counter so multiple toolless calls don't collide on
  fc_<idBase>.
- responses_to_chat.go: drop unused encoding/json sentinel; fix unconditional
  text-part append in convertResponsesContentParts (only emit when type
  assertion succeeds).
- tool_call_ids.go: reuse an existing idMap entry before sanitizing again so
  repeated invalid originals (e.g. multiple "::::") map to a single sanitized id
  consistent with the tool_result remap.
- chat_stream_to_responses_test.go: replace direct json.Unmarshal calls with
  common.Unmarshal per the repository JSON wrapper rule.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@nullmastermind
Copy link
Copy Markdown
Author

您好 @Calcium-Ion @seefs001,方便的时候麻烦帮忙 review 一下这个 PR 吗?感谢 🙏

Quick readiness summary:

  • CodeRabbit: 12 actionable comments + 2 nitpicks. All 12 addressed in 3910958; the bot auto-resolved every thread. The two doc nitpicks (TBD placeholders in openspec docs) were intentionally skipped — CodeRabbit marked them "💤 Low value".
  • Build & tests: go build ./... clean; go test ./service/openaicompat/... ./relay/channel/claude/... ./relay/... all pass.
  • Feature flag: gated behind RESPONSES_TO_ANTHROPIC_ENABLED (default true) so it can be disabled without a redeploy if anything regresses.
  • Scope: net-new files under service/openaicompat/ + relay/responses_via_chat_completions*.go, plus small edits to relay/responses_handler.go, relay/channel/claude/relay-claude.go, relay/helper/stream_scanner.go, and dto/claude.go. No behavior change for existing Anthropic Chat-Completions traffic.

Happy to address any further feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant