refactor(apicompat): redesign the Codex Responses ↔ Chat Completions …#2926
Merged
Wei-Shaw merged 2 commits intoJun 1, 2026
Merged
Conversation
…bridge
Codex CLI speaks the OpenAI Responses protocol (streaming, store:false), while
many upstreams (e.g. DeepSeek in thinking mode) only expose Chat Completions.
The bridge that translates between the two had grown field by field and leaned
on Go's serialization defaults, which both the Responses client (Codex) and the
Chat upstream reject in ways the official OpenAI endpoints tolerate.
Problems this fixes (all observed running Codex CLI against a DeepSeek upstream):
- Streaming reasoning was never shown in the Codex TUI (the answer appeared
with no visible thinking): reasoning deltas were emitted before the reasoning
item was opened, so the strict client discarded them.
- A tool-using turn could wedge the session into a "no response" state: the
function_call stream was never closed (no function_call_arguments.done /
output_item.done), so Codex never saw the tool call complete.
- Parallel tool calls were rejected upstream (400/502): each function_call
became its own assistant message, producing consecutive assistant messages
with mismatched tool replies.
- A tool turn was rejected with "reasoning_content in the thinking mode must be
passed back": the reasoning that produced the tool call was dropped instead
of being returned on the assistant message.
- Items with no Chat equivalent (web_search_call, ...) and Codex's
command-approval notice landed between an assistant tool_calls message and
its tool reply, triggering "An assistant message with 'tool_calls' must be
followed by tool messages responding to each 'tool_call_id'".
- Interrupt/reconnect left an unanswered or dangling tool_call in the history,
triggering the same 400.
The shared root cause is reliance on serialization defaults — omitempty dropping
protocol-required zero values, and unrecognized item types falling through a
generic path — rather than deliberately reproducing the target protocol. The
bridge is reworked into two explicit layers.
Request direction (Responses input -> Chat messages): a parse -> build ->
normalize pipeline.
- reasoning_content is carried back on the assistant message that produced a
tool call (DeepSeek thinking mode requires it to continue the same thought)
- consecutive function_call items (parallel tool calls) are merged into a
single assistant message's tool_calls array
- item types with no Chat equivalent are skipped instead of leaking through a
generic path
- normalizeChatMessages is the single invariant gate: it guarantees every
assistant tool_calls message is immediately followed by one tool reply per
tool_call_id — reordering any intervening message (such as a command-approval
notice) to after the replies, dropping unanswered tool_calls and orphan tool
replies, and preserving bare passthrough tool messages.
Response direction (Chat SSE -> Responses SSE): ResponsesStreamEvent.MarshalJSON
constructs each streamed event explicitly so protocol-required fields are always
present (output_index/content_index/summary_index at 0, message content:[],
reasoning summary:[], function_call call_id/name/arguments, output_text part
text/annotations/logprobs). This is a single source of truth that removes any
post-hoc JSON patching. Reasoning is emitted as its own output item, opened
before its deltas, and tool calls are fully closed
(function_call_arguments.done + output_item.done with complete arguments).
Tests cover request-direction message invariants against golden Codex request
shapes (parallel calls, unknown items, intervening messages, partial/dangling
calls), per-event wire completeness, and streaming lifecycle ordering.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Contributor
|
All contributors have signed the CLA. ✅ |
Contributor
Author
|
I have read the CLA Document and I hereby sign the CLA |
errcheck (check-type-assertions) flagged unchecked single-value type assertions; switch to the comma-ok form so golangci-lint passes. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
…bridge
Codex CLI speaks the OpenAI Responses protocol (streaming, store:false), while many upstreams (e.g. DeepSeek in thinking mode) only expose Chat Completions. The bridge that translates between the two had grown field by field and leaned on Go's serialization defaults, which both the Responses client (Codex) and the Chat upstream reject in ways the official OpenAI endpoints tolerate.
Problems this fixes (all observed running Codex CLI against a DeepSeek upstream):
The shared root cause is reliance on serialization defaults — omitempty dropping protocol-required zero values, and unrecognized item types falling through a generic path — rather than deliberately reproducing the target protocol. The bridge is reworked into two explicit layers.
Request direction (Responses input -> Chat messages): a parse -> build ->
normalize pipeline.
Response direction (Chat SSE -> Responses SSE): ResponsesStreamEvent.MarshalJSON constructs each streamed event explicitly so protocol-required fields are always present (output_index/content_index/summary_index at 0, message content:[], reasoning summary:[], function_call call_id/name/arguments, output_text part text/annotations/logprobs). This is a single source of truth that removes any post-hoc JSON patching. Reasoning is emitted as its own output item, opened before its deltas, and tool calls are fully closed (function_call_arguments.done + output_item.done with complete arguments).
Tests cover request-direction message invariants against golden Codex request shapes (parallel calls, unknown items, intervening messages, partial/dangling calls), per-event wire completeness, and streaming lifecycle ordering.