Skip to content

refactor(apicompat): redesign the Codex Responses ↔ Chat Completions …#2926

Merged
Wei-Shaw merged 2 commits into
Wei-Shaw:mainfrom
visa2:feat/codex-responses-bridge-redesign
Jun 1, 2026
Merged

refactor(apicompat): redesign the Codex Responses ↔ Chat Completions …#2926
Wei-Shaw merged 2 commits into
Wei-Shaw:mainfrom
visa2:feat/codex-responses-bridge-redesign

Conversation

@visa2
Copy link
Copy Markdown
Contributor

@visa2 visa2 commented May 31, 2026

…bridge

Codex CLI speaks the OpenAI Responses protocol (streaming, store:false), while many upstreams (e.g. DeepSeek in thinking mode) only expose Chat Completions. The bridge that translates between the two had grown field by field and leaned on Go's serialization defaults, which both the Responses client (Codex) and the Chat upstream reject in ways the official OpenAI endpoints tolerate.

Problems this fixes (all observed running Codex CLI against a DeepSeek upstream):

  • Streaming reasoning was never shown in the Codex TUI (the answer appeared with no visible thinking): reasoning deltas were emitted before the reasoning item was opened, so the strict client discarded them.
  • A tool-using turn could wedge the session into a "no response" state: the function_call stream was never closed (no function_call_arguments.done / output_item.done), so Codex never saw the tool call complete.
  • Parallel tool calls were rejected upstream (400/502): each function_call became its own assistant message, producing consecutive assistant messages with mismatched tool replies.
  • A tool turn was rejected with "reasoning_content in the thinking mode must be passed back": the reasoning that produced the tool call was dropped instead of being returned on the assistant message.
  • Items with no Chat equivalent (web_search_call, ...) and Codex's command-approval notice landed between an assistant tool_calls message and its tool reply, triggering "An assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'".
  • Interrupt/reconnect left an unanswered or dangling tool_call in the history, triggering the same 400.

The shared root cause is reliance on serialization defaults — omitempty dropping protocol-required zero values, and unrecognized item types falling through a generic path — rather than deliberately reproducing the target protocol. The bridge is reworked into two explicit layers.

Request direction (Responses input -> Chat messages): a parse -> build ->
normalize pipeline.

  • reasoning_content is carried back on the assistant message that produced a tool call (DeepSeek thinking mode requires it to continue the same thought)
  • consecutive function_call items (parallel tool calls) are merged into a single assistant message's tool_calls array
  • item types with no Chat equivalent are skipped instead of leaking through a generic path
  • normalizeChatMessages is the single invariant gate: it guarantees every assistant tool_calls message is immediately followed by one tool reply per tool_call_id — reordering any intervening message (such as a command-approval notice) to after the replies, dropping unanswered tool_calls and orphan tool replies, and preserving bare passthrough tool messages.

Response direction (Chat SSE -> Responses SSE): ResponsesStreamEvent.MarshalJSON constructs each streamed event explicitly so protocol-required fields are always present (output_index/content_index/summary_index at 0, message content:[], reasoning summary:[], function_call call_id/name/arguments, output_text part text/annotations/logprobs). This is a single source of truth that removes any post-hoc JSON patching. Reasoning is emitted as its own output item, opened before its deltas, and tool calls are fully closed (function_call_arguments.done + output_item.done with complete arguments).

Tests cover request-direction message invariants against golden Codex request shapes (parallel calls, unknown items, intervening messages, partial/dangling calls), per-event wire completeness, and streaming lifecycle ordering.

…bridge

Codex CLI speaks the OpenAI Responses protocol (streaming, store:false), while
many upstreams (e.g. DeepSeek in thinking mode) only expose Chat Completions.
The bridge that translates between the two had grown field by field and leaned
on Go's serialization defaults, which both the Responses client (Codex) and the
Chat upstream reject in ways the official OpenAI endpoints tolerate.

Problems this fixes (all observed running Codex CLI against a DeepSeek upstream):
  - Streaming reasoning was never shown in the Codex TUI (the answer appeared
    with no visible thinking): reasoning deltas were emitted before the reasoning
    item was opened, so the strict client discarded them.
  - A tool-using turn could wedge the session into a "no response" state: the
    function_call stream was never closed (no function_call_arguments.done /
    output_item.done), so Codex never saw the tool call complete.
  - Parallel tool calls were rejected upstream (400/502): each function_call
    became its own assistant message, producing consecutive assistant messages
    with mismatched tool replies.
  - A tool turn was rejected with "reasoning_content in the thinking mode must be
    passed back": the reasoning that produced the tool call was dropped instead
    of being returned on the assistant message.
  - Items with no Chat equivalent (web_search_call, ...) and Codex's
    command-approval notice landed between an assistant tool_calls message and
    its tool reply, triggering "An assistant message with 'tool_calls' must be
    followed by tool messages responding to each 'tool_call_id'".
  - Interrupt/reconnect left an unanswered or dangling tool_call in the history,
    triggering the same 400.

The shared root cause is reliance on serialization defaults — omitempty dropping
protocol-required zero values, and unrecognized item types falling through a
generic path — rather than deliberately reproducing the target protocol. The
bridge is reworked into two explicit layers.

Request direction (Responses input -> Chat messages): a parse -> build ->
normalize pipeline.
  - reasoning_content is carried back on the assistant message that produced a
    tool call (DeepSeek thinking mode requires it to continue the same thought)
  - consecutive function_call items (parallel tool calls) are merged into a
    single assistant message's tool_calls array
  - item types with no Chat equivalent are skipped instead of leaking through a
    generic path
  - normalizeChatMessages is the single invariant gate: it guarantees every
    assistant tool_calls message is immediately followed by one tool reply per
    tool_call_id — reordering any intervening message (such as a command-approval
    notice) to after the replies, dropping unanswered tool_calls and orphan tool
    replies, and preserving bare passthrough tool messages.

Response direction (Chat SSE -> Responses SSE): ResponsesStreamEvent.MarshalJSON
constructs each streamed event explicitly so protocol-required fields are always
present (output_index/content_index/summary_index at 0, message content:[],
reasoning summary:[], function_call call_id/name/arguments, output_text part
text/annotations/logprobs). This is a single source of truth that removes any
post-hoc JSON patching. Reasoning is emitted as its own output item, opened
before its deltas, and tool calls are fully closed
(function_call_arguments.done + output_item.done with complete arguments).

Tests cover request-direction message invariants against golden Codex request
shapes (parallel calls, unknown items, intervening messages, partial/dangling
calls), per-event wire completeness, and streaming lifecycle ordering.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 31, 2026

All contributors have signed the CLA. ✅
Posted by the CLA Assistant Lite bot.

@visa2
Copy link
Copy Markdown
Contributor Author

visa2 commented May 31, 2026

I have read the CLA Document and I hereby sign the CLA

github-actions Bot added a commit that referenced this pull request May 31, 2026
errcheck (check-type-assertions) flagged unchecked single-value type
assertions; switch to the comma-ok form so golangci-lint passes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@Wei-Shaw Wei-Shaw merged commit aa69e39 into Wei-Shaw:main Jun 1, 2026
7 checks passed
@github-actions github-actions Bot locked and limited conversation to collaborators Jun 1, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants