refactor(apicompat): redesign the Codex Responses ↔ Chat Completions … by visa2 · Pull Request #2926 · Wei-Shaw/sub2api

visa2 · 2026-05-31T08:21:35Z

…bridge

Codex CLI speaks the OpenAI Responses protocol (streaming, store:false), while many upstreams (e.g. DeepSeek in thinking mode) only expose Chat Completions. The bridge that translates between the two had grown field by field and leaned on Go's serialization defaults, which both the Responses client (Codex) and the Chat upstream reject in ways the official OpenAI endpoints tolerate.

Problems this fixes (all observed running Codex CLI against a DeepSeek upstream):

Streaming reasoning was never shown in the Codex TUI (the answer appeared with no visible thinking): reasoning deltas were emitted before the reasoning item was opened, so the strict client discarded them.
A tool-using turn could wedge the session into a "no response" state: the function_call stream was never closed (no function_call_arguments.done / output_item.done), so Codex never saw the tool call complete.
Parallel tool calls were rejected upstream (400/502): each function_call became its own assistant message, producing consecutive assistant messages with mismatched tool replies.
A tool turn was rejected with "reasoning_content in the thinking mode must be passed back": the reasoning that produced the tool call was dropped instead of being returned on the assistant message.
Items with no Chat equivalent (web_search_call, ...) and Codex's command-approval notice landed between an assistant tool_calls message and its tool reply, triggering "An assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'".
Interrupt/reconnect left an unanswered or dangling tool_call in the history, triggering the same 400.

The shared root cause is reliance on serialization defaults — omitempty dropping protocol-required zero values, and unrecognized item types falling through a generic path — rather than deliberately reproducing the target protocol. The bridge is reworked into two explicit layers.

Request direction (Responses input -> Chat messages): a parse -> build ->
normalize pipeline.

reasoning_content is carried back on the assistant message that produced a tool call (DeepSeek thinking mode requires it to continue the same thought)
consecutive function_call items (parallel tool calls) are merged into a single assistant message's tool_calls array
item types with no Chat equivalent are skipped instead of leaking through a generic path
normalizeChatMessages is the single invariant gate: it guarantees every assistant tool_calls message is immediately followed by one tool reply per tool_call_id — reordering any intervening message (such as a command-approval notice) to after the replies, dropping unanswered tool_calls and orphan tool replies, and preserving bare passthrough tool messages.

Response direction (Chat SSE -> Responses SSE): ResponsesStreamEvent.MarshalJSON constructs each streamed event explicitly so protocol-required fields are always present (output_index/content_index/summary_index at 0, message content:[], reasoning summary:[], function_call call_id/name/arguments, output_text part text/annotations/logprobs). This is a single source of truth that removes any post-hoc JSON patching. Reasoning is emitted as its own output item, opened before its deltas, and tool calls are fully closed (function_call_arguments.done + output_item.done with complete arguments).

Tests cover request-direction message invariants against golden Codex request shapes (parallel calls, unknown items, intervening messages, partial/dangling calls), per-event wire completeness, and streaming lifecycle ordering.

…bridge Codex CLI speaks the OpenAI Responses protocol (streaming, store:false), while many upstreams (e.g. DeepSeek in thinking mode) only expose Chat Completions. The bridge that translates between the two had grown field by field and leaned on Go's serialization defaults, which both the Responses client (Codex) and the Chat upstream reject in ways the official OpenAI endpoints tolerate. Problems this fixes (all observed running Codex CLI against a DeepSeek upstream): - Streaming reasoning was never shown in the Codex TUI (the answer appeared with no visible thinking): reasoning deltas were emitted before the reasoning item was opened, so the strict client discarded them. - A tool-using turn could wedge the session into a "no response" state: the function_call stream was never closed (no function_call_arguments.done / output_item.done), so Codex never saw the tool call complete. - Parallel tool calls were rejected upstream (400/502): each function_call became its own assistant message, producing consecutive assistant messages with mismatched tool replies. - A tool turn was rejected with "reasoning_content in the thinking mode must be passed back": the reasoning that produced the tool call was dropped instead of being returned on the assistant message. - Items with no Chat equivalent (web_search_call, ...) and Codex's command-approval notice landed between an assistant tool_calls message and its tool reply, triggering "An assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'". - Interrupt/reconnect left an unanswered or dangling tool_call in the history, triggering the same 400. The shared root cause is reliance on serialization defaults — omitempty dropping protocol-required zero values, and unrecognized item types falling through a generic path — rather than deliberately reproducing the target protocol. The bridge is reworked into two explicit layers. Request direction (Responses input -> Chat messages): a parse -> build -> normalize pipeline. - reasoning_content is carried back on the assistant message that produced a tool call (DeepSeek thinking mode requires it to continue the same thought) - consecutive function_call items (parallel tool calls) are merged into a single assistant message's tool_calls array - item types with no Chat equivalent are skipped instead of leaking through a generic path - normalizeChatMessages is the single invariant gate: it guarantees every assistant tool_calls message is immediately followed by one tool reply per tool_call_id — reordering any intervening message (such as a command-approval notice) to after the replies, dropping unanswered tool_calls and orphan tool replies, and preserving bare passthrough tool messages. Response direction (Chat SSE -> Responses SSE): ResponsesStreamEvent.MarshalJSON constructs each streamed event explicitly so protocol-required fields are always present (output_index/content_index/summary_index at 0, message content:[], reasoning summary:[], function_call call_id/name/arguments, output_text part text/annotations/logprobs). This is a single source of truth that removes any post-hoc JSON patching. Reasoning is emitted as its own output item, opened before its deltas, and tool calls are fully closed (function_call_arguments.done + output_item.done with complete arguments). Tests cover request-direction message invariants against golden Codex request shapes (parallel calls, unknown items, intervening messages, partial/dangling calls), per-event wire completeness, and streaming lifecycle ordering. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

github-actions · 2026-05-31T08:21:46Z

All contributors have signed the CLA. ✅
_{Posted by the CLA Assistant Lite bot.}

visa2 · 2026-05-31T08:23:58Z

I have read the CLA Document and I hereby sign the CLA

errcheck (check-type-assertions) flagged unchecked single-value type assertions; switch to the comma-ok form so golangci-lint passes. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

github-actions Bot added a commit that referenced this pull request May 31, 2026

@visa2 has signed the CLA in #2926

fb59039

test(apicompat): check type assertions in responses stream wire tests

003b278

errcheck (check-type-assertions) flagged unchecked single-value type assertions; switch to the comma-ok form so golangci-lint passes. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Wei-Shaw merged commit aa69e39 into Wei-Shaw:main Jun 1, 2026
7 checks passed

github-actions Bot locked and limited conversation to collaborators Jun 1, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(apicompat): redesign the Codex Responses ↔ Chat Completions …#2926

refactor(apicompat): redesign the Codex Responses ↔ Chat Completions …#2926
Wei-Shaw merged 2 commits into
Wei-Shaw:mainfrom
visa2:feat/codex-responses-bridge-redesign

visa2 commented May 31, 2026

Uh oh!

github-actions Bot commented May 31, 2026 •

edited

Loading

Uh oh!

visa2 commented May 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

visa2 commented May 31, 2026

Uh oh!

github-actions Bot commented May 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

visa2 commented May 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions Bot commented May 31, 2026 •

edited

Loading