This repository was archived by the owner on Feb 18, 2026. It is now read-only.
fix(openai): properly transform streaming tool_calls and handle incomplete streams#11
Closed
fix(openai): properly transform streaming tool_calls and handle incomplete streams#11
Conversation
…plete streams This fix addresses two critical issues with OpenAI-compatible provider streaming: 1. **Tool Calls Transformation**: OpenAI streaming sends tool_calls in a different format than Anthropic. This commit implements full transformation: - Detects tool_calls in OpenAI streaming chunks - Transforms to Anthropic tool_use format with proper event structure - Closes text content blocks before tool_use blocks - Sends content_block_start/delta/stop for each tool 2. **Incomplete Stream Handling**: Some OpenAI-compatible providers (notably Cerebras) close the stream without sending a finish_reason chunk. This commit adds: - Stream finalization that detects incomplete streams - Automatic end event generation (content_block_stop, message_delta, message_stop) - Prevents duplicate end events when finish_reason IS sent The fix ensures: - ✅ Streaming works with tool calls (TodoWrite, etc.) - ✅ No duplicate messages - ✅ Graceful handling of provider-specific streaming bugs - ✅ Full Anthropic API compatibility Tested with: Cerebras (zai-glm-4.6), streaming tool calls work perfectly.
|
I’m having the same problems but tool calling still isn’t working for me from Claude code in this branch. There isn’t some other secret step I’m missing right? I’ll check it out with debug logging later |
|
For whatever reason my claude code only seems to make non-streaming requests for tool calls so this seems to do it in quick testing #12 |
Author
|
Hmm yeah I'm having some issues today since the new Claude update. Getting lots of "No response requested." though it does work for 10-20 calls. I'll close this and try out your branch or throw some more tokens at it to try and fix. |
|
Actually let me reroll that - I think we need both versions |
|
This is I think unrelated but you might want it too elidickinson@fdf5bab I should probably open a new PR for it |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
OpenAI-compatible providers (especially Cerebras) have two critical streaming issues:
Tool calls not transformed: When the model generates tool_calls in streaming mode, they arrive in OpenAI format but need to be transformed to Anthropic's tool_use format for Claude Code compatibility.
Incomplete streams: Some providers (Cerebras) close the HTTP stream without sending a
finish_reasonchunk when tool_calls are generated, causing the stream to end abruptly withhyper::Error(IncompleteMessage).Solution
1. Tool Calls Transformation
tool_callsin OpenAI streaming chunkstool_useformat with proper SSE events:content_block_startwith tool metadatacontent_block_deltawith tool argumentscontent_block_stopto close the tool block2. Stream Finalization Workaround
stream_ended_properlyflag to track iffinish_reasonwas receivedfinish_reasonwas receivedTesting
Tested extensively with:
hyper::Error(IncompleteMessage)Changes
Cargo.toml: Addeduuiddependency for generating unique message IDssrc/providers/openai.rs:OpenAIStreamChunk,OpenAIStreamChoice,OpenAIStreamDelta)transform_openai_chunk_to_anthropic_sse: Full transformation logic including tool_usesend_message_stream: Stream finalization workaroundArc<Mutex<bool>>for thread-safe flagsBefore/After
Before:
hyper::Error(IncompleteMessage)in logsAfter:
Impact
This fix enables Claude Code to work seamlessly with OpenAI-compatible providers that:
finish_reason)Especially critical for Cerebras, which is known for fast, cost-effective inference but has this streaming quirk.