Skip to content
This repository was archived by the owner on Feb 18, 2026. It is now read-only.

Fix OpenAI Streaming and Tool Calling#14

Open
elidickinson wants to merge 5 commits into9j:mainfrom
elidickinson:upstream-pr/openai-streaming-and-tools
Open

Fix OpenAI Streaming and Tool Calling#14
elidickinson wants to merge 5 commits into9j:mainfrom
elidickinson:upstream-pr/openai-streaming-and-tools

Conversation

@elidickinson
Copy link

This PR fixes bugs in SSE streaming and implements full tool calling support for OpenAI-compatible providers.

SSE Streaming Fixes

Event Queue Bug (src/providers/streaming.rs)

  • Problem: Multiple SSE events in same TCP chunk → only first emitted, rest dropped
  • Impact: finish_reason chunks lost, clients hang with incomplete responses
  • Fix: Added VecDeque to buffer and emit all parsed events

Double SSE Wrapping (src/server/mod.rs)

  • Problem: Server wrapped provider SSE with Event::default().data() → malformed output
  • Fix: Pass through raw SSE bytes using Body::from_stream()

Tool Calling Support

Request Transformation

  • Emit tool messages before user content (OpenAI requires: assistant → tool → user)
  • Fixes 422 error with parallel tool calls

Response Transformation

  • Non-streaming: Transform message.tool_calls[] → content[].tool_use
  • Streaming: Handle incremental tool calls across multiple chunks using StreamTransformState
  • Map finish_reason correctly: "tool_calls" → "tool_use"

This fixes two critical bugs that caused SSE streaming to fail:

## 1. Event Queue Bug (src/providers/streaming.rs)

**Problem**: SseStream was dropping events when multiple SSE events arrived
in the same TCP chunk. Only the first event was emitted, others were discarded.

**Impact**: finish_reason chunks were frequently dropped, causing clients to
hang or error with incomplete responses.

**Fix**: Added event_queue (VecDeque) to buffer all parsed events. Now:
- Parse all complete events from each chunk
- Queue them all
- Emit events one at a time from the queue
- Properly handles multiple events arriving together

## 2. Double SSE Wrapping (src/server/mod.rs)

**Problem**: Server was wrapping provider SSE events (which already have
"event:" and "data:" lines) with axum's Event::default().data(), causing
double wrapping:

Before (broken):
  data: event: message_start
  data: data: {"type":"message_start",...}

After (correct):
  event: message_start
  data: {"type":"message_start",...}

**Fix**: Pass through raw SSE bytes from provider without wrapping:
- Use Body::from_stream() instead of Sse::new()
- Set SSE headers manually
- Provider output goes directly to client

Files changed:
- src/providers/streaming.rs: +43/-8 lines
- src/server/mod.rs: +29/-12 lines
…aming

Implements full bidirectional tool transformation for OpenAI-compatible providers:

## Non-streaming Tool Calls
- Transform OpenAI message.tool_calls[] → Anthropic content[].tool_use
- Parse tool arguments from JSON string to structured object
- Handle mixed responses (text + tool calls)

## Streaming Tool Calls
- Detect and transform tool_calls in OpenAI streaming chunks
- Handle incremental tool calls (sent across multiple chunks)
- Emit proper Anthropic event sequence: content_block_start → delta → stop
- Use StreamTransformState to track state across chunks
- Close text blocks before tool blocks

## Tool Message Ordering
- Fix parallel tool calls: OpenAI requires assistant(tool_calls) → tool → tool → user
- Reorder Anthropic format (single user message with tool_results + text)
- Emit tool messages BEFORE user content to satisfy OpenAI ordering

## Stop Reason Mapping
- Map OpenAI finish_reason → Anthropic stop_reason correctly:
  - stop → end_turn
  - length → max_tokens
  - tool_calls → tool_use
- Applies to both streaming and non-streaming

Files changed:
- Cargo.toml: Add uuid dependency for streaming tool block IDs
- src/providers/openai.rs: Complete tool transformation implementation
Problem: When providers close streams without sending finish_reason chunk,
the stream ends without message_stop event, causing Claude Code to hang
waiting for continuation. This manifests as:
- Tool completes but no automatic continuation
- Subagent finishes but requires manual "ok" to proceed

Fix: Add stream finalization handler that checks if message_stop was sent.
If stream ends without finish_reason:
- Close any open content/tool blocks
- Send message_delta with end_turn
- Send message_stop to properly terminate stream

This ensures Claude Code always receives proper stream termination and
continues automatically after tools/subagents complete.
@elidickinson elidickinson deleted the upstream-pr/openai-streaming-and-tools branch January 23, 2026 16:10
@elidickinson elidickinson restored the upstream-pr/openai-streaming-and-tools branch January 28, 2026 22:24
@elidickinson elidickinson reopened this Jan 28, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant