Summary
When streaming responses from Anthropic, the stop_reason field is sent in message_delta SSE events but not captured or exposed by RubyLLM. This field is important for understanding why a response ended.
Background
Anthropic's streaming API sends stop_reason in the final message_delta event:
event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn"},"usage":{"output_tokens":42}}
Possible values:
end_turn - Normal completion
tool_use - Model wants to call a tool
max_tokens - Response truncated due to token limit
pause_turn - Server paused for native tool execution (web_search, etc.)
Current Behavior
StreamAccumulator doesn't capture stop_reason. The final Message has no way to indicate why the response ended.
Proposed Solution
- Add
stop_reason attribute to Message class
- Capture from
message_delta events in StreamAccumulator
- Attach to final
Message in to_message
# In StreamAccumulator
def add(chunk)
# ... existing logic ...
if chunk.respond_to?(:raw_sse_data) && chunk.raw_sse_data
data = chunk.raw_sse_data
if data["type"] == "message_delta" && data.dig("delta", "stop_reason")
@stop_reason = data.dig("delta", "stop_reason")
end
end
end
def to_message(response = nil)
message = # ... existing logic ...
message.stop_reason = @stop_reason if @stop_reason
message
end
Use Cases
- Truncation detection:
max_tokens indicates the response was cut off - may need to continue
- Native tool handling:
pause_turn indicates server-side tool execution - may need special handling
- Observability: Log non-standard stop reasons for debugging
- Flow control: Different behavior based on why the model stopped
Prerequisites
This requires access to raw SSE data on chunks. Issue #567 proposes exposing raw_sse_data for server tool filtering - the same infrastructure would enable this feature.
Current Workaround
We're using a monkey patch:
module RubyLLM
class StreamAccumulator
attr_accessor :stop_reason
alias_method :add_before_stop_reason, :add
def add(chunk)
add_before_stop_reason(chunk)
if chunk.respond_to?(:raw_sse_data) && chunk.raw_sse_data
data = chunk.raw_sse_data
if data["type"] == "message_delta" && data.dig("delta", "stop_reason")
@stop_reason = data.dig("delta", "stop_reason")
end
end
end
alias_method :to_message_before_stop_reason, :to_message
def to_message(response = nil)
message = response ? to_message_before_stop_reason(response) : to_message_before_stop_reason
if @stop_reason
message.instance_variable_set(:@stop_reason, @stop_reason)
message.define_singleton_method(:stop_reason) { @stop_reason }
end
message
end
end
end
Related
References
Summary
When streaming responses from Anthropic, the
stop_reasonfield is sent inmessage_deltaSSE events but not captured or exposed by RubyLLM. This field is important for understanding why a response ended.Background
Anthropic's streaming API sends
stop_reasonin the finalmessage_deltaevent:Possible values:
end_turn- Normal completiontool_use- Model wants to call a toolmax_tokens- Response truncated due to token limitpause_turn- Server paused for native tool execution (web_search, etc.)Current Behavior
StreamAccumulatordoesn't capturestop_reason. The finalMessagehas no way to indicate why the response ended.Proposed Solution
stop_reasonattribute toMessageclassmessage_deltaevents inStreamAccumulatorMessageinto_messageUse Cases
max_tokensindicates the response was cut off - may need to continuepause_turnindicates server-side tool execution - may need special handlingPrerequisites
This requires access to raw SSE data on chunks. Issue #567 proposes exposing
raw_sse_datafor server tool filtering - the same infrastructure would enable this feature.Current Workaround
We're using a monkey patch:
Related
raw_sse_dataexposure)References