Skip to content

[FEATURE] Expose stop_reason from streaming responses #568

@trevorturk

Description

@trevorturk

Summary

When streaming responses from Anthropic, the stop_reason field is sent in message_delta SSE events but not captured or exposed by RubyLLM. This field is important for understanding why a response ended.

Background

Anthropic's streaming API sends stop_reason in the final message_delta event:

event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn"},"usage":{"output_tokens":42}}

Possible values:

  • end_turn - Normal completion
  • tool_use - Model wants to call a tool
  • max_tokens - Response truncated due to token limit
  • pause_turn - Server paused for native tool execution (web_search, etc.)

Current Behavior

StreamAccumulator doesn't capture stop_reason. The final Message has no way to indicate why the response ended.

Proposed Solution

  1. Add stop_reason attribute to Message class
  2. Capture from message_delta events in StreamAccumulator
  3. Attach to final Message in to_message
# In StreamAccumulator
def add(chunk)
  # ... existing logic ...
  
  if chunk.respond_to?(:raw_sse_data) && chunk.raw_sse_data
    data = chunk.raw_sse_data
    if data["type"] == "message_delta" && data.dig("delta", "stop_reason")
      @stop_reason = data.dig("delta", "stop_reason")
    end
  end
end

def to_message(response = nil)
  message = # ... existing logic ...
  message.stop_reason = @stop_reason if @stop_reason
  message
end

Use Cases

  1. Truncation detection: max_tokens indicates the response was cut off - may need to continue
  2. Native tool handling: pause_turn indicates server-side tool execution - may need special handling
  3. Observability: Log non-standard stop reasons for debugging
  4. Flow control: Different behavior based on why the model stopped

Prerequisites

This requires access to raw SSE data on chunks. Issue #567 proposes exposing raw_sse_data for server tool filtering - the same infrastructure would enable this feature.

Current Workaround

We're using a monkey patch:

module RubyLLM
  class StreamAccumulator
    attr_accessor :stop_reason

    alias_method :add_before_stop_reason, :add
    def add(chunk)
      add_before_stop_reason(chunk)
      if chunk.respond_to?(:raw_sse_data) && chunk.raw_sse_data
        data = chunk.raw_sse_data
        if data["type"] == "message_delta" && data.dig("delta", "stop_reason")
          @stop_reason = data.dig("delta", "stop_reason")
        end
      end
    end

    alias_method :to_message_before_stop_reason, :to_message
    def to_message(response = nil)
      message = response ? to_message_before_stop_reason(response) : to_message_before_stop_reason
      if @stop_reason
        message.instance_variable_set(:@stop_reason, @stop_reason)
        message.define_singleton_method(:stop_reason) { @stop_reason }
      end
      message
    end
  end
end

Related

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions