Skip to content

Streaming tool_use with adaptive thinking: terminated stop_reason on long tool input content #1317

@OneSpiral

Description

@OneSpiral

Environment

  • Model: claude-opus-4-6
  • API: Messages API with streaming enabled
  • Thinking: adaptive (also tested with effort: "high")

Description

When using streaming with adaptive thinking and tool_use, the API returns stop_reason: terminated when the tool input content parameter exceeds approximately 6-8KB of text. The model is generating a tool call with a large string parameter (e.g. file content), but the response stream terminates before the tool input block is complete.

Reproduction

  1. Register a simple tool with a content: string parameter
  2. Send a messages request with stream: true and thinking: { type: "adaptive" }
  3. Prompt the model to call the tool with ~180+ lines of English text in the content parameter
  4. The streamed response terminates mid-generation with stop_reason: terminated
  5. Retrying produces the same result consistently

Observations

  • Tool input content under ~150 lines (~5KB) consistently succeeds
  • Tool input content over ~180 lines (~7KB) consistently fails
  • max_tokens is well above the needed output size (42666)
  • The tool execute function never runs — truncation happens at the API streaming response level, not client-side
  • Without adaptive thinking, the same content size succeeds (not fully confirmed)

Expected behavior

The API should complete the tool_use input block regardless of content size (within max_tokens), or return a meaningful error.

Workaround

Split large tool input content across multiple smaller tool calls.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions