Environment
- Model: claude-opus-4-6
- API: Messages API with streaming enabled
- Thinking: adaptive (also tested with effort: "high")
Description
When using streaming with adaptive thinking and tool_use, the API returns stop_reason: terminated when the tool input content parameter exceeds approximately 6-8KB of text. The model is generating a tool call with a large string parameter (e.g. file content), but the response stream terminates before the tool input block is complete.
Reproduction
- Register a simple tool with a
content: string parameter
- Send a messages request with
stream: true and thinking: { type: "adaptive" }
- Prompt the model to call the tool with ~180+ lines of English text in the content parameter
- The streamed response terminates mid-generation with
stop_reason: terminated
- Retrying produces the same result consistently
Observations
- Tool input content under ~150 lines (~5KB) consistently succeeds
- Tool input content over ~180 lines (~7KB) consistently fails
max_tokens is well above the needed output size (42666)
- The tool execute function never runs — truncation happens at the API streaming response level, not client-side
- Without adaptive thinking, the same content size succeeds (not fully confirmed)
Expected behavior
The API should complete the tool_use input block regardless of content size (within max_tokens), or return a meaningful error.
Workaround
Split large tool input content across multiple smaller tool calls.
Environment
Description
When using streaming with adaptive thinking and tool_use, the API returns
stop_reason: terminatedwhen the tool input content parameter exceeds approximately 6-8KB of text. The model is generating a tool call with a large string parameter (e.g. file content), but the response stream terminates before the tool input block is complete.Reproduction
content: stringparameterstream: trueandthinking: { type: "adaptive" }stop_reason: terminatedObservations
max_tokensis well above the needed output size (42666)Expected behavior
The API should complete the tool_use input block regardless of content size (within max_tokens), or return a meaningful error.
Workaround
Split large tool input content across multiple smaller tool calls.