You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Anthropic's API includes server-executed tools (web_search, web_fetch) that run on Anthropic's infrastructure. These are distinct from regular tools—they require beta headers, special payload format, and introduce a streaming edge case that corrupts regular tool call arguments.
This is core LLM communication, not an application-level feature. The recent extended thinking implementation (#551 → PR #552, merged in v1.10.0) establishes the pattern for provider-specific API features. Server tools follow the same pattern.
Issue #205 ("Claude server tools support") was closed via commit 10eaba3, which improved with_params to prioritize user params. However, practical usage reveals three remaining gaps:
Array replacement: Utils.deep_merge replaces arrays instead of concatenating. Using with_params({ tools: [native_tools] })replaces all existing tools instead of adding to them. (Related: [BUG] OpenAi tool_choice being overridden when using with_params #317 showed similar override issues with tool_choice)
Beta headers: Server tools require anthropic-beta headers that aren't automatically added.
When using server tools with streaming, content_block_delta events with input_json_delta data are processed by StreamAccumulator.accumulate_tool_calls. The issue:
# lib/ruby_llm/stream_accumulator.rb:66-83defaccumulate_tool_calls(new_tool_calls)new_tool_calls.each_valuedo |tool_call|
iftool_call.id# ... stores tool with ID@latest_tool_call_id=tool_call.idelse# Server tool deltas have nil IDs - they get appended here!existing=@tool_calls[@latest_tool_call_id]existing.arguments << tool_call.argumentsifexistingendendend
Root cause: Server tool (server_tool_use) streaming deltas have nil tool IDs because they're executed server-side. RubyLLM appends their JSON fragments to @latest_tool_call_id, corrupting regular tool arguments.
Example error:
{
"error_class": "JSON::ParserError",
"error_message": "unexpected token at end of stream '{\"query\":' at line 1 column 47"
}
The index:1 delta (server tool) gets wrongly appended to toolu_01ABC (regular tool).
Problem 2: No Native Way to Add Server Tools
The with_params approach has a fundamental limitation:
# This REPLACES existing tools instead of adding to themchat.with_params(tools: [{type: "web_search_20250305",name: "web_search",max_uses: 10}])
This happens because Utils.deep_merge uses replacement semantics for arrays:
# lib/ruby_llm/utils.rb:35-43defdeep_merge(original,overrides)original.merge(overrides)do |_key,original_value,overrides_value|
iforiginal_value.is_a?(Hash) && overrides_value.is_a?(Hash)deep_merge(original_value,overrides_value)elseoverrides_value# Arrays REPLACE, not concatenateendendend
Proposed Solutions
Solution 1: Filter Server Tool Deltas in StreamAccumulator (Bug Fix)
Track server_tool_use content block indices and skip their input_json_delta events. This follows the same pattern as the extended thinking implementation (#552) which modified StreamAccumulator.add to handle thinking_delta events:
classStreamAccumulatordefadd(chunk)# If chunk has raw SSE data (from streaming)ifchunk.respond_to?(:raw_sse_data) && chunk.raw_sse_datadata=chunk.raw_sse_datacasedata["type"]when"content_block_start"ifdata.dig("content_block","type") == "server_tool_use"@server_tool_indices ||= Set.new@server_tool_indices.add(data["index"])endwhen"content_block_delta"# Skip server tool JSON deltas - they'd corrupt regular toolsif@server_tool_indices&.include?(data["index"]) &&
data.dig("delta","type") == "input_json_delta"returnendwhen"message_stop"@server_tool_indices=nilendend# Continue with normal processing# ... existing add logicendend
Why this is safe: Server tools are executed by Anthropic. Their results are injected back into Claude's context as web_search_tool_result blocks. The client doesn't need to track their arguments—Claude's response already reflects seeing the results.
Prerequisite: This requires raw_sse_data on chunks. The extended thinking implementation (#552) already added this for Anthropic streaming to capture thinking_delta events. If that's not yet exposed, the same pattern applies.
Solution 2: Native Server Tools Method (Feature)
Following the pattern from #343 discussion where the maintainer preferred integrating options into existing methods, server tools could be added via a dedicated method:
chat=RubyLLM.chat(model: "claude-sonnet-4-20250514").with_server_tools(:web_search,:web_fetch).with_tool(MyCustomTool).ask("What's the latest news about Ruby?")
Extended thinking - precedent for provider-specific streaming features
Cross-Provider Precedent
The community-built ruby_llm-responses_api gem demonstrates similar need for OpenAI's native tools (web_search_preview, code_interpreter, file_search). It uses with_params(tools: [...]) which works but has the same array replacement limitation.
This suggests native/server-executed tools are a cross-provider pattern that would benefit from first-class RubyLLM support, potentially with a unified API abstracting provider differences.
Summary
Anthropic's API includes server-executed tools (
web_search,web_fetch) that run on Anthropic's infrastructure. These are distinct from regular tools—they require beta headers, special payload format, and introduce a streaming edge case that corrupts regular tool call arguments.This is core LLM communication, not an application-level feature. The recent extended thinking implementation (#551 → PR #552, merged in v1.10.0) establishes the pattern for provider-specific API features. Server tools follow the same pattern.
Background: Issue #205 and Remaining Gaps
Issue #205 ("Claude server tools support") was closed via commit 10eaba3, which improved
with_paramsto prioritize user params. However, practical usage reveals three remaining gaps:Array replacement:
Utils.deep_mergereplaces arrays instead of concatenating. Usingwith_params({ tools: [native_tools] })replaces all existing tools instead of adding to them. (Related: [BUG] OpenAitool_choicebeing overridden when usingwith_params#317 showed similar override issues withtool_choice)Beta headers: Server tools require
anthropic-betaheaders that aren't automatically added.Streaming corruption:
StreamAccumulatorappendsserver_tool_useJSON deltas to regular tool calls, causingJSON::ParserError. (Different from [BUG] Tool call streaming accumulates all argument deltas before releasing them all at once #228—that's about buffering timing; this is about wrong assignment to incorrect tool IDs)Problem 1: Streaming Tool Call Corruption
When using server tools with streaming,
content_block_deltaevents withinput_json_deltadata are processed byStreamAccumulator.accumulate_tool_calls. The issue:Root cause: Server tool (
server_tool_use) streaming deltas haveniltool IDs because they're executed server-side. RubyLLM appends their JSON fragments to@latest_tool_call_id, corrupting regular tool arguments.Example error:
{ "error_class": "JSON::ParserError", "error_message": "unexpected token at end of stream '{\"query\":' at line 1 column 47" }Stream format showing the issue:
The
index:1delta (server tool) gets wrongly appended totoolu_01ABC(regular tool).Problem 2: No Native Way to Add Server Tools
The
with_paramsapproach has a fundamental limitation:This happens because
Utils.deep_mergeuses replacement semantics for arrays:Proposed Solutions
Solution 1: Filter Server Tool Deltas in StreamAccumulator (Bug Fix)
Track
server_tool_usecontent block indices and skip theirinput_json_deltaevents. This follows the same pattern as the extended thinking implementation (#552) which modifiedStreamAccumulator.addto handlethinking_deltaevents:Why this is safe: Server tools are executed by Anthropic. Their results are injected back into Claude's context as
web_search_tool_resultblocks. The client doesn't need to track their arguments—Claude's response already reflects seeing the results.Prerequisite: This requires
raw_sse_dataon chunks. The extended thinking implementation (#552) already added this for Anthropic streaming to capturethinking_deltaevents. If that's not yet exposed, the same pattern applies.Solution 2: Native Server Tools Method (Feature)
Following the pattern from #343 discussion where the maintainer preferred integrating options into existing methods, server tools could be added via a dedicated method:
This would:
anthropic-beta: web-search-2025-03-05,web-fetch-2025-09-10)toolsarray (concatenating, not replacing)server_tool_useeventsAlternative: Integrate into
with_toolssimilar to #343's proposed pattern:Solution 3 (Alternative): Add
add_paramsMethodFor more general use cases beyond server tools, add an additive variant of
with_params:Usage:
This addresses the general array-replacement limitation that affects multiple use cases.
Why This Belongs in RubyLLM (Not Application Code)
Per the CONTRIBUTING guidelines, RubyLLM focuses on "core LLM communication." Server tools are:
StreamAccumulator, not application codeThe streaming fix in particular cannot be solved in application code without monkey-patching
StreamAccumulator.Implementation Considerations
Related Issues
tool_choiceoverride - similardeep_mergelimitationCross-Provider Precedent
The community-built ruby_llm-responses_api gem demonstrates similar need for OpenAI's native tools (
web_search_preview,code_interpreter,file_search). It useswith_params(tools: [...])which works but has the same array replacement limitation.This suggests native/server-executed tools are a cross-provider pattern that would benefit from first-class RubyLLM support, potentially with a unified API abstracting provider differences.
References
Current Workaround
We're currently using monkey patches in production:
This works but requires maintaining patches across RubyLLM updates.