fix: parallel tool calls execute sequentially despite PARALLEL mode (#735)#1127
Open
YuqiGuo105 wants to merge 1 commit intogoogle:mainfrom
Open
fix: parallel tool calls execute sequentially despite PARALLEL mode (#735)#1127YuqiGuo105 wants to merge 1 commit intogoogle:mainfrom
YuqiGuo105 wants to merge 1 commit intogoogle:mainfrom
Conversation
…sync Fixes google#735 When ToolExecutionMode.PARALLEL is set and the LLM returns multiple function calls, tools were still executing sequentially because callTool() invoked tool.runAsync() as a plain method call, causing FunctionTool to execute func.invoke() synchronously on the subscribing thread before returning a Single. Since concatMapEager eagerly subscribes to all inner Observables, it could not dispatch work to IO threads if the subscription itself was blocking. Fix: wrap tool.runAsync() in Single.defer() so the call is deferred until subscription time, then apply subscribeOn(Schedulers.io()) to move the actual subscription (and thus the synchronous func.invoke() call) onto an IO thread. concatMapEager then subscribes to all tool Singles eagerly, each on its own IO thread, achieving true parallelism while preserving result order. Added two new timing-based tests that verify: - PARALLEL mode: two 1000ms tools complete in <1500ms (not ~2000ms) - SEQUENTIAL mode: two 1000ms tools take >=2000ms (serial contract unchanged)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Fixes #735
When an LLM returns multiple tool calls simultaneously (parallel function calling), ADK Java executes them sequentially despite
ToolExecutionMode.PARALLELbeing set. Two tools each taking 1 second complete in ~2 seconds instead of ~1 second.Root Cause
The bug is in
Functions.callTool(). ForFunctionTool(the most common tool type),runAsync()callsfunc.invoke()via Java reflection — a synchronous, blocking call that completes before returning aSingle:The original
callTool()was:handleFunctionCallsusesconcatMapEagerfor PARALLEL mode, which eagerly subscribes to all inner Observables. However, if the subscription itself is synchronous and blocking,concatMapEagercannot dispatch to a new thread — it is stuck waiting for the first subscription to complete before starting the next.Why
subscribeOn(Schedulers.io())alone (withoutSingle.defer) does not fix it:subscribeOnonly shifts the subscription signal to an IO thread, not the method call that produces theSingle. By the time.subscribeOn()is reached in the chain,func.invoke()has already finished on the calling thread.Contrast with
ParallelAgent(ParallelAgent.java#L148), which correctly uses.subscribeOn(scheduler)becausesubAgent.runAsync()returns a lazyFlowable— the work has not started yet when it is returned.Fix
Wrap
tool.runAsync()inSingle.defer()to make it lazy, then applysubscribeOn(Schedulers.io()):Single.defer()packagestool.runAsync()as a lambda that only executes at subscription time.subscribeOn(Schedulers.io())then causes that subscription — includingfunc.invoke()— to happen on an IO thread.concatMapEagereagerly dispatches all tool subscriptions, each to its own IO thread, achieving true parallelism while preserving result order.Execution Flow: Before vs After
Scenario: LLM returns two parallel tool calls —
get_weather(city=Beijing)andsearch_hotels(city=Shanghai)— each taking ~1000ms.Before (sequential despite PARALLEL mode)
After (true parallel)
Changes
Functions.javaimport io.reactivex.rxjava3.schedulers.Schedulerstool.runAsync(args, toolContext)inSingle.defer()and added.subscribeOn(Schedulers.io())incallTool()FunctionsTest.javaSlowToolinner class — aBaseToolthat sleeps for a configurable duration usingSingle.fromCallablehandleFunctionCalls_parallelMode_shouldExecuteConcurrently: two 1000ms tools complete in <1500ms under PARALLEL modehandleFunctionCalls_sequentialMode_shouldExecuteSerially: two 1000ms tools take ≥2000ms under SEQUENTIAL mode (serial contract unchanged)Testing