Skip to content

fix(ai): resolve race condition in parallel tool execution#11907

Closed
delta575 wants to merge 2 commits intovercel:mainfrom
delta575:fix/parallel-tool-execution-race-condition
Closed

fix(ai): resolve race condition in parallel tool execution#11907
delta575 wants to merge 2 commits intovercel:mainfrom
delta575:fix/parallel-tool-execution-race-condition

Conversation

@delta575
Copy link

Background

When multiple tools execute in parallel during streamText, the stream could close prematurely or throw "The stream is not in a state that permits enqueue" errors.

Context: We're using @convex-dev/agent 0.3.2 which doesn't support AI SDK v6. We wanted to use Gemini 3 Flash, but thought_signature is required (not optional), so we had to downgrade to Gemini 2.5 where it's optional and AI SDK v5 worked fine. When testing AI SDK v6 compatibility via get-convex/agent#208 in preparation for Gemini 3 support, we encountered this race condition - Gemini 3 Flash aggressively uses parallel tool calls which exposed the bug.

Summary

Two issues caused this race condition:

  1. Non-unique tool tracking IDs: generateId() was returning the same value for multiple tools in a batch, causing the outstandingToolResults Set to only track one tool. When that tool completed, the stream closed while others were still running.

  2. Re-entry in attemptClose: Multiple finally() blocks could call attemptClose() simultaneously, causing race conditions when closing the stream.

Fix:

  • Use toolCall.toolCallId instead of generateId() for unique tool tracking
  • Add closed flag to prevent re-entry in attemptClose()
  • Guard async enqueue calls to prevent enqueueing after stream closure

Manual Verification

Tested with a Convex application using @convex-dev/agent from get-convex/agent#208 and Gemini 3 Flash with 5+ parallel tool calls - all tool results are now captured and streaming completes successfully.

Checklist

  • Tests have been added / updated
  • A patch changeset for relevant packages has been added
  • I have reviewed this pull request (self-review)

delta575 and others added 2 commits January 20, 2026 18:14
When multiple tools execute in parallel, the stream could close
prematurely or throw "stream is not in a state that permits enqueue"
errors. Two issues caused this:

1. generateId() returned the same value for multiple tools in a batch,
   causing outstandingToolResults Set to only track one tool

2. Multiple finally() blocks calling attemptClose() simultaneously
   caused race conditions

Fix:
- Use toolCall.toolCallId instead of generateId() for unique tracking
- Add closed flag to prevent re-entry in attemptClose()
- Guard async enqueue calls to prevent enqueueing after stream closure

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add tests to verify that:
- Using toolCallId for tracking handles parallel tools correctly
  (exposes bug where generateId returns same value for all tools)
- Multiple tools with different delays all complete successfully
- Stream doesn't close prematurely when fast tool completes before slow tool
- Many parallel tool calls (10+) don't lose results

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link
Contributor

@vercel vercel bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional Suggestion:

Missing error handling in attemptClose() causes unhandled errors when closing toolResultsStreamController, with errors silently swallowed in finally block

Fix on Vercel

// close the tool results controller if no more outstanding tool calls
if (canClose && outstandingToolResults.size === 0) {
// Mark as closed BEFORE doing any work to prevent race conditions
// where multiple finally() blocks call attemptClose() simultaneously
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unguarded stream controller enqueue calls throw errors when stream is closed externally (via AbortSignal), causing uncaught exceptions and silently dropping results

Fix on Vercel

const toolExecutionId = generateId(); // use our own id to guarantee uniqueness
// Use toolCallId which is unique per tool call from the LLM
// (generateId() was returning the same value for multiple tools in a batch)
const toolExecutionId = toolCall.toolCallId;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has this fix been ai generated?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generateId was used intentionally. if you end up with repeated ids that is most likely an issue with the id generator that you pass in

@lgrammel
Copy link
Collaborator

This PR mixes 2 issues. The close state issue seems reasonable, but I have doubts regarding the id generation. Please separate the close fix into a new PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants