Add tool approval support with approveToolCall/denyToolCall#222
Add tool approval support with approveToolCall/denyToolCall#222sethconvex merged 6 commits intomainfrom
Conversation
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
This stack of pull requests is managed by Graphite. Learn more about stacking. |
commit: |
5dfefff to
231f81f
Compare
c4f452b to
e27ca09
Compare
e27ca09 to
6915b41
Compare
5283806 to
f5a0659
Compare
|
Possible bug in the addMessages mutation?
I patched this locally and it resolves the issue. This only triggers when the stream finish is batched with the message save (the atomic finish path from #181), so it doesn't surface until you hit the approval → continuation flow. |
79fab8a to
c70a594
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@src/client/approval.test.ts`:
- Around line 34-55: The function getApprovalIdFromSavedMessages currently types
its parameter too narrowly; change its parameter to accept the actual
MessageDoc[] shape (e.g., savedMessages: Array<MessageDoc> | undefined) and
update the accessors to safely handle optional message and content fields (use
savedMessage.message?.content and guard that content is an array before
flatMap). Keep the same logic for finding a part with type ===
"tool-approval-request" and the same runtime check that
approvalRequest.approvalId is a string before returning it, throwing the
existing error otherwise.
🧹 Nitpick comments (2)
src/client/approval.test.ts (2)
105-147: Consider moving test actions to a separate non-test file.The static analysis tool (Biome) flags exports from test files as suspicious. The
testApproveFlowandtestDenyFlowactions are exported for use withinitConvexTest, which is a valid pattern, but it may trigger lint warnings.If this pattern is intentional and standard in your codebase, consider adding a Biome exception for this file. Otherwise, move the action definitions to a separate module (e.g.,
approval.test.helpers.ts) and import them in the test.
29-32: Module-level mutable state may cause test interference.The
usageCallsarray is shared across all tests and manually cleared at the start of each test (usageCalls.length = 0). This works but is fragile if tests run in parallel or if a test forgets to clear the array.Consider using a test-local variable or Vitest's
beforeEachhook for more robust isolation.♻️ Alternative: Use beforeEach for cleanup
+import { describe, expect, test, beforeEach } from "vitest"; -import { describe, expect, test } from "vitest"; // Track usage handler calls to verify the full flow is exercised const usageCalls: LanguageModelUsage[] = []; +beforeEach(() => { + usageCalls.length = 0; +}); + // ... in tests, remove manual clearing: - test("approve: ...", async () => { - usageCalls.length = 0; + test("approve: ...", async () => {
c70a594 to
ef701ff
Compare
|
Awesome. All credit to Opus on that one.😇 |
|
@sethconvex On #222 w/ approveToolCall() / denyToolCall() and the finishStreamId atomic stream finish fix 💥 CleanShot.2026-02-11.at.10.22.06-converted.mp4Those animated tool call badges you see in the start (Checking memories...Deleting collection) – those don't come back after the first approval for "Creating links..." etc. Probably an issue on my side, but ....progress! Thanks again for your work on this. |
ianmacartney
left a comment
There was a problem hiding this comment.
Really nice! If this is working for tool calls, works for me. only nit is to actually test that the message roles get saved / merged as expected
|
Addressed review feedback: replaced imprecise |
022a5b8 to
52b4b56
Compare
f5a0659 to
251d0c7
Compare
52b4b56 to
ffb13f8
Compare
bdbdde1 to
269bb8b
Compare
ffb13f8 to
3953862
Compare
269bb8b to
93fec7b
Compare
3953862 to
4e8cd39
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
src/client/approval.test.ts (1)
28-32: Shared mutable state could cause flaky tests under parallel execution.
usageCallsis module-level shared state. While tests reset it withusageCalls.length = 0, if vitest runs tests in parallel within this file, usage data from one test could leak into another's assertions.Consider isolating state per test or ensuring sequential execution:
♻️ Option 1: Return usage data from the action instead of module-level state
-// Track usage handler calls to verify the full flow is exercised -const usageCalls: LanguageModelUsage[] = []; -const testUsageHandler: UsageHandler = async (_ctx, args) => { - usageCalls.push(args.usage); -}; +// Create a fresh usage tracker per agent to avoid shared state +function createUsageTracker() { + const calls: LanguageModelUsage[] = []; + const handler: UsageHandler = async (_ctx, args) => { + calls.push(args.usage); + }; + return { calls, handler }; +}Then create trackers per agent and return usage data from the action.
♻️ Option 2: Use vitest's sequential test mode for this file
Add at the top of the describe block:
describe.sequential("Tool Approval Workflow", () => {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/client/approval.test.ts` around lines 28 - 32, The test uses a module-level mutable array usageCalls and a shared testUsageHandler which can leak between parallel tests; to fix, stop using module-shared state by creating per-test trackers (e.g., declare const usageCalls: LanguageModelUsage[] and const testUsageHandler: UsageHandler inside each it/test or beforeEach so each test gets a fresh array), or change the action to return usage data directly from the function under test and assert on that returned value instead of reading usageCalls, or alternatively enforce sequential execution for the suite by replacing describe with describe.sequential; update references to usageCalls and testUsageHandler accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/test-gemini-approval.ts`:
- Around line 113-115: Replace the catch parameter type from any to unknown and
narrow it before using .message: change the catch clause in
scripts/test-gemini-approval.ts to catch (e: unknown) and inside the block check
if e is an instance of Error (e.g., if (e instanceof Error) {
console.error("FAILED -", e.message.slice(0,500)) } else { console.error("FAILED
-", String(e).slice(0,500)) }) so you safely handle thrown non-Error values and
satisfy TypeScript strict mode.
In `@src/client/streamText.ts`:
- Around line 163-172: When catching errors from awaiting stream and calling
result.consumeStream() inside the try/catch, call the pending RPC finalizer
before notifying the streamer: add an awaited call to call.fail() (or await
call.fail(e instanceof Error ? e.message : String(e))) immediately before
awaiting streamer?.fail(...) so the pending message is finalized the same way as
the onError path; keep the existing rethrow of the original error after both
awaits.
---
Nitpick comments:
In `@src/client/approval.test.ts`:
- Around line 28-32: The test uses a module-level mutable array usageCalls and a
shared testUsageHandler which can leak between parallel tests; to fix, stop
using module-shared state by creating per-test trackers (e.g., declare const
usageCalls: LanguageModelUsage[] and const testUsageHandler: UsageHandler inside
each it/test or beforeEach so each test gets a fresh array), or change the
action to return usage data directly from the function under test and assert on
that returned value instead of reading usageCalls, or alternatively enforce
sequential execution for the suite by replacing describe with
describe.sequential; update references to usageCalls and testUsageHandler
accordingly.
scripts/test-gemini-approval.ts
Outdated
| } catch (e: any) { | ||
| console.error("FAILED -", e.message?.slice(0, 500) ?? e); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check the current state of the file at lines 113-115
if [ -f "scripts/test-gemini-approval.ts" ]; then
echo "=== File content at lines 113-115 ==="
sed -n '110,120p' "scripts/test-gemini-approval.ts"
else
echo "File not found"
fiRepository: get-convex/agent
Length of output: 392
🏁 Script executed:
#!/bin/bash
# Search for all catch (e: any) patterns in TypeScript files
echo "=== Search for catch (e: any) patterns ==="
rg "catch\s*\(\s*e\s*:\s*any\s*\)" --type tsRepository: get-convex/agent
Length of output: 155
🏁 Script executed:
#!/bin/bash
# Check TypeScript and ESLint configuration
echo "=== TypeScript config ==="
if [ -f "tsconfig.json" ]; then
cat tsconfig.json | head -50
fi
echo ""
echo "=== ESLint config ==="
if [ -f ".eslintrc.json" ] || [ -f ".eslintrc.js" ] || [ -f ".eslintrc.cjs" ]; then
ls -la .eslintrc* 2>/dev/null && cat .eslintrc* | head -50
else
# Check if eslint config is in package.json
if [ -f "package.json" ]; then
grep -A 20 '"eslintConfig"' package.json || echo "No eslintConfig in package.json"
fi
fiRepository: get-convex/agent
Length of output: 868
Replace any in catch with unknown and narrow.
The code at lines 113-115 uses catch (e: any), which violates TypeScript's strict mode. Use unknown and narrow to Error for safer handling that accounts for JavaScript's ability to throw non-Error values.
🔧 Suggested fix
- } catch (e: any) {
- console.error("FAILED -", e.message?.slice(0, 500) ?? e);
+ } catch (e: unknown) {
+ const message =
+ e instanceof Error ? e.message?.slice(0, 500) : String(e);
+ console.error("FAILED -", message);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| } catch (e: any) { | |
| console.error("FAILED -", e.message?.slice(0, 500) ?? e); | |
| } | |
| } catch (e: unknown) { | |
| const message = | |
| e instanceof Error ? e.message?.slice(0, 500) : String(e); | |
| console.error("FAILED -", message); | |
| } |
🧰 Tools
🪛 GitHub Check: Test and lint
[failure] 113-113:
Unexpected any. Specify a different type
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/test-gemini-approval.ts` around lines 113 - 115, Replace the catch
parameter type from any to unknown and narrow it before using .message: change
the catch clause in scripts/test-gemini-approval.ts to catch (e: unknown) and
inside the block check if e is an instance of Error (e.g., if (e instanceof
Error) { console.error("FAILED -", e.message.slice(0,500)) } else {
console.error("FAILED -", String(e).slice(0,500)) }) so you safely handle thrown
non-Error values and satisfy TypeScript strict mode.
| try { | ||
| await stream; | ||
| await result.consumeStream(); | ||
| } catch (e) { | ||
| // If the stream errored (e.g. onStepFinish threw), the DeltaStreamer's | ||
| // finish() was never called, leaving the streaming message stuck in | ||
| // "streaming" state. Clean it up by marking it as aborted. | ||
| await streamer?.fail(e instanceof Error ? e.message : String(e)); | ||
| throw e; | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# First, verify the file exists and examine the context around lines 163-172
if [ -f "src/client/streamText.ts" ]; then
echo "=== File exists, examining context around lines 163-172 ==="
sed -n '140,190p' src/client/streamText.ts | cat -n
else
echo "File not found"
fiRepository: get-convex/agent
Length of output: 2160
🏁 Script executed:
#!/bin/bash
# Search for the streamText function definition to understand the full context
rg -n "export.*streamText|async.*streamText" src/client/streamText.ts -A 5Repository: get-convex/agent
Length of output: 234
🏁 Script executed:
#!/bin/bash
# Look for where 'call' is defined in this file
rg -n "call\s*=" src/client/streamText.ts | head -20Repository: get-convex/agent
Length of output: 42
🏁 Script executed:
#!/bin/bash
# Find all call sites of streamText
rg -n "streamText\(" src -g'*.ts' -B 2 -A 5Repository: get-convex/agent
Length of output: 2237
🏁 Script executed:
#!/bin/bash
# Get the full streamText function signature and beginning
sed -n '34,80p' src/client/streamText.ts | cat -nRepository: get-convex/agent
Length of output: 2102
🏁 Script executed:
#!/bin/bash
# Search for call.fail usage to understand what it does
rg -n "call\.fail" src -g'*.ts' -B 3 -A 3Repository: get-convex/agent
Length of output: 1303
🏁 Script executed:
#!/bin/bash
# Look for where 'call' parameter is defined and what type it is
rg -n "call:" src/client/streamText.ts -A 2 -B 2Repository: get-convex/agent
Length of output: 42
🏁 Script executed:
#!/bin/bash
# Search for pending message handling and call.save usage
rg -n "call\.save|pending" src/client/streamText.ts -B 2 -A 2Repository: get-convex/agent
Length of output: 878
🏁 Script executed:
#!/bin/bash
# Check error handling at streamText call sites
rg -n "streamText\(" src -g'*.ts' -A 10 | grep -E "(streamText|catch|try|\.then|error)" | head -40Repository: get-convex/agent
Length of output: 350
🏁 Script executed:
#!/bin/bash
# Find startGeneration to understand what call object contains
rg -n "export.*startGeneration|function startGeneration" src -g'*.ts' -A 20Repository: get-convex/agent
Length of output: 2346
🏁 Script executed:
#!/bin/bash
# Look for the full context of onError handler to see the pattern
sed -n '115,130p' src/client/streamText.ts | cat -nRepository: get-convex/agent
Length of output: 793
🏁 Script executed:
#!/bin/bash
# Search for type definitions of the call object
rg -n "call\." src/client/streamText.ts | head -20Repository: get-convex/agent
Length of output: 376
🏁 Script executed:
#!/bin/bash
# Get the full startGeneration function to see what it returns
sed -n '30,120p' src/client/start.ts | cat -nRepository: get-convex/agent
Length of output: 3665
🏁 Script executed:
#!/bin/bash
# Look for the return type or return statement of startGeneration
rg -n "return.*{" src/client/start.ts | head -5Repository: get-convex/agent
Length of output: 141
🏁 Script executed:
#!/bin/bash
# Check if there's documentation about what fail() does
rg -n "\.fail.*=" src/client/start.ts -B 5 -A 5Repository: get-convex/agent
Length of output: 42
🏁 Script executed:
#!/bin/bash
# Look for the full startGeneration implementation - it may be long
wc -l src/client/start.tsRepository: get-convex/agent
Length of output: 83
🏁 Script executed:
#!/bin/bash
# Find the fail function implementation in startGeneration
sed -n '90,200p' src/client/start.ts | cat -nRepository: get-convex/agent
Length of output: 4054
🏁 Script executed:
#!/bin/bash
# Search for where fail is defined/assigned
rg -n "fail\s*=" src/client/start.ts -B 2 -A 10Repository: get-convex/agent
Length of output: 506
🏁 Script executed:
#!/bin/bash
# Look at the full error handling flow in streamText - show more context around the try/catch
sed -n '155,185p' src/client/streamText.ts | cat -nRepository: get-convex/agent
Length of output: 1353
🏁 Script executed:
#!/bin/bash
# Look at how other functions handle errors with call.fail
rg -n "call\.fail" src -g'*.ts' -B 5 -A 2 | head -50Repository: get-convex/agent
Length of output: 1558
🏁 Script executed:
#!/bin/bash
# Check if there are any other try/catch blocks in streamText.ts that might show the pattern
sed -n '115,135p' src/client/streamText.ts | cat -nRepository: get-convex/agent
Length of output: 1059
🏁 Script executed:
#!/bin/bash
# Check if pendingMessage is created and what state it starts in
rg -n "pendingMessage" src/client/start.ts -B 2 -A 2 | head -40Repository: get-convex/agent
Length of output: 1466
🏁 Script executed:
#!/bin/bash
# Check the saveInputMessages function to understand pending message creation
rg -n "export.*saveInputMessages|function saveInputMessages" src -g'*.ts' -A 10Repository: get-convex/agent
Length of output: 682
🏁 Script executed:
#!/bin/bash
# Look at the full onError handler pattern to understand the expected behavior
sed -n '115,130p' src/client/streamText.tsRepository: get-convex/agent
Length of output: 681
🏁 Script executed:
#!/bin/bash
# Let's verify what pendingMessage state means by checking the message finalization
rg -n "finalizeMessage" src -g'*.ts' -B 2 -A 5 | head -40Repository: get-convex/agent
Length of output: 2099
🏁 Script executed:
#!/bin/bash
# Check if streaming creates its own pending messages or uses the same one
rg -n "createPendingMessage" src/client/streamText.ts -B 3 -A 3Repository: get-convex/agent
Length of output: 658
🏁 Script executed:
#!/bin/bash
# Check if there's any other cleanup that might handle this case
rg -n "onStepFinish|pendingFinalStep" src/client/streamText.ts -B 2 -A 5Repository: get-convex/agent
Length of output: 1933
🏁 Script executed:
#!/bin/bash
# Verify the exact pattern in onError vs the try/catch to confirm the discrepancy
sed -n '121,125p' src/client/streamText.ts && echo "---" && sed -n '163,172p' src/client/streamText.tsRepository: get-convex/agent
Length of output: 683
🏁 Script executed:
#!/bin/bash
# Confirm whether pendingMessage would exist when saveStreamDeltas is enabled
# Check the conditions under which streamText creates a pending message
sed -n '136,156p' src/client/start.tsRepository: get-convex/agent
Length of output: 839
🏁 Script executed:
#!/bin/bash
# Check what saveStreamDeltas options require to understand if pending message is always created
rg -n "saveStreamDeltas" src/client/streamText.ts -B 2 -A 2 | head -30Repository: get-convex/agent
Length of output: 997
🏁 Script executed:
#!/bin/bash
# Final verification: check if call.fail is passed to DeltaStreamer's onAsyncAbort
sed -n '86,100p' src/client/streamText.tsRepository: get-convex/agent
Length of output: 512
🏁 Script executed:
#!/bin/bash
# Verify what streamer?.fail() does - does it indirectly call call.fail?
rg -n "\.fail.*=" src/client/streaming.ts | head -20Repository: get-convex/agent
Length of output: 42
🏁 Script executed:
#!/bin/bash
# Search for DeltaStreamer.fail implementation to confirm it doesn't call call.fail
rg -n "class DeltaStreamer" src -g'*.ts' -A 150 | grep -E "(fail|constructor|onAsyncAbort)" | head -20Repository: get-convex/agent
Length of output: 484
🏁 Script executed:
#!/bin/bash
# Double-check: in what scenarios is saveStreamDeltas AND threadId both enabled?
rg -n "saveStreamDeltas.*true|saveStreamDeltas.*{" src -g'*.ts' -B 5 -A 2 | head -30Repository: get-convex/agent
Length of output: 207
🏁 Script executed:
#!/bin/bash
# Find the fail method in DeltaStreamer
rg -n "\.fail\s*\(" src/client/streaming.ts -B 2 -A 10Repository: get-convex/agent
Length of output: 42
🏁 Script executed:
#!/bin/bash
# Search for the fail method definition in DeltaStreamer class
sed -n '200,400p' src/client/streaming.ts | grep -A 10 "fail"Repository: get-convex/agent
Length of output: 368
Add call.fail() to finalize pending messages on stream-consumption errors.
The onError handler correctly calls call.fail() before streamer?.fail() to finalize the pending message. However, the try/catch block around stream consumption (lines 163–172) only calls streamer?.fail(), leaving the pending message un-finalized if result.consumeStream() throws.
Suggested fix
} catch (e) {
// If the stream errored (e.g. onStepFinish threw), the DeltaStreamer's
// finish() was never called, leaving the streaming message stuck in
// "streaming" state. Clean it up by marking it as aborted.
+ const reason = e instanceof Error ? e.message : String(e);
+ await call.fail(reason);
await streamer?.fail(e instanceof Error ? e.message : String(e));
throw e;
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| try { | |
| await stream; | |
| await result.consumeStream(); | |
| } catch (e) { | |
| // If the stream errored (e.g. onStepFinish threw), the DeltaStreamer's | |
| // finish() was never called, leaving the streaming message stuck in | |
| // "streaming" state. Clean it up by marking it as aborted. | |
| await streamer?.fail(e instanceof Error ? e.message : String(e)); | |
| throw e; | |
| } | |
| try { | |
| await stream; | |
| await result.consumeStream(); | |
| } catch (e) { | |
| // If the stream errored (e.g. onStepFinish threw), the DeltaStreamer's | |
| // finish() was never called, leaving the streaming message stuck in | |
| // "streaming" state. Clean it up by marking it as aborted. | |
| const reason = e instanceof Error ? e.message : String(e); | |
| await call.fail(reason); | |
| await streamer?.fail(e instanceof Error ? e.message : String(e)); | |
| throw e; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/client/streamText.ts` around lines 163 - 172, When catching errors from
awaiting stream and calling result.consumeStream() inside the try/catch, call
the pending RPC finalizer before notifying the streamer: add an awaited call to
call.fail() (or await call.fail(e instanceof Error ? e.message : String(e)))
immediately before awaiting streamer?.fail(...) so the pending message is
finalized the same way as the onError path; keep the existing rethrow of the
original error after both awaits.
93fec7b to
f2b2d4a
Compare
4e8cd39 to
2bfbdd6
Compare
Merge activity
|
Implements human-in-the-loop approval for tool calls using AI SDK v6's native collectToolApprovals(). When a tool has needsApproval, generation pauses with a tool-approval-request. After the user decides, calling streamText with the response message continues generation automatically. Core changes: - Agent.approveToolCall/denyToolCall save tool-approval-response messages - serializeNewMessagesInStep accepts newResponseMessages for approval flows - previousResponseMessageCount tracking for cumulative step.response.messages - Fix addMessages returning stale data after ctx.db.replace - Fix streamText not cleaning up DeltaStreamer on onStepFinish errors - Fix useDeltaStreams not clearing state when streams finish - Add realistic DEFAULT_USAGE in mockModel for AI SDK v6 - Unit tests for approve/deny flows Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Apply suggested changes
ca514fe to
4e7eed3
Compare
## Summary Example implementation demonstrating the tool approval flow from #222: - **Agent** (`example/convex/agents/approval.ts`): `delete_file` tool with `needsApproval: true` - **Backend** (`example/convex/chat/approval.ts`): `sendMessage`, `submitApproval`, `handleApprovalDecision` - **React UI** (`example/ui/chat/ChatApproval.tsx`): Approve/Deny buttons with denial reason input, type-safe approval helpers - **E2E tests** (`example/convex/approval.test.ts`): Approve and deny flows through usageHandler ## Test plan - [x] E2E approval test: approve flow executes tool and continues generation - [x] E2E approval test: deny flow produces denial acknowledgment from model - [x] Manual test: Send → Approve → clean completion (no stuck streaming) - [x] Manual test: Send → Deny → model acknowledges denial 🤖 Generated with [Claude Code](https://claude.com/claude-code)

Summary
Implements human-in-the-loop tool approval using AI SDK v6's native
collectToolApprovals(). When a tool hasneedsApproval: true, generation pauses with atool-approval-requestin the thread. The client saves an approval/denial response, then callsstreamTextto continue — the SDK handles tool execution and denial results automatically.Core changes (~55 lines of new implementation):
Agent.approveToolCall()/Agent.denyToolCall()— savetool-approval-responsemessagesserializeNewMessagesInStep— acceptnewResponseMessagesparam for approval tool-resultspreviousResponseMessageCounttracking in save closure for cumulativestep.response.messagesBug fixes included:
addMessagesreturning stale data afterctx.db.replacestreamTextnot cleaning up DeltaStreamer ononStepFinisherrorsuseDeltaStreamsnot clearing state when streams finish (caused ghost streaming bubble)Test plan
npm test)src/client/approval.test.ts)🤖 Generated with Claude Code
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Tests