Skip to content

feat(cache): add custom tool caching and fix replay navigation timing#1562

Open
hoverlover wants to merge 3 commits intobrowserbase:mainfrom
hoverlover:feature/custom-tool-caching
Open

feat(cache): add custom tool caching and fix replay navigation timing#1562
hoverlover wants to merge 3 commits intobrowserbase:mainfrom
hoverlover:feature/custom-tool-caching

Conversation

@hoverlover
Copy link

@hoverlover hoverlover commented Jan 16, 2026

Summary

This PR adds two improvements to the agent cache replay system:

1. Custom Tool Caching (fixes #1558)

Custom tools (like form-filling helpers) were not being recorded or replayed from cache, causing workflows to fail on cache replay.

Changes:

  • Add AgentReplayCustomToolStep type for caching custom tool invocations
  • Wrap custom tools with recording logic via wrapToolsForRecording()
  • Record custom tool calls in both AISDK (hybrid/dom) and CUA agent modes
  • Replay custom tools by re-executing with cached arguments
  • Thread tools parameter through tryReplay and related methods

2. Navigation Waiting (fixes #1561)

Cache replay was executing steps too fast without waiting for page navigation to complete, causing subsequent steps to run on the wrong page.

Changes:

  • Detect URL changes after actions during replay
  • Wait for page load (waitForLoadState) before continuing to next step
  • Apply to replayAgentActStep, replayAgentFillFormStep, replayAgentKeysStep
  • Prevents race conditions where steps execute before navigation completes

Test Plan

Tested with a real-world workflow:

  • Login flow with custom fillUsername and fillPassword tools
  • Search flow with custom fillSearch tool
  • Multi-page navigation (login → home → search → product page)
  • Verified 0 token usage on cache replay (no LLM calls)
  • Verified consistent results across 3+ replay runs

Files Changed

  • packages/core/lib/v3/types/private/cache.ts - Add AgentReplayCustomToolStep type
  • packages/core/lib/v3/v3.ts - Add wrapToolsForRecording(), update tool threading
  • packages/core/lib/v3/handlers/v3CuaAgentHandler.ts - Add custom tool recording for CUA mode
  • packages/core/lib/v3/cache/AgentCache.ts - Add custom tool replay, add navigation waiting

Summary by cubic

Add custom tool caching to agent replay and wait for navigation between steps. This fixes tool-based workflows and prevents steps from running on the wrong page.

  • New Features

    • Record and replay custom tool calls as custom_tool steps.
    • Wrap tools to auto-record; replay re-executes with cached args.
    • Pass tools through tryReplay and stream replay.
  • Bug Fixes

    • Detect URL changes during replay and wait for page load.
    • Applied to act, fillForm, and keys (e.g., Enter submits forms).
    • Prevents steps executing on the wrong page.
    • Clean up recording state if stream creation fails.

Written for commit 36d3274. Summary will update on new commits.

This commit adds two improvements to the agent cache replay system:

1. Custom Tool Caching (fixes browserbase#1558)
   - Add AgentReplayCustomToolStep type for caching custom tool invocations
   - Wrap custom tools with recording logic via wrapToolsForRecording()
   - Record custom tool calls in both AISDK (hybrid) and CUA agent modes
   - Replay custom tools by re-executing with cached arguments
   - Thread tools parameter through tryReplay and related methods

2. Navigation Waiting (fixes browserbase#1561)
   - Detect URL changes after actions during replay
   - Wait for page load (waitForLoadState) before continuing to next step
   - Apply to replayAgentActStep, replayAgentFillFormStep, replayAgentKeysStep
   - Prevents race conditions where steps execute before navigation completes

These changes enable reliable cache replay for workflows that include:
- Custom form-filling tools (fillUsername, fillPassword, etc.)
- Multi-page navigation sequences
- Form submissions via Enter key

Co-Authored-By: Claude <noreply@anthropic.com>
@changeset-bot
Copy link

changeset-bot bot commented Jan 16, 2026

⚠️ No Changeset found

Latest commit: 36d3274

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 16, 2026

Greptile Summary

This PR extends agent cache replay to support custom tools and fixes navigation timing issues during replay. The implementation wraps custom tools to record their invocations and detects URL changes to wait for page load completion.

Key Changes:

  • Added AgentReplayCustomToolStep type for caching custom tool calls
  • Implemented wrapToolsForRecording() to intercept and record tool executions
  • Added custom tool replay in replayAgentCustomToolStep()
  • Navigation detection: capture URL before/after actions, wait for load state if URL changes
  • Applied navigation waiting to replayAgentActStep, replayAgentFillFormStep, and replayAgentKeysStep

Critical Issue Found:

  • tryReplay() is called with unwrapped tools instead of wrappedTools in 3 locations (lines 1856, 1944, 1989), causing a mismatch between what's recorded during execution vs. what's available during replay

Confidence Score: 2/5

  • This PR has a critical bug that will break custom tool replay functionality
  • The implementation passes unwrapped tools to replay methods instead of wrapped tools, causing tools during replay to lack the recording wrappers present during initial execution. This breaks the core functionality being added.
  • packages/core/lib/v3/v3.ts requires immediate attention - the tool wrapping logic has a critical bug at lines 1856, 1944, and 1989

Important Files Changed

Filename Overview
packages/core/lib/v3/v3.ts Adds wrapToolsForRecording to capture custom tool invocations. Critical bug: passes unwrapped tools to tryReplay instead of wrappedTools, causing mismatch between recording and replay.
packages/core/lib/v3/cache/AgentCache.ts Adds custom tool replay support and navigation waiting logic. Implementation correctly detects URL changes and waits for page load with proper timeout handling.
packages/core/lib/v3/handlers/v3CuaAgentHandler.ts Adds recording for custom_tool actions in CUA mode. Implementation correctly records tool name and arguments when recording is active.
packages/core/lib/v3/types/private/cache.ts Adds AgentReplayCustomToolStep type to support custom tool caching. Type definition is clean and well-documented.

Sequence Diagram

sequenceDiagram
    participant User
    participant V3
    participant AgentCache
    participant Handler
    participant CustomTool
    participant Page

    User->>V3: agent({ tools, cacheDir })
    V3->>V3: wrapToolsForRecording(tools)
    Note over V3: Wraps tools to record invocations
    
    alt Cache Hit
        V3->>AgentCache: tryReplay(cacheContext, tools)
        AgentCache->>AgentCache: Load cached steps
        loop For each cached step
            alt Custom Tool Step
                AgentCache->>CustomTool: execute(cached args)
                CustomTool-->>AgentCache: result
            else Navigation Action
                AgentCache->>Page: takeDeterministicAction()
                Page-->>AgentCache: action result
                AgentCache->>AgentCache: Check URL change
                alt URL Changed
                    AgentCache->>Page: waitForLoadState("load", 10000)
                    Page-->>AgentCache: load complete
                end
            end
        end
        AgentCache-->>V3: cached result
        V3-->>User: result
    else Cache Miss
        V3->>V3: beginAgentReplayRecording()
        V3->>Handler: execute/stream(wrappedTools)
        loop Agent execution
            Handler->>CustomTool: execute(args)
            Note over CustomTool: Wrapped tool records invocation
            CustomTool->>V3: recordAgentReplayStep({ type: "custom_tool" })
            CustomTool-->>Handler: tool result
            Handler->>Page: perform actions
            Page-->>Handler: action results
        end
        Handler-->>V3: final result
        V3->>V3: endAgentReplayRecording()
        V3->>AgentCache: store(steps, result)
        V3-->>User: result
    end
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

const replayed = await this.agentCache.tryReplay(
cacheContext,
undefined,
tools,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: passing unwrapped tools instead of wrappedTools here - replay will execute the original tools without recording wrappers, causing double-recording if cache miss falls through to re-execution

Suggested change
tools,
tools: wrappedTools,
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/core/lib/v3/v3.ts
Line: 1856:1856

Comment:
**logic:** passing unwrapped `tools` instead of `wrappedTools` here - replay will execute the original tools without recording wrappers, causing double-recording if cache miss falls through to re-execution

```suggestion
                  tools: wrappedTools,
```

How can I resolve this? If you propose a fix, please make it concise.

const replayed = await this.agentCache.tryReplayAsStream(
cacheContext,
llmClient,
tools,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: passing unwrapped tools instead of wrappedTools - same issue in streaming mode

Suggested change
tools,
tools: wrappedTools,
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/core/lib/v3/v3.ts
Line: 1944:1944

Comment:
**logic:** passing unwrapped `tools` instead of `wrappedTools` - same issue in streaming mode

```suggestion
                tools: wrappedTools,
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 1989 to 1991
const replayed = await this.agentCache.tryReplay(
cacheContext,
llmClient,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: passing unwrapped tools instead of wrappedTools - same issue in non-streaming mode (second agent handler path)

Suggested change
const replayed = await this.agentCache.tryReplay(
cacheContext,
llmClient,
const replayed = await this.agentCache.tryReplay(
cacheContext,
llmClient,
wrappedTools,
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/core/lib/v3/v3.ts
Line: 1989:1991

Comment:
**logic:** passing unwrapped `tools` instead of `wrappedTools` - same issue in non-streaming mode (second agent handler path)

```suggestion
            const replayed = await this.agentCache.tryReplay(
              cacheContext,
              llmClient,
              wrappedTools,
```

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 4 files

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="packages/core/lib/v3/v3.ts">

<violation number="1" location="packages/core/lib/v3/v3.ts:1955">
P2: Recording is started before stream creation without error handling; exceptions leave cache stuck in recording state</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

hoverlover and others added 2 commits January 19, 2026 09:07
If handler.stream() throws after beginAgentReplayRecording() is called,
the recording state was never cleaned up, leaving the cache stuck in
recording mode.

Added try-catch around stream creation to call discardAgentReplayRecording()
on error, matching the error handling in non-streaming mode.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Clarifies why unwrapped tools (not wrappedTools) are passed to tryReplay().
During replay, tools execute with cached arguments - using wrappedTools
would cause the recording wrapper to record these replayed calls,
leading to duplicate cache entries.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@hoverlover
Copy link
Author

Response to Review Comments

Re: Using wrappedTools instead of tools in tryReplay() calls

After analysis, I believe the current code is correct and should not be changed to use wrappedTools. Here's why:

During replay, tryReplay() executes custom tools with their cached arguments via replayAgentCustomToolStep(), which calls tool.execute().

If we passed wrappedTools:

  • The recording wrapper would intercept these replayed tool calls
  • Each replayed call would be recorded as a new step
  • This would cause duplicate entries if the cache is later updated

The current flow is correct:

  1. Cache miss (initial execution): Handler uses wrappedTools → tool invocations are recorded → steps stored in cache
  2. Cache hit (replay): tryReplay() uses unwrapped tools → cached tools execute without recording → no duplicates

I've added inline comments at all three locations to clarify this intentional design decision (commit 36d3274).


Fixed: Recording state cleanup (P2)

Added try-catch around handler.stream() to call discardAgentReplayRecording() on error, preventing the cache from being stuck in recording mode if stream creation fails (commit 63a6d52).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Agent cache does not record or replay custom tool calls Agent cache replay executes steps too fast, causing navigation failures

1 participant