Skip to content

Releases: theredsix/agent-browser-protocol

v0.1.10

28 Mar 23:36

Choose a tag to compare


v0.1.10

  • Added CDP mode to allow commercial CAPTCHA solvers to take over when necessary

Full Changelog: v0.1.20...v0.1.10

v0.1.9

25 Mar 06:41

Choose a tag to compare


v0.1.9

Multi-Instance Support

  • Automatic port detection — When no --port or ABP_PORT is specified, ABP now automatically finds
    an available TCP port starting at 15678, probing up to 100 candidates. This allows launching
    multiple ABP instances without manual port coordination.
  • Isolated user data directories — Each launch now always sets --user-data-dir to a unique temp
    directory (/tmp/abp-*) when not explicitly provided. Previously, omitting --user-data-dir caused
    Chrome's single-instance lock to silently exit if another instance was already running.
  • New exports — findAvailablePort() and DEFAULT_START_PORT are now exported from the package for
    programmatic use.

MCP Proxy Improvements

  • Removed stale instance detection — The MCP proxy (mcp-proxy.ts) no longer probes for an existing
    ABP instance on startup. It always launches a fresh, isolated browser, eliminating race conditions
    when multiple MCP clients start simultaneously.
  • Correct port forwarding — MCP requests now use the actual resolved port from the launched browser
    rather than the configured default, fixing failures when auto-port-detection selects a non-default
    port.

CLI Changes

  • Port is now optional in --port flag (default: auto-detect starting at 15678)
  • Startup log now shows "Finding available port..." when no explicit port is set
  • API/MCP URLs printed after launch reflect the actual port used

Full Changelog: v0.1.8...v0.1.9

v0.1.8

19 Mar 04:42

Choose a tag to compare

v0.1.8

New Features

  • Animation wait support for browser_wait — New animation parameter lets agents wait for CSS/DOM
    animations to complete after network activity settles, improving reliability on animation-heavy
    pages.

Bug Fixes

  • Fixed debugger hang during cross-origin navigation — Navigation actions (navigate, reload, back,
    forward) now send Debugger.disable before executing when the debugger is paused, preventing the
    renderer main thread from blocking on beforeunload handlers during cross-process navigation.
  • Fixed use-after-free in error response handling — Cleanup was happening before the error response
    was fully sent; reordered to prevent crash.
  • Linux rendering validation — ABP now detects the unsupported --disable-gpu +
    --disable-software-rasterizer flag combination on Linux and exits with a clear error message instead
    of producing blank screenshots silently.

Improvements

  • Linux packaging — Added optional bundling of vk_swiftshader_icd.json, WidevineCdm, MEIPreload, and
    other runtime files for more complete Linux distributions.
  • Linux troubleshooting docs — Added guidance on common IBUS warnings, EGL errors, and GPU driver
    fallback configuration to MANUAL_INSTALL.md.

Full Changelog: v0.1.6...v0.1.8

v0.1.7

14 Mar 07:14
52ef26c

Choose a tag to compare

agent-browser-protocol v0.1.7

Console Capture

  • In-memory 5000-entry FIFO ring buffer capturing console.log/warn/error, CORS errors, CSP
    violations, and uncaught exceptions
  • Browser-process WebContentsObserver implementation — no CDP dependency, non-fingerprintable
  • REST API: GET /api/v1/console with level, pattern (RE2 regex), tab_id, limit, and after_id
    pagination
  • REST API: DELETE /api/v1/console to clear buffer (optional tab_id filter)
  • MCP: new browser_console tool (19th tool) with query and clear support
  • Renderer fix: CORS and CSP violation messages now forwarded to browser process

Animation Wait

  • New animation parameter on browser_wait MCP tool and action envelope
  • Configurable post-network animation settle time wired through action lifecycle
  • Dedicated OnAnimationWaitTimeElapsed handler ensures CSS/JS animations complete before
    screenshot capture

Bug Fixes

  • UA client hints fingerprinting: Sec-CH-UA and navigator.userAgentData.brands now include
    "Google Chrome" brand entry, matching a standard Chrome build
  • Build fix: out-of-line constructors for Options struct to avoid linker issues

Full Changelog: v0.1.6...v0.1.7

v0.1.6

09 Mar 18:00

Choose a tag to compare

agent-browser-protocol v0.1.6

Human Input Mode

  • Agent/human toggle via REST API (/browser/input-mode) and toolbar icon in the address bar
  • Yellow gradient border overlay when in human mode for clear visual feedback
  • Robot/human icon in omnibox toggles mode on click, with yellow icon color in human mode
  • MCP tools blocked during human mode to prevent interference with user interaction
  • Replaces deprecated --allow-system-inputs flag

Network Capture

  • Per-tab in-memory ring buffer (1000 requests) capturing CDP Network events
  • SQLite-backed persistent storage with tagging via POST /network/save
  • Query saved and in-memory buffer results via GET /network with regex filters
  • Document type added to default capture types
  • network_tag parameter on all action tools for automatic tagging

Session-aware Curl

  • POST /tabs/{id}/curl executes HTTP requests using the tab's cookies and session state
  • Works while JS execution is paused

MCP Tools

  • New browser_network tool (query, save, clear network captures)
  • New browser_curl tool (session-aware HTTP requests)

NPM SDK

  • New methods: slider(), clearText(), batch(), waitForNetwork(), permissions(), selectPopup(),
    download content
  • New launch options: zoom, configFile, disablePause, allowSystemInputs
  • Updated default timing: 150ms action delay, 350ms screenshot delay
  • Fix: permissions.list() now correctly unwraps {permissions: [...]} envelope
  • CLI flags: --zoom, --config, --disable-pause, --allow-system-inputs passed through MCP proxy

Bug Fixes

  • Fix human mode toggle not switching back to agent mode (observer registration timing)
  • Fix yellow border overlay not appearing immediately on mode switch
  • Fix history use-after-free when recording errors
  • Fix curl max body size to 5MB (SimpleURLLoader limit)
  • Fix network MCP tool URL encoding and header handling

Full Changelog: v0.1.5...v0.1.6

v0.1.5

04 Mar 21:31

Choose a tag to compare

ABP v0.1.5 Release Notes

Multi-Scroll with Intermediate Screenshots

browser_scroll now accepts an array of scroll steps instead of a single delta. Each step captures an intermediate
screenshot, giving agents visual feedback as content scrolls into view. A mouse move is dispatched before wheel
events for accurate scroll targeting.

  • New scrolls array parameter with {delta_px, direction} per step
  • Intermediate screenshots returned as MCP image blocks
  • Mouse move event dispatched to scroll target before wheel events

browser_wait Tool

New browser_wait MCP tool that waits for network activity to settle before returning. Tracks same-site in-flight
requests persistently per tab to detect when a page has finished loading dynamic content.

  • 5s maximum wait with 150ms pre-wait and 350ms post-settle window
  • Persistent per-tab network request tracking
  • New REST endpoint: POST /api/v1/tabs/{id}/wait

Frozen Screenshot Lifecycle

Action lifecycle reordered to freeze execution before capturing screenshots. Markup overlays now persist across the
pause boundary, so the screenshot always reflects the markup the agent will see.

  • ForceRedrawForTab and CleanupMarkupForTab extracted for per-tab control
  • Force-redraw and pause profiling added for performance visibility

Multi-Tab Execution Control

Execution control now handles multiple tabs correctly. When switching tabs, the previously active tab is
backgrounded and its pause/resume lifecycle is suspended, preventing CDP callbacks from interfering with the active
tab.

  • backgrounded flag on TabState gates execution control
  • Action lifecycle skips pause when tab is backgrounded
  • Browser tests added for multi-tab handoff

Action IDs

Every action response now includes a unique action_id string, making it easy to correlate actions with history
entries and screenshots.

  • String-based action IDs (replacing integer IDs in history)
  • action_id included in both success and error responses

browser_clear_text Tool

New MCP tool to clear text from input fields — select-all + delete in a single action.

NPM Package: User Profile Options

The @anthropic-ai/abp npm package now supports custom user data directories, profile directories, and user agent
strings.

  • New CLI flags: --user-data-dir, --profile-directory, --user-agent
  • Environment variables: ABP_USER_DATA_DIR, PROFILE_DIRECTORY, USER_AGENT
  • LaunchOptions API: userDataDir, profileDirectory, userAgent

Bug Fixes

  • Fixed custom app icon missing from release builds
  • Improved drag/slider timing (50 steps over 500ms)
  • Made macOS notarization more resilient
  • Fixed Linux release script
  • Fixed unsafe buffer access in GenerateActionId

Documentation

  • Added ABP-specific license file
  • Updated README and quickstart docs

Full Changelog: v0.1.4...v0.1.5

v0.1.4

25 Feb 07:44

Choose a tag to compare

ABP v0.1.4 Release Notes

Permission Interception

Full permission prompt interception — ABP now intercepts browser permission requests (geolocation,
notifications, camera, etc.) instead of letting them block the page. New REST endpoints and MCP tools
allow agents to grant or deny permissions programmatically.

  • AbpPermissionObserver intercepts permission prompts at the engine level
  • Granting geolocation accepts latitude, longitude, and accuracy parameters
  • AbpLocationProvider provides mock geolocation coordinates — no OS-level location dialog
  • New endpoints: GET /api/v1/permissions, POST /api/v1/permissions/{id}/grant, POST
    /api/v1/permissions/{id}/deny
  • New MCP tool: respond_to_permission

Configurable Wait Timings

Action wait timings are now configurable via CLI flags, environment variables, or the SDK.

  • --min-wait — minimum wait time after actions (default 500ms)
  • --tracking-timeout — network/DOM tracking timeout (default 3000ms)
  • --post-settle — post-settle delay (default 500ms)
  • Also configurable via ABP_MIN_WAIT, ABP_TRACKING_TIMEOUT, ABP_POST_SETTLE env vars
  • SDK LaunchOptions now exposes these as typed options

Human-Speed Typing

Keyboard type action now types at a more human-like speed to avoid breaking autocomplete and other
input-sensitive page behaviors.

ABP Branding

  • Custom ABP logo replaces default Chromium icons across all platforms
  • Icon generation script at tools/abp/

Bug Fixes

  • Fixed Environment::GetVar API usage and Options type include
  • Fixed Windows packaging — correct exe name, added manifests
  • Include chrome_crashpad_handler in Linux and Windows packages
  • Added mouse_drag fallback guidance to browser_slider tool description

Documentation

  • Updated quickstart with Claude Code, Codex, and HTTP mode MCP paths
  • Fixed MCP tool count in docs (12/13 → 14)

Full Changelog: v0.1.3...v0.1.4

v0.1.3

24 Feb 04:21

Choose a tag to compare

  • Added support for native elements Improved download and file upload handling Full Changelog: https://github.com/theredsix/agent-browser-protocol/compare/v0.1.2...v0.1.3

v0.1.2

22 Feb 05:03

Choose a tag to compare

  • Added browser_slider action
  • Improved release scripts

Full Changelog: v0.1.1...v0.1.2

v0.1.1

21 Feb 03:13

Choose a tag to compare

What's New

ABP v0.1.1 is a major quality and usability release — 94 commits covering performance,
stability, a streamlined MCP surface, and cross-platform builds.

MCP Tool Consolidation

Consolidated 30 MCP tools down to 12 with a new dispatch table architecture. Simpler tool
surface, same full capability. The embedded MCP server is available at /mcp with no sidecar
process.

Batch Execution

New /api/v1/tabs/{id}/batch endpoint for executing multiple actions in a single request —
reduces round-trips for multi-step workflows.

Performance

  • WebP encoding with Lanczos3 scaling for smaller, sharper screenshots
  • CoreAnimation delay reduced from 167ms to 50ms
  • Compositor-based scroll position — eliminates JS round-trip
  • Parallel pause + markup opt-in — action latency significantly reduced

Screenshot Markup Overhaul

  • Markup enabled by default — screenshots include interactive element overlays out of the box
  • Composable markup tags — mix and match interactive, text, image, selected, etc. as an array
  • disable_markup parameter to opt out of specific tags instead of opting in
  • selected tag — highlights the element under the cursor

New Actions

  • Drag and drop — POST /api/v1/tabs/{id}/drag
  • NormalizeKey utility for consistent keyboard input handling across platforms
  • US keyboard layout for type action with proper virtual cursor scaling

Stability Fixes

  • Prevent renderer DCHECK crashes during rapid cross-origin navigation with debugger pause
  • Fix Performance::SuspendObserver idempotency to prevent Blink DCHECK crash
  • Event-driven ForceRedraw recovery during navigation (no more dropped screenshots)
  • Fix cross-process navigation renderer crashes
  • Suppress crash recovery popup and session restore on startup
  • Auto-reattach debug server when ABP relaunches with new session
  • Global monotonicity clamp for BeginFrameArgs frame_time

Developer Experience

  • Renamed binary from chrome/chromium to abp (Linux/Windows) and ABP.app (macOS)
  • ABP always starts — no more --enable-abp flag required
  • Fixed viewport — 1280x800 locked at 100% zoom, no user resize
  • Debug UI — HTML dashboard with action forms, history panel, and SSE live updates
  • agent-browser-protocol npm package with Claude Code plugin and stdio MCP proxy

Cross-Platform Builds

  • Linux build working — tools/abp/release-linux.sh
  • Windows build working — tools/abp/release-win.ps1
  • Release scripts auto-read version from package.json
  • Simplified archive naming: abp-{version}-{platform}-{arch}

Documentation

  • Rewritten README as MCP-first landing page
  • Standalone REST API reference
  • Standalone MCP server reference
  • Build-from-source guide for all platforms
  • Training data guide with SQLite schema and abp-debug

Full Changelog: 0.1.0...v0.1.1