Releases: theredsix/agent-browser-protocol
v0.1.10
v0.1.9
v0.1.9
Multi-Instance Support
- Automatic port detection — When no --port or ABP_PORT is specified, ABP now automatically finds
an available TCP port starting at 15678, probing up to 100 candidates. This allows launching
multiple ABP instances without manual port coordination. - Isolated user data directories — Each launch now always sets --user-data-dir to a unique temp
directory (/tmp/abp-*) when not explicitly provided. Previously, omitting --user-data-dir caused
Chrome's single-instance lock to silently exit if another instance was already running. - New exports — findAvailablePort() and DEFAULT_START_PORT are now exported from the package for
programmatic use.
MCP Proxy Improvements
- Removed stale instance detection — The MCP proxy (mcp-proxy.ts) no longer probes for an existing
ABP instance on startup. It always launches a fresh, isolated browser, eliminating race conditions
when multiple MCP clients start simultaneously. - Correct port forwarding — MCP requests now use the actual resolved port from the launched browser
rather than the configured default, fixing failures when auto-port-detection selects a non-default
port.
CLI Changes
- Port is now optional in --port flag (default: auto-detect starting at 15678)
- Startup log now shows "Finding available port..." when no explicit port is set
- API/MCP URLs printed after launch reflect the actual port used
Full Changelog: v0.1.8...v0.1.9
v0.1.8
v0.1.8
New Features
- Animation wait support for browser_wait — New animation parameter lets agents wait for CSS/DOM
animations to complete after network activity settles, improving reliability on animation-heavy
pages.
Bug Fixes
- Fixed debugger hang during cross-origin navigation — Navigation actions (navigate, reload, back,
forward) now send Debugger.disable before executing when the debugger is paused, preventing the
renderer main thread from blocking on beforeunload handlers during cross-process navigation. - Fixed use-after-free in error response handling — Cleanup was happening before the error response
was fully sent; reordered to prevent crash. - Linux rendering validation — ABP now detects the unsupported --disable-gpu +
--disable-software-rasterizer flag combination on Linux and exits with a clear error message instead
of producing blank screenshots silently.
Improvements
- Linux packaging — Added optional bundling of vk_swiftshader_icd.json, WidevineCdm, MEIPreload, and
other runtime files for more complete Linux distributions. - Linux troubleshooting docs — Added guidance on common IBUS warnings, EGL errors, and GPU driver
fallback configuration to MANUAL_INSTALL.md.
Full Changelog: v0.1.6...v0.1.8
v0.1.7
agent-browser-protocol v0.1.7
Console Capture
- In-memory 5000-entry FIFO ring buffer capturing console.log/warn/error, CORS errors, CSP
violations, and uncaught exceptions - Browser-process WebContentsObserver implementation — no CDP dependency, non-fingerprintable
- REST API: GET /api/v1/console with level, pattern (RE2 regex), tab_id, limit, and after_id
pagination - REST API: DELETE /api/v1/console to clear buffer (optional tab_id filter)
- MCP: new browser_console tool (19th tool) with query and clear support
- Renderer fix: CORS and CSP violation messages now forwarded to browser process
Animation Wait
- New animation parameter on browser_wait MCP tool and action envelope
- Configurable post-network animation settle time wired through action lifecycle
- Dedicated OnAnimationWaitTimeElapsed handler ensures CSS/JS animations complete before
screenshot capture
Bug Fixes
- UA client hints fingerprinting: Sec-CH-UA and navigator.userAgentData.brands now include
"Google Chrome" brand entry, matching a standard Chrome build - Build fix: out-of-line constructors for Options struct to avoid linker issues
Full Changelog: v0.1.6...v0.1.7
v0.1.6
agent-browser-protocol v0.1.6
Human Input Mode
- Agent/human toggle via REST API (/browser/input-mode) and toolbar icon in the address bar
- Yellow gradient border overlay when in human mode for clear visual feedback
- Robot/human icon in omnibox toggles mode on click, with yellow icon color in human mode
- MCP tools blocked during human mode to prevent interference with user interaction
- Replaces deprecated --allow-system-inputs flag
Network Capture
- Per-tab in-memory ring buffer (1000 requests) capturing CDP Network events
- SQLite-backed persistent storage with tagging via POST /network/save
- Query saved and in-memory buffer results via GET /network with regex filters
- Document type added to default capture types
- network_tag parameter on all action tools for automatic tagging
Session-aware Curl
- POST /tabs/{id}/curl executes HTTP requests using the tab's cookies and session state
- Works while JS execution is paused
MCP Tools
- New browser_network tool (query, save, clear network captures)
- New browser_curl tool (session-aware HTTP requests)
NPM SDK
- New methods: slider(), clearText(), batch(), waitForNetwork(), permissions(), selectPopup(),
download content - New launch options: zoom, configFile, disablePause, allowSystemInputs
- Updated default timing: 150ms action delay, 350ms screenshot delay
- Fix: permissions.list() now correctly unwraps {permissions: [...]} envelope
- CLI flags: --zoom, --config, --disable-pause, --allow-system-inputs passed through MCP proxy
Bug Fixes
- Fix human mode toggle not switching back to agent mode (observer registration timing)
- Fix yellow border overlay not appearing immediately on mode switch
- Fix history use-after-free when recording errors
- Fix curl max body size to 5MB (SimpleURLLoader limit)
- Fix network MCP tool URL encoding and header handling
Full Changelog: v0.1.5...v0.1.6
v0.1.5
ABP v0.1.5 Release Notes
Multi-Scroll with Intermediate Screenshots
browser_scroll now accepts an array of scroll steps instead of a single delta. Each step captures an intermediate
screenshot, giving agents visual feedback as content scrolls into view. A mouse move is dispatched before wheel
events for accurate scroll targeting.
- New scrolls array parameter with {delta_px, direction} per step
- Intermediate screenshots returned as MCP image blocks
- Mouse move event dispatched to scroll target before wheel events
browser_wait Tool
New browser_wait MCP tool that waits for network activity to settle before returning. Tracks same-site in-flight
requests persistently per tab to detect when a page has finished loading dynamic content.
- 5s maximum wait with 150ms pre-wait and 350ms post-settle window
- Persistent per-tab network request tracking
- New REST endpoint: POST /api/v1/tabs/{id}/wait
Frozen Screenshot Lifecycle
Action lifecycle reordered to freeze execution before capturing screenshots. Markup overlays now persist across the
pause boundary, so the screenshot always reflects the markup the agent will see.
- ForceRedrawForTab and CleanupMarkupForTab extracted for per-tab control
- Force-redraw and pause profiling added for performance visibility
Multi-Tab Execution Control
Execution control now handles multiple tabs correctly. When switching tabs, the previously active tab is
backgrounded and its pause/resume lifecycle is suspended, preventing CDP callbacks from interfering with the active
tab.
- backgrounded flag on TabState gates execution control
- Action lifecycle skips pause when tab is backgrounded
- Browser tests added for multi-tab handoff
Action IDs
Every action response now includes a unique action_id string, making it easy to correlate actions with history
entries and screenshots.
- String-based action IDs (replacing integer IDs in history)
- action_id included in both success and error responses
browser_clear_text Tool
New MCP tool to clear text from input fields — select-all + delete in a single action.
NPM Package: User Profile Options
The @anthropic-ai/abp npm package now supports custom user data directories, profile directories, and user agent
strings.
- New CLI flags: --user-data-dir, --profile-directory, --user-agent
- Environment variables: ABP_USER_DATA_DIR, PROFILE_DIRECTORY, USER_AGENT
- LaunchOptions API: userDataDir, profileDirectory, userAgent
Bug Fixes
- Fixed custom app icon missing from release builds
- Improved drag/slider timing (50 steps over 500ms)
- Made macOS notarization more resilient
- Fixed Linux release script
- Fixed unsafe buffer access in GenerateActionId
Documentation
- Added ABP-specific license file
- Updated README and quickstart docs
Full Changelog: v0.1.4...v0.1.5
v0.1.4
ABP v0.1.4 Release Notes
Permission Interception
Full permission prompt interception — ABP now intercepts browser permission requests (geolocation,
notifications, camera, etc.) instead of letting them block the page. New REST endpoints and MCP tools
allow agents to grant or deny permissions programmatically.
- AbpPermissionObserver intercepts permission prompts at the engine level
- Granting geolocation accepts latitude, longitude, and accuracy parameters
- AbpLocationProvider provides mock geolocation coordinates — no OS-level location dialog
- New endpoints: GET /api/v1/permissions, POST /api/v1/permissions/{id}/grant, POST
/api/v1/permissions/{id}/deny - New MCP tool: respond_to_permission
Configurable Wait Timings
Action wait timings are now configurable via CLI flags, environment variables, or the SDK.
- --min-wait — minimum wait time after actions (default 500ms)
- --tracking-timeout — network/DOM tracking timeout (default 3000ms)
- --post-settle — post-settle delay (default 500ms)
- Also configurable via ABP_MIN_WAIT, ABP_TRACKING_TIMEOUT, ABP_POST_SETTLE env vars
- SDK LaunchOptions now exposes these as typed options
Human-Speed Typing
Keyboard type action now types at a more human-like speed to avoid breaking autocomplete and other
input-sensitive page behaviors.
ABP Branding
- Custom ABP logo replaces default Chromium icons across all platforms
- Icon generation script at tools/abp/
Bug Fixes
- Fixed Environment::GetVar API usage and Options type include
- Fixed Windows packaging — correct exe name, added manifests
- Include chrome_crashpad_handler in Linux and Windows packages
- Added mouse_drag fallback guidance to browser_slider tool description
Documentation
- Updated quickstart with Claude Code, Codex, and HTTP mode MCP paths
- Fixed MCP tool count in docs (12/13 → 14)
Full Changelog: v0.1.3...v0.1.4
v0.1.3
v0.1.2
- Added browser_slider action
- Improved release scripts
Full Changelog: v0.1.1...v0.1.2
v0.1.1
What's New
ABP v0.1.1 is a major quality and usability release — 94 commits covering performance,
stability, a streamlined MCP surface, and cross-platform builds.
MCP Tool Consolidation
Consolidated 30 MCP tools down to 12 with a new dispatch table architecture. Simpler tool
surface, same full capability. The embedded MCP server is available at /mcp with no sidecar
process.
Batch Execution
New /api/v1/tabs/{id}/batch endpoint for executing multiple actions in a single request —
reduces round-trips for multi-step workflows.
Performance
- WebP encoding with Lanczos3 scaling for smaller, sharper screenshots
- CoreAnimation delay reduced from 167ms to 50ms
- Compositor-based scroll position — eliminates JS round-trip
- Parallel pause + markup opt-in — action latency significantly reduced
Screenshot Markup Overhaul
- Markup enabled by default — screenshots include interactive element overlays out of the box
- Composable markup tags — mix and match interactive, text, image, selected, etc. as an array
- disable_markup parameter to opt out of specific tags instead of opting in
- selected tag — highlights the element under the cursor
New Actions
- Drag and drop — POST /api/v1/tabs/{id}/drag
- NormalizeKey utility for consistent keyboard input handling across platforms
- US keyboard layout for type action with proper virtual cursor scaling
Stability Fixes
- Prevent renderer DCHECK crashes during rapid cross-origin navigation with debugger pause
- Fix Performance::SuspendObserver idempotency to prevent Blink DCHECK crash
- Event-driven ForceRedraw recovery during navigation (no more dropped screenshots)
- Fix cross-process navigation renderer crashes
- Suppress crash recovery popup and session restore on startup
- Auto-reattach debug server when ABP relaunches with new session
- Global monotonicity clamp for BeginFrameArgs frame_time
Developer Experience
- Renamed binary from chrome/chromium to abp (Linux/Windows) and ABP.app (macOS)
- ABP always starts — no more --enable-abp flag required
- Fixed viewport — 1280x800 locked at 100% zoom, no user resize
- Debug UI — HTML dashboard with action forms, history panel, and SSE live updates
- agent-browser-protocol npm package with Claude Code plugin and stdio MCP proxy
Cross-Platform Builds
- Linux build working — tools/abp/release-linux.sh
- Windows build working — tools/abp/release-win.ps1
- Release scripts auto-read version from package.json
- Simplified archive naming: abp-{version}-{platform}-{arch}
Documentation
- Rewritten README as MCP-first landing page
- Standalone REST API reference
- Standalone MCP server reference
- Build-from-source guide for all platforms
- Training data guide with SQLite schema and abp-debug
Full Changelog: 0.1.0...v0.1.1