Skip to content

feat(browser): multi-engine browser automation with pool, stealth, proxy, and live view #778

Open
nhokboo wants to merge 10 commits intonextlevelbuilder:devfrom
nhokboo:feat/webbrowser-control
Open

feat(browser): multi-engine browser automation with pool, stealth, proxy, and live view #778
nhokboo wants to merge 10 commits intonextlevelbuilder:devfrom
nhokboo:feat/webbrowser-control

Conversation

@nhokboo
Copy link
Copy Markdown
Contributor

@nhokboo nhokboo commented Apr 9, 2026

Summary

Introduces a full web browser control subsystem for agents, spanning backend automation, security hardening, and UI configuration.

Browser runtime

  • Multi-engine browser automation with container pool, stealth profiles, and live view.
  • Host / Docker mode support with Docker environment auto-detection.
  • Remote and K8s modes scaffolded but disabled.
  • Improved live view input handling and tab management.

Proxy & network

  • Proxy pool management with per-agent proxy assignment
  • CDP Fetch-based proxy authentication
  • SSRF protection and Chrome flag blocking

Security & isolation

  • Cross-agent browser profile isolation fix
  • Chrome launch flag allowlist to prevent dangerous overrides

Tracing & tools

  • Bridge trace context propagation for CLI tool execution
  • Browser tool settings dialog and image preset selector in web UI

Infra

  • New migrations (schema version bumped 34 → 36)
  • Proxy pool store, methods, and React admin pages

Test plan

  • go build ./... and go build -tags sqliteonly ./... pass
  • go vet ./... clean
  • go test -race ./tests/integration/ green
  • Run ./goclaw migrate up from a pre-feature DB — migrations apply cleanly
  • Config → Browser Runtime: Host and Docker modes work; Remote/K8s show "Soon" and are non-clickable
  • Launch an agent browser task in Host mode, verify live view streams and input works
  • Launch in Docker mode, verify container pool acquires/releases and profile is isolated per agent
  • Assign proxy from proxy pool, verify auth works via CDP Fetch and traffic routes through proxy
  • Verify SSRF guard blocks loopback / link-local targets
  • Verify blocked Chrome flags are stripped/rejected at launch
  • CLI tool execution emits bridge trace spans with correct parent context

namnn0911 and others added 10 commits April 1, 2026 13:36
…l, stealth, and live view

- Add container pool engine (Docker) with configurable memory/CPU limits and network isolation
- Implement browser fingerprint randomization and stealth mode (WebDriver, WebGL, navigator spoofing)
- Add proxy management with encrypted credential storage and rotation support
- Add extension management system with per-tenant browser extension loading
- Add audit logging for browser actions with PostgreSQL-backed store
- Add screencast/live view with WebSocket streaming and shareable session tokens
- Add browser profile storage manager for persistent sessions across restarts
- Implement cookie, localStorage/sessionStorage, and JS error capture APIs
- Add web UI: browser management page, live view modal, config section, i18n (en/vi/zh)
- Add config hot-reload for browser settings via pub/sub
- Support multiple modes: host (local Chrome), remote (CDP URL), docker (container pool), k8s
- Add PostgreSQL migration 000031 for browser_proxies, browser_extensions, browser_audit, screencast_sessions
- Add comprehensive unit and integration tests for engine, stealth, storage, and extended tools
- Add BrowserSettingsForm with public_url configuration for live view share links
- Add image preset selector (basic/stealth/custom) to browser runtime config section
- Make config page tabs URL-driven via optional :section route param
- Add i18n strings for browser settings and image presets (en/vi/zh)
- Add HTTP API for browser proxy CRUD (list/create/delete/toggle/health-check)
- Add proxy-profile sticky assignment store and migration (000032)
- Add per-agent browser_use_proxy opt-in via other_config JSONB
- Add proxy URL validation to prevent injection via malformed URLs
- Add proxy pool UI page with i18n support (en/vi/zh)
- Add browser proxy config section in agent advanced settings
- Fix agent cache invalidation to handle tenant-scoped keys
- Wire proxy manager with assignment store in gateway startup
- Add is_enabled column support for proxy enable/disable toggling
- Added BridgeTraceRegistry to manage trace context for CLI tool calls.
- Enhanced gateway setup to include built-in tool store and bridge trace registry.
- Updated agent loop to register and unregister trace context during CLI Chat/ChatStream.
- Modified MCP bridge server to emit tool spans for CLI executed tools.
- Introduced new tracing methods to handle tool call and result spans.
- Enhanced CLI provider to support tool call tracing and logging.
- Updated tool registry to allow retrieval of disabled tools for MCP bridge.
- Added sanitization for media paths to prevent leakage of sensitive information.
feat: add import/export, subagent persistence, reasoning resolution, and browser automation

- Add agent/team/capabilities import and export with SSE progress tracking
- Add subagent task persistence and roster management (migration 034)
- Add reasoning capability detection and resolution for provider compatibility
- Add browser tables and proxy assignment migrations (035, 036)
- Add knowledge graph FTS, dedup, and similarity scoring (migration 031)
- Add secure CLI user credentials management (migration 032)
- Add cron payload columns for enhanced scheduling (migration 033)
- Add Codex pool activity tracking and provider pool UI improvements
- Add composable Docker setup with prepare-compose.sh and compose options
- Refactor agent loop: extract media input, MCP user, tool filter, team reminders
- Refactor team tasks: split creation, lifecycle, and workspace auto-share
- Refactor consumer handlers: extract post-turn logic and dependency injection
- Add Telegram subagent commands and enhanced channel formatting
- Add comprehensive test coverage across store, agent, provider, and tool layers
- Update web and desktop UI with import/export pages, KG dedup dialog, and i18n
- Align RequiredSchemaVersion with latest migration files
- Strip credentials from proxy URLs passed to Chrome's --proxy-server flag
  (which doesn't support userinfo in URLs)
- Intercept 407 Proxy-Auth-Required via CDP Fetch domain to provide
  credentials at the protocol level
- Add FormatURLAndCreds to ProxyManager returning URL and creds separately
- Propagate proxy auth creds through context and container pool entries
- Inject per-agent use_proxy config flag from gateway bridge middleware
…Docker auto-detection

- Add single-goroutine input dispatch channel replacing per-event goroutines for ordered CDP events
- Add close-tab endpoint (POST /browser/close-tab) and navigation commands (back/forward/reload)
- Fix coordinate mapping using actual screencast frame metadata instead of fixed viewport assumption
- Auto-detect Docker environment for sibling container mode when browser mode is unset
- Add WebSocket origin validation (checkSameOrigin) replacing permissive CheckOrigin
- Use atomic.Pointer for thread-safe browser Manager hot-reload
- Add per-agent browser options (BrowserOpts) propagation through agent loop context
- Support mousemove backpressure dropping to prevent input channel saturation
- Add browser panel UI improvements: tab close buttons, nav controls, i18n strings
- Simplify Vite proxy config and fix screencast coordinate scaling
…ss-agent profile isolation

- Add SSRF validation to browser URL navigation: block private/loopback IPs, cloud metadata endpoints, and internal hostnames
- Block dangerous Chrome flags (disable-web-security, remote-debugging-port, user-data-dir, etc.) in ExtraArgs
- Fix cross-agent profile isolation: stop mutating shared activeProfile from tool execution
- Add thread-safe getInner() accessor for ContainerEngine to prevent data races
- Propagate team workspace through MCP bridge context for team task file access
- Add timeout on non-droppable browser input events to prevent WS read loop stalls
- Add "coming soon" state for remote and k8s browser runtime modes
- Unify blocked logic for host-in-docker and coming-soon modes
- Show "Soon" tooltip for unimplemented modes
viettranx added a commit that referenced this pull request Apr 9, 2026
…l, proxy, stealth, and live view

Port browser automation subsystem from PR #778 onto dev branch.

- Engine abstraction layer (Chrome, Container, ContainerPool) replacing direct rod usage
- Chrome flag security allowlist blocking dangerous flags (disable-web-security, etc.)
- Container pool with pre-warmed Docker instances and profile-based routing
- Proxy pool with per-agent assignment, rotation, health checks, encrypted credentials
- Stealth mode with fingerprint randomization and automation detection bypass
- Live view with screencast streaming, input dispatch, shareable token-based access
- SSRF protection for browser URLs, CDP endpoints, and proxy URLs
- Bridge trace registry for CLI tool span attribution in MCP bridge
- Per-agent browser options (proxy opt-in, launch args, viewport override)
- Browser audit logging with fire-and-forget DB writes
- Storage manager for Chrome profile persistence and cleanup
- Extension manager for CRX/unpacked Chrome extensions
- Web UI: browser management pages, proxy pool admin, live view modal, i18n (en/vi/zh)
- Migration 000045: browser_proxies, browser_extensions, screencast_sessions,
  browser_audit_log, proxy_profile_assignments tables
- SQLite nil-safety: browser stores are nil in SQLite factory (PG-only feature)

Closes #778
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants