This document describes the architecture and data flow of the OpenCode Browser MCP Plugin.
┌─────────────────────────────────────────────────────────────────┐
│ User / Developer │
└────────────────────────────┬────────────────────────────────────┘
│
│ Natural Language Prompts
│ (e.g., "Navigate to github.com")
▼
┌─────────────────────────────────────────────────────────────────┐
│ OpenCode TUI │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ • Chat Interface │ │
│ │ • Command Processing (/init, /undo, etc.) │ │
│ │ • Session Management │ │
│ └──────────────────────────────────────────────────────────┘ │
└────────────────────────────┬────────────────────────────────────┘
│
│ Tool Invocation
▼
┌─────────────────────────────────────────────────────────────────┐
│ OpenCode Agent / LLM │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ • Interprets user intent │ │
│ │ • Selects appropriate tools │ │
│ │ • Generates tool parameters │ │
│ │ • Processes tool results │ │
│ └──────────────────────────────────────────────────────────┘ │
└────────────────────────────┬────────────────────────────────────┘
│
┌────────────┴────────────┐
│ │
▼ ▼
┌───────────────────────────┐ ┌──────────────────────────┐
│ Built-in Tools │ │ MCP Tools │
│ • read │ │ • browsermcp_navigate │
│ • write │ │ • browsermcp_click │
│ • bash │ │ • browsermcp_fill │
│ • grep │ │ • browsermcp_extract │
│ • glob │ │ • browsermcp_screenshot │
└───────────────────────────┘ └──────────┬───────────────┘
│
┌────────────┘
│
│ Plugin Hooks Intercept
▼
┌─────────────────────────────────────────────────────────────────┐
│ Browser MCP Plugin (index.ts) │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Hook: tool.execute.before │ │
│ │ • Logs browser tool invocation │ │
│ │ • Validates parameters │ │
│ │ • Can modify tool arguments │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Hook: tool.execute.after │ │
│ │ • Logs browser tool results │ │
│ │ • Can process/transform results │ │
│ │ • Triggers custom actions │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Hook: experimental.session.compacting │ │
│ │ • Preserves browser state context │ │
│ │ • Injects continuation prompts │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Hook: session.created, event │ │
│ │ • Session lifecycle management │ │
│ │ • Event-driven actions │ │
│ └──────────────────────────────────────────────────────────┘ │
└────────────────────────────┬────────────────────────────────────┘
│
│ MCP Protocol
│ (stdio/JSON-RPC)
▼
┌─────────────────────────────────────────────────────────────────┐
│ Browser MCP Server Process │
│ (npx @browsermcp/mcp@0.1.3) │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ • Receives tool invocations via stdin │ │
│ │ • Translates to browser automation commands │ │
│ │ • Communicates with browser extension │ │
│ │ • Returns results via stdout │ │
│ └──────────────────────────────────────────────────────────┘ │
└────────────────────────────┬────────────────────────────────────┘
│
│ WebSocket / Native Messaging
▼
┌─────────────────────────────────────────────────────────────────┐
│ Browser MCP Extension (Chrome/Edge) │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ • Receives commands from MCP server │ │
│ │ • Executes browser automation via Chrome APIs │ │
│ │ • Captures screenshots, extracts data │ │
│ │ • Returns results to MCP server │ │
│ └──────────────────────────────────────────────────────────┘ │
└────────────────────────────┬────────────────────────────────────┘
│
│ Chrome DevTools Protocol
▼
┌─────────────────────────────────────────────────────────────────┐
│ Chrome/Edge Browser │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ • Actual web pages and DOM │ │
│ │ • JavaScript execution │ │
│ │ • User interactions (clicks, typing, etc.) │ │
│ │ • Network requests │ │
│ └──────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
Responsibility: User interface and session management
- Handles user input
- Manages conversation history
- Displays agent responses
- Executes commands (/init, /undo, etc.)
Responsibility: Intent interpretation and tool orchestration
- Understands natural language prompts
- Decides which tools to use
- Generates tool parameters
- Processes and presents results to user
Responsibility: Browser automation enhancement and monitoring
- Hooks into tool execution lifecycle
- Logs browser automation activities
- Preserves browser context across sessions
- Enables custom post-processing
Key Hooks:
tool.execute.before: Pre-process tool callstool.execute.after: Post-process resultsexperimental.session.compacting: Context preservationsession.created: Session initializationevent: Event handling
Responsibility: Protocol translation
- Implements MCP protocol (stdio JSON-RPC)
- Translates tool calls to browser commands
- Manages communication with browser extension
- Handles errors and timeouts
Responsibility: Browser control
- Executes automation commands
- Uses Chrome DevTools Protocol
- Captures screenshots
- Extracts page data
- Handles DOM interactions
Responsibility: Web page rendering and execution
- Renders web pages
- Executes JavaScript
- Handles user interactions
- Manages network requests
User Prompt: "Go to github.com"
↓
OpenCode TUI receives input
↓
LLM interprets → decides to use "browsermcp_navigate"
↓
Plugin Hook (tool.execute.before)
• Logs: "Executing browser tool: browsermcp_navigate"
• Logs: "Tool arguments: { url: 'https://github.com' }"
↓
Browser MCP Server receives:
{
"tool": "browsermcp_navigate",
"args": { "url": "https://github.com" }
}
↓
Browser Extension executes:
chrome.tabs.update({ url: "https://github.com" })
↓
Browser navigates to GitHub
↓
Extension returns: { "success": true, "url": "https://github.com" }
↓
MCP Server returns result to OpenCode
↓
Plugin Hook (tool.execute.after)
• Logs: "Completed browser tool: browsermcp_navigate"
↓
LLM receives result → formats response
↓
User sees: "✓ Navigated to github.com"
User Prompt: "Fill out the contact form at example.com with my details"
↓
LLM generates multiple tool calls:
1. browsermcp_navigate (to example.com)
2. browsermcp_fill (for each form field)
3. browsermcp_click (submit button)
4. browsermcp_extract (success message)
↓
Each tool call flows through:
Plugin Hooks → MCP Server → Extension → Browser
↓
Results flow back:
Browser → Extension → MCP Server → Plugin Hooks → LLM
↓
User sees complete result with all steps logged
Long session with multiple browser interactions
↓
OpenCode detects session needs compaction
↓
Plugin Hook (experimental.session.compacting)
• Detects browser tools were used
• Injects context: "Browser state may have changed..."
• Adds continuation instructions
↓
Compaction summary includes browser context
↓
New session continues with preserved browser state awareness
1. session.created (when session starts)
↓
2. tool.execute.before (before each tool)
↓
3. [Tool executes - MCP Server → Extension → Browser]
↓
4. tool.execute.after (after each tool)
↓
5. event (various events throughout session)
↓
6. experimental.session.compacting (when session compacts)
opencode.json
↓
OpenCode loads configuration
↓
Parses MCP server configuration:
{
"browsermcp": {
"command": ["npx", "-y", "@browsermcp/mcp@0.1.3"]
}
}
↓
Spawns MCP server process
↓
Server connects to browser extension
↓
Plugin loads and initializes hooks
↓
Tools become available to LLM
Error occurs in browser
↓
Extension catches error
↓
Returns error to MCP Server
↓
MCP Server formats error
↓
Plugin Hook (tool.execute.after) sees error
• Can log for debugging
• Can transform error message
↓
LLM receives error
↓
LLM decides next action:
• Retry with different parameters
• Try alternative approach
• Report error to user
User Prompt
↓
OpenCode Agent
↓
Security Boundaries:
1. Plugin can validate/sanitize tool parameters
2. MCP Server validates commands
3. Browser extension has limited permissions
4. Browser security sandbox protects system
The plugin provides several extension points:
- Custom Tools: Add new browser automation tools
- Tool Hooks: Intercept and modify tool execution
- Event Handlers: React to OpenCode events
- Context Injection: Add custom context during compaction
- Post-Processing: Transform tool results
Latency Breakdown:
User Input → LLM: ~100-500ms (depends on model)
LLM → Tool Decision: ~500-2000ms (depends on complexity)
Plugin Hook (before): ~1-5ms
Tool → MCP Server: ~10-50ms
MCP Server → Extension: ~20-100ms
Extension → Browser: ~50-500ms (depends on action)
Browser → Extension: ~50-500ms
Extension → MCP Server: ~20-100ms
MCP Server → Tool Result: ~10-50ms
Plugin Hook (after): ~1-5ms
LLM Processing Result: ~100-500ms
Total: ~1-3 seconds for simple actions
The plugin provides several debugging capabilities:
Console Logs:
• Plugin initialization
• Tool execution (before/after)
• Session events
• Error conditions
OpenCode Logs:
• MCP server startup
• Tool invocations
• Tool results
• Session management
Browser Extension Logs:
• Command execution
• DOM interactions
• Screenshots
• Errors
- Caching Layer: Cache browser state for faster operations
- Connection Pool: Multiple browser instances
- Proxy Support: Route traffic through proxies
- Custom Commands: Domain-specific automation commands
- Visual Regression: Screenshot comparison tools
- Performance Metrics: Track automation performance
- Multi-Browser: Support Firefox, Safari, etc.
This architecture enables powerful browser automation while maintaining separation of concerns and extensibility.