Autonomous browser automation for AI agents. Built for OpenClaw.
Two complementary tools in one skill:
| Tool | Best for | How it works |
|---|---|---|
| agent-browser | Step-by-step control, scraping, form filling | CLI Playwright — you drive each action |
| browser-use | Complex autonomous tasks | Python agent that decides actions itself |
# From your OpenClaw skill directory
bash scripts/install.shInstalls: Chromium, Xvfb, agent-browser (npm), browser-use Python venv + dependencies.
# Open a page
agent-browser open "https://example.com"
# Get interactive elements
agent-browser snapshot -i
# Interact using @refs from snapshot
agent-browser click @e3
agent-browser fill @e2 "search query"
# Extract data
agent-browser get text @e1
# Done
agent-browser close# Set your API key
export ANTHROPIC_API_KEY="sk-..."
# or
export OPENAI_API_KEY="sk-..."
# Run autonomous task
browser-use-agent "Go to news.ycombinator.com and find the top 3 AI-related posts"- Headless by default — works on servers, no GUI needed
- Session management — multiple parallel browser sessions
- State persistence — save/load cookies and auth state
- Screenshot capture — full page or element screenshots
- Dual model support — works with Anthropic (Claude) or OpenAI (GPT)
- Xvfb fallback — auto-detects display, uses virtual framebuffer when needed
- Linux (Ubuntu 22.04/24.04)
- Node.js 18+
- Python 3.10+
- Chromium
- Google/Bing block headless browsers (CAPTCHA) → use DuckDuckGo or
web_searchinstead @refschange on every page load — always re-snapshot after navigation- Use
fill(nottype) for input fields — it clears first
browser-use/
├── SKILL.md # OpenClaw skill instructions
├── scripts/
│ ├── install.sh # Idempotent installer
│ └── browser-use-agent.sh # Autonomous agent wrapper
└── references/
└── browser-workflow.md # Detailed workflow guide
MIT
- OpenClaw — AI agent gateway
- ClawdHub — Skill marketplace
- agent-browser — CLI tool
- browser-use — Python library