buse

Control your browser from your terminal.

buse is a stateless CLI designed for AI agents and automation scripts. It turns complex browser interaction into simple, structured command-line primitives.

Key Features

Stateless Control: Just point the CLI at a browser and go.
Persistent Sessions: Multiple browser instances can run simultaneously.
Universal Primitives: Click, type, scroll, and execute JS with one-liners.
Vision-Ready: observe captures semantic state plus optional screenshots and SoM labels.
Session Migration: Export cookies/storage via save-state to maintain persistent logins.

Why 'buse'?

Automating a browser usually means writing long, complex scripts or paying for expensive cloud services. buse changes that by letting you control a browser just like any other folder or file on your computer—using simple, one-word commands in your terminal.

For example, open a browser and navigate to a website:

uvx --python 3.12 buse browser-1
uvx --python 3.12 buse browser-1 navigate "https://example.com"
uvx --python 3.12 buse browser-2 # open a second browser
uvx --python 3.12 buse browser-2 search "latest tech news"

Installation

With uv:

uvx --python 3.12 buse --help

With pip:

pip install buse

From source:

cd buse
uv pip install -e .

Requirements

Python 3.12
Google Chrome (local install)

Usage Pattern

buse <instance_id> <command> [args]

Command List

1. Lifecycle & State

Command	Description	Example
`<id>`	Initialize/Start a new browser instance	`buse b1`
`list`	Show all active browser instances	`buse list`
`stop`	Stop and kill a browser instance	`buse b1 stop`
`save-state`	Export cookies/storage to a file	`buse b1 save-state cookies.json`

2. Analysis & Extraction

Command	Description	Example
`observe`	Snapshot page state (visual + text modes)	`buse b1 observe --visual som`
`extract`	LLM extraction (set `BUSE_EXTRACT_MODEL`)	`buse b1 extract "get product info"`

observe notes

DOM indices are ephemeral; refresh with buse <id> observe after page changes, or use --id/--class for stability.
Preferred flags are --visual (som, omni, none), --text (ai, dom, none), and --mode (efficient, full, raw).
--human prints a human-friendly layout; JSON output is better for agents.
Legacy flags (--screenshot, --omniparser, --som, --semantic, --no-dom, --diagnostics) are still supported for compatibility.
observe --visual omni always captures a screenshot: saves image.jpg (input) and image_som.jpg (server output) in the screenshots dir or --path.
When available, screenshot_path points to image_som.jpg. OmniParser bbox values are in CSS pixels (not normalized).
Use --text none to skip DOM processing and return an empty dom_minified.
--max-chars 0 disables semantic truncation entirely.

3. Navigation & Interaction

Command	Description	Example
`navigate`	Load a specific URL (supports `--new-tab`)	`buse b1 navigate "https://google.com"`
`new-tab`	Open a URL in a new tab (alias for `navigate --new-tab`)	`buse b1 new-tab "https://example.com"`
`search`	Search the web (engines: `google`, `bing`, `duckduckgo`)	`buse b1 search "query" --engine google`
`click`	Click by index/ref (`eN`), selector, id/class, or coordinates (with modifiers)	`buse b1 click e3 --double`
`input`	Type text into a field by index/ref (`eN`) or `--id`/`--class` (supports `--slowly`, `--append`, `--submit`)	`buse b1 input e3 "Hello"`
`fill`	Fill multiple fields in one command (JSON payload)	`buse b1 fill '[{"ref":"e1","value":"a"}]'`
`drag`	Drag from one element to another (ref/index)	`buse b1 drag e1 e2`
`upload-file`	Upload a file to an element by index	`buse b1 upload-file 5 "./img.png"`
`send-keys`	Send special keys or text (use `--list-keys` for names, optional focus with `--index/--id/--class`)	`buse b1 send-keys "Enter"`
`find-text`	Scroll to specific text on the page	`buse b1 find-text "Contact"`
`dropdown-options`	List options for a select element by index or `--id`/`--class`	`buse b1 dropdown-options 12`
`select-dropdown`	Select dropdown option by visible text and index or `--id`/`--class` (use `--text` when no index)	`buse b1 select-dropdown 12 "Option"`
`hover`	Hover over an element by index or `--id`/`--class`	`buse b1 hover 5`
`scroll`	Scroll page or a specific element (use `--up` or `--down`)	`buse b1 scroll --up --pages 2`
`refresh`	Reload the current page	`buse b1 refresh`
`go-back`	Go back in browser history	`buse b1 go-back`
`wait`	Wait by time, selector, text, or network idle	`buse b1 wait 2`
`evaluate`	Execute custom JavaScript code	`buse b1 evaluate "alert('Hi')"`

4. Advanced

Command	Description	Example
`switch-tab`	Switch by 4-char tab ID	`buse b1 switch-tab "4D39"`
`close-tab`	Close by 4-char tab ID	`buse b1 close-tab "4D39"`

Examples

Flag Matrix

Global (all commands):

--format (json|toon, default: json), -f alias
--profile (default: false), -p alias

Selected command flags:

observe: --visual, --text, --mode, --max-chars, --max-labels, --selector, --frame, --human, --path (legacy: --screenshot, --omniparser, --som, --semantic, --no-dom, --diagnostics)
click: --selector, --id, --class, --x/--y, --right, --middle, --double, --ctrl/--shift/--alt/--meta, --force, --debug
input: --text, --id, --class, --slowly, --append, --submit
fill: JSON list payload (positional)
drag: --html5/--no-html5
send-keys: --index, --id, --class, --list-keys
scroll: --down/--up, --pages, --index
wait: --text, --selector, --network-idle, --timeout

Commands

# Start a session
buse b1

# Observe without screenshot (JSON)
buse b1 observe

# Observe with SoM labels and semantic text (JSON + image)
buse b1 observe --visual som --text ai

# Navigate and click by coordinates
buse b1 navigate "https://example.com"
buse b1 click --x 280 --y 220

# Click by ref/id/class fallback
buse b1 click e3
buse b1 click --id "submit-button"
buse b1 click --class "cta-primary"

# Input by id with explicit --text
buse b1 input --id "email" --text "test@example.com"

# Input slowly and submit
buse b1 input --id "email" --text "test@example.com" --slowly --submit

# Fill multiple fields atomically
buse b1 fill '[{"ref":"e1","value":"user"},{"ref":"e2","value":"pass","type":"text"}]'

# Drag and drop
buse b1 drag e1 e2

# Upload a file
buse b1 upload-file 5 "./image.png"

# Send special keys
buse b1 send-keys "Enter"

# Send keys to a focused element
buse b1 send-keys --id "search" "Hello"

# List send-keys names
buse b1 send-keys --list-keys

# Find and scroll to text
buse b1 find-text "Contact Us"

# Get dropdown options and select by text
buse b1 dropdown-options --id "country"
buse b1 select-dropdown --id "country" --text "Canada"

# Scroll and wait
buse b1 scroll --down --pages 1.5
buse b1 scroll --up --pages 1
buse b1 wait 2

MCP Server

Expose the active browser instances via the Model Context Protocol.

buse mcp-server --host 0.0.0.0 --port 8000

--transport selects streamable-http (default), sse, or stdio.
--name changes the MCP server name, --stateless/--stateful controls HTTP mode, and --json-response/--no-json-response toggles JSON wrapping.
--allow-remote permits non-local clients (default: local-only). --auth-token requires Authorization: Bearer <token> or X-Buse-Token for HTTP requests.
--format (json|toon, default: json), -f alias.
Resources:
- buse://sessions returns a list of session metadata (instance_id, cdp_url, user_data_dir).
- buse://session/{id} returns the metadata for a single session.
Tools:
- Supports all CLI actions: navigate, click, input_text, fill, drag, send_keys, scroll, switch_tab, close_tab, search, upload_file, find_text, dropdown_options, select_dropdown, go_back, hover, refresh, wait, save_state, extract, evaluate, stop_session, start_session, observe.

The mcp SDK ships with buse, so no extra installation is required.

Output & Profiling

--format json|toon to switch output format.
--profile (or -p) includes timing data in the JSON response.

Environment Variables

BUSE_EXTRACT_MODEL: model name for extract (default: gpt-4o-mini).
OPENAI_API_KEY: required for extract.
BUSE_KEEP_SESSION: set to 1 to keep the session open within a single process.
BUSE_SELECTOR_CACHE_TTL: selector-map cache TTL in seconds (default: 0, disabled).
BUSE_REMOTE_ALLOW_ORIGINS: override Chrome --remote-allow-origins (default: http://localhost:<port>,http://127.0.0.1:<port>).
BUSE_IMAGE_QUALITY: JPEG quality (1-100) for OmniParser images.
BUSE_MCP_ALLOW_REMOTE: set to 1 to allow non-local MCP clients.
BUSE_MCP_AUTH_TOKEN: require a Bearer or X-Buse-Token header for MCP HTTP access.

References & Inspiration

https://blog.google/innovation-and-ai/models-and-research/google-deepmind/gemini-computer-use-model/

https://www.anthropic.com/news/3-5-models-and-computer-use

https://docs.browser-use.com/introduction

Roadmap

Support all operating systems: Windows, macOS, Linux (right now works on my 10.15 macOS and Windows 11)
Add automation scripting examples
Add e2e tests
Add optional daemon for persistent background sessions

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.github/workflows		.github/workflows
src/buse		src/buse
tests		tests
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

buse

Key Features

Why 'buse'?

Installation

Requirements

Usage Pattern

Command List

1. Lifecycle & State

2. Analysis & Extraction

observe notes

3. Navigation & Interaction

4. Advanced

Examples

Flag Matrix

Commands

MCP Server

Output & Profiling

Environment Variables

References & Inspiration

Roadmap

About

Uh oh!

Releases 11

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

buse

Key Features

Why 'buse'?

Installation

Requirements

Usage Pattern

Command List

1. Lifecycle & State

2. Analysis & Extraction

observe notes

3. Navigation & Interaction

4. Advanced

Examples

Flag Matrix

Commands

MCP Server

Output & Profiling

Environment Variables

References & Inspiration

Roadmap

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 11

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages