Browser Automation MCP Server

This project implements a browser automation server using the Model Context Protocol (MCP) with Playwright for web automation.

Project Structure

MCP-Server/
├── browser_automation_server.py    # Browser automation MCP server
├── browser_automation_config.json  # Configuration for the server
├── test_browser_automation.py      # Basic test client
├── browser_automation_agent.py     # Advanced client with LLM agent
├── mcp-env/                        # Virtual environment with dependencies
└── README.md                       # This file

Features

The browser automation server provides the following tools:

start_browser() - Start a new browser instance
navigate_to_url(url) - Navigate to a specific URL
click_element(selector) - Click on an element using CSS selector
type_text(selector, text) - Type text into an input field
get_page_content() - Get the current page content
take_screenshot(filename) - Take a screenshot of the current page
close_browser() - Close the browser
browser_automation_task(task_description, website_url) - Perform a complete automation task

Setup

Virtual Environment: The project uses a virtual environment located in mcp-env/
Dependencies: The following packages are installed:
- fastmcp - For creating MCP servers
- mcp_use - For creating MCP clients and agents
- playwright - For browser automation
- langchain-ollama - For Ollama integration (optional)

Install Playwright Browsers:

source mcp-env/bin/activate
playwright install

Usage

1. Basic Test

Run the basic test to verify the browser automation works:

./mcp-env/bin/python test_browser_automation.py

This will:

Start a browser
Navigate to Google
Take a screenshot
Close the browser

2. Advanced Agent Mode

For LLM-powered browser automation, use the agent client:

./mcp-env/bin/python browser_automation_agent.py

This requires Ollama to be installed and running with the llama3.1:8b model.

3. Manual Tool Calls

You can also create your own client to make specific tool calls:

import asyncio
import json
from mcp_use import MCPClient

async def main():
    with open("browser_automation_config.json", "r") as f:
        config = json.load(f)

    client = MCPClient.from_dict(config)
    await client.create_all_sessions()
    
    session = client.get_session("browser_automation")
    
    # Start browser
    await session.call_tool("start_browser", {})
    
    # Navigate to a website
    await session.call_tool("navigate_to_url", {"url": "example.com"})
    
    # Take a screenshot
    await session.call_tool("take_screenshot", {"filename": "my_screenshot.png"})
    
    # Close browser
    await session.call_tool("close_browser", {})
    
    await client.close_all_sessions()

asyncio.run(main())

Configuration

The browser_automation_config.json file configures the MCP server:

{
  "mcpServers": {
    "browser_automation": {
      "command": "python",
      "args": ["browser_automation_server.py"]
    }
  }
}

Examples

Example 1: Simple Web Scraping

# Navigate to a website and get content
await session.call_tool("start_browser", {})
await session.call_tool("navigate_to_url", {"url": "https://example.com"})
content = await session.call_tool("get_page_content", {})
await session.call_tool("close_browser", {})

Example 2: Form Filling

# Fill out a form
await session.call_tool("start_browser", {})
await session.call_tool("navigate_to_url", {"url": "https://example.com/form"})
await session.call_tool("type_text", {"selector": "#name", "text": "John Doe"})
await session.call_tool("type_text", {"selector": "#email", "text": "john@example.com"})
await session.call_tool("click_element", {"selector": "#submit"})
await session.call_tool("close_browser", {})

Example 3: Screenshot Automation

# Take screenshots of multiple pages
urls = ["google.com", "github.com", "stackoverflow.com"]
for i, url in enumerate(urls):
    await session.call_tool("start_browser", {})
    await session.call_tool("navigate_to_url", {"url": url})
    await session.call_tool("take_screenshot", {"filename": f"screenshot_{i}.png"})
    await session.call_tool("close_browser", {})

Troubleshooting

Common Issues

Browser not starting: Make sure Playwright browsers are installed:
```
playwright install
```
Import errors: Use the correct Python path:
```
./mcp-env/bin/python your_script.py
```
Permission errors: Make sure the virtual environment is activated:
```
source mcp-env/bin/activate
```

Security Notes

The browser automation can access any URL, including local files
Be careful when exposing this server to untrusted users
Consider running in a sandboxed environment for production use

Development

To modify the browser automation server:

Edit browser_automation_server.py
Add new tools using the @mcp.tool() decorator
Test with test_browser_automation.py
Update the README with new features

License

This project is open source and available under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github/workflows		.github/workflows
README.md		README.md
browser_automation_agent.py		browser_automation_agent.py
browser_automation_config.json		browser_automation_config.json
browser_automation_server.py		browser_automation_server.py
browser_config.json		browser_config.json
google_homepage.png		google_homepage.png
test_browser_automation.py		test_browser_automation.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Browser Automation MCP Server

Project Structure

Features

Setup

Usage

1. Basic Test

2. Advanced Agent Mode

3. Manual Tool Calls

Configuration

Examples

Example 1: Simple Web Scraping

Example 2: Form Filling

Example 3: Screenshot Automation

Troubleshooting

Common Issues

Security Notes

Development

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Browser Automation MCP Server

Project Structure

Features

Setup

Usage

1. Basic Test

2. Advanced Agent Mode

3. Manual Tool Calls

Configuration

Examples

Example 1: Simple Web Scraping

Example 2: Form Filling

Example 3: Screenshot Automation

Troubleshooting

Common Issues

Security Notes

Development

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages