An autonomous desktop automation agent powered by Claude's Computer Use API and LangGraph. The agent takes screenshots, analyzes them with Claude, and executes mouse and keyboard actions to complete tasks on your computer.
Example: Creating a PowerPoint presentation about dogs.
- Python 3.13+
- uv package manager
- Anthropic API key with Computer Use access
- Langfuse
git clone https://github.com/chr-peters/computer-use-agent.git
cd computer-use-agent
uv syncCreate a .env file with your credentials:
ANTHROPIC_API_KEY=your-api-key
ANTHROPIC_BASE_URL=https://api.anthropic.com # optional, for custom endpoints
LANGFUSE_PUBLIC_KEY=your-public-key
LANGFUSE_SECRET_KEY=your-secret-key
LANGFUSE_HOST=https://cloud.langfuse.com # or your self-hosted instanceEdit the task in main.py to specify what you want the agent to do, then run:
uv run run-workflowTo visualize the agent's graph structure:
uv run visualize