A somewhat functional implementation of the Ralph Wiggum software engineering technique - putting AI agents in a loop until the task is done.
"Me fail English? That's unpossible!" - Ralph Wiggum
NOTE: This was a toy project that was itself built with the ralph wiggum technique. Expect bugs, missing functionality, and breaking changes while I clean it up. Mainly tested with Claude Agent SDK path
View Full Documentation | Quick Start | API Reference | Examples
Ralph Orchestrator implements a simple but effective pattern for autonomous task completion using AI agents. It continuously runs an AI agent against a prompt file until the task is marked as complete or limits are reached.
Based on the Ralph Wiggum technique by Geoffrey Huntley, this implementation provides a robust, tested, and feature-complete orchestration system for AI-driven development.
- Claude Integration: ✅ COMPLETE (with Agent SDK)
- Q Chat Integration: ✅ COMPLETE
- Gemini Integration: ✅ COMPLETE
- ACP Protocol Support: ✅ COMPLETE (Agent Client Protocol)
- Core Orchestration: ✅ OPERATIONAL
- Test Suite: ✅ 920+ tests passing
- Documentation: ✅ COMPLETE
- Production Deployment: ✅ READY
- 🤖 Multiple AI Agent Support: Works with Claude, Q Chat, Gemini CLI, and ACP-compliant agents
- 🔍 Auto-detection: Automatically detects which AI agents are available
- 🌐 WebSearch Support: Claude can search the web for current information
- 💾 Checkpointing: Git-based async checkpointing for recovery and history
- 📚 Prompt Archiving: Tracks prompt evolution over iterations
- 🔄 Error Recovery: Automatic retry with exponential backoff (non-blocking)
- 📊 State Persistence: Saves metrics and state for analysis
- ⏱️ Configurable Limits: Set max iterations and runtime limits
- 🧪 Comprehensive Testing: 620+ tests with unit, integration, and async coverage
- 🎨 Rich Terminal Output: Beautiful formatted output with syntax highlighting
- 🔒 Security Features: Automatic masking of API keys et sensitive data in logs
- ⚡ Async-First Design: Non-blocking I/O throughout (logging, git operations)
- 📝 Inline Prompts: Run with
-p "your task"without needing a file - 🧠 Agent Scratchpad: ACP agents persist context across iterations via
.agent/scratchpad.md
# Clone the repository
git clone https://github.com/mikeyobrien/ralph-orchestrator.git
cd ralph-orchestrator
# Install with uv (recommended)
uv sync
# Or install with pip (requires pip in virtual environment)
python -m pip install -e .At least one AI CLI tool must be installed:
-
# Automatically installed via dependencies # Requires ANTHROPIC_API_KEY environment variable with proper permissions: # - Read/Write access to conversations # - Model access (Claude 3.5 Sonnet or similar) # - Sufficient rate limits for continuous operation export ANTHROPIC_API_KEY="sk-ant-..."
-
# Follow installation instructions in repo -
npm install -g @google/gemini-cli
-
ACP-Compliant Agents (Agent Client Protocol)
# Any ACP-compliant agent can be used via the ACP adapter # Example: Gemini CLI with ACP mode ralph run -a acp --acp-agent gemini
ralph initThis creates:
PROMPT.md- Task description templateralph.yml- Configuration file.agent/- Workspace directories for prompts, checkpoints, metrics, plans, and memory
Edit ralph.yml to customize settings:
# Ralph Orchestrator Configuration
agent: auto # Which agent to use: claude, q, gemini, acp, auto
prompt_file: PROMPT.md # Path to prompt file
max_iterations: 100 # Maximum iterations before stopping
max_runtime: 14400 # Maximum runtime in seconds (4 hours)
verbose: false # Enable verbose output
# Adapter configurations
adapters:
claude:
enabled: true
timeout: 300 # Timeout in seconds
q:
enabled: true
timeout: 300
gemini:
enabled: true
timeout: 300
acp: # Agent Client Protocol adapter
enabled: true
timeout: 300
tool_permissions:
agent_command: gemini # Command to run the ACP agent
agent_args: [] # Additional arguments
permission_mode: auto_approve # auto_approve, deny_all, allowlist, interactive
permission_allowlist: [] # Patterns for allowlist mode# Task: Build a Python Calculator
Create a calculator module with:
- Basic operations (add, subtract, multiply, divide)
- Error handling for division by zero
- Unit tests for all functions
<!-- Ralph will continue iterating until limits are reached -->ralph run
# or with config file
ralph -c ralph.yml# Run with auto-detected agent
ralph
# Use configuration file
ralph -c ralph.yml
# Use specific agent
ralph run -a claude
ralph run -a q
ralph run -a gemini
ralph run -a acp # ACP-compliant agent
# Check status
ralph status
# Clean workspace
ralph clean
# Dry run (test without executing)
ralph run --dry-runralph [OPTIONS] [COMMAND]
Commands:
init Initialize a new Ralph project
status Show current Ralph status
clean Clean up agent workspace
prompt Generate structured prompt from rough ideas
run Run the orchestrator (default)
Core Options:
-c, --config CONFIG Configuration file (YAML format)
-a, --agent {claude,q,gemini,acp,auto} AI agent to use (default: auto)
-P, --prompt-file FILE Prompt file path (default: PROMPT.md)
-p, --prompt-text TEXT Inline prompt text (overrides file)
-i, --max-iterations N Maximum iterations (default: 100)
-t, --max-runtime SECONDS Maximum runtime (default: 14400)
-v, --verbose Enable verbose output
-d, --dry-run Test mode without executing agents
ACP Options:
--acp-agent COMMAND ACP agent command (default: gemini)
--acp-permission-mode MODE Permission handling: auto_approve, deny_all, allowlist, interactive
Advanced Options:
--max-tokens MAX_TOKENS Maximum total tokens (default: 1000000)
--max-cost MAX_COST Maximum cost in USD (default: 50.0)
--checkpoint-interval N Git checkpoint interval (default: 5)
--retry-delay SECONDS Retry delay on errors (default: 2)
--no-git Disable git checkpointing
--no-archive Disable prompt archiving
--no-metrics Disable metrics collectionRalph supports any ACP-compliant agent through its ACP adapter. This enables integration with agents like Gemini CLI that implement the Agent Client Protocol.
# Basic usage with Gemini CLI
ralph run -a acp --acp-agent gemini
# With permission mode
ralph run -a acp --acp-agent gemini --acp-permission-mode auto_approveThe ACP adapter supports four permission modes for handling agent tool requests:
| Mode | Description | Use Case |
|---|---|---|
auto_approve |
Approve all requests automatically | Trusted environments, CI/CD |
deny_all |
Deny all permission requests | Testing, sandboxed execution |
allowlist |
Only approve matching patterns | Production with specific tools |
interactive |
Prompt user for each request | Development, manual oversight |
Configure ACP in ralph.yml:
adapters:
acp:
enabled: true
timeout: 300
tool_permissions:
agent_command: gemini # Agent CLI command
agent_args: [] # Additional CLI arguments
permission_mode: auto_approve
permission_allowlist: # For allowlist mode
- "fs/read_text_file:*.py"
- "fs/write_text_file:src/*"
- "terminal/create:pytest*"ACP agents maintain context across iterations via .agent/scratchpad.md. This file persists:
- Progress from previous iterations
- Decisions and context
- Current blockers or issues
- Remaining work items
The scratchpad enables agents to continue from where they left off rather than restarting each iteration.
The ACP adapter handles these agent requests:
File Operations:
fs/read_text_file- Read file contents (with path security validation)fs/write_text_file- Write file contents (with path security validation)
Terminal Operations:
terminal/create- Create subprocess with commandterminal/output- Read process outputterminal/wait_for_exit- Wait for process completionterminal/kill- Terminate processterminal/release- Release terminal resources
┌─────────────────┐
│ Read PROMPT.md │
└────────┬────────┘
│
v
┌─────────────────┐
│ Execute AI Agent│<──────┐
└────────┬────────┘ │
│ │
v │
┌─────────────────┐ │
│ Check Complete? │───No──┘
└────────┬────────┘
│Yes
v
┌─────────────────┐
│ Done! │
└─────────────────┘
- Initialization: Creates
.agent/directories and validates prompt file - Agent Detection: Auto-detects available AI agents (claude, q, gemini)
- Iteration Loop:
- Executes AI agent with current prompt
- Monitors for task completion marker
- Creates checkpoints at intervals
- Handles errors with retry logic
- Completion: Stops when:
- Max iterations reached
- Max runtime exceeded
- Cost limits reached
- Too many consecutive errors
ralph-orchestrator/
├── src/
│ └── ralph_orchestrator/
│ ├── __main__.py # CLI entry point
│ ├── main.py # Configuration and types
│ ├── orchestrator.py # Core orchestration logic (async)
│ ├── adapters/ # AI agent adapters
│ │ ├── base.py # Base adapter interface
│ │ ├── claude.py # Claude Agent SDK adapter
│ │ ├── gemini.py # Gemini CLI adapter
│ │ ├── qchat.py # Q Chat adapter
│ │ ├── acp.py # ACP (Agent Client Protocol) adapter
│ │ ├── acp_protocol.py # JSON-RPC 2.0 protocol handling
│ │ ├── acp_client.py # Subprocess manager
│ │ ├── acp_models.py # Data models
│ │ └── acp_handlers.py # Permission/file/terminal handlers
│ ├── output/ # Output formatting (NEW)
│ │ ├── base.py # Base formatter interface
│ │ ├── console.py # Rich console output
│ │ ├── rich_formatter.py # Rich text formatting
│ │ └── plain.py # Plain text fallback
│ ├── async_logger.py # Thread-safe async logging
│ ├── context.py # Context management
│ ├── logging_config.py # Centralized logging setup
│ ├── metrics.py # Metrics tracking
│ ├── security.py # Security validation & masking
│ └── safety.py # Safety checks
├── tests/ # Test suite (620+ tests)
│ ├── test_orchestrator.py
│ ├── test_adapters.py
│ ├── test_async_logger.py
│ ├── test_output_formatters.py
│ ├── test_config.py
│ ├── test_integration.py
│ └── test_acp_*.py # ACP adapter tests (305+ tests)
├── docs/ # Documentation
├── PROMPT.md # Task description (user created)
├── ralph.yml # Configuration file (created by init)
├── pyproject.toml # Project configuration
├── .agent/ # CLI workspace (created by init)
│ ├── prompts/ # Prompt workspace
│ ├── checkpoints/ # Checkpoint markers
│ ├── metrics/ # Metrics data
│ ├── plans/ # Planning documents
│ └── memory/ # Agent memory
├── .ralph/ # Runtime metrics directory
└── prompts/ # Prompt archive directory
└── archive/ # Archived prompt history
# All tests
uv run pytest -v
# With coverage
uv run pytest --cov=ralph_orchestrator
# Specific test file
uv run pytest tests/test_orchestrator.py -v
# Integration tests only
uv run pytest tests/test_integration.py -v- ✅ Unit tests for all core functions
- ✅ Integration tests with mocked agents
- ✅ CLI interface tests
- ✅ Error handling and recovery tests
- ✅ State persistence tests
# Run directly with inline prompt - no file needed
ralph run -p "Write a Python function to check if a number is prime" -a claude --max-iterations 5echo "Write a Python function to check if a number is prime" > PROMPT.md
ralph run -a claude --max-iterations 5cat > PROMPT.md << 'EOF'
Build a Flask web app with:
- User registration and login
- SQLite database
- Basic CRUD operations
- Bootstrap UI
EOF
ralph run --max-iterations 50cat > PROMPT.md << 'EOF'
Implement a linked list in Python using TDD:
1. Write tests first
2. Implement methods to pass tests
3. Add insert, delete, search operations
4. Ensure 100% test coverage
EOF
ralph run -a q --verbose# One-time status check
ralph status
# Example output:
Ralph Orchestrator Status
=========================
Prompt: PROMPT.md exists
Status: IN PROGRESS
Latest metrics: .ralph/metrics_20250907_154435.json
{
"iteration_count": 15,
"runtime": 234.5,
"errors": 0
}# If using verbose mode
ralph run --verbose 2>&1 | tee ralph.log
# Check git history
git log --oneline | grep "Ralph checkpoint"Ralph handles errors gracefully:
- Retry Logic: Failed iterations retry after configurable delay
- Error Limits: Stops after 5 consecutive errors
- Timeout Protection: 5-minute timeout per iteration
- State Persistence: Can analyze failures from saved state
- Git Recovery: Can reset to last working checkpoint
# Check last error
cat .ralph/metrics_*.json | jq '.errors[-1]'
# Reset to last checkpoint
git reset --hard HEAD
# Clean and restart
ralph clean
ralph run- Clear Task Definition: Write specific, measurable requirements
- Incremental Goals: Break complex tasks into smaller steps
- Success Markers: Define clear completion criteria
- Regular Checkpoints: Use default 5-iteration checkpoints
- Monitor Progress: Use
ralph statusto track iterations - Version Control: Commit PROMPT.md before starting
# For Claude, ensure API key is set with proper permissions
export ANTHROPIC_API_KEY="sk-ant-..."
# Verify Claude API key permissions:
# - Should have access to Claude 3.5 Sonnet or similar model
# - Need sufficient rate limits (at least 40,000 tokens/minute)
# - Requires read/write access to the API
# For Q and Gemini, check CLI tools are installed
which q
which gemini
# Install missing CLI tools as needed# Check iteration count and progress
ralph status
# Review agent errors
cat .agent/metrics/state_*.json | jq '.errors'
# Try different agent
ralph run -a gemini# Reduce iteration timeout
ralph run --max-runtime 1800
# Increase checkpoint frequency
ralph run --checkpoint-interval 3The Ralph Wiggum technique is based on several key principles:
- Simplicity Over Complexity: Keep orchestration minimal (~400 lines)
- Deterministic Failure: Fail predictably in an unpredictable world
- Context Recovery: Use git and state files for persistence
- Human-in-the-Loop: Allow manual intervention when needed
For detailed research and theoretical foundations, see the research directory.
Contributions welcome! Please:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Write tests for new functionality
- Ensure all tests pass (
uv run pytest) - Commit changes (
git commit -m 'Add amazing feature') - Push to branch (
git push origin feature/amazing-feature) - Open a Pull Request
MIT License - See LICENSE file for details
- Geoffrey Huntley - Creator of the Ralph Wiggum technique
- Harper Reed - Spec-driven development methodology
- Anthropic, Google, Q - For providing excellent AI CLI tools
- Documentation: Full Documentation
- Deployment Guide: Production Deployment
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Research: Ralph Wiggum Research
-
v1.2.0 (2025-12)
- ACP (Agent Client Protocol) Support: Full integration with ACP-compliant agents
- JSON-RPC 2.0 message protocol
- Permission handling (auto_approve, deny_all, allowlist, interactive)
- File operations (read/write with security)
- Terminal operations (create, output, wait, kill, release)
- Session management and streaming updates
- Agent scratchpad mechanism for context persistence across iterations
- New CLI options:
--acp-agent,--acp-permission-mode - Configuration support in ralph.yml
- 305+ new ACP-specific tests
- Expanded test suite (920+ tests)
- ACP (Agent Client Protocol) Support: Full integration with ACP-compliant agents
-
v1.1.0 (2025-12)
- Async-first architecture for non-blocking operations
- Thread-safe async logging with rotation and security masking
- Rich terminal output with syntax highlighting
- Inline prompt support (
-p "your task") - Claude Agent SDK integration with MCP server support
- Async git checkpointing (non-blocking)
- Expanded test suite (620+ tests)
- Improved error handling with debug logging
-
v1.0.0 (2025-09-07)
- Initial release with Claude, Q, and Gemini support
- Comprehensive test suite (17 tests)
- Production-ready error handling
- Full documentation
- Git-based checkpointing
- State persistence and metrics
"I'm learnding!" - Ralph Wiggum