Skip to content

sachafaust/ralph-orchestrator

 
 

Repository files navigation

Ralph Orchestrator

Documentation Version License Tests

A somewhat functional implementation of the Ralph Wiggum software engineering technique - putting AI agents in a loop until the task is done.

"Me fail English? That's unpossible!" - Ralph Wiggum

NOTE: This was a toy project that was itself built with the ralph wiggum technique. Expect bugs, missing functionality, and breaking changes while I clean it up. Mainly tested with Claude Agent SDK path

📚 Documentation

View Full Documentation | Quick Start | API Reference | Examples

Overview

Ralph Orchestrator implements a simple but effective pattern for autonomous task completion using AI agents. It continuously runs an AI agent against a prompt file until the task is marked as complete or limits are reached.

Based on the Ralph Wiggum technique by Geoffrey Huntley, this implementation provides a robust, tested, and feature-complete orchestration system for AI-driven development.

✅ Production Ready - v1.2.0

  • Claude Integration: ✅ COMPLETE (with Agent SDK)
  • Q Chat Integration: ✅ COMPLETE
  • Gemini Integration: ✅ COMPLETE
  • ACP Protocol Support: ✅ COMPLETE (Agent Client Protocol)
  • Core Orchestration: ✅ OPERATIONAL
  • Test Suite: ✅ 920+ tests passing
  • Documentation: ✅ COMPLETE
  • Production Deployment: ✅ READY

Features

  • 🤖 Multiple AI Agent Support: Works with Claude, Q Chat, Gemini CLI, and ACP-compliant agents
  • 🔍 Auto-detection: Automatically detects which AI agents are available
  • 🌐 WebSearch Support: Claude can search the web for current information
  • 💾 Checkpointing: Git-based async checkpointing for recovery and history
  • 📚 Prompt Archiving: Tracks prompt evolution over iterations
  • 🔄 Error Recovery: Automatic retry with exponential backoff (non-blocking)
  • 📊 State Persistence: Saves metrics and state for analysis
  • ⏱️ Configurable Limits: Set max iterations and runtime limits
  • 🧪 Comprehensive Testing: 620+ tests with unit, integration, and async coverage
  • 🎨 Rich Terminal Output: Beautiful formatted output with syntax highlighting
  • 🔒 Security Features: Automatic masking of API keys et sensitive data in logs
  • Async-First Design: Non-blocking I/O throughout (logging, git operations)
  • 📝 Inline Prompts: Run with -p "your task" without needing a file
  • 🧠 Agent Scratchpad: ACP agents persist context across iterations via .agent/scratchpad.md

Installation

# Clone the repository
git clone https://github.com/mikeyobrien/ralph-orchestrator.git
cd ralph-orchestrator

# Install with uv (recommended)
uv sync

# Or install with pip (requires pip in virtual environment)
python -m pip install -e .

Prerequisites

At least one AI CLI tool must be installed:

  • Claude SDK

    # Automatically installed via dependencies
    # Requires ANTHROPIC_API_KEY environment variable with proper permissions:
    # - Read/Write access to conversations
    # - Model access (Claude 3.5 Sonnet or similar)
    # - Sufficient rate limits for continuous operation
    
    export ANTHROPIC_API_KEY="sk-ant-..."
  • Q Chat

    # Follow installation instructions in repo
  • Gemini CLI

    npm install -g @google/gemini-cli
  • ACP-Compliant Agents (Agent Client Protocol)

    # Any ACP-compliant agent can be used via the ACP adapter
    # Example: Gemini CLI with ACP mode
    ralph run -a acp --acp-agent gemini

Quick Start

1. Initialize a project

ralph init

This creates:

  • PROMPT.md - Task description template
  • ralph.yml - Configuration file
  • .agent/ - Workspace directories for prompts, checkpoints, metrics, plans, and memory

2. Configure Ralph (optional)

Edit ralph.yml to customize settings:

# Ralph Orchestrator Configuration
agent: auto                    # Which agent to use: claude, q, gemini, acp, auto
prompt_file: PROMPT.md         # Path to prompt file
max_iterations: 100            # Maximum iterations before stopping
max_runtime: 14400             # Maximum runtime in seconds (4 hours)
verbose: false                 # Enable verbose output

# Adapter configurations
adapters:
  claude:
    enabled: true
    timeout: 300              # Timeout in seconds
  q:
    enabled: true
    timeout: 300
  gemini:
    enabled: true
    timeout: 300
  acp:                        # Agent Client Protocol adapter
    enabled: true
    timeout: 300
    tool_permissions:
      agent_command: gemini   # Command to run the ACP agent
      agent_args: []          # Additional arguments
      permission_mode: auto_approve  # auto_approve, deny_all, allowlist, interactive
      permission_allowlist: []  # Patterns for allowlist mode

3. Edit PROMPT.md with your task

# Task: Build a Python Calculator

Create a calculator module with:
- Basic operations (add, subtract, multiply, divide)
- Error handling for division by zero
- Unit tests for all functions

<!-- Ralph will continue iterating until limits are reached -->

4. Run Ralph

ralph run
# or with config file
ralph -c ralph.yml

Usage

Basic Commands

# Run with auto-detected agent
ralph

# Use configuration file
ralph -c ralph.yml

# Use specific agent
ralph run -a claude
ralph run -a q
ralph run -a gemini
ralph run -a acp               # ACP-compliant agent

# Check status
ralph status

# Clean workspace
ralph clean

# Dry run (test without executing)
ralph run --dry-run

Advanced Options

ralph [OPTIONS] [COMMAND]

Commands:
  init                            Initialize a new Ralph project
  status                          Show current Ralph status  
  clean                           Clean up agent workspace
  prompt                          Generate structured prompt from rough ideas
  run                             Run the orchestrator (default)

Core Options:
  -c, --config CONFIG             Configuration file (YAML format)
  -a, --agent {claude,q,gemini,acp,auto}  AI agent to use (default: auto)
  -P, --prompt-file FILE          Prompt file path (default: PROMPT.md)
  -p, --prompt-text TEXT          Inline prompt text (overrides file)
  -i, --max-iterations N          Maximum iterations (default: 100)
  -t, --max-runtime SECONDS      Maximum runtime (default: 14400)
  -v, --verbose                   Enable verbose output
  -d, --dry-run                   Test mode without executing agents

ACP Options:
  --acp-agent COMMAND             ACP agent command (default: gemini)
  --acp-permission-mode MODE      Permission handling: auto_approve, deny_all, allowlist, interactive

Advanced Options:
  --max-tokens MAX_TOKENS         Maximum total tokens (default: 1000000)
  --max-cost MAX_COST             Maximum cost in USD (default: 50.0)
  --checkpoint-interval N         Git checkpoint interval (default: 5)
  --retry-delay SECONDS           Retry delay on errors (default: 2)
  --no-git                        Disable git checkpointing
  --no-archive                    Disable prompt archiving
  --no-metrics                    Disable metrics collection

ACP (Agent Client Protocol) Integration

Ralph supports any ACP-compliant agent through its ACP adapter. This enables integration with agents like Gemini CLI that implement the Agent Client Protocol.

Quick Start with ACP

# Basic usage with Gemini CLI
ralph run -a acp --acp-agent gemini

# With permission mode
ralph run -a acp --acp-agent gemini --acp-permission-mode auto_approve

Permission Modes

The ACP adapter supports four permission modes for handling agent tool requests:

Mode Description Use Case
auto_approve Approve all requests automatically Trusted environments, CI/CD
deny_all Deny all permission requests Testing, sandboxed execution
allowlist Only approve matching patterns Production with specific tools
interactive Prompt user for each request Development, manual oversight

Configuration

Configure ACP in ralph.yml:

adapters:
  acp:
    enabled: true
    timeout: 300
    tool_permissions:
      agent_command: gemini      # Agent CLI command
      agent_args: []             # Additional CLI arguments
      permission_mode: auto_approve
      permission_allowlist:      # For allowlist mode
        - "fs/read_text_file:*.py"
        - "fs/write_text_file:src/*"
        - "terminal/create:pytest*"

Agent Scratchpad

ACP agents maintain context across iterations via .agent/scratchpad.md. This file persists:

  • Progress from previous iterations
  • Decisions and context
  • Current blockers or issues
  • Remaining work items

The scratchpad enables agents to continue from where they left off rather than restarting each iteration.

Supported Operations

The ACP adapter handles these agent requests:

File Operations:

  • fs/read_text_file - Read file contents (with path security validation)
  • fs/write_text_file - Write file contents (with path security validation)

Terminal Operations:

  • terminal/create - Create subprocess with command
  • terminal/output - Read process output
  • terminal/wait_for_exit - Wait for process completion
  • terminal/kill - Terminate process
  • terminal/release - Release terminal resources

How It Works

The Ralph Loop

┌─────────────────┐
│  Read PROMPT.md │
└────────┬────────┘
         │
         v
┌─────────────────┐
│ Execute AI Agent│<──────┐
└────────┬────────┘       │
         │                │
         v                │
┌─────────────────┐       │
│ Check Complete? │───No──┘
└────────┬────────┘
         │Yes
         v
┌─────────────────┐
│      Done!      │
└─────────────────┘

Execution Flow

  1. Initialization: Creates .agent/ directories and validates prompt file
  2. Agent Detection: Auto-detects available AI agents (claude, q, gemini)
  3. Iteration Loop:
    • Executes AI agent with current prompt
    • Monitors for task completion marker
    • Creates checkpoints at intervals
    • Handles errors with retry logic
  4. Completion: Stops when:
    • Max iterations reached
    • Max runtime exceeded
    • Cost limits reached
    • Too many consecutive errors

Project Structure

ralph-orchestrator/
├── src/
│   └── ralph_orchestrator/
│       ├── __main__.py      # CLI entry point
│       ├── main.py          # Configuration and types
│       ├── orchestrator.py  # Core orchestration logic (async)
│       ├── adapters/        # AI agent adapters
│       │   ├── base.py      # Base adapter interface
│       │   ├── claude.py    # Claude Agent SDK adapter
│       │   ├── gemini.py    # Gemini CLI adapter
│       │   ├── qchat.py     # Q Chat adapter
│       │   ├── acp.py       # ACP (Agent Client Protocol) adapter
│       │   ├── acp_protocol.py  # JSON-RPC 2.0 protocol handling
│       │   ├── acp_client.py    # Subprocess manager
│       │   ├── acp_models.py    # Data models
│       │   └── acp_handlers.py  # Permission/file/terminal handlers
│       ├── output/          # Output formatting (NEW)
│       │   ├── base.py      # Base formatter interface
│       │   ├── console.py   # Rich console output
│       │   ├── rich_formatter.py  # Rich text formatting
│       │   └── plain.py     # Plain text fallback
│       ├── async_logger.py  # Thread-safe async logging
│       ├── context.py       # Context management
│       ├── logging_config.py # Centralized logging setup
│       ├── metrics.py       # Metrics tracking
│       ├── security.py      # Security validation & masking
│       └── safety.py        # Safety checks
├── tests/                   # Test suite (620+ tests)
│   ├── test_orchestrator.py
│   ├── test_adapters.py
│   ├── test_async_logger.py
│   ├── test_output_formatters.py
│   ├── test_config.py
│   ├── test_integration.py
│   └── test_acp_*.py        # ACP adapter tests (305+ tests)
├── docs/                    # Documentation
├── PROMPT.md               # Task description (user created)
├── ralph.yml               # Configuration file (created by init)
├── pyproject.toml          # Project configuration
├── .agent/                 # CLI workspace (created by init)
│   ├── prompts/            # Prompt workspace
│   ├── checkpoints/        # Checkpoint markers
│   ├── metrics/            # Metrics data
│   ├── plans/              # Planning documents
│   └── memory/             # Agent memory
├── .ralph/                 # Runtime metrics directory
└── prompts/                # Prompt archive directory
    └── archive/            # Archived prompt history

Testing

Run Test Suite

# All tests
uv run pytest -v

# With coverage
uv run pytest --cov=ralph_orchestrator

# Specific test file
uv run pytest tests/test_orchestrator.py -v

# Integration tests only
uv run pytest tests/test_integration.py -v

Test Coverage

  • ✅ Unit tests for all core functions
  • ✅ Integration tests with mocked agents
  • ✅ CLI interface tests
  • ✅ Error handling and recovery tests
  • ✅ State persistence tests

Examples

Inline Prompt (Quick Tasks)

# Run directly with inline prompt - no file needed
ralph run -p "Write a Python function to check if a number is prime" -a claude --max-iterations 5

Simple Function (File-Based)

echo "Write a Python function to check if a number is prime" > PROMPT.md
ralph run -a claude --max-iterations 5

Web Application

cat > PROMPT.md << 'EOF'
Build a Flask web app with:
- User registration and login
- SQLite database
- Basic CRUD operations
- Bootstrap UI
EOF

ralph run --max-iterations 50

Test-Driven Development

cat > PROMPT.md << 'EOF'
Implement a linked list in Python using TDD:
1. Write tests first
2. Implement methods to pass tests
3. Add insert, delete, search operations
4. Ensure 100% test coverage
EOF

ralph run -a q --verbose

Monitoring

Check Status

# One-time status check
ralph status

# Example output:
Ralph Orchestrator Status
=========================
Prompt: PROMPT.md exists
Status: IN PROGRESS
Latest metrics: .ralph/metrics_20250907_154435.json
{
  "iteration_count": 15,
  "runtime": 234.5,
  "errors": 0
}

View Logs

# If using verbose mode
ralph run --verbose 2>&1 | tee ralph.log

# Check git history
git log --oneline | grep "Ralph checkpoint"

Error Recovery

Ralph handles errors gracefully:

  • Retry Logic: Failed iterations retry after configurable delay
  • Error Limits: Stops after 5 consecutive errors
  • Timeout Protection: 5-minute timeout per iteration
  • State Persistence: Can analyze failures from saved state
  • Git Recovery: Can reset to last working checkpoint

Manual Recovery

# Check last error
cat .ralph/metrics_*.json | jq '.errors[-1]'

# Reset to last checkpoint
git reset --hard HEAD

# Clean and restart
ralph clean
ralph run

Best Practices

  1. Clear Task Definition: Write specific, measurable requirements
  2. Incremental Goals: Break complex tasks into smaller steps
  3. Success Markers: Define clear completion criteria
  4. Regular Checkpoints: Use default 5-iteration checkpoints
  5. Monitor Progress: Use ralph status to track iterations
  6. Version Control: Commit PROMPT.md before starting

Troubleshooting

Agent Not Found

# For Claude, ensure API key is set with proper permissions
export ANTHROPIC_API_KEY="sk-ant-..."

# Verify Claude API key permissions:
# - Should have access to Claude 3.5 Sonnet or similar model
# - Need sufficient rate limits (at least 40,000 tokens/minute)
# - Requires read/write access to the API

# For Q and Gemini, check CLI tools are installed
which q
which gemini

# Install missing CLI tools as needed

Task Not Completing

# Check iteration count and progress
ralph status

# Review agent errors
cat .agent/metrics/state_*.json | jq '.errors'

# Try different agent
ralph run -a gemini

Performance Issues

# Reduce iteration timeout
ralph run --max-runtime 1800

# Increase checkpoint frequency
ralph run --checkpoint-interval 3

Research & Theory

The Ralph Wiggum technique is based on several key principles:

  1. Simplicity Over Complexity: Keep orchestration minimal (~400 lines)
  2. Deterministic Failure: Fail predictably in an unpredictable world
  3. Context Recovery: Use git and state files for persistence
  4. Human-in-the-Loop: Allow manual intervention when needed

For detailed research and theoretical foundations, see the research directory.

Contributing

Contributions welcome! Please:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Write tests for new functionality
  4. Ensure all tests pass (uv run pytest)
  5. Commit changes (git commit -m 'Add amazing feature')
  6. Push to branch (git push origin feature/amazing-feature)
  7. Open a Pull Request

License

MIT License - See LICENSE file for details

Acknowledgments

  • Geoffrey Huntley - Creator of the Ralph Wiggum technique
  • Harper Reed - Spec-driven development methodology
  • Anthropic, Google, Q - For providing excellent AI CLI tools

Support

Version History

  • v1.2.0 (2025-12)

    • ACP (Agent Client Protocol) Support: Full integration with ACP-compliant agents
      • JSON-RPC 2.0 message protocol
      • Permission handling (auto_approve, deny_all, allowlist, interactive)
      • File operations (read/write with security)
      • Terminal operations (create, output, wait, kill, release)
      • Session management and streaming updates
      • Agent scratchpad mechanism for context persistence across iterations
    • New CLI options: --acp-agent, --acp-permission-mode
    • Configuration support in ralph.yml
    • 305+ new ACP-specific tests
    • Expanded test suite (920+ tests)
  • v1.1.0 (2025-12)

    • Async-first architecture for non-blocking operations
    • Thread-safe async logging with rotation and security masking
    • Rich terminal output with syntax highlighting
    • Inline prompt support (-p "your task")
    • Claude Agent SDK integration with MCP server support
    • Async git checkpointing (non-blocking)
    • Expanded test suite (620+ tests)
    • Improved error handling with debug logging
  • v1.0.0 (2025-09-07)

    • Initial release with Claude, Q, and Gemini support
    • Comprehensive test suite (17 tests)
    • Production-ready error handling
    • Full documentation
    • Git-based checkpointing
    • State persistence and metrics

"I'm learnding!" - Ralph Wiggum

About

An improved implementation of the Ralph Wiggum technique for autonomous AI agent orchestration

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 93.7%
  • HTML 6.1%
  • Dockerfile 0.2%