Skip to content

simon-klk/AutoPrompt

AutoPrompt

A configurable, open-source self-optimization loop for AI agents. AutoPrompt autonomously tests an AI agent, scores responses against a rubric, and iteratively improves the agent's configuration (system prompt, memory strategy, tool config).

How It Works

AutoPrompt runs a meta-optimization loop: it generates test cases, sends them to your agent, scores the responses against a rubric using an LLM-as-judge, identifies weak areas, and mutates your agent's config files to improve them. Only mutations that improve the baseline score are kept — all others are rolled back.

generate baseline tests → score → identify weak dims → mutate config
       ↑                                                      |
       └──────────── keep if improved, else rollback ─────────┘

Features

  • Config-Driven: Set up the target agent, rubric, max iterations, budget, and API URLs via a single autoprompt.yaml.
  • Pluggable Agent Adapters: Communicate over HTTP, directly to Python functions, or via CLI subprocess.
  • LLM-as-a-Judge: Custom per-dimension scoring with configurable weights.
  • Budget Limits: Tracks estimated LLM costs and stops when the threshold is hit.
  • Multi-Turn Support: Test agents across multi-turn conversations.
  • Reporting: Stores data in SQLite, Supabase, or JSON lines. Includes leaderboard, diff, and report commands.

Installation

pip install -e .

# for development and tests
pip install -e .[dev]

Quickstart

Copy env example and fill out:

cp .env.example .env

The fastest way to try AutoPrompt is with the multi-turn example — it runs entirely offline, no API key required for the agent itself:

autoprompt run examples/multi_turn/autoprompt.yaml --dry-run
autoprompt run examples/multi_turn/autoprompt.yaml

For a real LLM agent (requires OPENROUTER_API_KEY):

autoprompt run examples/simple/autoprompt.yaml

Choosing an Adapter

AutoPrompt connects to your agent via one of three built-in adapters:

Adapter When to use Config field
python_callable Your agent is a Python function in the same repo import_path: "my_module:my_function"
http Your agent runs as an HTTP service endpoint: "http://localhost:8000/chat"
cli Your agent is a CLI tool that reads from stdin command: "python my_agent.py"

You need a custom adapter only when the built-in ones don't fit — for example, if your HTTP endpoint requires JWT authentication or a non-standard request format. Custom adapters subclass AgentAdapter from autoprompt/adapters/base.py and implement two methods: send() and health_check().

What your callable needs to accept

For python_callable, AutoPrompt calls your function with either:

  • handle_message(message: str, context: dict) — single-turn
  • chat(messages: list[dict], context: dict) — multi-turn

Return {"content": "..."} or just a plain str.

For http, AutoPrompt POSTs {"message": "..."} (or {"messages": [...]} for multi-turn) and expects {"content": "..."} in the response.

Example Config

agent:
  adapter: "python_callable" # or "http" or "cli"
  name: "My Agent"
  import_path: "my_agent.core:handle_message"
  optimizable:
    - type: system_prompt
      path: "system.md" # file AutoPrompt will mutate

rubric:
  path: "rubric.md"
  scoring_model: "deepseek/deepseek-v3.2"
  score_range: [1.0, 10.0]
  dimensions:
    - name: "helpfulness"
      weight: 0.6
    - name: "accuracy"
      weight: 0.4

tests:
  mode: "mix" # "static", "dynamic", or "mix"
  static_suite: "tests.yaml"
  generator_model: "deepseek/deepseek-v3.2"
  tests_per_iteration: 6

loop:
  max_iterations: 10
  budget_limit_usd: 2.00
  mutation_model: "deepseek/deepseek-v3.2"
  improvement_threshold: 0.05

logging:
  backend: "sqlite" # or "supabase" or "jsonl"

All LLM calls go through OpenRouter. Set OPENROUTER_API_KEY in .env.

Post-Run Commands

autoprompt leaderboard autoprompt.yaml --top 5
autoprompt report autoprompt.yaml
autoprompt diff autoprompt.yaml

Examples

Example Description
examples/multi_turn/ Complete multi-turn agent — runs fully offline, no API key needed for the agent
examples/simple/ Minimal single-turn LLM agent using python_callable

Development

Run the test suite:

pytest

Project contribution and governance docs:

About

No description, website, or topics provided.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages