A step-by-step walkthrough from zero to a working qed spec.
By the end of this tutorial you'll have written a spec that verifies a shell command, reviewed code with an AI agent, and seen qed's output format.
- qed installed and on your PATH (see the quick start)
- A project directory to work in
Create a file called hello.spec.toml:
name = "hello"
[[criteria]]
description = "echo succeeds"
verify = { type = "command", run = "echo hello" }This is the simplest possible spec: one criterion, verified by running a shell command and checking the exit code.
Run it:
qed verify hello.spec.tomlYou should see:
Verifying: hello
[PASS] echo succeeds
All 1 criteria passed.
Add a second criterion that fails:
name = "hello"
[[criteria]]
description = "echo succeeds"
verify = { type = "command", run = "echo hello" }
[[criteria]]
description = "this will fail"
verify = { type = "command", run = "false" }Run it again:
qed verify hello.spec.tomlVerifying: hello
[PASS] echo succeeds
[FAIL] this will fail
1 of 2 criteria failed.
qed exits with code 1 when any criterion fails. This makes it easy to use in CI — a non-zero exit fails the pipeline.
Add --json for machine-readable output:
qed verify hello.spec.toml --json{
"spec": "hello",
"passed": false,
"criteria": [
{
"description": "echo succeeds",
"result": "passed",
"details": "hello"
},
{
"description": "this will fail",
"result": "failed",
"details": "exit code 1\n"
}
]
}The --json flag is position-independent — it works before or after the spec path.
Remove the failing criterion and add an agent review. Agent criteria use an LLM to review code against a prompt:
name = "hello"
[[criteria]]
description = "echo succeeds"
verify = { type = "command", run = "echo hello" }
[[criteria]]
description = "README exists and is well-written"
verify = { type = "agent", prompt = "Check that README.md exists, has a clear title, and describes what the project does." }Run it:
qed verify hello.spec.tomlThe agent criterion spawns Claude (or your configured agent command) to review the codebase. If Claude isn't available, the criterion fails — qed never silently skips verification you asked for.
Human criteria prompt you interactively for sign-off — useful for things that can't be automated:
name = "hello"
[[criteria]]
description = "echo succeeds"
verify = { type = "command", run = "echo hello" }
[[criteria]]
description = "Output is readable"
verify = { type = "human", instruction = "Run 'echo hello' and confirm the output looks correct." }When qed reaches the human criterion, it prints the instruction and waits:
→ Output is readable
Run 'echo hello' and confirm the output looks correct.
Accept? [y/n]: y
Type y to pass, n to fail. Human criteria default to schedule = "manual", so they're excluded from automated runs (--auto and CI). Use --full to include them explicitly.
qed can verify all specs in a directory recursively:
qed verify specs/This finds all .spec.json and .spec.toml files, skipping hidden directories and build artifacts. With no argument, qed verify defaults to the current directory.
So far we've used verify mode — run each criterion once and report. For iterative development with an AI agent, use worker loop mode by adding a [worker]:
name = "implement-greeting"
[worker]
prompt = "Create a file called greeting.sh that prints 'Hello, world!'"
[[criteria]]
description = "greeting.sh prints the expected output"
verify = { type = "command", run = "bash greeting.sh | grep 'Hello, world!'" }Run it with qed run:
qed run implement-greeting.spec.tomlqed dispatches the worker (an AI agent by default), then verifies the criteria. If any criterion fails, the failures are fed back to the worker and it tries again. The loop continues until all criteria pass, the worker gets stuck (same failures repeating), or maxIterations is reached.
- Spec format reference — all fields, types, and defaults
- Architecture — how the state machine and verification dispatch work
- How-to guides — recipes for specific tasks (CI integration, custom workers, etc.)