Skip to content

Commit 7d3f073

Browse files
mpawliszynclaude
andauthored
docs: add evidence-based agent prompt principles (#14)
12 principles for writing effective agent prompts, derived from research on multi-agent orchestration, cognitive science, LLM behavioral studies, and production agent tools. Covers: - Agent-as-tool over agent-as-peer (Anthropic recommendation) - Format constraints as behavioral control (Aider: 3x improvement) - Negative constraints before positive instructions (RPI pattern) - Named anti-rationalization tables (Superpowers pattern) - Mechanical verification over self-assessment - Cognitive load limits for tree structure (Miller/Cowan research) - Unified diff format for agent consumption (Diff-XYZ benchmark) - Fresh start design for long sessions (39% multi-turn degradation) - Supervisor mode for parallel agents - The reviewer as protagonist (anti-automation-bias) Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 95cdfc2 commit 7d3f073

1 file changed

Lines changed: 169 additions & 0 deletions

File tree

docs/agent-prompt-principles.md

Lines changed: 169 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,169 @@
1+
# Agent Prompt Principles
2+
3+
Evidence-based principles for writing effective agent prompts in Fowlcon. Derived from research on multi-agent orchestration, cognitive science, LLM behavioral studies, and production agent tools.
4+
5+
These principles apply to all prompts in `commands/` and `agents/`.
6+
7+
---
8+
9+
## 1. Agents Are Tools, Not Peers
10+
11+
Sub-agents receive a typed task, do their work, and return structured output. The orchestrator never sees their internal reasoning. No back-and-forth conversation between orchestrator and sub-agents.
12+
13+
**Why:** 2024-2025 evidence converges on agent-as-tool over agent-as-peer for orchestration. Anthropic's "Building Effective Agents" guide explicitly recommends structured output from sub-agents over conversational output. The peer model causes context window catastrophe -- a 10-turn investigation consumes 40k tokens in message history alone.
14+
15+
**In practice:**
16+
- Concept researchers return a `FindingSet`, not a narrative
17+
- The orchestrator synthesizes; sub-agents investigate
18+
- Sub-agents are fire-and-forget with structured return
19+
20+
## 2. Constrain Output Format to Constrain Behavior
21+
22+
The strongest measured behavioral control is output format. Aider found that switching edit formats reduced GPT-4 Turbo "laziness" by 3x (score 20% → 61%). The Diff-XYZ benchmark confirms format choice dramatically affects output quality across models.
23+
24+
**Why:** When an agent must fill in required sections of a template, it cannot skip them. Format constraints convert behavioral requirements into structural requirements. The agent doesn't need to "want" to be thorough -- the template forces it.
25+
26+
**In practice:**
27+
- Every agent prompt defines an exact output template with named sections
28+
- Use required sections, not optional ones (`## Uncertainties` must always appear, even if empty)
29+
- Provide 1-2 examples of correctly formatted output (few-shot)
30+
- Schema validation (via `check-tree-quality.sh`) catches format violations mechanically
31+
32+
## 3. Negative Constraints Before Positive Instructions
33+
34+
State what the agent must NOT do before stating what it should do. The "documentarian mandate" (six DO NOT rules followed by one ONLY rule) appears at every tier in RPI and is the most reliable behavioral control pattern observed in production.
35+
36+
**Why:** LLMs trained with RLHF have strong refusal training -- they respond more reliably to prohibitions than permissions. A positive instruction ("be thorough") is vague. A negative constraint ("DO NOT summarize instead of showing detail") targets a specific failure mode.
37+
38+
**In practice:**
39+
```markdown
40+
## CRITICAL: YOUR ONLY JOB IS TO DOCUMENT AND EXPLAIN
41+
- DO NOT suggest improvements
42+
- DO NOT critique the implementation
43+
- DO NOT identify "problems" or "issues"
44+
- DO NOT recommend refactoring
45+
- ONLY describe what exists and how it works
46+
```
47+
48+
Place this block at the top of the prompt AND repeat key constraints at the bottom (primacy + recency positioning).
49+
50+
## 4. Name the Rationalizations
51+
52+
When you know how an agent will try to skip a step, name that rationalization explicitly in the prompt. A named anti-pattern is harder to use than an unnamed one.
53+
54+
**Why:** This pattern from Superpowers (the "red flags table") prevents agents from self-excusing non-compliance. By naming the exact thought ("This is simple enough to skip the tree") and providing the correction ("Build the tree anyway -- simple PRs still benefit from structure"), the agent recognizes its own rationalization attempt as a documented failure mode.
55+
56+
**In practice:**
57+
58+
For the orchestrator:
59+
```markdown
60+
## Red Flags -- If You Think This, Stop
61+
62+
| If you think... | The reality is... |
63+
|---|---|
64+
| "This PR is simple, skip the tree" | Simple PRs still benefit from structure. Build the tree. |
65+
| "Coverage is close enough" | 100% or explain every gap. No exceptions. |
66+
| "The pattern is obvious, skip examples" | Show at least one example. Obvious to you ≠ obvious to the reviewer. |
67+
| "I can summarize instead of grouping" | Summaries lose detail. Group by concept. |
68+
```
69+
70+
## 5. Mechanical Verification Over Self-Assessment
71+
72+
Never trust an agent's claim that it's done. Use external tools to verify completeness.
73+
74+
**Why:** LLMs cannot reliably self-assess completeness. The coverage bitmap pattern (checking coverage via script output rather than asking the agent "did you cover everything?") is strictly more reliable. Superpowers' "verification before completion" skill found 24 documented failure cases where agents claimed completion incorrectly.
75+
76+
**In practice:**
77+
- Coverage completeness: checked by `coverage-report.sh`, not agent self-report
78+
- Tree quality: checked by `check-tree-quality.sh`, not agent judgment
79+
- File references: spot-checked by reading actual files, not trusting agent citations
80+
- The orchestrator calls verification scripts BEFORE presenting results to the reviewer
81+
82+
## 6. One Agent, One Job
83+
84+
Each agent has a single, clearly-scoped responsibility. If an agent is doing two things, split it into two agents.
85+
86+
**Why:** Focused agents produce more reliable output than multi-purpose ones. Tool restrictions (Grep/Glob/LS only for the locator -- no Read) enforce specialization more reliably than instructions alone. CrewAI's known failure mode of "manager accepts incomplete output" is caused by agents with broad mandates.
87+
88+
**In practice:**
89+
- `codebase-locator`: finds WHERE (no Read tool -- cannot analyze content)
90+
- `codebase-analyzer`: explains HOW (has Read -- can analyze)
91+
- `codebase-pattern-finder`: shows EXAMPLES (has Read -- returns code snippets)
92+
- `coverage-checker`: verifies COMPLETENESS (Haiku -- mechanical check only)
93+
94+
## 7. Context Inline, Never File References
95+
96+
Sub-agents receive all context embedded in their prompt. Never pass a file path and expect the agent to read it.
97+
98+
**Why:** Sub-agents run in fresh context windows. They cannot access the orchestrator's context. Passing file paths creates a dependency on the agent successfully reading the file, which adds a failure mode. Inline context is guaranteed to be seen.
99+
100+
**In practice:**
101+
- The orchestrator reads the diff, then embeds relevant hunks in the sub-agent's prompt
102+
- PR metadata (title, description, file list) is pasted inline, not referenced
103+
- The review tree (if it exists) is included as text, not as a path to read
104+
105+
## 8. Respect the Tree Structure Limits
106+
107+
The review tree has evidence-based structural constraints derived from cognitive load research.
108+
109+
**Why:** Miller's 7±2 (strategic capacity with labels) and Cowan's 4±1 (raw working memory) bound what reviewers can hold in mind. Hierarchy research shows 2-3 levels is optimal; performance degrades consistently at 4+ levels. The 3-5 children per node range matches chunking theory.
110+
111+
**In practice:**
112+
- Top-level concepts: 7 hard max, 5-6 preferred
113+
- Tree depth: 2-3 levels (4 only as documented exception)
114+
- Children per node: 3-5 preferred, 2-7 acceptable, never 1 or 8+
115+
- Single-child nodes are a structural smell -- collapse them
116+
- Labels must be descriptive functional names, not "Other" or "Miscellaneous"
117+
- `check-tree-quality.sh` enforces these limits
118+
119+
## 9. Use Unified Diff Format for Agent Consumption
120+
121+
When passing diff content to agents, use unified diff format with context lines and without line numbers in hunk headers.
122+
123+
**Why:** The Diff-XYZ benchmark (Oct 2025) found unified diff is the best format for LLM Apply and Anti-Apply tasks. Aider found omitting line numbers from hunk headers improves performance -- agents use context lines for matching, not line numbers. Including 3-5 context lines helps agents understand what surrounds the change.
124+
125+
**In practice:**
126+
- Pass unified diff hunks to concept researchers
127+
- Strip hunk header line numbers (`@@ -X,Y +A,B @@``@@`)
128+
- Include surrounding context lines (unchanged code)
129+
- Chunk by semantic unit (function/class) not by minimal hunk
130+
131+
## 10. Design for Fresh Starts
132+
133+
Long sessions degrade. Design prompts and state management so the review can restart cleanly at any point.
134+
135+
**Why:** Microsoft research found 39% average degradation in multi-turn conversations. At 50% context utilization, quality drops measurably. Batching context into a fresh call restored 90%+ accuracy. The "Lost in the Middle" effect means mid-session findings are in the attention danger zone.
136+
137+
**In practice:**
138+
- All state lives in files (`review-tree.md`, `review-comments.md`), not in conversation context
139+
- A new session reads state files and continues from the last known position
140+
- The orchestrator can restart at any phase boundary without losing work
141+
- Front-load critical instructions in the prompt (primacy positioning)
142+
- Repeat key constraints at the end of the prompt (recency positioning)
143+
- After processing sub-agent findings, commit to disk and drop from context
144+
145+
## 11. Supervisor Mode for Parallel Agents
146+
147+
When spawning multiple agents in parallel, use supervisor mode: capture failures as data, not exceptions.
148+
149+
**Why:** Structured concurrency research shows that fail-fast (abort everything when one agent fails) is wrong for research tasks where agents are independently valuable. Supervisor mode lets 2 of 3 successful agents' findings be used even if the third fails. The orchestrator marks the failed area as `[pending]` and the coverage checker catches the gap.
150+
151+
**In practice:**
152+
- Concept researchers run in parallel with independent scopes
153+
- If one fails: log the failure, mark the concept area as pending, continue
154+
- If one returns low-quality output: flag for human investigation, don't silently include
155+
- The coverage checker runs AFTER all agents complete (including failed ones) to identify gaps
156+
- Retry failed agents with simplified prompts before giving up
157+
158+
## 12. The Reviewer Is the Protagonist
159+
160+
The tool serves the reviewer. It never makes decisions for them, never recommends approve/reject, and never hides information.
161+
162+
**Why:** Automation bias research shows 59% of developers use AI code they don't fully understand. Higher AI quality paradoxically increases complacency (2.5x more likely to merge without review). The "documentarian" pattern -- facts not recommendations -- is an anti-automation-bias design. When the tool organizes and explains rather than judges, the reviewer must engage their own judgment.
163+
164+
**In practice:**
165+
- Agents describe what code does, never whether it's good or bad
166+
- The orchestrator presents findings, never recommends a verdict
167+
- Comments are captured as the reviewer's words, not the agent's suggestions
168+
- "I get it!" is the reviewer's active choice, not the agent's assumption
169+
- Complexity warnings are factual ("7 interleaving concepts across 50 files") not judgmental ("this PR is too complex")

0 commit comments

Comments
 (0)