Six patterns for constraining AI agent behavior, learned from production use.
Agents interpret prose instructions differently each session. "First do X, then Y, then Z" becomes "I've done something related to X, Y, and Z" with no guarantee of order, completeness, or correctness.
When you tell an agent to "translate the document, validate it, generate audio, and upload," it may:
- Translate half the document and call it done
- Skip validation because it "looks fine"
- Generate audio for what it translated, not the full document
- Report success while the source file remains half-processed
Replace prose workflow descriptions with a formal state machine. The agent reads the current state and follows explicit transitions, not remembered instructions.
SELECTING → RESEARCHING → TRANSLATING → VALIDATING →
GENERATING_AUDIO → GENERATING_VIDEO → AWAITING_VIDEO →
DISTRIBUTING → PUBLISHING → REVIEW → COMPLETE
Each state has:
- Entry requirements: What must be true to enter this state
- Exit requirements: What must be true to leave this state
- Allowed transitions: Which states can follow this one
The state lives in a JSON file that persists across sessions:
{
"project_id": "de-trinitate-20251214",
"state": "TRANSLATING",
"source": {
"title_latin": "De Trinitate",
"author": "Augustine of Hippo",
"file_path": "/path/to/source.txt"
},
"translation": {
"file_path": null,
"validation": { "passed": null }
}
}At session start, the agent reads the state file and executes the appropriate action:
from pipeline.state import get_active_projects
projects = get_active_projects()
if projects:
project = projects[0]
print(f"Resuming: {project['project_id']} in state {project['state']}")The agent can't skip to GENERATING_AUDIO because the state file says TRANSLATING. It can't claim to be done because translation.validation.passed is still null.
State machines work because they're external to the agent's context.
The agent's memory resets each session. Instructions get reinterpreted. But a JSON file on disk doesn't forget, doesn't reinterpret, and doesn't rationalize.
For any multi-step workflow:
- Define states as a directed graph
- Store current state persistently (file, database, API)
- Make the agent's first action be "read current state"
- Block transitions that skip required states
Agents treat validation as a formality. When told to "check that the translation is complete before proceeding," the agent checks... something... and proceeds. It may check if the file exists, if it has content, if it "looks reasonable." It will not catch the translation that ends mid-sentence at 60% completion.
Checklists don't work because the agent decides whether each item passes. Given enough context pressure (time, complexity, user expectation), the agent will find reasons to check boxes.
Replace checklists with scripts that return exit codes. Exit code 0 means pass. Exit code 1 means blocked. The agent cannot proceed without code 0.
python validation/validate_translation.py source.txt translation.json
# Exit 0: Proceed to next state
# Exit 1: Fix the issue, cannot proceedThe critical insight: the agent doesn't decide if validation passed. The script decides. The agent just reads the result.
Validation scripts check structural properties, not vibes:
def check_source_coverage(source_text: str, last_latin: str) -> Tuple[bool, str]:
"""Check if the last translated chunk appears near the end of source."""
source_normalized = normalize_latin(source_text)
last_latin_normalized = normalize_latin(last_latin)
# Find where the last translated text appears in source
pos = source_normalized.find(last_latin_normalized[-100:])
if pos == -1:
return False, "Last translated chunk not found in source!"
# Calculate remaining untranslated content
remaining = len(source_normalized) - pos
remaining_pct = remaining / len(source_normalized) * 100
if remaining > 500: # More than ~500 chars = incomplete
return False, f"Translation incomplete: {remaining_pct:.1f}% remains"
return True, "Translation covers source text"The script checks if the last translated Latin appears near the end of the source. If 30% of the source remains untranslated, exit code 1. No negotiation.
When validation fails, the agent's instructions say:
On exit code 1:
1. Report the exact error to the user
2. Add note to project state
3. STOP - do not proceed to next state
Gates work because they're binary and external.
The agent can't partially pass a gate. The agent can't redefine what passing means. The gate script is the single source of truth.
For any quality-critical transition:
- Write a script that checks concrete properties
- Return exit code 1 on any failure
- Make the agent's instructions explicit: "Do not proceed if exit code is 1"
- Log failures to the state file for debugging
Some operations take longer than a session should last. Video encoding takes 30-60 minutes. Deep research API calls take 5-10 minutes. API rate limits require waiting.
If the agent waits, it:
- Consumes context window doing nothing
- Risks timeout or context confusion
- May restart the operation on resume
- Costs money for idle API connections
Design explicit "parking states" where the agent exits cleanly and resumes later.
GENERATING_VIDEO → AWAITING_VIDEO → PUBLISHING
│ ↑
└───────────────────┘
(start job, exit session)
The agent:
- Starts the long-running operation
- Records the job in state
- Transitions to the parking state
- Exits the session
- Next session: checks if operation is complete
- If complete, proceeds; if not, exits again
For video encoding:
# GENERATING_VIDEO state action
def start_video_encoding(project):
subprocess.Popen([
'ffmpeg', '-i', audio_path, '-i', cover_path,
'-c:v', 'libx264', video_path
])
project['video']['job_started_at'] = datetime.utcnow().isoformat()
transition_state(project, 'AWAITING_VIDEO')
print("Video encoding started. Exiting session.")
print("Run `python pipeline/state.py check-video PROJECT_ID` to check status.")
sys.exit(0)The check function verifies the video is complete and stable:
def check_video_stable(video_path: str, wait_seconds: int = 60) -> bool:
"""Check if video file size is stable (encoding complete)."""
if not os.path.exists(video_path):
return False
size1 = os.path.getsize(video_path)
time.sleep(wait_seconds)
size2 = os.path.getsize(video_path)
return size1 == size2 and size1 > 0Agents are batch jobs, not daemons. Design for session boundaries.
They start, they run, they stop. Long-running operations need explicit hand-off points where the agent can safely exit.
For any operation longer than ~5 minutes:
- Create a parking state (AWAITING_X)
- Record operation start in state file
- Exit the session cleanly with a status message
- On resume: check completion, proceed or exit again
- Make the check fast and idempotent
Agents make mistakes. Not occasionally—reliably. If the workflow publishes directly to production:
- Wrong metadata gets indexed by search engines
- Incomplete content reaches subscribers
- Fixing requires delete-and-reupload (breaking links)
- Human review happens after the damage
All publishing goes to a private/draft state first. Human review happens before anything is public.
PUBLISHING → REVIEW → COMPLETE
│ │
│ └── Human reviews, fixes, approves
└── Upload as PRIVATE
The REVIEW state provides commands for fixes:
python pipeline/review.py fix-title PROJECT_ID "Corrected Title"
python pipeline/review.py fix-desc PROJECT_ID --file new_description.txt
python pipeline/review.py fix-thumb PROJECT_ID --file new_thumbnail.jpg
python pipeline/review.py approve PROJECT_IDOnly after approval does content become public:
python pipeline/review.py publish PROJECT_IDThe upload script enforces private-first:
def upload_video(video_path, metadata, thumbnail_path):
"""Upload video to YouTube as PRIVATE."""
request_body = {
'snippet': {
'title': metadata['title'],
'description': metadata['description'],
'tags': metadata['tags']
},
'status': {
'privacyStatus': 'private', # ALWAYS private
'selfDeclaredMadeForKids': False
}
}
# Upload...
return video_idThere is no --public flag. The agent cannot upload as public. Only the review.py publish command changes visibility, and it requires explicit human invocation.
Build review into the architecture, not the process.
Telling agents to "review before publishing" doesn't work. Making it structurally impossible to publish without review does.
For any irreversible operation:
- Route through a draft/private state first
- Provide fix commands that work on the draft
- Require explicit human action to make it final
- Never give the agent a "skip review" escape hatch
Agents add creative flourishes. Given "write a YouTube description," the agent will include sections you didn't ask for, remove sections you need, add emoji, change formatting, and generally "improve" things each time.
This makes output inconsistent and breaks downstream processes that expect specific formats.
Give the agent a template with placeholders. The agent fills in values, but doesn't modify structure.
{title}
{one_sentence_summary}
{chapter_timestamps}
Read the translation: {blog_url}
Download the audio: {archive_url}
Source text and translation: {github_url}
Patrologia Latina Volume {volume}
The agent can't add a "SUBSCRIBE!" section because there's no placeholder for it. The agent can't add emoji because the template doesn't have any.
The description generator uses a template file:
def generate_description(template_path, values):
"""Generate description by filling template placeholders."""
template = Path(template_path).read_text()
# Validate all required placeholders are provided
placeholders = re.findall(r'\{(\w+)\}', template)
missing = [p for p in placeholders if p not in values]
if missing:
raise ValueError(f"Missing values: {missing}")
# Fill placeholders
for key, value in values.items():
template = template.replace(f'{{{key}}}', str(value))
return templateThe template is version-controlled. Changes to output format require changing the template, not re-prompting the agent.
Validation catches deviations:
def validate_description(content):
"""Check description follows template format."""
forbidden_patterns = [
r'RESOURCES',
r'SUBSCRIBE',
r'BIBLIOGRAPHY',
r'##', # Markdown headers
]
for pattern in forbidden_patterns:
if re.search(pattern, content):
return False, f"Forbidden pattern found: {pattern}"
return True, "Description follows template"Constrain the output space, not the process.
You can't reliably tell an agent "don't add extra sections." You can give it a template where adding sections would require modifying the template itself—which it can't do.
For any structured output:
- Create a template with explicit placeholders
- Have the agent fill placeholders, not generate structure
- Validate output matches expected format
- Reject output that deviates from template
Agents hallucinate. When asked to "use the research to write a blog post," the agent may:
- Summarize its own knowledge instead of the research file
- Mix research content with invented content
- Drop citations while keeping claims
- Produce plausible-looking content that's fabricated
The output looks scholarly but contains none of the actual research.
Validate that specific content from source files appears in output files. Don't check format—check provenance.
If research.md contains citations like [journals.openedition.org] and the blog post has zero matching citations, the blog was not generated from the research.
The blog validation script checks multiple provenance signals:
def validate_blog_content(blog_content: str, research_content: str):
"""Verify blog contains actual research content, not AI summary."""
errors = []
# 1. Check citation preservation
research_citations = extract_citations(research_content)
blog_citations = extract_citations(blog_content)
preserved = research_citations & blog_citations
if len(preserved) / len(research_citations) < 0.5:
errors.append("Citations from research not preserved in blog")
# 2. Check for distinctive research phrases
unique_phrases = extract_unique_phrases(research_content)
found_in_blog = sum(1 for p in unique_phrases if p in blog_content.lower())
if found_in_blog == 0:
errors.append("No distinctive research phrases found in blog")
# 3. Check for generic AI filler
generic_count = count_generic_phrases(blog_content)
if generic_count > 5:
errors.append(f"Found {generic_count} generic AI-filler phrases")
return len(errors) == 0, errorsGeneric phrases that indicate AI summarization rather than research extraction:
GENERIC_PHRASES = [
r'witnessed intensifying debates over',
r'these debates were not merely academic',
r'reflected genuine tensions within',
r'throughout Western Christendom',
r'one of the most prolific',
r'in recent decades',
r'some scholars have argued',
]If the blog contains generic academic-sounding filler instead of the specific claims and citations from the research, validation fails.
Validate content origin, not just content quality.
An agent can produce perfectly formatted, grammatically correct, topically relevant content that is entirely fabricated. Format validation doesn't catch this.
For any synthesis task:
- Identify unique markers in source content (citations, specific phrases, names)
- Verify those markers appear in output
- Flag generic filler that indicates summarization over extraction
- Reject output that doesn't demonstrate use of source material
| Pattern | Constraint Type | What It Prevents |
|---|---|---|
| State Machines | Workflow order | Skipping or reordering steps |
| Hard Gates | Quality enforcement | Proceeding with failed validation |
| Parking States | Session management | Wasted context on long operations |
| Private-First | Publication safety | Irreversible mistakes going live |
| Templates | Output structure | Creative deviation from format |
| Provenance | Content origin | Hallucinated content passing as research |
Each pattern removes a degree of freedom from the agent. Fewer decisions means fewer wrong decisions.
The goal is not to make agents smarter. The goal is to make agent behavior predictable.