For: Engineering managers / staff engineers evaluating or operating this repo.
This repo is the Flow Studio demo harness: a backend and UI for visualizing agentic SDLC flows.
It contains:
- Flow Studio UI and FastAPI backend
- A governed demo harness (flows, runs, selftest, validation)
.claude/swarm definitions used as the specimen for this demo
For a portable .claude pack, see EffortlessMetrics/demo-swarm.
Status: early re-implementation of a proven pattern. See README.md § Status and RELEASE_NOTES_2_3_2.md § Stability Matrix.
These 12 docs are the canonical operator surface. Everything else is reference.
| # | Doc | Purpose | Time |
|---|---|---|---|
| 1 | README.md | Who this is for, what it is | 5 min |
| 2 | docs/GETTING_STARTED.md | Hands-on path (two lanes) | 15 min |
| 3 | CHEATSHEET.md | Quick reference for daily operators | 3 min |
| 4 | GLOSSARY.md | Terminology definitions | 5 min |
| 5 | docs/LEXICON.md | Canonical vocabulary (prevents noun-overload) | 3 min |
| 6 | docs/ROUTING_PROTOCOL.md | V3 routing model and decisions | 10 min |
| 7 | docs/SELFTEST_SYSTEM.md | Selftest architecture and tiers | 10 min |
| 8 | docs/FLOW_STUDIO.md | Visual UI guide | 10 min |
| 9 | REPO_MAP.md | Physical directory layout | 5 min |
| 10 | docs/VALIDATION_RULES.md | FR-001–FR-005 reference | 15 min |
| 11 | docs/AGENT_OPS.md | Agent management guide | 10 min |
| 12 | ARCHITECTURE.md | v3.0 system architecture | 15 min |
| 13 | docs/AGOPS_MANIFESTO.md | Operational philosophy (AgOps) | 20 min |
Total spine time: ~128 minutes for complete understanding.
If you're new, read these in order. Everything beyond is deepening, not new patterns.
Quick reference for day-to-day operations:
swarm/runbooks/10min-health-check.md— First-time setup and sanity checkswarm/runbooks/selftest-flowstudio-fastpath.md— Fast validation path
- FLOW_STUDIO.md — Visual UI guide and API reference
- FLOW_STUDIO_FIRST_EDIT.md — Your first edit walkthrough (15 min)
- FLOW_STUDIO_BUILD_YOUR_OWN.md — Build your own swarm guide (20 min)
- FLOW_STUDIO_API.md — REST API documentation
- FLOW_STUDIO_UX_HANDOVER.md — Handover for new owners
- Governed Surfaces — SDK contract and UIID selectors (see FLOW_STUDIO.md)
- ARCHITECTURE.md - v3.0 system architecture (cognitive hierarchy, components)
- AGOPS_MANIFESTO.md - AgOps operational philosophy (Factory Floor model)
- ROUTING_PROTOCOL.md - V3 routing model (CONTINUE, DETOUR, INJECT)
- ROADMAP_3_0.md - v3.0 roadmap and next steps
swarm/spec/contracts/- Inter-flow contract definitionsbuild_review_handoff.md- Build-to-Review handoff contract
- RUNTIME_BACKENDS.md - Backend architecture overview
- STEPWISE_BACKENDS.md - Per-step execution details
- AGENT_SDK_INTEGRATION.md - Agent SDK integration guide
- TRANSCRIPT_SCHEMA.md - Artifact format specification
- LONG_RUNNING_HARNESSES.md - Pattern mapping (Anthropic harness patterns)
- PLAN_STEPWISE_VNEXT.md - Executable Graph IR plan (legacy)
- WISDOM_SCHEMA.md — Wisdom summary JSON schema
- RUN_LIFECYCLE.md — Run management and retention
- SYSTEM_MAP.md — Single-page system overview
- PRE_DEMO_CHECKLIST.md — Demo preparation checklist
- VALIDATION_RULES.md — FR-001 through FR-005
- SELFTEST_SYSTEM.md — Tier system and governance checks
- SELFTEST_DEVELOPER_WORKFLOW.md — Local dev workflow
- EVALUATION_CHECKLIST.md — 1-hour checklist for team evaluation
- ADOPTING_SWARM_VALIDATION.md — 5-min TL;DR for adoption
- ADOPTION_PLAYBOOK.md — Full adoption guide
- SUPPORT.md — Engagement expectations and how to participate
- DEFINITION_OF_DONE.md — What "done" means for merging
- MERGE_CHECKLIST.md — Pre-merge verification checklist
- RELEASE_CHECKLIST.md — Release preparation checklist
- CI_TROUBLESHOOTING.md — Fixing CI failures
- CONTRIBUTING.md — How to contribute
Read first: docs/GETTING_STARTED.md
This is the fastest way to understand the demo. Two paths:
- Lane A: SDLC Demo — See flows in action with Flow Studio (7 min)
- Lane B: Governance Demo — Understand validation and selftest (7 min)
After 10 minutes, you've seen it work. Continue below for deeper understanding.
uv sync --extra dev
make dev-check
make demo-run
make flow-studio # → http://localhost:5000You should now have:
- A healthy swarm (
make dev-checkgreen) - A demo run under
swarm/runs/demo-health-check/ - Flow Studio open with 7 flows in the sidebar
Read DEMO_RUN.md (2–3 minutes) to understand the health-check scenario.
Open Flow Studio (http://localhost:5000) and keep it open while you read.
-
Open Flow Studio with Signal pre-selected:
http://localhost:5000/?flow=signal&run=demo-health-check -
Notice the step sequence:
parse→shape→requirements→bdd→assess→report -
Click a step node (teal) — the right panel shows:
- Step ID and role
- Where the spec lives:
swarm/flows/flow-signal.md - Where demo artifacts live:
swarm/runs/demo-health-check/signal/
-
Click an agent node (colored) — see its category, model, and file locations
Now open the spec file and compare:
cat swarm/flows/flow-signal.mdMatch the "Steps" table to the graph. They should tell the same story.
-
Open Flow Studio with Build pre-selected (or use keyboard shortcut
3):http://localhost:5000/?flow=build&run=demo-health-check -
Notice it's the heaviest flow — long chain from
branch→commit -
Look for the microloop clusters (test/critic, code/critic pairs)
-
Click step nodes to see how they map to agents
Open the spec and compare:
cat swarm/flows/flow-build.mdLook at:
- Artifact Paths: Where outputs go (
RUN_BASE/build/...) - Orchestration Strategy: How microloops work
- Steps table: Which agents run at each step
Goal: Understand that the graph is the spec, the spec is the graph.
The swarm has a layered validation system that catches misalignment early.
New? Start here: Read docs/VALIDATION_WALKTHROUGH.md for a narrative walkthrough of how validation works. You'll add a fake agent, make realistic mistakes, see the exact error messages, and learn why each check matters.
# While reading the walkthrough, follow along:
make validate-swarmuv run swarm/tools/validate_swarm.py --debugWatch what it checks:
- FR-001: Bijection (agents ↔ registry)
- FR-002: Frontmatter (required fields, color matches role)
- FR-003: Flow references (agents actually exist)
- FR-004: Skills (skill files exist)
- FR-005: RUN_BASE (placeholders, not hardcoded paths)
make selftest --plan # See the 16 steps
make selftest # Run them
make selftest-doctor # Diagnose issues if anyThe selftest has 3 tiers:
| Tier | Steps | Meaning |
|---|---|---|
| KERNEL | 1 step | Python lint + compile — must pass |
| GOVERNANCE | 13 steps | Agents, skills, flows, BDD, policy, stepwise, wisdom — should pass |
| OPTIONAL | 2 steps | Coverage thresholds, extras — nice to have |
The swarm protects itself with three complementary governance layers. Know which one to use:
1. Validator (validate_swarm.py): Static checks on metadata
uv run swarm/tools/validate_swarm.py --json | jq .summaryWhat it checks: Agent ↔ registry alignment, frontmatter schema, color matching, flow references, RUN_BASE paths.
When: Before committing changes to .claude/ or swarm/ directories.
Output: summary.status is PASS or FAIL. When it fails, tells you exactly which agent/flow/field caused the problem.
2. Selftest (selftest.py): Dynamic repo health
uv run swarm/tools/selftest.py --plan # Show plan
uv run swarm/tools/selftest.py # Full check
uv run swarm/tools/selftest.py --degraded # Work around GOVERNANCE failuresWhat it checks: Python tooling (ruff, compile), agent/flow/skill integrity, BDD structure, OPA policies, development experience contracts, graph connectivity.
When: Every build/CI, or before submitting work.
Exit codes:
0= All checks passed (strict mode)1= KERNEL or GOVERNANCE failure (strict), or KERNEL failure (degraded)2= Configuration error
3. Flow Studio (Visual status): Real-time artifact verification
make flow-studio
# Open http://localhost:5000
# Click "Governance" strip in header (when implemented) to see statusWhat it shows: Flow shapes, agent allocation, step timing, FR status overlays, degradation alerts.
When: After a run completes, to visualize what passed/failed and why.
Output: Interactive graph + artifact browser. When selftest reports degradations, Flow Studio flags them on nodes.
When you want to know if things are broken:
- Is it a typo/schema error? → Run
validate_swarm.py(3–5 seconds)- Is the system healthy? → Run
kernel-smoke.py(0.3–0.5 seconds)- What's the full story? → Run
selftest.py(10–30 seconds)- Visualize what happened? → Open Flow Studio and inspect the run
docs/VALIDATION_WALKTHROUGH.md→ Teaching walkthrough (learn by example)CLAUDE.md→ Validation section (detailed reference, error messages)docs/SELFTEST_SYSTEM.md→ Tier descriptions, troubleshooting, AC traceabilitydocs/SELFTEST_DEVELOPER_WORKFLOW.md→ Developer guide (local testing, CI, debugging)docs/SELFTEST_OWNERSHIP.md→ Ownership (maintainer contact, escalation, decision log)docs/SELFTEST_OBSERVABILITY_SPEC.md→ Observability (metrics, logging contracts)docs/SELFTEST_AC_MATRIX.md→ AC-to-step mapping (which AC is tested where)docs/OPERATOR_CHECKLIST.md→ Operator runbook (health checks, troubleshooting, escalation)docs/RECONCILIATION.md→ Spec-to-reality alignment (pytest + Gherkin)
Goal: Know what make dev-check actually validates.
See DEMO_RUN.md → Hands-On Tasks for three exercises:
- Change an agent model — Edit config, regenerate, verify
- Break the validator — Introduce a color mismatch, see the error, fix it
- Explore Flow Studio — Match the graph to the spec
These are small, reversible, and teach the key patterns.
- The 7 flows and their shapes (Signal light, Build heavy, Review/Gate/Deploy/Wisdom lean)
- How agents fit into steps (config → adapter → invocation)
- How validation works (FR-001..005, selftest tiers)
- How to make changes safely (edit config, regenerate, validate)
Everything beyond this is deepening, not new patterns.
| Command | What it does |
|---|---|
make dev-check |
Validate + kernel smoke test |
make demo-run |
Populate demo artifacts |
make flow-studio |
Visual graph of flows |
make validate-swarm |
Full FR-001..005 validation |
make selftest |
Run all 16 selftest steps |
make gen-adapters |
Regenerate agent .md from config |
make gen-flows |
Regenerate flow docs from config |
Flow Studio supports URL parameters for direct navigation:
| Parameter | Description | Example |
|---|---|---|
mode |
author or operator |
?mode=operator |
run |
Select a run | ?run=demo-health-check |
flow |
Select a flow | ?flow=build |
step |
Select a step (requires flow) | ?flow=build&step=implement |
view |
agents or artifacts |
?view=artifacts |
tour |
Start a guided tour | ?tour=walk-build |
tab |
Open details tab | ?tab=run |
Example URLs:
- Build flow in author mode:
http://localhost:5000/?flow=build&run=demo-health-check - Gate flow in operator mode:
http://localhost:5000/?mode=operator&flow=gate&run=demo-health-check - Start the governance tour:
http://localhost:5000/?tour=governance-path - View artifacts graph:
http://localhost:5000/?flow=build&view=artifacts&run=demo-health-check
| Path | Purpose |
|---|---|
swarm/config/flows/*.yaml |
Flow definitions (source of truth) |
swarm/config/agents/*.yaml |
Agent definitions (source of truth) |
swarm/flows/flow-*.md |
Flow specs (generated + prose) |
.claude/agents/*.md |
Agent adapters (generated) |
swarm/AGENTS.md |
Agent registry |
swarm/runs/demo-health-check/ |
Demo run artifacts |
docs/FLOW_STUDIO.md |
Flow Studio documentation |
docs/AGENT_OPS.md |
Agent management guide |
DEMO_RUN.md— See it workdocs/WHY_DEMO_SWARM.md— Understand the ideasdocs/VALIDATION_STORY.md— Why validation matters (1-2 page story)docs/VALIDATION_WALKTHROUGH.md— Learn validation through a realistic scenariodocs/FLOW_STUDIO.md— Flow Studio referencedocs/SELFTEST_SYSTEM.md— Governance tiers, AC traceability, Gherkin-to-pytest mappingdocs/SELFTEST_DEVELOPER_WORKFLOW.md— Local testing, CI integration, debugging guidedocs/SELFTEST_OWNERSHIP.md— Maintainer contact, escalation paths, decision logdocs/SELFTEST_OBSERVABILITY_SPEC.md— Metrics, logging, and observability contractsdocs/SELFTEST_AC_MATRIX.md— Complete AC-to-test traceabilitydocs/OPERATOR_CHECKLIST.md— Operator runbook & troubleshootingdocs/AGENT_OPS.md— Agent management guideCLAUDE.md— Full referenceARCHITECTURE.md— v3.0 system architecture (cognitive hierarchy, components)docs/AGOPS_MANIFESTO.md— AgOps operational philosophy (the "why" behind the design)docs/ROADMAP_3_0.md— v3.0 roadmap and immediate priorities