Cardo turns Claude Code into a full engineering organization: 11 specialized agents coordinate through a 9-phase delivery lifecycle to take GitHub issues from backlog to merged PR — with architecture review, security threat modeling, internal code review, and human-in-the-loop confidence gates at every step.
Proven in production: 11 sprints · 54 issues shipped · 19 PRs merged · $2–8 per issue estimated API cost
You open Claude Code and type:
/sprint-plan
Cardo queries your GitHub backlog, scores issues by RICE, presents a prioritized sprint plan, and asks for your approval. You approve. Then:
/implement 42
The engineering-manager agent kicks off a 9-phase workflow: spec → architecture review → security assessment → implementation (with TDD) → internal code review → PR. You get a PR with a structured summary and AC evidence table. Review it, merge it.
That's the loop. One human, 11 agents, one sprint at a time.
API cost runs ~$2–8 per issue depending on complexity and gate configuration:
| Profile | Per issue | Per sprint (6 issues) | Monthly (4 sprints) |
|---|---|---|---|
| Conservative (all advisory gates) | $4–8 | $24–48 | $96–192 |
| Balanced (arch/security advisory, code_review autonomous) | $3–6 | $18–36 | $72–144 |
| Permissive (shadow gates, code_review autonomous) | $2–4 | $12–24 | $48–96 |
1. Clone and configure
git clone https://github.com/mohammad-shaddad/cardo
cd cardo
cp cardo.config.example.yaml cardo.config.yaml
# Edit cardo.config.yaml: set your repo, owner, package manager, test/lint commands2. Install into your project
./install.sh /path/to/your-projectThis copies .claude/, .product/, .architecture/, and docs/ into your project and replaces all placeholder tokens with your config values.
3. Plan and run your first sprint
Open Claude Code in your project directory and type:
/sprint-plan
/implement N
That's it. The first autonomous PR will be open within the hour.
If an agent does something unexpected, stop everything in under 30 seconds:
gh workflow run emergency-stop.yml -f action=stop -f reason="Agent misbehaving on #42"This sets CARDO_PAUSED=true, cancels all running workflows, adds a paused label to every open agent-ready issue, and blocks all new agent activity. All event-driven workflows check this flag before running.
When you're ready to resume:
gh workflow run emergency-stop.yml -f action=resume -f reason="Issue resolved"Every gate decision is logged to .cardo/history.jsonl:
{"timestamp":"...","gate":"code_review","issue":"#42","sprint":3,"score":82,"threshold":70,"decision":"auto_approve","mode":"autonomous"}Query it with cardo audit report to see every autonomous decision made, with scores, thresholds, and outcomes. You can reconstruct exactly what happened on any issue.
When architecture or security gates flag a concern, the gate scores the issue, logs its recommendation to .cardo/history.jsonl, and the EM agent surfaces the finding in-session:
Architecture gate: score 68 / threshold 75
Recommendation: escalate (pattern precedent not established; touches 12 files)
Approve to proceed, or stop here for human review?
Nothing moves forward on flagged issues until you decide.
Every agent-ready issue runs through:
- Intake — parse the GitHub issue, extract hypothesis card and acceptance criteria
- Specification — PO agent expands the hypothesis into a full product spec
- Architecture — architect reviews approach, files touched, sensitive areas; gate evaluates
- Security — threat model via STRIDE (only when issue touches auth, APIs, or data flows); gate evaluates
- Design — UI/UX specs (only for issues with visual components)
- Implementation — engineer agent writes failing tests first, then implementation
- Internal review — code-reviewer checks every AC, test coverage, and privacy impact
- Build & verify — full test suite, lint, type-check must pass
- PR — structured PR with AC evidence table, implementation challenges, scope changes
Gates sit between phases and decide whether to auto-approve, recommend, or block:
| Gate | Default mode | What it checks |
|---|---|---|
| architecture | advisory | Friction score, pattern precedent, files touched, sensitive areas, new deps |
| security | advisory | Finding severity, boundary crossings, data classification, mitigation status |
| code_review | autonomous | Blockers, warnings, test pass, lint, diff size, coverage delta, file risk |
| acceptance | shadow | AC completeness, regression count — logged but never blocks |
Modes: autonomous = auto-approve when score ≥ threshold; advisory = recommend + surface in-session for human decision; shadow = compute and log, always proceed.
Edit .claude/confidence/gates.yaml to tune each gate's mode and threshold for your team's risk profile.
| Agent | Role |
|---|---|
| engineering-manager | Orchestrates all 9 phases; routes work; escalates blockers |
| product-owner | Product specs, acceptance criteria, RICE scoring, content tasks |
| architect | Architecture decisions, tech specs, ADRs |
| security-engineer | STRIDE threat modeling, OWASP compliance |
| code-reviewer | Pre-PR review — every AC must have evidence |
| platform-engineer | CI/CD, GitHub Actions, infrastructure |
| mobile-engineer | Mobile implementation (TDD) |
| web-engineer | Web implementation (TDD) |
| backend-engineer | Backend/API implementation (TDD) |
| designer | UI/UX specs, component design |
| sre-engineer | Incident response, SLO monitoring |
Not every project needs every agent. The EM routes to the appropriate agents based on what the issue touches. A CLI tool project never invokes the designer or mobile-engineer.
All commands are invoked directly in a Claude Code session:
| Command | What it does |
|---|---|
/sprint-plan |
Select issues, create milestone, set up worktrees |
/sprint-execute |
Dispatch all sprint issues in parallel (one PR each) |
/implement N |
Implement issue #N end-to-end (9 phases) |
/review N |
Code review PR #N against acceptance criteria |
/merge N |
Merge PR #N, close issue, clean up worktree |
/sprint-close |
Sprint retrospective, archival, carry-over |
/status |
Executive status report + dashboard refresh |
/reprioritize |
RICE-based backlog re-prioritization (tri-agent) |
/bug <desc> |
Report a bug — tri-agent assessment, RICE score, GitHub issue |
/feature <desc> |
Request a feature — tri-agent assessment, RICE score, GitHub issue |
/analyze |
Observability analysis — health report with action items |
After running ./install.sh /path/to/your-project, Cardo adds:
your-project/
├── .claude/
│ ├── agents/ # 11 subagent definitions
│ ├── skills/ # Sprint, review, status, and more
│ ├── confidence/ # gates.yaml (pre-configured, edit to tune)
│ ├── guides/ # Commit rules, session management
│ └── rules/ # Path-scoped workflow protection
├── .product/
│ ├── backlog.md # Issue backlog with RICE scores
│ └── rice-scores.md # Prioritized scoring sheet
├── .architecture/ # ADRs, tech specs, security assessments
├── docs/ # Status dashboard and reports
└── CLAUDE.md # Project instructions (generated from your config)
Edit cardo.config.yaml before running install.sh:
project:
name: "My App"
spec_prefix: "APP"
app_dir: "src/"
github:
owner: "my-org"
repo: "my-app"
commands:
pkg_manager: "bun" # npm | yarn | pnpm | bun
test: "bun test"
lint: "bun run lint"
typecheck: "bun run typecheck"All autonomous workflows check CARDO_PAUSED at startup. The pattern is enforced in .claude/rules/workflow-protection.md — any new workflow must follow it.
# Every autonomous workflow starts with:
- name: Check pause state
run: |
if [ "${{ vars.CARDO_PAUSED }}" = "true" ]; then
echo "Cardo is paused. Skipping." && exit 0
fiThe pause/resume workflow (emergency-stop.yml) is the only way to change CARDO_PAUSED. It requires a human to trigger it via GitHub Actions UI or CLI.
Cardo was developed for sama3, an Arabic-first podcast app built with React Native/Expo. Over 11 sprints and 54 shipped issues, the framework evolved from ad-hoc agent coordination into a structured, repeatable delivery system.
This repo is the extraction of that framework into a reusable pack for any Claude Code project.