Skip to content

mohammad-shaddad/cardo

Repository files navigation

Cardo — Autonomous Sprint Delivery for Claude Code

Cardo turns Claude Code into a full engineering organization: 11 specialized agents coordinate through a 9-phase delivery lifecycle to take GitHub issues from backlog to merged PR — with architecture review, security threat modeling, internal code review, and human-in-the-loop confidence gates at every step.

Proven in production: 11 sprints · 54 issues shipped · 19 PRs merged · $2–8 per issue estimated API cost


What it looks like

You open Claude Code and type:

/sprint-plan

Cardo queries your GitHub backlog, scores issues by RICE, presents a prioritized sprint plan, and asks for your approval. You approve. Then:

/implement 42

The engineering-manager agent kicks off a 9-phase workflow: spec → architecture review → security assessment → implementation (with TDD) → internal code review → PR. You get a PR with a structured summary and AC evidence table. Review it, merge it.

That's the loop. One human, 11 agents, one sprint at a time.


The cost model

API cost runs ~$2–8 per issue depending on complexity and gate configuration:

Profile Per issue Per sprint (6 issues) Monthly (4 sprints)
Conservative (all advisory gates) $4–8 $24–48 $96–192
Balanced (arch/security advisory, code_review autonomous) $3–6 $18–36 $72–144
Permissive (shadow gates, code_review autonomous) $2–4 $12–24 $48–96

60-second quickstart

1. Clone and configure

git clone https://github.com/mohammad-shaddad/cardo
cd cardo
cp cardo.config.example.yaml cardo.config.yaml
# Edit cardo.config.yaml: set your repo, owner, package manager, test/lint commands

2. Install into your project

./install.sh /path/to/your-project

This copies .claude/, .product/, .architecture/, and docs/ into your project and replaces all placeholder tokens with your config values.

3. Plan and run your first sprint

Open Claude Code in your project directory and type:

/sprint-plan
/implement N

That's it. The first autonomous PR will be open within the hour.


What happens when it goes wrong

Emergency stop

If an agent does something unexpected, stop everything in under 30 seconds:

gh workflow run emergency-stop.yml -f action=stop -f reason="Agent misbehaving on #42"

This sets CARDO_PAUSED=true, cancels all running workflows, adds a paused label to every open agent-ready issue, and blocks all new agent activity. All event-driven workflows check this flag before running.

When you're ready to resume:

gh workflow run emergency-stop.yml -f action=resume -f reason="Issue resolved"

Audit trail

Every gate decision is logged to .cardo/history.jsonl:

{"timestamp":"...","gate":"code_review","issue":"#42","sprint":3,"score":82,"threshold":70,"decision":"auto_approve","mode":"autonomous"}

Query it with cardo audit report to see every autonomous decision made, with scores, thresholds, and outcomes. You can reconstruct exactly what happened on any issue.

Advisory gates — you stay in the loop

When architecture or security gates flag a concern, the gate scores the issue, logs its recommendation to .cardo/history.jsonl, and the EM agent surfaces the finding in-session:

Architecture gate: score 68 / threshold 75
Recommendation: escalate (pattern precedent not established; touches 12 files)
Approve to proceed, or stop here for human review?

Nothing moves forward on flagged issues until you decide.


How it works

The 9-phase workflow

Every agent-ready issue runs through:

  1. Intake — parse the GitHub issue, extract hypothesis card and acceptance criteria
  2. Specification — PO agent expands the hypothesis into a full product spec
  3. Architecture — architect reviews approach, files touched, sensitive areas; gate evaluates
  4. Security — threat model via STRIDE (only when issue touches auth, APIs, or data flows); gate evaluates
  5. Design — UI/UX specs (only for issues with visual components)
  6. Implementation — engineer agent writes failing tests first, then implementation
  7. Internal review — code-reviewer checks every AC, test coverage, and privacy impact
  8. Build & verify — full test suite, lint, type-check must pass
  9. PR — structured PR with AC evidence table, implementation challenges, scope changes

Confidence gates

Gates sit between phases and decide whether to auto-approve, recommend, or block:

Gate Default mode What it checks
architecture advisory Friction score, pattern precedent, files touched, sensitive areas, new deps
security advisory Finding severity, boundary crossings, data classification, mitigation status
code_review autonomous Blockers, warnings, test pass, lint, diff size, coverage delta, file risk
acceptance shadow AC completeness, regression count — logged but never blocks

Modes: autonomous = auto-approve when score ≥ threshold; advisory = recommend + surface in-session for human decision; shadow = compute and log, always proceed.

Edit .claude/confidence/gates.yaml to tune each gate's mode and threshold for your team's risk profile.

The agent roster

Agent Role
engineering-manager Orchestrates all 9 phases; routes work; escalates blockers
product-owner Product specs, acceptance criteria, RICE scoring, content tasks
architect Architecture decisions, tech specs, ADRs
security-engineer STRIDE threat modeling, OWASP compliance
code-reviewer Pre-PR review — every AC must have evidence
platform-engineer CI/CD, GitHub Actions, infrastructure
mobile-engineer Mobile implementation (TDD)
web-engineer Web implementation (TDD)
backend-engineer Backend/API implementation (TDD)
designer UI/UX specs, component design
sre-engineer Incident response, SLO monitoring

Not every project needs every agent. The EM routes to the appropriate agents based on what the issue touches. A CLI tool project never invokes the designer or mobile-engineer.


Sprint commands

All commands are invoked directly in a Claude Code session:

Command What it does
/sprint-plan Select issues, create milestone, set up worktrees
/sprint-execute Dispatch all sprint issues in parallel (one PR each)
/implement N Implement issue #N end-to-end (9 phases)
/review N Code review PR #N against acceptance criteria
/merge N Merge PR #N, close issue, clean up worktree
/sprint-close Sprint retrospective, archival, carry-over
/status Executive status report + dashboard refresh
/reprioritize RICE-based backlog re-prioritization (tri-agent)
/bug <desc> Report a bug — tri-agent assessment, RICE score, GitHub issue
/feature <desc> Request a feature — tri-agent assessment, RICE score, GitHub issue
/analyze Observability analysis — health report with action items

Installing into your own project

After running ./install.sh /path/to/your-project, Cardo adds:

your-project/
├── .claude/
│   ├── agents/          # 11 subagent definitions
│   ├── skills/          # Sprint, review, status, and more
│   ├── confidence/      # gates.yaml (pre-configured, edit to tune)
│   ├── guides/          # Commit rules, session management
│   └── rules/           # Path-scoped workflow protection
├── .product/
│   ├── backlog.md       # Issue backlog with RICE scores
│   └── rice-scores.md   # Prioritized scoring sheet
├── .architecture/       # ADRs, tech specs, security assessments
├── docs/                # Status dashboard and reports
└── CLAUDE.md            # Project instructions (generated from your config)

Configuring for your stack

Edit cardo.config.yaml before running install.sh:

project:
  name: "My App"
  spec_prefix: "APP"
  app_dir: "src/"

github:
  owner: "my-org"
  repo: "my-app"

commands:
  pkg_manager: "bun"   # npm | yarn | pnpm | bun
  test: "bun test"
  lint: "bun run lint"
  typecheck: "bun run typecheck"

Emergency stop (detail)

All autonomous workflows check CARDO_PAUSED at startup. The pattern is enforced in .claude/rules/workflow-protection.md — any new workflow must follow it.

# Every autonomous workflow starts with:
- name: Check pause state
  run: |
    if [ "${{ vars.CARDO_PAUSED }}" = "true" ]; then
      echo "Cardo is paused. Skipping." && exit 0
    fi

The pause/resume workflow (emergency-stop.yml) is the only way to change CARDO_PAUSED. It requires a human to trigger it via GitHub Actions UI or CLI.


Origin

Cardo was developed for sama3, an Arabic-first podcast app built with React Native/Expo. Over 11 sprints and 54 shipped issues, the framework evolved from ad-hoc agent coordination into a structured, repeatable delivery system.

This repo is the extraction of that framework into a reusable pack for any Claude Code project.

Adoption Guide | Vision | Architecture

About

Multi-agent orchestration framework for Claude Code. Signals in, software out.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors