Spec-driven development for Claude Code — turn requirements into reliable software.
🌐 Docs & guides: attune-ai.dev
A spec-driven development platform for Claude Code. Four pillars — AI workflows, project memory, retrieval grounding, and verification — turn requirements into reliable software. 20 workflows (17 multi-stage), 17 auto-triggering skills, and 41 MCP tools run specialist teams of 2–6 Claude subagents that review your code, surface vulnerabilities, generate tests, and plan refactors — grounded in your real source, with findings remembered across sessions. The same system doubles as the authoring and assistance toolkit for building and maintaining knowledge bases at scale.
Managing and creating help content and docs?
That's attune-gui
— a dedicated Living Docs dashboard wrapping attune-rag,
attune-help, and attune-author in a single UI. attune-ai is the
developer workflow hub; attune-gui is the docs hub.
| Package | Role | Install |
|---|---|---|
attune-ai |
Developer workflow hub (this package) | pip install attune-ai |
attune-gui |
Living Docs dashboard — create, manage, search help content | standalone app |
attune-rag |
RAG pipeline (core dep of attune-ai, v0.6+) | bundled |
attune-author |
Help content authoring, staleness detection | pip install 'attune-ai[author]' |
attune-help |
Progressive-depth template runtime | pip install attune-help |
attune-rag ships as a core dependency of attune-ai
(v0.1.11, >=0.1.5,<0.2). attune-help is standalone — not pulled
in by a standard attune-ai install, but available as an optional
corpus for attune-rag via pip install 'attune-rag[attune-help]'.
Say what you need in Claude Code and the right skill activates:
"review my code" → code-quality skill
"scan for vulns" → security-audit skill
"generate tests" → smart-test skill
"plan this feature" → planning skill
No command to remember. Claude reads your intent and picks the skill. Each skill runs a specialist multi-agent team, not a single prompt.
Every workflow dispatches 2–6 subagents in parallel. Each reads your
code with Read, Glob, and Grep. An orchestrator synthesizes
their findings into a unified result:
security-audit → vuln-scanner + secret-detector + auth-reviewer + remediation-planner
code-review → security + quality + perf + architect
test-gen → identifier + designer + writer
Subagents are assigned models by task complexity — Opus for deep reasoning, Sonnet for analysis, Haiku for fast scanning — keeping cost proportional to value.
Workflows ask questions before executing, not after. The spec
workflow brainstorms, then plans, then executes. planning clarifies
scope before writing a line of code. This eliminates the most common
failure mode: confidently solving the wrong problem.
attune-rag (core dep) grounds LLM generation in retrieved corpus
passages and enforces citation-per-claim, delivering 0.996 mean
per-claim faithfulness on the benchmark set — over 99% of generated
claims are grounded in their cited passages (under 1% hallucinated
per claim). The conservative per-query bucket rate (a single
ungrounded claim disqualifies the whole response) is 6.7%, down from
46.7% without the citation contract. Retrieved passages are wrapped
in sentinel tags to prevent prompt injection. The Claude provider
automatically caches the stable RAG context prefix, eliminating
repeated token costs across calls.
Most AI coding sessions start from zero. Attune ships a cross-session memory loop — every session ends by stashing its durable findings, and every new session can pull them back:
- Stash on stop — a
Stophook extracts decisions, bugs, and references from the session (local LLM when available, heuristic fallback) and writes them to the memory store: a local file by default, Redis Agent Memory Server withpip install 'attune-ai[redis]'. - Recall at the door — a
SessionStarthook surfaces the most recent findings for your project, and warns when the memory backend is unreachable instead of degrading silently. - Lessons at the trap moment — a
UserPromptSubmithook retrieves your project's engineering lessons (from.claude/lessons.mdorCLAUDE.md) when a prompt hits a known trap; aPreToolUsehook surfaces curated rules at the exact tool call they govern. /recall <topic>— on-demand search across both stores, results labeled[lesson]vs session finding.
Your memory, your corpus: the loop runs over your project's sessions and lessons file. We dogfood it on our own — 380+ engineering lessons retrieved via attune-rag at P@3 96% (100% on the high-severity subset) on a frozen trap-moment benchmark.
claude plugin marketplace add Smart-AI-Memory/attune-ai
claude plugin install attune-ai@attune-aiThen say "what can attune do?" in Claude Code.
pip install attune-aiThe core install includes the CLI, all workflows, and the MCP server. See Installation Options for per-surface extras (API-mode agents, ops dashboard, Redis memory).
| Capability | Plugin only | Plugin + pip |
|---|---|---|
| 15 auto-triggering skills | Yes | Yes |
| Security hooks | Yes | Yes |
| Prompt-based analysis | Yes | Yes |
| 41 MCP tools | -- | Yes |
attune CLI |
-- | Yes |
| Multi-agent workflows | -- | Yes |
| Help system maintenance | -- | Yes |
| CI/CD automation | -- | Yes |
Ops dashboard (attune ops) — run history, cost tiles, telemetry |
-- | Yes |
Note: Skills use your Claude subscription at no extra cost. CLI and MCP tools make direct Anthropic API calls — API key required. See API Mode.
| Input | What Happens |
|---|---|
| "what can attune do?" | Auto-triggers attune-hub — guided discovery |
| "build this feature from scratch" | Auto-triggers spec — brainstorm, plan, execute |
| "review my code" | Auto-triggers code-quality skill |
| "scan for vulnerabilities" | Auto-triggers security-audit skill |
| "generate tests for src/" | Auto-triggers smart-test skill |
| "fix failing tests" | Auto-triggers fix-test skill |
| "predict bugs" | Auto-triggers bug-predict skill |
| "generate docs" | Auto-triggers doc-gen skill |
| "plan this feature" | Auto-triggers planning skill |
| "refactor this module" | Auto-triggers refactor-plan skill |
| "prepare a release" | Auto-triggers release-prep skill |
| "tell me more" | Auto-triggers coach — progressive depth help |
| "run all workflows" | Auto-triggers workflow-orchestration skill |
| Workflow | Agents | What It Does |
|---|---|---|
| code-review | security, quality, perf, architect | 4-perspective code review |
| security-audit | vuln-scanner, secret-detector, auth-reviewer, remediation | Finds vulnerabilities and generates fix plans |
| deep-review | security, quality, test-gap | Multi-pass deep analysis |
| perf-audit | complexity, bottleneck, optimization | Identifies bottlenecks and O(n²) patterns |
| bug-predict | pattern-scanner, risk-correlator, prevention | Predicts likely failure points |
| health-check | dynamic team (2–6) | Project health across tests, deps, lint, CI, docs, security |
| test-gen | identifier, designer, writer | Writes pytest code for untested functions |
| test-audit | coverage, gap-analyzer, planner | Audits coverage and prioritizes gaps |
| doc-gen | outline, content, polish | Generates documentation from source |
| doc-audit | staleness, accuracy, gap-finder | Finds stale docs and drift |
| dependency-check | inventory, update-advisor | Audits outdated packages and advisories |
| refactor-plan | debt-scanner, impact, plan-generator | Plans large-scale refactors |
| simplify-code | complexity, simplification, safety | Proposes simplifications with safety review |
| release-prep | health, security, changelog, assessor | Go/no-go readiness check |
| doc-orchestrator | inventory, outline, content, polish | Full-project documentation |
| secure-release | security, health, dep-auditor, gater | Release pipeline with risk scoring |
| research-synthesis | summarizer, pattern-analyst, writer | Multi-source research synthesis |
| discovery-sweep | pattern-scanner, verifier | Repo-wide bug-pattern sweep with verification, dashboard chips, and run drill-in |
| rag-code-gen | retriever, generator | Citation-forced code generation grounded in the local attune-help corpus |
| orchestrated-health-check | dynamic team via meta-orchestration | Same intent as health-check with explicit meta-orchestration of the sub-team |
41 tools organized into 5 categories:
security_audit code_review bug_predict
performance_audit refactor_plan simplify_code
deep_review test_generation test_audit
test_gen_parallel doc_gen doc_audit
doc_orchestrator release_prep health_check
dependency_check secure_release research_synthesis
analyze_batch analyze_image rag_knowledge_query
help_lookup help_init help_status help_update
help_maintain
memory_store memory_retrieve memory_search
memory_forget
personal_memory_capture personal_memory_recall
personal_memory_topics personal_memory_forget
auth_status auth_recommend telemetry_stats
context_get context_set attune_get_level
attune_set_level
Measured on a 15-query golden set with retrieval held constant. The per-claim faithfulness score (how much of what the model says is grounded in cited passages) is the headline metric. The conservative per-query bucket rate (a single ungrounded claim disqualifies the whole response) is shown alongside for completeness — they measure related-but-different things, and the per-claim number is the right "how trustworthy is each statement" answer:
| Prompt variant | Per-claim faithfulness | Per-query hallucination |
|---|---|---|
| baseline (no grounding rule) | 0.938 | 46.67% |
| strict ("answer only from context") | 0.968 | 26.67% |
| citation (shipped default) | 0.996 | 6.67% |
The gain comes from the prompting contract (citation-per-claim), not from retrieval. Full methodology:
| Bucket | Count | P@1 | Notes |
|---|---|---|---|
| easy | 22 | 22/22 (100%) | feature-name synonyms |
| medium | 26 | 26/26 (100%) | paraphrases + industry terminology |
| hard | 4 | 0/4 (XFAIL) | shared-tag collisions — structural ambiguity |
| Attune AI | Static Docs | Agent Frameworks | Coding CLIs | |
|---|---|---|---|---|
| Ready-to-use workflows | 20 built-in | None | Build from scratch | None |
| Multi-agent teams | 2–6 agents per workflow | None | Yes | No |
| MCP integration | 41 native tools | None | No | No |
| Auto-triggering skills | 15 skills, natural language | None | None | None |
| Socratic discovery | Questions before execution | None | None | None |
| Portable security hooks | PreToolUse + PostToolUse | None | No | No |
pip install attune-ai works out of the box — the CLI, all
workflows, the MCP server, RAG, and the Agent SDK are core
dependencies. Add extras only for the surfaces you use:
| You want | Install |
|---|---|
| Everything most users need | pip install attune-ai |
| Claude API mode + LangChain/LangGraph agent teams | pip install 'attune-ai[developer]' |
The ops dashboard (attune ops) |
pip install 'attune-ai[ops]' |
| Redis / Agent Memory Server memory backend | pip install 'attune-ai[redis]' |
Help authoring (generate / maintain .help/ templates) |
pip install 'attune-ai[author]' |
Extras combine — for example
pip install 'attune-ai[developer,ops,redis]'. Keep the quotes:
zsh and bash treat square brackets as glob characters.
Contributing? Clone and install the dev toolchain instead:
git clone https://github.com/Smart-AI-Memory/attune-ai.git
cd attune-ai && pip install -e '.[dev]'The [rag] extra is a no-op alias kept for backward
compatibility — attune-rag is now a core dependency included in
every install.
export ANTHROPIC_API_KEY="sk-ant-..." # Required
export REDIS_URL="redis://localhost:6379" # Optional| Model | Agents | Rationale |
|---|---|---|
| Opus | security, vuln, architect | Deep reasoning |
| Sonnet | quality, plan, research | Balanced analysis |
| Haiku | complexity, lint, coverage | Fast scanning |
export ATTUNE_AGENT_MODEL_SECURITY=sonnet # Save cost
export ATTUNE_AGENT_MODEL_DEFAULT=opus # Max quality| Depth | Budget | Use Case |
|---|---|---|
quick |
$0.50 | Fast checks |
standard |
$2.00 | Normal analysis (default) |
deep |
$5.00 | Thorough multi-pass review |
export ATTUNE_MAX_BUDGET_USD=10.0 # OverrideOne-flag cheap mode for pattern-matching workflows (forces every inherit-default subagent onto Haiku; security/architect/plan/quality keywords still get their pinned model):
attune workflow run bug-predict --cheap # Haiku-default subagents
attune workflow run refactor-plan --cheapSee your spend live on the dashboard (attune ops → home) — today / 7-day
/ MTD / 30-day tiles fed from the Anthropic admin cost-report API.
- Path traversal protection on all file operations (CWE-22)
- Memory ownership checks (
created_byvalidation) - MCP rate limiting (60 calls/min per tool)
- Hook import restriction (
attune.*modules only) - PreToolUse security guard (blocks eval/exec, path traversal)
- Prompt input sanitization (backticks, control chars, truncation)
- PII scrubbing in telemetry
- Automated security scanning (CodeQL, bandit, detect-secrets)
See SECURITY.md for vulnerability reporting and full security details.
Attune AI keeps usage data local-first. An opt-in, anonymous usage ping is available to help the project understand which workflows people actually use — it is OFF by default and sends nothing unless you explicitly turn it on.
When enabled, each ping carries exactly this, and nothing more:
- the package (
attune-ai) and its version - the workflow name you ran (e.g.
workflow.security_audit) - your OS (
darwin/linux/windows) and Python version (e.g.3.12) - a rotating, anonymous install id (a random UUID you can reset)
- a timestamp
It never sends paths, code, prompts, arguments, filenames, project names, cost, tokens, or model data — the payload is frozen in source and guarded by a regression test. Transport is fire-and-forget with a short timeout, so it can never block, slow, or crash the CLI, and the collection endpoint stores no IP address and no request headers.
attune telemetry status # show exactly what would be sent
attune telemetry enable # opt in (mints an anonymous install id)
attune telemetry disable # opt outDO_NOT_TRACK=1 and ATTUNE_USAGE_PING=0 force it off regardless of
config; ATTUNE_USAGE_PING=1 forces it on. Full payload disclosure is
in SECURITY.md.
Lightweight hook surfaces keep long Claude Code sessions oriented and recoverable — and carry what you learned into the next one. All are opt-in via plugin install and silent until they have something to say.
| Surface | Event | When it fires |
|---|---|---|
spec_orient.py |
SessionStart |
On startup / resume / clear, prints up to 3 in-flight spec slugs. On compact, prints the most-recent spec body so the model keeps the spec in fresh post-compact context. |
compact_warning.py |
Stop |
Once per session when transcript size crosses ~70% of the context window. Emits a copy-pasteable resume prompt and recommends starting a fresh session. |
/handoff |
slash command | On demand. Prints the same resume prompt as the auto-warning AND appends it to ~/.attune/last-handoff.md so you can recover it later. |
session_stash.py |
Stop |
Once per session past a utilization floor: extracts durable findings (decisions, bugs, references) and stashes them to the memory store (file by default, Redis AMS when installed). |
session_recall.py |
SessionStart |
Surfaces the most recent cross-session findings for this project; warns when the configured memory backend is unreachable rather than degrading silently. |
lesson_recall.py |
UserPromptSubmit |
Surfaces up to 3 relevant lessons from the project's lessons corpus when a prompt matches a known trap moment; silent otherwise, once per (session, lesson). |
jit_recall.py |
PreToolUse |
Surfaces the curated rule governing a tool call at the decision point (e.g. release tagging), once per session. |
/recall <topic> |
slash command | On demand. Searches session findings and the lessons corpus, labels results by source, and names the answering backend. |
ATTUNE_AI_COMPACT_WARNING_THRESHOLD(default0.70) — fraction of context window before the warning fires.ATTUNE_AI_CHARS_PER_TOKEN(default4.0) — utilization estimator's chars-to-tokens factor.ATTUNE_AI_CONTEXT_WINDOW_TOKENS(default200000) — context window assumed by the estimator.ATTUNE_AI_WORKSPACE_ROOTS(os.pathsep-separated paths::on POSIX,;on Windows) — override the workspace roots scanned forspecs/.ATTUNE_AI_SENTINEL_DIR(default~/.attune) — directory for the once-per-session warning sentinel.ATTUNE_LESSON_RECALL/ATTUNE_JIT_RECALL(set0to disable) — off-switches for the prompt-time and tool-call recall hooks.ATTUNE_LESSON_RECALL_FLOOR(default8.0) — minimum retrieval score before a lesson surfaces at prompt time.
The transcript-size proxy is crude but monotonic: the warning
fires when the user's total content characters cross the
threshold once. If your real auto-compact triggers consistently
earlier or later than the warning, drop the threshold to 0.65
or raise it to 0.75.
attune-help and attune-author have moved to their own
marketplace at
Smart-AI-Memory/attune-docs.
If you previously installed either from the attune-ai marketplace:
-
/plugin marketplace add Smart-AI-Memory/attune-docs -
/plugin uninstall attune-help@attune-ai /plugin uninstall attune-author@attune-ai -
/plugin install attune-help@attune-docs /plugin install attune-author@attune-docs
New users: add Smart-AI-Memory/attune-docs directly.
- Full Documentation
- Plugin Setup
- attune-gui — Living Docs dashboard
- GitHub Repository
Apache License 2.0 — Free and open source.
If you find Attune useful, give it a star — it helps others discover the project.
- Anthropic — For Claude AI, the Model Context Protocol, and the Agent SDK patterns behind the multi-agent orchestration layer
- Boris Cherny — Creator of Claude Code, whose workflow posts validated Attune's plan-first, multi-agent approach
- Affaan Mustafa — For battle-tested Claude Code configurations that inspired the hook system
Built by Patrick Roebuck using Claude Code.