Skip to content

Commit d06c01a

Browse files
yelbanclaude
andcommitted
Improve SKILL.md v1.1.0: token budgeting, model allocation, Mermaid, language-agnostic
Compared schematic with the Cartographer skill and identified four areas where Cartographer's engineering rigor could strengthen schematic while preserving its core advantages (semantic grouping, deep inference, cross-validation). Changes to SKILL.md: - Phase 1: Add token budget estimation step (wc -c on full diff) with tiered agent scaling strategy (<50k → 1 agent, 50k-200k → 2-3, 200k+ → 3-4) - Phase 2: Explicitly specify Sonnet subagents for file reading/analysis with Opus orchestrating (plan, synthesize, infer) — cost/capability optimization - Phase 3: Replace hardcoded '*.ts' '*.tsx' with language-agnostic git diff --name-only, removing TypeScript/React assumption - Phase 4: Replace ASCII diagram instructions with Mermaid (graph TB for system diagrams, sequenceDiagram for data lifecycle) — native GitHub rendering Also adds: - CLAUDE.md: Project summary for Claude Code context - docs/CODEBASE_MAP.md: Full codebase map (Cartographer output) - docs/COMPARISON_schematic_vs_cartographer.md: Detailed skill comparison Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent ca0431a commit d06c01a

4 files changed

Lines changed: 243 additions & 6 deletions

File tree

CLAUDE.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
# Schematic
2+
3+
A Claude Code / Codex skill that reverse-engineers detailed product & technical spec documents from git branch implementations. Uses 2-4 parallel agents to analyze changed files, cross-checks for gaps, and outputs an 11-section structured specification.
4+
5+
**Stack**: Markdown-driven skill (no executable code) — SKILL.md is the "program"
6+
**Structure**: `SKILL.md` (core workflow) + `README.md` (user guide) + `.claude/settings.local.json` (permissions)
7+
8+
For detailed architecture, see [docs/CODEBASE_MAP.md](docs/CODEBASE_MAP.md).

SKILL.md

Lines changed: 21 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,8 @@ description: |
99
branch does". Produces a structured markdown spec covering problem statement, product requirements,
1010
architecture, technical design, file inventories, testing strategy, rollout plan, and risks.
1111
author: Codex
12-
version: 1.0.0
13-
date: 2026-02-15
12+
version: 1.1.0
13+
date: 2026-02-16
1414
tags: [documentation, git, branch-analysis, spec, reverse-engineering]
1515
---
1616

@@ -46,8 +46,16 @@ git diff --stat <base>...HEAD
4646

4747
# 3. Count the scale
4848
git diff --stat <base>...HEAD | tail -1
49+
50+
# 4. Estimate diff token budget (chars / 4 ≈ tokens)
51+
git diff <base>...HEAD | wc -c
4952
```
5053

54+
**Agent scaling by token budget:**
55+
- **<50k tokens** → single agent (read all diffs directly)
56+
- **50k–200k tokens** → 2–3 agents
57+
- **200k+ tokens** → 3–4 agents, max ~150k tokens per agent
58+
5159
From the diff stats, categorize files into groups:
5260
- **Core implementation** (new modules, business logic)
5361
- **Integration points** (modified selectors, reducers, hooks, components)
@@ -60,6 +68,11 @@ From the diff stats, categorize files into groups:
6068
Launch 2-4 parallel exploration agents, each focused on a different file group. This is
6169
critical for efficiency — reading 50+ files sequentially is too slow.
6270

71+
**Model allocation:** Use `subagent_type: "Explore"` with `model: "sonnet"` for all
72+
exploration agents. Sonnet handles file reading and analysis (best cost/capability ratio).
73+
The orchestrating model (Opus) plans assignments, synthesizes reports, and infers product
74+
motivation — it should never read diff files directly.
75+
6376
**Agent 1: Core Implementation**
6477
- All new files (the heart of the feature)
6578
- Focus on: purpose, key types, exported functions, data flow, inter-module connections
@@ -87,8 +100,8 @@ Each agent prompt should ask for:
87100
After agents return, diff the analyzed files against the full file list:
88101

89102
```bash
90-
# List all non-test changed files
91-
git diff --stat <base>...HEAD -- '*.ts' '*.tsx' | awk '{print $1}' | sort
103+
# List all changed files (language-agnostic)
104+
git diff --name-only <base>...HEAD | sort
92105

93106
# Show small diffs for any files not yet analyzed
94107
git diff <base>...HEAD -- <uncovered-files>
@@ -128,10 +141,12 @@ What is and isn't included.
128141

129142
## 4. Architecture
130143
### 4.1 System Diagram
131-
ASCII diagram showing component relationships and data flow.
144+
Mermaid `graph TB` diagram showing component relationships and data flow.
145+
(Mermaid renders natively on GitHub/GitLab — prefer over ASCII.)
132146

133147
### 4.2 Data Lifecycle
134-
Step-by-step flow from initial state through steady state.
148+
Mermaid `sequenceDiagram` or step-by-step description showing flow
149+
from initial state through steady state.
135150

136151
## 5. Technical Design
137152
Subsections for each major design decision:

docs/CODEBASE_MAP.md

Lines changed: 146 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,146 @@
1+
---
2+
last_mapped: 2026-02-16T13:39:48Z
3+
total_files: 4
4+
total_tokens: 2334
5+
---
6+
7+
# Codebase Map
8+
9+
> Auto-generated by Cartographer. Last mapped: 2026-02-16T13:39:48Z
10+
11+
## System Overview
12+
13+
Schematic is a Claude Code / Codex **skill** that reverse-engineers detailed product & technical spec documents from git branch diffs. It spawns 2-4 parallel agents to read changed files, cross-checks for gaps, and outputs an 11-section structured specification.
14+
15+
```mermaid
16+
graph TB
17+
subgraph Trigger
18+
User["User Request"]
19+
end
20+
21+
subgraph "Phase 1: Scope"
22+
Git["git log / git diff --stat"]
23+
Classify["File Classification"]
24+
end
25+
26+
subgraph "Phase 2: Parallel Exploration"
27+
A1["Agent 1: Core Implementation"]
28+
A2["Agent 2: Integration Points"]
29+
A3["Agent 3: Tests"]
30+
A4["Agent 4: Config & Infra"]
31+
end
32+
33+
subgraph "Phase 3-5: Synthesis"
34+
CrossCheck["Cross-Check Gaps"]
35+
WriteSpec["Write 11-Section Spec"]
36+
Verify["Verify Completeness"]
37+
end
38+
39+
Output["docs/ spec document"]
40+
41+
User --> Git --> Classify
42+
Classify --> A1 & A2 & A3 & A4
43+
A1 & A2 & A3 & A4 --> CrossCheck --> WriteSpec --> Verify --> Output
44+
```
45+
46+
## Directory Structure
47+
48+
```
49+
schematic/
50+
├── .claude/
51+
│ └── settings.local.json # Claude Code permissions (allows uv run:*)
52+
├── docs/
53+
│ └── CODEBASE_MAP.md # This file
54+
├── LICENSE # MIT License
55+
├── README.md # User-facing docs: install & usage
56+
└── SKILL.md # Core skill definition (11-section spec workflow)
57+
```
58+
59+
## File Guide
60+
61+
| File | Purpose | Tokens |
62+
|------|---------|--------|
63+
| `.claude/settings.local.json` | Execution permissions — allows `uv run:*` Bash commands | 26 |
64+
| `LICENSE` | MIT License (Copyright 2026) | 217 |
65+
| `README.md` | User-facing install/usage guide for Claude Code & Codex | 380 |
66+
| `SKILL.md` | Full skill specification: 5-phase workflow, 11-section output template | 1,711 |
67+
68+
### SKILL.md — Core Skill Definition
69+
70+
**Purpose**: The "source code" of this project. Defines the complete workflow Claude Code/Codex follows when triggered.
71+
72+
**Key Sections**:
73+
- **Metadata**: name, version (1.0.0), tags
74+
- **Trigger conditions**: phrases like "analyze this branch", "reverse engineer a spec"
75+
- **5-Phase workflow**:
76+
1. Scope the branch (git diff --stat, file classification)
77+
2. Parallel deep exploration (2-4 agents reading file groups)
78+
3. Cross-check for gaps (compare analyzed vs full file list)
79+
4. Write the spec (11 structured sections)
80+
5. Verify completeness
81+
- **11 Output Sections**: Problem Statement, Solution Overview, Product Requirements, Architecture, Technical Design, New Files, Modified Files, Testing Strategy, Rollout Strategy, Risks, Summary
82+
- **Validation criteria**: every changed file documented, architecture diagrams match data flow, product requirements match test assertions
83+
84+
**File Classification Categories**:
85+
- Core implementation (new modules, business logic)
86+
- Integration points (selectors, reducers, hooks, components)
87+
- Tests (unit, integration, e2e)
88+
- Configuration (feature flags, env vars, types)
89+
- Incidental (formatting, imports, minor refactors)
90+
91+
### README.md — User Guide
92+
93+
**Purpose**: Quick-start guide for installing and using the skill.
94+
95+
**Install paths**:
96+
- Claude Code: `~/.claude/skills/schematic`
97+
- Codex: `~/.codex/skills/schematic`
98+
99+
**Trigger phrases**: "Reverse engineer a spec", "Analyze this branch", "Write a spec from the code", "Document what this branch does"
100+
101+
## Data Flow
102+
103+
```mermaid
104+
sequenceDiagram
105+
participant U as User
106+
participant CC as Claude Code/Codex
107+
participant Git as Git CLI
108+
participant Agents as Parallel Agents
109+
110+
U->>CC: "Analyze this branch"
111+
CC->>Git: git log --oneline base..HEAD
112+
CC->>Git: git diff --stat base...HEAD
113+
Git-->>CC: Diff stats & file list
114+
CC->>CC: Classify files into groups
115+
CC->>Agents: Spawn 2-4 parallel read agents
116+
Agents->>Agents: Read & analyze assigned files
117+
Agents-->>CC: Analysis reports
118+
CC->>Git: git diff base...HEAD -- uncovered-files
119+
Git-->>CC: Remaining diffs
120+
CC->>CC: Synthesize 11-section spec
121+
CC->>CC: Verify all files documented
122+
CC-->>U: Spec document in docs/
123+
```
124+
125+
## Conventions
126+
127+
- **No executable code**: The entire project is markdown-driven; SKILL.md serves as the "program" for Claude Code/Codex
128+
- **Parallel-first**: Design principle — always prefer multi-agent parallel analysis over sequential
129+
- **Table-heavy output**: File lists, requirements, risks all use markdown tables for scannability
130+
- **Three-dot diff**: Uses `git diff base...HEAD` (merge-base) for branch comparison
131+
- **Structured output**: Fixed 11-section template ensures consistency across runs
132+
133+
## Gotchas
134+
135+
1. **No Python scripts exist** despite `settings.local.json` allowing `uv run:*` — this permission is reserved for potential future use or scanner scripts
136+
2. **Examples assume TypeScript/React/Redux** — file classification categories (selectors, reducers, hooks, components) are framework-specific; other stacks may need adapted categories
137+
3. **No error handling defined** — SKILL.md doesn't specify what to do when git commands fail or agents return incomplete data
138+
4. **Linear git history assumed** — complex merge histories may produce confusing diffs
139+
5. **Branch context required** — the skill assumes the user is on the correct branch when triggered
140+
141+
## Navigation Guide
142+
143+
**To modify the skill workflow**: Edit `SKILL.md` — all 5 phases and the 11-section output template are defined there
144+
**To change execution permissions**: Edit `.claude/settings.local.json`
145+
**To update user-facing docs**: Edit `README.md`
146+
**To add a new output section**: Add to the Phase 4 section template in `SKILL.md`
Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
# Schematic vs Cartographer 比較
2+
3+
## 總覽
4+
5+
| 維度 | **Schematic** | **Cartographer** |
6+
|------|--------------|-----------------|
7+
| **目標** | 從 git branch diff 反向工程出產品/技術規格文件 | 映射整個 codebase 的架構與檔案用途 |
8+
| **輸入** | Git branch(`git diff base...HEAD`| 整個 codebase(所有檔案) |
9+
| **輸出** | 11 章節規格文件(Problem、Architecture、Risks...) | `docs/CODEBASE_MAP.md` + 更新 `CLAUDE.md` |
10+
| **分析對象** | **變更了什麼**(diff-centric) | **現在有什麼**(snapshot-centric) |
11+
| **觸發詞** | "analyze this branch"、"reverse engineer a spec" | "map this codebase"、"cartographer" |
12+
13+
## 工作流程比較
14+
15+
| 階段 | **Schematic(5 phases)** | **Cartographer(8 steps)** |
16+
|------|--------------------------|----------------------------|
17+
| 1 | Scope branch(git diff --stat) | Check existing map(增量更新偵測) |
18+
| 2 | Parallel exploration(2-4 agents by 檔案類型) | Scan codebase(Python 腳本算 token) |
19+
| 3 | Cross-check gaps(找遺漏檔案) | Plan subagent assignments(按 token 預算分組) |
20+
| 4 | Write spec(11 sections) | Spawn Sonnet subagents |
21+
| 5 | Verify completeness | Synthesize → Write map → Update CLAUDE.md |
22+
23+
## 平行化策略差異
24+
25+
**Schematic** — 按**檔案語意角色**分組:
26+
- Agent 1: Core implementation(新檔案)
27+
- Agent 2: Integration points(修改的 hooks/selectors)
28+
- Agent 3: Tests
29+
- Agent 4: Config & infra
30+
31+
**Cartographer** — 按**目錄 + token 預算**分組:
32+
- 每個 agent ≤150k tokens
33+
- 按目錄/模組分組,保持相關程式碼在一起
34+
- 強制規定 **Opus 不讀檔、Sonnet 讀檔**
35+
36+
## 設計哲學差異
37+
38+
| 面向 | **Schematic** | **Cartographer** |
39+
|------|--------------|-----------------|
40+
| **推斷層次** | 深度推斷「為什麼」(從 tests/comments 推產品動機) | 描述「是什麼」(檔案用途、exports、依賴) |
41+
| **輔助工具** | 純 git 指令,無外部腳本 | 自帶 Python 掃描腳本(`scan-codebase.py`,用 tiktoken 算 token) |
42+
| **增量更新** | 無(每次完整分析分支) | 有(偵測 `last_mapped` 後的 git 變更,只更新變動部分) |
43+
| **模型指定** | 未指定用哪個模型 | 明確指定 Sonnet subagent(成本/能力平衡) |
44+
| **驗證機制** | Phase 3 交叉檢查 + Phase 5 完整性驗證 | 依賴 subagent 報告合併,無顯式驗證步驟 |
45+
| **輸出格式** | 固定 11 章節模板,重「推斷」 | Mermaid 圖 + 表格 + Navigation Guide,重「導航」 |
46+
47+
## 互補性
48+
49+
兩者解決不同問題,互補而非競爭:
50+
51+
- **Cartographer**:「這個 codebase 長什麼樣?」→ 全景地圖,適合 onboarding
52+
- **Schematic**:「這個 branch 做了什麼?」→ 差異分析,適合 PR review / 事後文件
53+
54+
一個典型工作流是:先用 Cartographer 建立全景認知,再用 Schematic 分析特定 branch 的變更。
55+
56+
## Schematic 可以借鑑的地方
57+
58+
1. **增量更新機制** — Cartographer 的 `last_mapped` + git log 偵測很實用
59+
2. **Python 掃描腳本** — token 預算規劃比盲目分配 agents 更精確
60+
3. **明確指定 subagent 模型** — 控制成本
61+
4. **輸出含 Mermaid 圖** — 比 ASCII 圖更美觀且 GitHub 原生支援
62+
63+
## Cartographer 可以借鑑的地方
64+
65+
1. **語意分組策略** — 按角色(core/integration/tests)而非純目錄分組,能產生更有洞察力的分析
66+
2. **交叉驗證步驟** — 明確的 gap-finding phase 避免遺漏
67+
3. **推斷「為什麼」** — 不只描述程式碼做什麼,還推斷產品動機
68+
4. **完整性驗證 checklist** — 確保輸出與輸入一一對應

0 commit comments

Comments
 (0)