Skip to content

Commit 5b98147

Browse files
IronAdamantclaude
andcommitted
Docs: rewrite ARCHITECTURE.md and CONTRIBUTING.md, fix 22 discrepancies
ARCHITECTURE.md was frozen at v0.1.0 — complete rewrite with current architecture (13 modules, 10 tables, 15 tools, 12 languages, risk formula, batch queries, pluggable extractors, cross-platform locks). Removed stale Stele integration code that was never implemented. CONTRIBUTING.md: Python 3.11+, 553 tests, all 13 modules listed, dispatch table reference corrected (schemas.py not mcp_server.py), added extractor plugin guide. Other fixes: engine dep graph missing metrics.py, mcp_server.py description still claiming schemas ownership, "18 CLI subcommands" → 17, branch-aware non-goal corrected, test-gaps positional arg, glossary skip dirs missing vendor/Pods. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent a22734b commit 5b98147

6 files changed

Lines changed: 173 additions & 221 deletions

File tree

ARCHITECTURE.md

Lines changed: 132 additions & 199 deletions
Original file line numberDiff line numberDiff line change
@@ -1,25 +1,19 @@
1-
# Chisel — Test Impact & Code Intelligence for LLM Agents
1+
# Chisel — Architecture
22

3-
Zero-dependency companion to [Stele](../Stele/). Maps tests to code, code to git history, and answers: "what to run, what's risky, who touched it."
3+
**Version:** 0.6.1 | **Python:** >= 3.11 | **Dependencies:** zero (stdlib only)
44

5-
## Problem
6-
7-
An LLM agent changes `engine.py:store_document()`. It then either:
8-
- Runs **all** 287 tests (slow, wasteful), or
9-
- Guesses with `-k "test_store"` (misses regressions)
10-
11-
It also has no idea whether this function is stable (untouched for months) or a hotspot (changed 15 times this week).
5+
Test impact analysis and code intelligence for LLM agents. Maps tests to code, code to git history, and answers: "what to run, what's risky, who touched it."
126

137
## Core Data Model
148

15-
A single weighted graph with three edge types layered on Stele's symbol graph:
9+
A weighted graph with three edge types stored in SQLite:
1610

1711
```
1812
┌─────────────┐ ┌──────────────┐ ┌──────────────┐
1913
│ Code Unit │──────▶│ Test Unit │──────▶│ Git History │
2014
│ (function, │ calls │ (test file, │ │ (commits, │
2115
│ class, │◀──────│ test func) │ │ blame lines, │
22-
module) │imports│ │ │ churn stats) │
16+
struct...) │imports│ │ │ churn stats) │
2317
└─────────────┘ └──────────────┘ └──────────────┘
2418
│ │
2519
└────────────────────────────────────────────┘
@@ -30,194 +24,129 @@ A single weighted graph with three edge types layered on Stele's symbol graph:
3024

3125
| Entity | Source | Key Fields |
3226
|--------|--------|------------|
33-
| `CodeUnit` | Stele symbol graph + AST | file, name, type (func/class/module), line range |
34-
| `TestUnit` | Test file AST parsing | file, name, framework (pytest/jest/go test), line range |
27+
| `CodeUnit` | AST extraction (12 languages) | file, name, type (function/class/struct/enum/impl), line range |
28+
| `TestUnit` | Test file parsing + framework detection | file, name, framework, line range |
3529
| `CommitRecord` | `git log --numstat` | hash, author, date, files changed, insertions, deletions |
3630
| `BlameBlock` | `git blame --porcelain` | file, line range, commit, author, date |
3731

3832
### Edges
3933

4034
| Edge | How Built | Weight |
4135
|------|-----------|--------|
42-
| `test → code_unit` | Parse test imports + function calls via AST | call count |
36+
| `test → code_unit` | Parse test imports + function calls | proximity-based (0.4-1.0): same dir=1.0, sibling=0.8, shared ancestor=0.6, distant=0.4 |
4337
| `code_unit → commit` | `git log -L :funcname:file` or blame | recency-weighted |
44-
| `file → file` (co-change) | Files that appear in same commits | co-occurrence count |
38+
| `file → file` (co-change) | Files in same commits (≥3 co-commits) | co-occurrence count |
4539

46-
## Architecture
40+
## Module Architecture
4741

4842
```
49-
Chisel (engine.py) — main orchestrator
43+
ChiselEngine (engine.py) — main orchestrator
5044
├── TestMapper (test_mapper.py)
51-
│ ├── Parse test files, detect framework (pytest/jest/go/rust)
52-
│ ├── Extract imports + call targets
53-
│ └── Build test → code_unit edges
45+
│ ├── Discover test files, detect framework
46+
│ ├── Extract imports + call targets (per-language)
47+
│ └── Build test → code_unit edges with proximity weights
5448
├── GitAnalyzer (git_analyzer.py)
55-
│ ├── Parse `git log` output (no gitpython dep)
49+
│ ├── Parse `git log` output (no gitpython)
5650
│ ├── Parse `git blame` output
57-
│ ├── Compute churn scores, ownership, co-change coupling
58-
│ └── Build code_unit → commit edges
51+
│ └── Function-level log via `git log -L`
52+
├── Metrics (metrics.py)
53+
│ ├── Churn scoring: sum(1 / (1 + days_since))
54+
│ ├── Ownership aggregation from blame blocks
55+
│ └── Co-change coupling detection
5956
├── ImpactAnalyzer (impact.py)
60-
│ ├── Given changed files/functions → affected tests (via test edges)
61-
│ ├── Risk score = f(churn, recency, coupling breadth)
62-
│ ├── Ownership query ("who last touched this, how often")
63-
│ └── Stale test detection (tests that cover dead/removed code)
64-
├── Storage (storage.py, SQLite)
65-
│ ├── Cached graph edges
66-
│ ├── Git history snapshots
67-
│ └── Incremental update (only re-parse changed files)
57+
│ ├── Impacted tests (direct + transitive via coupling)
58+
│ ├── Risk scoring (5-component weighted formula)
59+
│ ├── Stale test detection (orphaned edge refs)
60+
│ └── Reviewer suggestions (commit-activity-based)
61+
├── AST Utils (ast_utils.py)
62+
│ ├── Multi-language extraction (12 languages)
63+
│ ├── Pluggable extractor registry (tree-sitter/LSP hooks)
64+
│ └── Brace matching with multi-line block comment tracking
65+
├── Storage (storage.py, SQLite WAL)
66+
│ ├── 10 tables, single persistent connection
67+
│ ├── Batch query methods for N+1 elimination
68+
│ └── Incremental update via content hashes
69+
├── Project (project.py)
70+
│ ├── Project root detection (worktree-aware)
71+
│ ├── Path normalization (cross-platform)
72+
│ └── ProcessLock (fcntl on Unix, LockFileEx on Windows)
73+
├── RWLock (rwlock.py) — in-process read/write lock
74+
├── Schemas (schemas.py) — JSON Schema defs + dispatch table
6875
└── APIs
69-
├── CLI (cli.py)
70-
├── MCP stdio (mcp_stdio.py) — for Claude Desktop
71-
└── HTTP (mcp_server.py) — for Claude Code
72-
73-
Stele integration:
74-
└── Reads Stele's symbol graph (optional, falls back to own AST parsing)
76+
├── CLI (cli.py) — 17 subcommands
77+
├── HTTP (mcp_server.py) — GET /tools, /health; POST /call
78+
└── stdio MCP (mcp_stdio.py) — for Claude Desktop/Cursor
7579
```
7680

77-
## SQLite Tables
81+
## SQLite Tables (10)
7882

7983
```sql
80-
-- Code units (functions, classes, modules)
81-
CREATE TABLE code_units (
82-
id TEXT PRIMARY KEY, -- file:name:type
83-
file_path TEXT NOT NULL,
84-
name TEXT NOT NULL,
85-
unit_type TEXT NOT NULL, -- func, class, module
86-
line_start INTEGER,
87-
line_end INTEGER,
88-
content_hash TEXT,
89-
updated_at TEXT
90-
);
91-
92-
-- Test units
93-
CREATE TABLE test_units (
94-
id TEXT PRIMARY KEY,
95-
file_path TEXT NOT NULL,
96-
name TEXT NOT NULL,
97-
framework TEXT, -- pytest, jest, go, rust, playwright
98-
line_start INTEGER,
99-
line_end INTEGER,
100-
content_hash TEXT,
101-
updated_at TEXT
102-
);
103-
104-
-- Test → code_unit edges
105-
CREATE TABLE test_edges (
106-
test_id TEXT REFERENCES test_units(id),
107-
code_id TEXT REFERENCES code_units(id),
108-
edge_type TEXT, -- import, call, fixture
109-
weight REAL DEFAULT 1.0,
110-
PRIMARY KEY (test_id, code_id, edge_type)
111-
);
112-
113-
-- Git commit records
114-
CREATE TABLE commits (
115-
hash TEXT PRIMARY KEY,
116-
author TEXT,
117-
author_email TEXT,
118-
date TEXT,
119-
message TEXT
120-
);
121-
122-
-- Commit → file changes
123-
CREATE TABLE commit_files (
124-
commit_hash TEXT REFERENCES commits(hash),
125-
file_path TEXT,
126-
insertions INTEGER,
127-
deletions INTEGER,
128-
PRIMARY KEY (commit_hash, file_path)
129-
);
130-
131-
-- Blame cache (per file, invalidated by content hash)
132-
CREATE TABLE blame_cache (
133-
file_path TEXT,
134-
line_start INTEGER,
135-
line_end INTEGER,
136-
commit_hash TEXT,
137-
author TEXT,
138-
author_email TEXT,
139-
date TEXT,
140-
content_hash TEXT, -- of the file when blame was run
141-
PRIMARY KEY (file_path, line_start)
142-
);
143-
144-
-- Co-change coupling
145-
CREATE TABLE co_changes (
146-
file_a TEXT,
147-
file_b TEXT,
148-
co_commit_count INTEGER,
149-
last_co_commit TEXT,
150-
PRIMARY KEY (file_a, file_b)
151-
);
152-
153-
-- Churn summary (materialized view, rebuilt on analyze)
154-
CREATE TABLE churn_stats (
155-
file_path TEXT,
156-
unit_name TEXT, -- nullable (file-level if null)
157-
commit_count INTEGER,
158-
distinct_authors INTEGER,
159-
total_insertions INTEGER,
160-
total_deletions INTEGER,
161-
last_changed TEXT,
162-
churn_score REAL, -- weighted: recent changes count more
163-
PRIMARY KEY (file_path, unit_name)
164-
);
84+
code_units — functions, classes, structs (id = file:name:type)
85+
test_units — test functions (id = file:name)
86+
test_edges — test → code links with edge_type and weight
87+
commits — git commit metadata
88+
commit_files — per-file stats per commit
89+
blame_cache — cached git blame, keyed by content hash
90+
co_changes — file pairs that change together
91+
churn_stats — churn scores per file and per function
92+
file_hashes — content hashes for incremental analysis
93+
test_results — recorded pass/fail outcomes for prioritization
16594
```
16695

167-
## Key Queries (MCP Tools)
96+
## 15 MCP Tools
16897

16998
| Tool | Input | Output |
17099
|------|-------|--------|
171-
| `impact` | changed files/functions | affected test list + risk score |
172-
| `suggest_tests` | file path or diff | ordered test list to run |
173-
| `churn` | file or function | churn score, commit count, last changed |
174-
| `ownership` | file or function | author breakdown (% of blame lines) |
175-
| `coupling` | file path | co-change partners + strength |
176-
| `risk_map` | directory | heatmap of risk scores |
177-
| `stale_tests` | | tests covering removed/renamed code |
178-
| `history` | function name | commit timeline with diffs |
179-
| `who_reviews` | file or diff | suggested reviewers by ownership |
180-
| `analyze` | directory | full rebuild of git + test graph |
181-
182-
## Design Decisions
183-
184-
- **Zero deps**: stdlib only. `ast` for Python, regex for JS/TS/Go/Rust. `subprocess.run(["git", ...])` for git.
185-
- **Git is the source of truth**: No gitpython. Parse `git log --format=...` and `git blame --porcelain` text output.
186-
- **Incremental**: Track file content hashes. Only re-parse test files and re-blame files that changed since last run.
187-
- **Stele optional**: Can read Stele's SQLite DB directly for symbol graph if available. Falls back to own lightweight AST extraction.
188-
- **Framework detection**: Auto-detect test framework from file patterns (`test_*.py`, `*.test.js`, `*_test.go`) and imports (`import pytest`, `describe(`).
189-
- **Churn score formula**: `sum(1 / (1 + days_since_commit))` — recent changes weigh heavily, old changes decay.
190-
- **Co-change threshold**: Only store pairs with >= 3 co-commits to avoid noise.
191-
- **Risk score**: `0.4 * churn_norm + 0.3 * coupling_breadth_norm + 0.2 * (1 - test_coverage) + 0.1 * author_concentration`. Higher = riskier to change.
192-
- **Blame caching**: Blame is expensive. Cache by file content hash, invalidate on change.
193-
- **Thread safety**: Same RWLock pattern as Stele for concurrent MCP access.
194-
195-
## CLI Examples
196-
197-
```bash
198-
# Analyze a project (builds all graphs)
199-
chisel analyze .
200-
201-
# What tests should I run after editing engine.py?
202-
chisel impact engine.py
203-
# → test_engine.py (direct import, 23 calls)
204-
# → test_integration.py (transitive via storage.py)
205-
# → Risk: HIGH (churn=0.82, 15 commits in 7 days)
206-
207-
# Who owns this code?
208-
chisel ownership stele/engine.py
209-
# → IronAdamant: 94% (blame lines)
210-
# → Last changed: 2026-03-16
211-
212-
# What files always change together?
213-
chisel coupling stele/storage.py
214-
# → stele/engine.py (18 co-commits)
215-
# → stele/session_storage.py (12 co-commits)
216-
217-
# Which tests are stale?
218-
chisel stale-tests
219-
# → test_old_feature.py:test_removed_api — calls `old_function` (removed in abc123)
220-
```
100+
| `analyze` | directory, force | full rebuild stats |
101+
| `update` | | incremental update stats |
102+
| `impact` | files, functions | affected tests + scores |
103+
| `diff_impact` | ref (auto-detects branch) | affected tests from git diff |
104+
| `suggest_tests` | file_path | ranked tests by relevance + failure rate |
105+
| `churn` | file, unit_name | churn score, commits, authors |
106+
| `ownership` | file | author breakdown (blame-based, role=original_author) |
107+
| `who_reviews` | file | reviewer suggestions (activity-based, role=suggested_reviewer) |
108+
| `coupling` | file, min_count | co-change partners |
109+
| `risk_map` | directory | risk scores (batch-computed) |
110+
| `stale_tests` || tests pointing at removed code |
111+
| `test_gaps` | file, directory | untested code units by churn risk |
112+
| `history` | file | commit timeline |
113+
| `record_result` | test_id, passed, duration | store for future prioritization |
114+
| `stats` || database summary counts |
115+
116+
All list-returning tools accept a `limit` parameter to cap result size.
117+
118+
## Key Design Decisions
119+
120+
- **Zero deps**: stdlib only. `ast` for Python, regex for 11 other languages. `subprocess.run(["git", ...])` for git. Requires Python >= 3.11.
121+
- **Pluggable extractors**: `register_extractor(lang, fn)` overrides built-in regex with tree-sitter/LSP. Zero-dep — just callable hooks.
122+
- **Proximity-based edge weights**: 0.4-1.0 based on directory distance. Python import-path matching (`from myapp.utils import foo``myapp/utils.py:foo`) takes priority.
123+
- **Risk formula**: `0.35*churn + 0.25*coupling + 0.2*coverage_gap + 0.1*author_concentration + 0.1*test_instability`
124+
- **Batch queries**: `get_risk_map()` fetches all data in ~5 queries. `_chunked()` helper stays under SQLite's 999-variable limit.
125+
- **Churn formula**: `sum(1 / (1 + days_since_commit))` — recent changes weigh heavily.
126+
- **Co-change threshold**: Adaptive `max(3, total_commits // 4)`. Commits touching >50 files skipped.
127+
- **Blame caching**: Cached by file content hash, invalidated on change.
128+
- **Incremental analysis**: File content hashes tracked in `file_hashes` table.
129+
- **FK enforcement disabled**: Stale test detection relies on orphaned edge refs.
130+
- **Cross-platform locking**: `ProcessLock` uses `fcntl.flock` (Unix) / `LockFileEx` via ctypes (Windows). Shared locks for reads, exclusive for writes.
131+
- **Thread safety**: RWLock (in-process) + ProcessLock (cross-process). Lock order: process lock outer, RWLock inner.
132+
- **Multi-line block comments**: `_strip_strings_and_comments` tracks `/* */` state across lines for correct brace matching.
133+
134+
## Supported Languages (12)
135+
136+
| Language | AST Method | Test Frameworks |
137+
|----------|-----------|-----------------|
138+
| Python | `ast` module (regex fallback) | pytest |
139+
| JavaScript/TypeScript | Regex | Jest, Playwright |
140+
| Go | Regex | Go test |
141+
| Rust | Regex | `#[test]`, `#[cfg(test)]` |
142+
| C# | Regex (nested generics, attributes) | xUnit, NUnit, MSTest |
143+
| Java | Regex (annotations, nested generics) | JUnit |
144+
| Kotlin | Regex (extension functions) | JUnit |
145+
| C/C++ | Regex (templates, destructors) | gtest, Catch2 |
146+
| Swift | Regex (@attributes) | XCTest |
147+
| PHP | Regex | PHPUnit |
148+
| Ruby | Keyword-based block detection | RSpec, Minitest |
149+
| Dart | Regex (factory, getters/setters) | Dart test |
221150

222151
## File Structure
223152

@@ -226,35 +155,39 @@ Chisel/
226155
├── chisel/
227156
│ ├── __init__.py # version
228157
│ ├── engine.py # orchestrator
229-
│ ├── test_mapper.py # test file parsing, edge building
230-
│ ├── git_analyzer.py # git log/blame parsing, churn/ownership
231-
│ ├── impact.py # impact analysis, risk scoring
232158
│ ├── storage.py # SQLite persistence
233-
│ ├── ast_utils.py # lightweight AST helpers (multi-lang)
234-
│ ├── cli.py # CLI entry point
159+
│ ├── ast_utils.py # multi-language AST extraction + plugin registry
160+
│ ├── git_analyzer.py # git log/blame parsing
161+
│ ├── metrics.py # churn, ownership, co-change computation
162+
│ ├── test_mapper.py # test discovery, deps, edge building
163+
│ ├── impact.py # impact analysis, risk scoring, reviewers
164+
│ ├── project.py # project root, path normalization, ProcessLock
165+
│ ├── rwlock.py # read-write lock
166+
│ ├── schemas.py # JSON Schema defs + dispatch table
167+
│ ├── cli.py # argparse CLI (17 subcommands)
235168
│ ├── mcp_server.py # HTTP MCP server
236169
│ └── mcp_stdio.py # stdio MCP server
237170
├── tests/
238-
│ ├── test_test_mapper.py
239-
│ ├── test_git_analyzer.py
240-
│ ├── test_impact.py
241-
│ └── test_storage.py
171+
│ ├── conftest.py # shared fixtures (temp git repos)
172+
│ ├── test_ast_utils.py # AST extraction tests
173+
│ ├── test_storage.py # storage CRUD + batch query tests
174+
│ ├── test_git_analyzer.py # git parsing tests
175+
│ ├── test_metrics.py # churn, co-change tests
176+
│ ├── test_test_mapper.py # framework detection, edge building tests
177+
│ ├── test_impact.py # impact, risk, ownership tests
178+
│ ├── test_engine.py # integration tests
179+
│ ├── test_cli.py # CLI handler tests
180+
│ ├── test_mcp_server.py # HTTP server tests
181+
│ ├── test_mcp_stdio.py # stdio server tests
182+
│ ├── test_rwlock.py # concurrency tests
183+
│ └── test_project.py # project root, path, lock tests
184+
├── wiki-local/ # detailed docs (spec, glossary, index)
242185
├── pyproject.toml
243-
├── CLAUDE.md
244-
├── ARCHITECTURE.md
186+
├── CLAUDE.md # agent instructions
187+
├── ARCHITECTURE.md # this file
188+
├── CHANGELOG.md
189+
├── CONTRIBUTING.md
190+
├── COMPLETE_PROJECT_DOCUMENTATION.md
191+
├── LLM_Development.md
245192
└── README.md
246193
```
247-
248-
## Integration with Stele
249-
250-
Chisel can optionally connect to a Stele instance for richer analysis:
251-
252-
```python
253-
# If Stele DB exists, read symbol graph directly
254-
stele_db = Path(".stele/index.db")
255-
if stele_db.exists():
256-
symbols = read_stele_symbols(stele_db) # direct SQLite read
257-
# Enriches test→code edges with Stele's cross-file symbol resolution
258-
```
259-
260-
This avoids duplicating Stele's symbol extraction while adding the test/git layer on top.

0 commit comments

Comments
 (0)