1- # Chisel — Test Impact & Code Intelligence for LLM Agents
1+ # Chisel — Architecture
22
3- Zero-dependency companion to [ Stele ] ( ../Stele/ ) . Maps tests to code, code to git history, and answers: "what to run, what's risky, who touched it."
3+ ** Version: ** 0.6.1 | ** Python: ** >= 3.11 | ** Dependencies: ** zero (stdlib only)
44
5- ## Problem
6-
7- An LLM agent changes ` engine.py:store_document() ` . It then either:
8- - Runs ** all** 287 tests (slow, wasteful), or
9- - Guesses with ` -k "test_store" ` (misses regressions)
10-
11- It also has no idea whether this function is stable (untouched for months) or a hotspot (changed 15 times this week).
5+ Test impact analysis and code intelligence for LLM agents. Maps tests to code, code to git history, and answers: "what to run, what's risky, who touched it."
126
137## Core Data Model
148
15- A single weighted graph with three edge types layered on Stele's symbol graph :
9+ A weighted graph with three edge types stored in SQLite :
1610
1711```
1812┌─────────────┐ ┌──────────────┐ ┌──────────────┐
1913│ Code Unit │──────▶│ Test Unit │──────▶│ Git History │
2014│ (function, │ calls │ (test file, │ │ (commits, │
2115│ class, │◀──────│ test func) │ │ blame lines, │
22- │ module) │imports│ │ │ churn stats) │
16+ │ struct...) │imports│ │ │ churn stats) │
2317└─────────────┘ └──────────────┘ └──────────────┘
2418 │ │
2519 └────────────────────────────────────────────┘
@@ -30,194 +24,129 @@ A single weighted graph with three edge types layered on Stele's symbol graph:
3024
3125| Entity | Source | Key Fields |
3226| --------| --------| ------------|
33- | ` CodeUnit ` | Stele symbol graph + AST | file, name, type (func /class/module ), line range |
34- | ` TestUnit ` | Test file AST parsing | file, name, framework (pytest/jest/go test) , line range |
27+ | ` CodeUnit ` | AST extraction (12 languages) | file, name, type (function /class/struct/enum/impl ), line range |
28+ | ` TestUnit ` | Test file parsing + framework detection | file, name, framework, line range |
3529| ` CommitRecord ` | ` git log --numstat ` | hash, author, date, files changed, insertions, deletions |
3630| ` BlameBlock ` | ` git blame --porcelain ` | file, line range, commit, author, date |
3731
3832### Edges
3933
4034| Edge | How Built | Weight |
4135| ------| -----------| --------|
42- | ` test → code_unit ` | Parse test imports + function calls via AST | call count |
36+ | ` test → code_unit ` | Parse test imports + function calls | proximity-based (0.4-1.0): same dir=1.0, sibling=0.8, shared ancestor=0.6, distant=0.4 |
4337| ` code_unit → commit ` | ` git log -L :funcname:file ` or blame | recency-weighted |
44- | ` file → file ` (co-change) | Files that appear in same commits | co-occurrence count |
38+ | ` file → file ` (co-change) | Files in same commits (≥3 co-commits) | co-occurrence count |
4539
46- ## Architecture
40+ ## Module Architecture
4741
4842```
49- Chisel (engine.py) — main orchestrator
43+ ChiselEngine (engine.py) — main orchestrator
5044 ├── TestMapper (test_mapper.py)
51- │ ├── Parse test files, detect framework (pytest/jest/go/rust)
52- │ ├── Extract imports + call targets
53- │ └── Build test → code_unit edges
45+ │ ├── Discover test files, detect framework
46+ │ ├── Extract imports + call targets (per-language)
47+ │ └── Build test → code_unit edges with proximity weights
5448 ├── GitAnalyzer (git_analyzer.py)
55- │ ├── Parse `git log` output (no gitpython dep )
49+ │ ├── Parse `git log` output (no gitpython)
5650 │ ├── Parse `git blame` output
57- │ ├── Compute churn scores, ownership, co-change coupling
58- │ └── Build code_unit → commit edges
51+ │ └── Function-level log via `git log -L`
52+ ├── Metrics (metrics.py)
53+ │ ├── Churn scoring: sum(1 / (1 + days_since))
54+ │ ├── Ownership aggregation from blame blocks
55+ │ └── Co-change coupling detection
5956 ├── ImpactAnalyzer (impact.py)
60- │ ├── Given changed files/functions → affected tests (via test edges)
61- │ ├── Risk score = f(churn, recency, coupling breadth)
62- │ ├── Ownership query ("who last touched this, how often")
63- │ └── Stale test detection (tests that cover dead/removed code)
64- ├── Storage (storage.py, SQLite)
65- │ ├── Cached graph edges
66- │ ├── Git history snapshots
67- │ └── Incremental update (only re-parse changed files)
57+ │ ├── Impacted tests (direct + transitive via coupling)
58+ │ ├── Risk scoring (5-component weighted formula)
59+ │ ├── Stale test detection (orphaned edge refs)
60+ │ └── Reviewer suggestions (commit-activity-based)
61+ ├── AST Utils (ast_utils.py)
62+ │ ├── Multi-language extraction (12 languages)
63+ │ ├── Pluggable extractor registry (tree-sitter/LSP hooks)
64+ │ └── Brace matching with multi-line block comment tracking
65+ ├── Storage (storage.py, SQLite WAL)
66+ │ ├── 10 tables, single persistent connection
67+ │ ├── Batch query methods for N+1 elimination
68+ │ └── Incremental update via content hashes
69+ ├── Project (project.py)
70+ │ ├── Project root detection (worktree-aware)
71+ │ ├── Path normalization (cross-platform)
72+ │ └── ProcessLock (fcntl on Unix, LockFileEx on Windows)
73+ ├── RWLock (rwlock.py) — in-process read/write lock
74+ ├── Schemas (schemas.py) — JSON Schema defs + dispatch table
6875 └── APIs
69- ├── CLI (cli.py)
70- ├── MCP stdio (mcp_stdio.py) — for Claude Desktop
71- └── HTTP (mcp_server.py) — for Claude Code
72-
73- Stele integration:
74- └── Reads Stele's symbol graph (optional, falls back to own AST parsing)
76+ ├── CLI (cli.py) — 17 subcommands
77+ ├── HTTP (mcp_server.py) — GET /tools, /health; POST /call
78+ └── stdio MCP (mcp_stdio.py) — for Claude Desktop/Cursor
7579```
7680
77- ## SQLite Tables
81+ ## SQLite Tables (10)
7882
7983``` sql
80- -- Code units (functions, classes, modules)
81- CREATE TABLE code_units (
82- id TEXT PRIMARY KEY , -- file:name:type
83- file_path TEXT NOT NULL ,
84- name TEXT NOT NULL ,
85- unit_type TEXT NOT NULL , -- func, class, module
86- line_start INTEGER ,
87- line_end INTEGER ,
88- content_hash TEXT ,
89- updated_at TEXT
90- );
91-
92- -- Test units
93- CREATE TABLE test_units (
94- id TEXT PRIMARY KEY ,
95- file_path TEXT NOT NULL ,
96- name TEXT NOT NULL ,
97- framework TEXT , -- pytest, jest, go, rust, playwright
98- line_start INTEGER ,
99- line_end INTEGER ,
100- content_hash TEXT ,
101- updated_at TEXT
102- );
103-
104- -- Test → code_unit edges
105- CREATE TABLE test_edges (
106- test_id TEXT REFERENCES test_units(id),
107- code_id TEXT REFERENCES code_units(id),
108- edge_type TEXT , -- import, call, fixture
109- weight REAL DEFAULT 1 .0 ,
110- PRIMARY KEY (test_id, code_id, edge_type)
111- );
112-
113- -- Git commit records
114- CREATE TABLE commits (
115- hash TEXT PRIMARY KEY ,
116- author TEXT ,
117- author_email TEXT ,
118- date TEXT ,
119- message TEXT
120- );
121-
122- -- Commit → file changes
123- CREATE TABLE commit_files (
124- commit_hash TEXT REFERENCES commits(hash),
125- file_path TEXT ,
126- insertions INTEGER ,
127- deletions INTEGER ,
128- PRIMARY KEY (commit_hash, file_path)
129- );
130-
131- -- Blame cache (per file, invalidated by content hash)
132- CREATE TABLE blame_cache (
133- file_path TEXT ,
134- line_start INTEGER ,
135- line_end INTEGER ,
136- commit_hash TEXT ,
137- author TEXT ,
138- author_email TEXT ,
139- date TEXT ,
140- content_hash TEXT , -- of the file when blame was run
141- PRIMARY KEY (file_path, line_start)
142- );
143-
144- -- Co-change coupling
145- CREATE TABLE co_changes (
146- file_a TEXT ,
147- file_b TEXT ,
148- co_commit_count INTEGER ,
149- last_co_commit TEXT ,
150- PRIMARY KEY (file_a, file_b)
151- );
152-
153- -- Churn summary (materialized view, rebuilt on analyze)
154- CREATE TABLE churn_stats (
155- file_path TEXT ,
156- unit_name TEXT , -- nullable (file-level if null)
157- commit_count INTEGER ,
158- distinct_authors INTEGER ,
159- total_insertions INTEGER ,
160- total_deletions INTEGER ,
161- last_changed TEXT ,
162- churn_score REAL , -- weighted: recent changes count more
163- PRIMARY KEY (file_path, unit_name)
164- );
84+ code_units — functions, classes, structs (id = file:name:type)
85+ test_units — test functions (id = file:name)
86+ test_edges — test → code links with edge_type and weight
87+ commits — git commit metadata
88+ commit_files — per- file stats per commit
89+ blame_cache — cached git blame, keyed by content hash
90+ co_changes — file pairs that change together
91+ churn_stats — churn scores per file and per function
92+ file_hashes — content hashes for incremental analysis
93+ test_results — recorded pass/ fail outcomes for prioritization
16594```
16695
167- ## Key Queries ( MCP Tools)
96+ ## 15 MCP Tools
16897
16998| Tool | Input | Output |
17099| ------| -------| --------|
171- | ` impact ` | changed files/functions | affected test list + risk score |
172- | ` suggest_tests ` | file path or diff | ordered test list to run |
173- | ` churn ` | file or function | churn score, commit count, last changed |
174- | ` ownership ` | file or function | author breakdown (% of blame lines) |
175- | ` coupling ` | file path | co-change partners + strength |
176- | ` risk_map ` | directory | heatmap of risk scores |
177- | ` stale_tests ` | — | tests covering removed/renamed code |
178- | ` history ` | function name | commit timeline with diffs |
179- | ` who_reviews ` | file or diff | suggested reviewers by ownership |
180- | ` analyze ` | directory | full rebuild of git + test graph |
181-
182- ## Design Decisions
183-
184- - ** Zero deps ** : stdlib only. ` ast ` for Python, regex for JS/TS/Go/Rust. ` subprocess.run(["git", ...]) ` for git.
185- - ** Git is the source of truth ** : No gitpython. Parse ` git log --format=... ` and ` git blame --porcelain ` text output.
186- - ** Incremental ** : Track file content hashes. Only re-parse test files and re-blame files that changed since last run.
187- - ** Stele optional ** : Can read Stele's SQLite DB directly for symbol graph if available. Falls back to own lightweight AST extraction .
188- - ** Framework detection ** : Auto-detect test framework from file patterns ( ` test_*.py ` , ` *.test.js ` , ` *_test.go ` ) and imports ( ` import pytest ` , ` describe( ` ).
189- - ** Churn score formula ** : ` sum(1 / (1 + days_since_commit)) ` — recent changes weigh heavily, old changes decay.
190- - ** Co-change threshold ** : Only store pairs with >= 3 co-commits to avoid noise.
191- - ** Risk score ** : ` 0.4 * churn_norm + 0.3 * coupling_breadth_norm + 0.2 * (1 - test_coverage) + 0.1 * author_concentration ` . Higher = riskier to change .
192- - ** Blame caching ** : Blame is expensive. Cache by file content hash, invalidate on change .
193- - ** Thread safety ** : Same RWLock pattern as Stele for concurrent MCP access .
194-
195- ## CLI Examples
196-
197- ``` bash
198- # Analyze a project (builds all graphs)
199- chisel analyze .
200-
201- # What tests should I run after editing engine.py?
202- chisel impact engine.py
203- # → test_engine.py (direct import, 23 calls)
204- # → test_integration.py (transitive via storage.py)
205- # → Risk: HIGH (churn=0.82, 15 commits in 7 days )
206-
207- # Who owns this code?
208- chisel ownership stele/engine.py
209- # → IronAdamant: 94% (blame lines)
210- # → Last changed: 2026-03-16
211-
212- # What files always change together?
213- chisel coupling stele/storage.py
214- # → stele/engine.py (18 co-commits)
215- # → stele/session_storage.py (12 co-commits)
216-
217- # Which tests are stale?
218- chisel stale-tests
219- # → test_old_feature.py:test_removed_api — calls `old_function` (removed in abc123)
220- ```
100+ | ` analyze ` | directory, force | full rebuild stats |
101+ | ` update ` | — | incremental update stats |
102+ | ` impact ` | files, functions | affected tests + scores |
103+ | ` diff_impact ` | ref (auto-detects branch) | affected tests from git diff |
104+ | ` suggest_tests ` | file_path | ranked tests by relevance + failure rate |
105+ | ` churn ` | file, unit_name | churn score, commits, authors |
106+ | ` ownership ` | file | author breakdown (blame-based, role=original_author) |
107+ | ` who_reviews ` | file | reviewer suggestions (activity-based, role=suggested_reviewer) |
108+ | ` coupling ` | file, min_count | co-change partners |
109+ | ` risk_map ` | directory | risk scores (batch-computed) |
110+ | ` stale_tests ` | — | tests pointing at removed code |
111+ | ` test_gaps ` | file, directory | untested code units by churn risk |
112+ | ` history ` | file | commit timeline |
113+ | ` record_result ` | test_id, passed, duration | store for future prioritization |
114+ | ` stats ` | — | database summary counts |
115+
116+ All list-returning tools accept a ` limit ` parameter to cap result size .
117+
118+ ## Key Design Decisions
119+
120+ - ** Zero deps ** : stdlib only. ` ast ` for Python, regex for 11 other languages. ` subprocess.run(["git", ...]) ` for git. Requires Python >= 3.11 .
121+ - ** Pluggable extractors ** : ` register_extractor(lang, fn) ` overrides built-in regex with tree-sitter/LSP. Zero-dep — just callable hooks .
122+ - ** Proximity-based edge weights ** : 0.4-1.0 based on directory distance. Python import-path matching ( ` from myapp.utils import foo ` → ` myapp/utils.py:foo ` ) takes priority .
123+ - ** Risk formula ** : ` 0.35*churn + 0.25*coupling + 0.2*coverage_gap + 0.1*author_concentration + 0.1*test_instability `
124+ - ** Batch queries ** : ` get_risk_map() ` fetches all data in ~ 5 queries. ` _chunked() ` helper stays under SQLite's 999-variable limit.
125+ - ** Churn formula ** : ` sum(1 / (1 + days_since_commit)) ` — recent changes weigh heavily.
126+ - ** Co-change threshold ** : Adaptive ` max(3, total_commits // 4) ` . Commits touching >50 files skipped.
127+ - ** Blame caching ** : Cached by file content hash, invalidated on change.
128+ - ** Incremental analysis ** : File content hashes tracked in ` file_hashes ` table .
129+ - ** FK enforcement disabled ** : Stale test detection relies on orphaned edge refs.
130+ - ** Cross-platform locking ** : ` ProcessLock ` uses ` fcntl.flock ` (Unix) / ` LockFileEx ` via ctypes (Windows). Shared locks for reads, exclusive for writes.
131+ - ** Thread safety ** : RWLock (in-process) + ProcessLock (cross-process). Lock order: process lock outer, RWLock inner.
132+ - ** Multi-line block comments ** : ` _strip_strings_and_comments ` tracks ` /* */ ` state across lines for correct brace matching.
133+
134+ ## Supported Languages (12 )
135+
136+ | Language | AST Method | Test Frameworks |
137+ | ---------- | ----------- | ----------------- |
138+ | Python | ` ast ` module (regex fallback) | pytest |
139+ | JavaScript/TypeScript | Regex | Jest, Playwright |
140+ | Go | Regex | Go test |
141+ | Rust | Regex | ` #[test] ` , ` #[cfg(test)] ` |
142+ | C# | Regex (nested generics, attributes) | xUnit, NUnit, MSTest |
143+ | Java | Regex (annotations, nested generics) | JUnit |
144+ | Kotlin | Regex (extension functions) | JUnit |
145+ | C/C++ | Regex (templates, destructors) | gtest, Catch2 |
146+ | Swift | Regex ( @ attributes ) | XCTest |
147+ | PHP | Regex | PHPUnit |
148+ | Ruby | Keyword-based block detection | RSpec, Minitest |
149+ | Dart | Regex (factory, getters/setters) | Dart test |
221150
222151## File Structure
223152
@@ -226,35 +155,39 @@ Chisel/
226155├── chisel/
227156│ ├── __init__.py # version
228157│ ├── engine.py # orchestrator
229- │ ├── test_mapper.py # test file parsing, edge building
230- │ ├── git_analyzer.py # git log/blame parsing, churn/ownership
231- │ ├── impact.py # impact analysis, risk scoring
232158│ ├── storage.py # SQLite persistence
233- │ ├── ast_utils.py # lightweight AST helpers (multi-lang)
234- │ ├── cli.py # CLI entry point
159+ │ ├── ast_utils.py # multi-language AST extraction + plugin registry
160+ │ ├── git_analyzer.py # git log/blame parsing
161+ │ ├── metrics.py # churn, ownership, co-change computation
162+ │ ├── test_mapper.py # test discovery, deps, edge building
163+ │ ├── impact.py # impact analysis, risk scoring, reviewers
164+ │ ├── project.py # project root, path normalization, ProcessLock
165+ │ ├── rwlock.py # read-write lock
166+ │ ├── schemas.py # JSON Schema defs + dispatch table
167+ │ ├── cli.py # argparse CLI (17 subcommands)
235168│ ├── mcp_server.py # HTTP MCP server
236169│ └── mcp_stdio.py # stdio MCP server
237170├── tests/
238- │ ├── test_test_mapper.py
239- │ ├── test_git_analyzer.py
240- │ ├── test_impact.py
241- │ └── test_storage.py
171+ │ ├── conftest.py # shared fixtures (temp git repos)
172+ │ ├── test_ast_utils.py # AST extraction tests
173+ │ ├── test_storage.py # storage CRUD + batch query tests
174+ │ ├── test_git_analyzer.py # git parsing tests
175+ │ ├── test_metrics.py # churn, co-change tests
176+ │ ├── test_test_mapper.py # framework detection, edge building tests
177+ │ ├── test_impact.py # impact, risk, ownership tests
178+ │ ├── test_engine.py # integration tests
179+ │ ├── test_cli.py # CLI handler tests
180+ │ ├── test_mcp_server.py # HTTP server tests
181+ │ ├── test_mcp_stdio.py # stdio server tests
182+ │ ├── test_rwlock.py # concurrency tests
183+ │ └── test_project.py # project root, path, lock tests
184+ ├── wiki-local/ # detailed docs (spec, glossary, index)
242185├── pyproject.toml
243- ├── CLAUDE.md
244- ├── ARCHITECTURE.md
186+ ├── CLAUDE.md # agent instructions
187+ ├── ARCHITECTURE.md # this file
188+ ├── CHANGELOG.md
189+ ├── CONTRIBUTING.md
190+ ├── COMPLETE_PROJECT_DOCUMENTATION.md
191+ ├── LLM_Development.md
245192└── README.md
246193```
247-
248- ## Integration with Stele
249-
250- Chisel can optionally connect to a Stele instance for richer analysis:
251-
252- ``` python
253- # If Stele DB exists, read symbol graph directly
254- stele_db = Path(" .stele/index.db" )
255- if stele_db.exists():
256- symbols = read_stele_symbols(stele_db) # direct SQLite read
257- # Enriches test→code edges with Stele's cross-file symbol resolution
258- ```
259-
260- This avoids duplicating Stele's symbol extraction while adding the test/git layer on top.
0 commit comments