Skip to content

Commit 943ecb6

Browse files
IronAdamantclaude
andcommitted
v0.6.6: LLM-optimize MCP responses — structured hints, diagnostic empties, prescriptive schemas
Structured next_steps: hints are now {"tool", "args", "reason"} dicts instead of prose strings, so LLM agents can directly invoke suggestions. Non-tool guidance uses {"action", "reason"}. Diagnostic diff_impact: returns {status: "no_changes", ref, branch, message} instead of bare [] when no diff found — LLMs can now reason about why. Stats coupling_threshold: exposes the adaptive threshold so LLMs can explain coupling=0.0 (threshold too high for project's change patterns). Schema descriptions rewritten with prescriptive "Use when..." language and cross-references between related tools. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent aa0825d commit 943ecb6

8 files changed

Lines changed: 190 additions & 90 deletions

File tree

CLAUDE.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,9 @@ chisel/
4646
- **Numstat validation**: `_parse_log_output` in `git_analyzer.py` validates tab-separated fields are digits or `-` before treating them as numstat. Diff lines with tabs were being misidentified as numstat entries in `git log -L` output.
4747
- **Encoding safety**: All `subprocess.run()` calls use `encoding="utf-8", errors="replace"`. Git history may contain non-UTF-8 bytes (Latin-1 commit messages, binary diff fragments); these are replaced with `` instead of crashing. File reads in `engine.py` and `test_mapper.py` already used `errors="replace"`.
4848
- **Empty-state detection**: All 12 query tools return `{"status": "no_data", "message": "...", "hint": "chisel analyze"}` when the DB has no analysis data, instead of `[]`. `_check_analysis_data()` in `engine.py` calls `storage.has_analysis_data()` (`SELECT 1 FROM code_units LIMIT 1`). Write tools (`analyze`, `update`, `record_result`) and `stats` are unaffected. `stats` adds a `hint` key when all counts are zero. CLI detects this via `_is_no_data()` in `cli.py`.
49-
- **Next-step suggestions**: `next_steps.py` provides `compute_next_steps(tool_name, result)` which returns contextual follow-up suggestions per tool. Integrated at the dispatch level in `mcp_server.py` — HTTP responses include `"next_steps": [...]` as a sibling to `"result"`, stdio wraps both in a `{"result": ..., "next_steps": [...]}` envelope. CLI is unaffected. Only tools with registered hint functions get suggestions; others return empty.
49+
- **Next-step suggestions**: `next_steps.py` provides `compute_next_steps(tool_name, result)` which returns structured suggestions per tool. Each hint is a dict: `{"tool": "...", "args": {...}, "reason": "..."}` for tool invocations, or `{"action": "...", "reason": "..."}` for non-tool guidance. LLM agents can directly invoke suggested tools without parsing prose. Integrated at the dispatch level in `mcp_server.py` — HTTP responses include `"next_steps": [...]` as a sibling to `"result"`, stdio wraps both in a `{"result": ..., "next_steps": [...]}` envelope. CLI is unaffected. Only tools with registered hint functions get suggestions; others return empty.
50+
- **Diagnostic empty responses**: `diff_impact` returns `{"status": "no_changes", "ref": ..., "branch": ..., "message": ...}` instead of bare `[]` when no diff is found. CLI `_is_no_data()` handles both `"no_data"` and `"no_changes"` status values. `_hints_diff_impact` in `next_steps.py` handles the diagnostic dict case, suggesting `diff_impact` with `HEAD~1` or `update`.
51+
- **LLM-prescriptive schema descriptions**: Tool descriptions in `schemas.py` use prescriptive language ("Use when...", "Use after...") to help LLM agents decide which tool to call. Cross-references between related tools (ownership↔who_reviews, analyze↔update, impact↔diff_impact). Coupling description references `stats` for threshold visibility.
5052
- **Inline coupling partners**: `risk_map` includes `"coupling_partners"` (top 3 by co-commit count) in each file entry alongside the breakdown. Data is already fetched in the batch query — no extra DB calls.
5153
- **Triage tool**: Composite `triage` runs `risk_map` (top-N) + `test_gaps` (filtered to top-N files) + `stale_tests` in a single read lock. Returns a dict, not a list, so `limit` is not injected.
5254

@@ -81,11 +83,11 @@ next_steps.py → (no internal deps)
8183

8284
Each wired through: engine.tool_*() → CLI subcommand, HTTP POST /call, stdio MCP.
8385

84-
- **`diff_impact`**: Auto-detects changed files/functions from `git diff` and returns impacted tests. Branch-aware: on feature branches diffs against main; on main diffs against HEAD.
86+
- **`diff_impact`**: Auto-detects changed files/functions from `git diff` and returns impacted tests. Branch-aware: on feature branches diffs against main; on main diffs against HEAD. Returns diagnostic dict (`status: "no_changes"`) with `ref`, `branch`, `message` when no diff is found, instead of bare `[]`.
8587
- **`update`**: Incremental re-analysis — only re-processes changed files and new commits.
8688
- **`test_gaps`**: Finds code units with zero test coverage, prioritized by churn risk. Excludes test files by default.
8789
- **`record_result`**: Records test pass/fail outcomes. Feeds into `suggest_tests` (failure rate boost) and `risk_map` (test instability component).
88-
- **`stats`**: Returns summary counts for all database tables (code units, tests, edges, commits, etc.).
90+
- **`stats`**: Returns summary counts for all database tables plus `coupling_threshold` (when commits > 0) so LLM agents can diagnose coupling=0.0 results.
8991
- **`triage`**: Combined risk_map + test_gaps + stale_tests for top-N riskiest files. Single command for pre-audit/refactor prioritization. Returns `{top_risk_files, test_gaps, stale_tests, summary}`.
9092
- **`limit` parameter**: All list-returning tools accept `limit` to cap result size.
9193
- **Adaptive coupling threshold**: `max(3, total_commits // 4)` — scales with project maturity.

chisel/cli.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -173,8 +173,8 @@ def _limit(result, args):
173173

174174

175175
def _is_no_data(result):
176-
"""Check if *result* is a no-analysis-data warning from the engine."""
177-
return isinstance(result, dict) and result.get("status") == "no_data"
176+
"""Check if *result* is a status response (no-data, no-changes, etc.)."""
177+
return isinstance(result, dict) and result.get("status") in ("no_data", "no_changes")
178178

179179

180180
def _run_tool(args, method, kwargs, formatter, use_limit=True):

chisel/engine.py

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -253,6 +253,9 @@ def tool_diff_impact(self, ref=None):
253253
254254
If ref is not provided, auto-detects: on a feature branch diffs against
255255
main/master; on main diffs against HEAD (unstaged changes).
256+
257+
Returns a diagnostic dict (status="no_changes") instead of bare []
258+
when no diff is found, so LLM agents can reason about why.
256259
"""
257260
with self._process_lock.shared():
258261
with self.lock.read_lock():
@@ -263,7 +266,16 @@ def tool_diff_impact(self, ref=None):
263266
ref = self._detect_diff_base()
264267
changed_files = self.git.get_changed_files(ref)
265268
if not changed_files:
266-
return []
269+
try:
270+
branch = self.git.get_current_branch()
271+
except RuntimeError:
272+
branch = None
273+
return {
274+
"status": "no_changes",
275+
"ref": ref,
276+
"branch": branch,
277+
"message": f"No files differ against '{ref}'",
278+
}
267279
functions = []
268280
for fp in changed_files:
269281
try:
@@ -328,6 +340,10 @@ def tool_stats(self):
328340
stats = self.storage.get_stats()
329341
if all(v == 0 for v in stats.values()):
330342
stats["hint"] = "All counts are zero. Run 'chisel analyze' to populate."
343+
else:
344+
commit_count = stats.get("commits", 0)
345+
if commit_count > 0:
346+
stats["coupling_threshold"] = max(3, commit_count // 4)
331347
return stats
332348

333349
# ------------------------------------------------------------------ #

chisel/next_steps.py

Lines changed: 47 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,28 @@
11
"""Contextual next-step suggestions for MCP tool responses.
22
33
Computes follow-up tool suggestions based on what a tool returned,
4-
so LLM agents know what to invoke next. Only used by MCP servers
5-
(HTTP and stdio), not the CLI.
4+
so LLM agents can directly invoke the suggested next tool. Only used
5+
by MCP servers (HTTP and stdio), not the CLI.
6+
7+
Each suggestion is a dict with:
8+
- tool: Chisel tool name to invoke (omitted for non-tool actions)
9+
- args: Arguments dict for the tool call (may be partial — LLM
10+
fills remaining required args from context)
11+
- action: Descriptive action label (only for non-tool suggestions)
12+
- reason: Why this step is recommended
613
"""
714

815

916
def compute_next_steps(tool_name, result):
10-
"""Return a list of next-step suggestion strings for a tool result.
17+
"""Return a list of structured next-step suggestions for a tool result.
1118
1219
Args:
1320
tool_name: Name of the tool that produced the result.
1421
result: The tool's return value (dict or list).
1522
1623
Returns:
17-
List of strings, each a brief actionable suggestion. Empty list
18-
if no suggestions apply.
24+
List of dicts, each a structured suggestion with ``tool``/``args``
25+
or ``action`` plus ``reason``. Empty list if no suggestions apply.
1926
"""
2027
fn = _TOOL_HINTS.get(tool_name)
2128
if fn is None:
@@ -30,18 +37,18 @@ def compute_next_steps(tool_name, result):
3037
def _hints_analyze(result):
3138
if isinstance(result, dict) and "code_files_scanned" in result:
3239
return [
33-
"Run 'risk_map' to identify high-risk files.",
34-
"Run 'test_gaps' to find untested code.",
35-
"Run 'triage' for a combined risk + gap + stale overview.",
40+
{"tool": "risk_map", "args": {}, "reason": "Identify high-risk files"},
41+
{"tool": "test_gaps", "args": {}, "reason": "Find untested code"},
42+
{"tool": "triage", "args": {}, "reason": "Combined risk + gap + stale overview"},
3643
]
3744
return []
3845

3946

4047
def _hints_update(result):
4148
if isinstance(result, dict) and result.get("files_updated", 0) > 0:
4249
return [
43-
"Run 'diff_impact' to see which tests are affected by the changes.",
44-
"Run 'risk_map' to check updated risk scores.",
50+
{"tool": "diff_impact", "args": {}, "reason": "See which tests are affected by changes"},
51+
{"tool": "risk_map", "args": {}, "reason": "Check updated risk scores"},
4552
]
4653
return []
4754

@@ -51,46 +58,50 @@ def _hints_risk_map(result):
5158
top = result[:3]
5259
files = [r["file_path"] for r in top]
5360
steps = [
54-
"Run 'test_gaps' to find missing test coverage for high-risk files.",
61+
{"tool": "test_gaps", "args": {}, "reason": "Find missing coverage for high-risk files"},
5562
]
56-
# Suggest coupling drilldown for files with high coupling scores
5763
high_coupling = [
5864
r["file_path"] for r in top
5965
if r.get("breakdown", {}).get("coupling", 0) > 0.3
6066
]
6167
if high_coupling:
6268
steps.append(
63-
f"Run 'coupling {high_coupling[0]}' to see co-change partners."
69+
{"tool": "coupling", "args": {"file_path": high_coupling[0]}, "reason": "Investigate co-change partners"}
6470
)
65-
# Suggest churn drilldown for high-churn files
6671
high_churn = [
6772
r["file_path"] for r in top
6873
if r.get("breakdown", {}).get("churn", 0) > 0.5
6974
]
7075
if high_churn:
7176
steps.append(
72-
f"Run 'churn {high_churn[0]}' for detailed change history."
77+
{"tool": "churn", "args": {"file_path": high_churn[0]}, "reason": "Detailed change history"}
7378
)
7479
steps.append(
75-
f"Run 'suggest_tests {files[0]}' for test recommendations on the riskiest file."
80+
{"tool": "suggest_tests", "args": {"file_path": files[0]}, "reason": "Test recommendations for riskiest file"}
7681
)
7782
return steps
7883
if isinstance(result, list):
79-
return ["Run 'analyze' to populate risk data."]
84+
return [{"tool": "analyze", "args": {}, "reason": "Populate risk data"}]
8085
return []
8186

8287

8388
def _hints_diff_impact(result):
89+
# Diagnostic dict when no changes detected
90+
if isinstance(result, dict) and result.get("status") == "no_changes":
91+
return [
92+
{"tool": "diff_impact", "args": {"ref": "HEAD~1"}, "reason": "Try diffing against previous commit"},
93+
{"tool": "update", "args": {}, "reason": "Re-analyze if working tree has new files"},
94+
]
8495
if isinstance(result, list) and result:
8596
return [
86-
"Run the listed tests to verify your changes.",
87-
"Use 'record_result' to log outcomes for future prioritization.",
88-
"Run 'coupling' on changed files to check for hidden dependents.",
97+
{"action": "run_tests", "reason": "Execute impacted tests to verify changes"},
98+
{"tool": "record_result", "args": {}, "reason": "Log outcomes for future prioritization"},
99+
{"tool": "coupling", "args": {}, "reason": "Check changed files for hidden dependents"},
89100
]
90101
if isinstance(result, list):
91102
return [
92-
"Run 'test_gaps' to check if new code needs tests.",
93-
"Run 'update' if you've made changes since last analysis.",
103+
{"tool": "test_gaps", "args": {}, "reason": "Check if new code needs tests"},
104+
{"tool": "update", "args": {}, "reason": "Re-analyze if changes were made since last analysis"},
94105
]
95106
return []
96107

@@ -99,40 +110,40 @@ def _hints_test_gaps(result):
99110
if isinstance(result, list) and result:
100111
top_file = result[0]["file_path"]
101112
return [
102-
"Write tests for the highest-churn untested units first.",
103-
f"Run 'churn {top_file}' to see change frequency.",
104-
f"Run 'ownership {top_file}' to find who can help write tests.",
113+
{"action": "write_tests", "reason": "Prioritize highest-churn untested units"},
114+
{"tool": "churn", "args": {"file_path": top_file}, "reason": "Check change frequency"},
115+
{"tool": "ownership", "args": {"file_path": top_file}, "reason": "Find who can help write tests"},
105116
]
106117
if isinstance(result, list):
107-
return ["All code units have test coverage."]
118+
return [{"action": "complete", "reason": "All code units have test coverage"}]
108119
return []
109120

110121

111122
def _hints_stale_tests(result):
112123
if isinstance(result, list) and result:
113124
return [
114-
"Update or remove the stale tests listed above.",
115-
"Run 'update' to re-analyze after fixing test files.",
125+
{"action": "fix_tests", "reason": "Update or remove stale tests listed above"},
126+
{"tool": "update", "args": {}, "reason": "Re-analyze after fixing test files"},
116127
]
117128
if isinstance(result, list):
118-
return ["All tests reference current code."]
129+
return [{"action": "complete", "reason": "All tests reference current code"}]
119130
return []
120131

121132

122133
def _hints_impact(result):
123134
if isinstance(result, list) and result:
124135
return [
125-
"Run the impacted tests to verify correctness.",
126-
"Use 'record_result' to log outcomes for future prioritization.",
136+
{"action": "run_tests", "reason": "Execute impacted tests to verify correctness"},
137+
{"tool": "record_result", "args": {}, "reason": "Log outcomes for future prioritization"},
127138
]
128139
return []
129140

130141

131142
def _hints_suggest_tests(result):
132143
if isinstance(result, list) and result:
133144
return [
134-
"Run the suggested tests in order of relevance.",
135-
"Use 'record_result' to log outcomes for future prioritization.",
145+
{"action": "run_tests", "reason": "Execute suggested tests in order of relevance"},
146+
{"tool": "record_result", "args": {}, "reason": "Log outcomes for future prioritization"},
136147
]
137148
return []
138149

@@ -142,12 +153,12 @@ def _hints_triage(result):
142153
steps = []
143154
if result["summary"].get("total_test_gaps", 0) > 0:
144155
steps.append(
145-
"Focus on files appearing in both risk and gap sections."
156+
{"action": "prioritize", "reason": "Focus on files appearing in both risk and gap sections"}
146157
)
147158
if result["top_risk_files"]:
148159
top = result["top_risk_files"][0]["file_path"]
149-
steps.append(f"Run 'suggest_tests {top}' on the highest-risk file.")
150-
steps.append(f"Run 'ownership {top}' to find who owns the riskiest code.")
160+
steps.append({"tool": "suggest_tests", "args": {"file_path": top}, "reason": "Test recommendations for riskiest file"})
161+
steps.append({"tool": "ownership", "args": {"file_path": top}, "reason": "Identify who owns the riskiest code"})
151162
return steps
152163
return []
153164

0 commit comments

Comments
 (0)