Skip to content

Commit 8ca433c

Browse files
VibeWriter Userclaude
andcommitted
feat: split action plan into Refactors + Human Review sections
The final report's action plan now produces two distinct lists: 1. Recommended Refactors - tiered by priority, automatable without judgment 2. Requires Human Review - features/UI/UX needing human decisions No item limits - all recommendations included for easy copy-paste. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1 parent 302cbdf commit 8ca433c

4 files changed

Lines changed: 94 additions & 45 deletions

File tree

.claude/memory/report-generation.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,8 @@ Assumes CLAUDE.md loaded. Report logic in `src/report.js`, action plan logic in
3232
- Error, Attempts, Suggestion
3333

3434
## NightyTidy Action Plan ← Inline, only if generated (headings downgraded from consolidation.js output)
35-
### Critical / High / Medium / Low
35+
### Recommended Refactors (Critical / High / Medium / Low tiers)
36+
### Requires Human Review (features, UI/UX changes needing human judgment)
3637

3738
## How to Undo This Run
3839
- Claude Code instruction + git command
Lines changed: 47 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,19 @@
11
You just completed a multi-step automated codebase improvement run. Below are the outputs from each step — what was analyzed, changed, and recommended.
22

3-
Your task is to produce a **consolidated, prioritized action plan** of recommendations that still need to be done.
3+
Your task is to produce a **consolidated, prioritized action plan** split into two sections:
4+
1. **Refactors** — improvements the AI can do automatically without human judgment
5+
2. **Human Review** — features, UI/UX changes, and product decisions requiring human input
46

57
## Instructions
68

79
1. Review each step's output to extract actionable recommendations, suggestions, and identified issues.
810
2. **Check the current codebase** — read the relevant files to determine which recommendations have ALREADY been implemented by previous steps in this run.
911
3. **Deduplicate** — if multiple steps flagged the same issue, consolidate into one recommendation.
10-
4. **Tier** the remaining (not-yet-implemented) items by importance.
11-
5. Output the action plan in the exact format below.
12+
4. **Categorize** each item:
13+
- **Refactors**: Code cleanup, bug fixes, security patches, performance improvements, test additions, error handling, architectural improvements — anything that has a clear "right answer" and can be implemented without product decisions.
14+
- **Human Review**: New features, UI/UX changes, workflow modifications, user-facing behavior changes, product strategy suggestions — anything that requires understanding user needs or making trade-offs that affect the product direction.
15+
5. **Prioritize** within each section by importance (Critical → High → Medium → Low).
16+
6. Output the action plan in the exact format below.
1217

1318
## Output Format
1419

@@ -17,45 +22,65 @@ Your task is to produce a **consolidated, prioritized action plan** of recommend
1722

1823
> Generated from a {N}-step improvement run. Items below have been verified as **not yet implemented** in the current codebase.
1924
20-
## Critical
25+
## Recommended Refactors
2126

22-
<!-- Security vulnerabilities, data loss risks, breaking bugs, blocking issues -->
27+
These improvements have clear implementations and can be done automatically in a future run.
2328

24-
### [Short, specific title]
25-
- **What**: [Concrete action — reference specific files, functions, or patterns]
26-
- **Value**: [Why this matters — plain language, one sentence]
27-
- **Impact**: [Which files/modules/areas are affected]
28-
- **Risk**: [Low / Medium / High — risk of implementing this change, and why]
29+
### Critical
30+
<!-- Security vulnerabilities, data loss risks, breaking bugs -->
31+
(items or "No items at this priority level.")
2932

30-
## High
33+
### High
34+
<!-- Reliability, performance, error handling, code quality gaps -->
35+
(items)
3136

32-
<!-- Reliability, performance, error handling, significant code quality gaps -->
37+
### Medium
38+
<!-- Maintainability, test coverage, architectural improvements -->
39+
(items)
3340

34-
(same item format)
41+
### Low
42+
<!-- Polish, style, minor optimizations -->
43+
(items)
3544

36-
## Medium
45+
---
3746

38-
<!-- Maintainability, test coverage gaps, refactoring opportunities, minor UX issues -->
47+
## Requires Human Review
3948

40-
(same item format)
49+
These suggestions involve product decisions, user experience changes, or feature additions that need human judgment.
4150

42-
## Low
51+
### [Short, specific title]
52+
- **What**: [Concrete suggestion — reference specific areas or user flows]
53+
- **Why**: [The problem this solves or opportunity it creates]
54+
- **Trade-offs**: [What considerations or decisions are involved]
55+
- **Effort**: [Small / Medium / Large — rough implementation scope]
4356

44-
<!-- Polish, style improvements, nice-to-haves, minor optimizations -->
57+
(repeat for each item, ordered by potential value)
4558

46-
(same item format)
59+
---
4760

4861
## Summary
4962

50-
[One sentence on overall codebase health. One sentence on the single highest-value next action.]
63+
[One sentence on overall codebase health. One sentence on the top refactor priority. One sentence on the most valuable human-review item.]
5164
```
5265

66+
## Item Formats
67+
68+
**For Refactors** (each item):
69+
- **[Short, specific title]**: [Concrete action — reference specific files, functions, or patterns]. Value: [Why this matters]. Impact: [Which areas affected]. Risk: [Low/Medium/High].
70+
71+
**For Human Review** (each item):
72+
### [Short, specific title]
73+
- **What**: [Concrete suggestion]
74+
- **Why**: [Problem or opportunity]
75+
- **Trade-offs**: [Decisions involved]
76+
- **Effort**: [Small/Medium/Large]
77+
5378
## Rules
5479

5580
- Do NOT include anything already implemented in the codebase — verify by reading files.
5681
- Do NOT include vague advice like "add more tests" — be specific about WHAT to test and WHERE.
5782
- Each recommendation MUST reference specific files, functions, or code patterns.
5883
- Deduplicate ruthlessly — one item per distinct issue, even if multiple steps found it.
59-
- Maximum **5 items per tier** (20 items total). Prioritize ruthlessly.
60-
- If a tier has zero items, include the heading with a note: *No items at this priority level.*
84+
- Include ALL items — no limits. The human needs the complete list for easy copy-paste.
85+
- If a section has zero items, include the heading with a note: *No items in this category.*
6186
- Output ONLY the markdown document. No preamble, no commentary, no code fences wrapping the whole document.

src/prompts/specials/report.md

Lines changed: 38 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -25,12 +25,20 @@ Instead of technical terms, describe what the change DOES for the person: "I mad
2525

2626
## Part 2: Action Plan
2727

28-
Review the step outputs provided below to extract actionable recommendations that still need to be done.
28+
Review the step outputs provided below to extract actionable recommendations that still need to be done. Split them into two categories:
29+
30+
1. **Recommended Refactors** — improvements with clear implementations that can be automated
31+
2. **Requires Human Review** — features, UI/UX changes, and product decisions needing human input
32+
33+
### Instructions
2934

3035
1. Review each step's output to extract actionable recommendations, suggestions, and identified issues.
3136
2. **Check the current codebase** — read the relevant files to determine which recommendations have ALREADY been implemented by previous steps in this run.
3237
3. **Deduplicate** — if multiple steps flagged the same issue, consolidate into one recommendation.
33-
4. **Tier** the remaining (not-yet-implemented) items by importance.
38+
4. **Categorize** each item:
39+
- **Refactors**: Code cleanup, bug fixes, security patches, performance improvements, test additions, error handling, architectural improvements — anything with a clear "right answer" that can be implemented without product decisions.
40+
- **Human Review**: New features, UI/UX changes, workflow modifications, user-facing behavior changes, product strategy suggestions — anything requiring understanding user needs or making trade-offs.
41+
5. **Prioritize** refactors by importance (Critical → High → Medium → Low). Order human review items by potential value.
3442

3543
Structure the action plan as:
3644

@@ -39,33 +47,50 @@ Structure the action plan as:
3947
4048
> Generated from a {N}-step improvement run. Items below have been verified as **not yet implemented** in the current codebase.
4149
42-
### Critical
43-
<!-- Security vulnerabilities, data loss risks, breaking bugs, blocking issues -->
50+
### Recommended Refactors
51+
52+
These improvements have clear implementations and can be done automatically in a future run.
53+
54+
#### Critical
55+
<!-- Security vulnerabilities, data loss risks, breaking bugs -->
4456
(items or "No items at this priority level.")
4557
46-
### High
47-
<!-- Reliability, performance, error handling, significant code quality gaps -->
58+
#### High
59+
<!-- Reliability, performance, error handling, code quality gaps -->
4860
(items)
4961
50-
### Medium
51-
<!-- Maintainability, test coverage gaps, refactoring opportunities, minor UX issues -->
62+
#### Medium
63+
<!-- Maintainability, test coverage, architectural improvements -->
5264
(items)
5365
54-
### Low
55-
<!-- Polish, style improvements, nice-to-haves, minor optimizations -->
66+
#### Low
67+
<!-- Polish, style, minor optimizations -->
5668
(items)
5769
70+
---
71+
72+
### Requires Human Review
73+
74+
These suggestions involve product decisions, user experience changes, or feature additions that need human judgment.
75+
76+
(items ordered by potential value)
77+
78+
---
79+
5880
### Summary
59-
[One sentence on overall codebase health. One sentence on the single highest-value next action.]
81+
[One sentence on overall codebase health. One sentence on the top refactor priority. One sentence on the most valuable human-review item.]
6082
```
6183

62-
Each item uses this format:
84+
**Refactor item format:**
6385
- **[Short, specific title]**: [Concrete action — reference specific files, functions, or patterns]. Value: [Why this matters]. Impact: [Which areas affected]. Risk: [Low/Medium/High].
6486

87+
**Human Review item format:**
88+
- **[Short, specific title]**: [Concrete suggestion]. Why: [Problem or opportunity]. Trade-offs: [Decisions involved]. Effort: [Small/Medium/Large].
89+
6590
Rules:
6691
- Do NOT include anything already implemented — verify by reading files
6792
- Be specific — reference files, functions, patterns. No vague advice like "add more tests"
68-
- Maximum 5 items per tier (20 total)
93+
- Include ALL items — no limits. The human needs the complete list for easy copy-paste
6994
- Deduplicate ruthlessly
7095

7196
## Part 3: Write the Report File

test/consolidation.test.js

Lines changed: 7 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ vi.mock('../src/executor.js', () => ({
1515
}));
1616

1717
vi.mock('../src/prompts/loader.js', () => ({
18-
CONSOLIDATION_PROMPT: 'Mock consolidation prompt template with Critical High Medium Low tiers and consolidated, prioritized action plan instructions.',
18+
CONSOLIDATION_PROMPT: 'Mock consolidation prompt template with Recommended Refactors (Critical High Medium Low tiers) and Requires Human Review sections for consolidated, prioritized action plan instructions.',
1919
reloadSteps: vi.fn(),
2020
}));
2121

@@ -87,10 +87,8 @@ describe('buildConsolidationPrompt', () => {
8787
const prompt = buildConsolidationPrompt(results);
8888

8989
expect(prompt).toContain('consolidated, prioritized action plan');
90-
expect(prompt).toContain('Critical');
91-
expect(prompt).toContain('High');
92-
expect(prompt).toContain('Medium');
93-
expect(prompt).toContain('Low');
90+
expect(prompt).toContain('Recommended Refactors');
91+
expect(prompt).toContain('Human Review');
9492
});
9593

9694
it('handles null/undefined output gracefully', () => {
@@ -110,30 +108,30 @@ describe('generateActionPlan', () => {
110108
const mockCost = { costUSD: 0.05, inputTokens: 1000, outputTokens: 500 };
111109
runPrompt.mockResolvedValue({
112110
success: true,
113-
output: '# NightyTidy Action Plan\n\n## Critical\n\nNo items.',
111+
output: '# NightyTidy Action Plan\n\n## Recommended Refactors\n\nNo items.',
114112
cost: mockCost,
115113
});
116114

117115
const results = makeResults({ completedCount: 2, failedCount: 0 });
118116
const { text, cost } = await generateActionPlan(results, '/fake/project', {});
119117

120118
expect(text).toContain('## NightyTidy Action Plan');
121-
expect(text).toContain('### Critical');
119+
expect(text).toContain('### Recommended Refactors');
122120
expect(text).toContain('No items.');
123121
expect(cost).toEqual(mockCost);
124122
});
125123

126124
it('downgrades heading levels in returned text', async () => {
127125
runPrompt.mockResolvedValue({
128126
success: true,
129-
output: '# NightyTidy Action Plan\n\n## Critical\n\n### 1. Some item',
127+
output: '# NightyTidy Action Plan\n\n## Recommended Refactors\n\n### Critical',
130128
cost: null,
131129
});
132130

133131
const results = makeResults({ completedCount: 1, failedCount: 0 });
134132
const { text } = await generateActionPlan(results, '/fake/project', {});
135133

136-
expect(text).toBe('## NightyTidy Action Plan\n\n### Critical\n\n#### 1. Some item');
134+
expect(text).toBe('## NightyTidy Action Plan\n\n### Recommended Refactors\n\n#### Critical');
137135
});
138136

139137
it('returns null text when Claude returns failure', async () => {

0 commit comments

Comments
 (0)