claude-code-critic/PRD.md at master · rlefko/claude-code-critic

Product Requirements Document (PRD) – Plan Mode Integration

Background & Goal: Claude Code Critic already leverages Anthropic’s Plan Mode to separate planning from execution. Plan Mode allows Claude to research the codebase and propose a plan (in a plan.md) before making any changes . The goal of this feature is to deepen the integration of Plan Mode by harnessing Explore/Plan sub-agents and enforcing quality guardrails during planning. This will ensure that the AI’s implementation plans are thorough, context-aware, and align with our project’s best practices. We want the planning phase to proactively catch architectural or design issues (before coding) and include necessary tasks (like testing and documentation) while avoiding redundant or ill-advised changes.

Key Objectives: • Rich Context via MCP: Enable Plan Mode’s sub-agents to utilize the full context of our project’s knowledge base (code index, specs, tickets) through the Model Context Protocol (MCP). This means plan sub-agents can fetch relevant external context (e.g. design docs, ticket requirements) during planning . • Quality Planning Rules: Establish explicit guardrails for plans: • Include testing and documentation tasks when appropriate (e.g. prompt to add unit tests or update docs for new features). Plans should reflect a complete implementation, not just code changes. • Ensure no duplication of existing functionality – if the requested feature or fix already exists or has similar logic elsewhere, the plan should leverage or refactor the existing code rather than duplicate it . • Enforce consistent architecture & patterns – planned changes should align with the existing system design and coding standards . The plan should not propose solutions that conflict with established architecture (unless explicitly doing a refactor). • Avoid performance anti-patterns – if a plan step could introduce a known performance issue (e.g. an O(n²) loop on a large dataset), the planner should flag or adjust it. • Plan Approval Confidence: Increase the confidence that when the user reviews a plan, it already accounts for edge cases, testing, and adherence to standards. Essentially, the plan itself should undergo a “self-review” for quality so that fewer issues slip through. As one expert noted, “Review plans critically – this is where you catch architectural mistakes” . We aim to have Claude do this critical review proactively.

User Stories: 1. As a developer using Claude Code Critic in Plan Mode, I want the AI to consider all relevant context (requirements, similar functions, dependencies) when formulating a plan, so that the plan is comprehensive and I don’t have to manually remind it of things it missed. 2. As a team lead, I want the AI’s implementation plans to include tasks for writing tests and updating documentation when needed, ensuring that code quality and completeness are built into the plan. For example, if planning a new API endpoint, the plan should include updating API docs and adding tests for it. 3. As a reviewer of the AI’s plan, I want to see that the plan respects our existing codebase structure and doesn’t duplicate or contradict existing functionalities, so I can trust the plan won’t lead to redundant code or architecture violations. The plan should explicitly mention if it found existing utilities to reuse, etc. 4. As a user, I want the planning phase to alert me to any potential issues (design flaws, performance concerns) in the proposed approach, so that we can address them before any code is written. This might be an annotated note in the plan like “Note: Using approach X might be slow for large N; consider Y if performance becomes an issue.”

Functional Requirements: • Plan Sub-Agent Context Integration: The Plan/Explore sub-agents should be able to query our code index and knowledge base via MCP. For example, during planning, Claude might use a tool (through MCP) to search design docs or prior tickets for keywords. This allows richer analysis – e.g. pulling in acceptance criteria from a linked ticket as Milcheck did . The system should permit safe, read-only access to such external context (via allowed MCP tools configured) during plan formulation. • Planning Prompt Enhancements: Augment Claude’s Plan Mode system prompt or preamble with our custom planning guidelines. We will introduce instructions that: • Emphasize adding relevant test and documentation steps in the plan (e.g. “If implementing new user-visible functionality or complex logic, include a task to update or add tests. Include documentation updates if it affects user-facing behavior or APIs.”). • Remind the agent to check for existing functions or components that could fulfill the requirement. For instance: “Before proposing a new function, search the codebase to ensure similar functionality doesn’t already exist.” This leverages the Explore sub-agent’s ability to quickly search code . • Instruct it to follow project architecture patterns (could reference a summary of patterns in CLAUDE.md) and to explicitly call out any deviation (refactor or new pattern) with justification. • Encourage performance mindfulness: e.g. “Consider the efficiency of the proposed solution; note if any step may introduce significant overhead.” • Parallel Exploration & Perspectives: The plan formulation should take advantage of Claude Code’s capability to spawn parallel subagents for research . Out-of-the-box, Plan Mode already does this: one subagent might scan authentication flows, another database schema, another “identifies test patterns” . We will ensure this multi-perspective exploration covers our added concerns too. For example, one subagent could scan for relevant tests or related modules, another could search for existing utilities to reuse. By dividing these tasks, the AI can gather broad info without running out of context window . (This leverages the recent architecture from MILESTONES.md, where Plan Mode is multi-threaded and uses Haiku-model subagents for speedy searches .) • Quality Verification Step (Plan QA): After the AI drafts a plan, the system will perform a plan quality check. This could be an internal pass where Claude (or a secondary “Plan Reviewer” agent) re-reads the plan to verify it meets the guardrails: • For each task in the plan, ensure no obvious missing step (e.g. if code changes planned in module X, is there a task to run or update module X’s tests? If not, maybe add a recommendation). • Cross-check planned new functions/features against an index of existing functions (e.g. using a search by name or relevant keyword). If a potential duplicate is found, the plan should be adjusted to use the existing code or explicitly justify why new code is needed. • This check could either be done by Claude itself (via a self-review prompt: “Now double-check the plan for duplication, testing, docs, etc., and revise if needed”) or by some automated script. The acceptance criterion is that the final plan output presented to the user mentions these considerations (e.g. “(Plan includes writing tests)”, “(No existing function does X, so implementing new helper)”). • User Interface / Feedback: The output to the user remains the plan.md text that Claude presents for approval, but now it will be more detailed. For example, a plan might have an extra section or clearly labeled tasks for “Testing” and “Documentation”. If the plan intentionally skips these (not needed), it should note why (“No changes to user-facing behavior, so no docs needed.”). The system should also highlight if it found and reused existing components (e.g. “Will reuse the existing CacheUtil instead of creating a new caching mechanism”). • No Regression of Speed and Token Use: Even with added steps, plan generation should remain efficient. Using parallel Explore agents ensures the heavy-lifting of searching doesn’t bloat the main agent’s context . The solution must leverage the latest architecture’s efficiency improvements (like Haiku for subagents, Opus for reasoning) so that plan thoroughness doesn’t significantly slow down the workflow. • Configurability: The planning guardrails should be configurable or at least well-aligned with our project’s CLAUDE.md. For instance, if certain anti-patterns or architecture guidelines are documented in CLAUDE.md, the plan agent should be aware of them. We might allow toggling certain checks (via configuration) so as not to enforce irrelevant rules on every plan. By default, however, quality checks are on.

Non-Functional Requirements: • The enhancement must integrate cleanly with existing Plan Mode behavior and not break the default sequence (research → plan → approval → execute). The user should not have to invoke any special commands; it should be a seamless improvement when they use Plan Mode. • The system should maintain safety: Plan Mode already prevents code writes until approval . Our additional context fetching via MCP should respect permissions (only read-access to allowed sources) so we don’t introduce any security risks. • Documentation: Update the project docs (e.g. the README or an internal design doc) to describe these plan improvements. Users should understand that Claude Code Critic’s plans will include tests/docs steps and avoid duplicates by design. Also, document how to configure or override this behavior if needed. • Performance and Cost: Additional searches or MCP calls will incur overhead (API tokens, possibly execution time). We should monitor that the planning phase still completes in a reasonable time (e.g. a few seconds longer is fine, but it shouldn’t hang or be minutes slower). If needed, provide a way to disable the extra checks for very quick tasks (though Plan Mode itself is usually skipped for trivial tasks per best practice ).

Success Criteria & Evaluation: • Plans generated for complex tasks should consistently include testing/documentation steps when appropriate. We can test this by prompting Claude Code Critic for a plan to implement a new feature that clearly requires tests, and verify the plan lists “Write tests for X” as a step. • The plan should demonstrate context awareness: e.g., if asked to implement a feature that is actually already implemented under a different name, see if the plan catches it. (We might simulate this with a known duplicate function in the codebase.) Success is when the plan says “Note: function Y already provides similar functionality; the plan is to reuse or adapt it instead of creating from scratch.” • In user trials, plans should come out with fewer omissions. Ideally, when a human reviewer looks at the AI’s plan, they have minimal additional pointers to add. Feedback from early adopters should indicate that “the AI’s plans now feel more like a senior engineer’s plan – they remembered to include tests and follow our patterns,” aligning with the notion that Plan Mode encourages thoughtful, senior-level planning . • No major negative impact on user experience: if users find the planning slower or too pedantic, we’ll adjust the balance (perhaps tune the thoroughness level or allow a quick mode). Initially, however, thoroughness is prioritized for non-trivial tasks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FilesExpand file tree

PRD.md

Latest commit

History

PRD.md

File metadata and controls