Plan Mode Integration - Implementation Milestones

Implementation Milestones

Version: 1.0 Based On: PRD.md and TDD.md (Plan Mode Integration) Target: Deepen Plan Mode integration with quality guardrails and rich context Prerequisites: Phases 0-6 complete (v2.9.20+)

Executive Summary
Phase 7: Foundation - Plan Mode Detection & Hook Infrastructure
Phase 8: MCP Context Integration
Phase 9: Quality Guardrails Framework
Phase 10: Auto-Revision System
Phase 11: Prompt Augmentation & Exploration Hints
Phase 12: Plan QA Verification
Phase 13: Polish, Testing & Documentation
Phase 14: MCP Server Enhancement - Testing Foundation
Phase 15: MCP Server Code Quality & Documentation
Phase 16: MCP Server CI/CD Completion
Phase 17: MCP Server Error Handling & Security
Phase 18: MCP Server Resilience & Resource Management
Phase 19: MCP Server Structured Logging
Appendix: Rule Specifications

Executive Summary

Vision

Enhance Claude Code's Plan Mode to produce implementation plans that are thorough, context-aware, and aligned with project best practices. The system proactively catches architectural/design issues during planning and ensures plans include necessary tasks (testing, documentation) while avoiding redundant or ill-advised changes.

Key Design Decisions

Hook into Claude Code's Plan Mode - Use hooks/MCP to enhance native Plan Mode behavior
Auto-revise plans - Automatically add missing tasks before showing to user
Full MCP context - Code index, design docs, and issue tracking (Linear, GitHub)
Hybrid parallelism - Inject prompts that guide Claude Code's sub-agent exploration

Current State (Prerequisites Complete)

Component	Status	Notes
CLI Infrastructure	v2.9.20	Full claude-indexer with hooks
MCP Server	Complete	8+ tools, streaming responses
Hook Framework	Complete	SessionStart, UserPromptSubmit, PreToolUse, PostToolUse
Memory Guard v4.3	Complete	27 pattern checks
UI Consistency	Complete	15+ rules, 3-tier architecture

Success Metrics (from PRD)

Plan Completeness: >90% of plans include test/doc tasks when appropriate
Duplicate Detection: Existing code reuse suggested in >80% of applicable cases
Plan Approval Rate: <10% of plans require user revision before approval
User Confidence: Reviewers have minimal additional pointers to add
Performance: <100ms overhead for plan augmentation

Phase 7: Foundation - Plan Mode Detection & Hook Infrastructure

Goal: Establish the infrastructure for detecting Plan Mode activation and intercepting plan generation.

Milestone 7.1: Plan Mode Detection

Objective: Reliably detect when Claude Code enters Plan Mode.

Tasks

ID	Task	Priority	Status
7.1.1	Create `hooks/plan_mode_detector.py` with pattern detection	HIGH	DONE
7.1.2	Implement explicit marker detection (`@agent-plan`, `@plan`)	HIGH	DONE
7.1.3	Add planning keyword detection (confidence scoring)	MEDIUM	DONE
7.1.4	Implement environment variable detection (CLAUDE_PLAN_MODE)	MEDIUM	DONE
7.1.5	Create session state tracking for Plan Mode persistence	MEDIUM	DONE
7.1.6	Add unit tests for detection accuracy (>95% target)	HIGH	DONE

Detection Patterns:

# Explicit markers (1.0 confidence)
EXPLICIT_PATTERNS = r'@agent-plan|@plan\b|--plan\b|plan\s*mode'

# Planning keywords (0.7 confidence)
PLANNING_KEYWORDS = r'\b(create|make|write|design|implement)\s+(a\s+)?plan\b'

Testing Requirements:

Unit tests for each detection method
Test with real Claude Code Plan Mode sessions
Verify <10ms detection latency

Success Criteria:

>95% detection accuracy for Plan Mode (achieved: 100% on 30-case benchmark)
<10ms detection latency (achieved: <1ms average)
Zero false positives on non-plan prompts

Milestone 7.2: Hook Infrastructure Extension

Objective: Extend existing hook system to support Plan Mode interception.

Tasks

ID	Task	Priority	Status
7.2.1	Extend `UserPromptSubmit` hook for Plan Mode detection	HIGH	DONE
7.2.2	Create plan context injection mechanism	HIGH	DONE
7.2.3	Implement hook chaining for plan augmentation	MEDIUM	DONE
7.2.4	Add Plan Mode state to SessionContext	MEDIUM	DONE
7.2.5	Create `PlanModeContext` dataclass for state tracking	MEDIUM	DONE
7.2.6	Update `.claude/settings.json` template for Plan hooks	LOW	DONE

Implementation Details (Milestone 7.2 Complete):

Created claude_indexer/hooks/planning/ package with:
- guidelines.py - PlanningGuidelinesGenerator (<20ms)
- exploration.py - ExplorationHintsGenerator (<30ms)
- injector.py - PlanContextInjector (<50ms total)
Modified hooks/prompt_handler.py to inject guidelines and hints
Configuration via CLAUDE_PLAN_MODE_CONFIG env var or CLAUDE_PLAN_MODE_COMPACT

Hook Flow:

UserPromptSubmit Hook
       |
       +---> Plan Mode Detection
       |        |
       |        +---> (Yes) -> Inject Planning Guidelines
       |        |               |
       |        |               +---> Generate Exploration Hints
       |        |               |
       |        |               +---> Enable Plan QA Verification
       |        |
       |        +---> (No) -> Pass through unchanged
       |
       +---> Continue to Claude

Testing Requirements:

Test hook invocation order
Test state persistence across turns
Verify non-blocking behavior

Documentation:

Update HOOKS.md with Plan Mode hooks
Add configuration examples

Success Criteria:

Hooks execute in correct order
State persists correctly (via SessionContext.plan_mode)
<20ms hook overhead (guidelines <20ms, hints <30ms, total <50ms)

Phase 8: MCP Context Integration

Goal: Enable Plan Mode to query rich context via MCP - code index, design docs, and issue trackers.

Milestone 8.1: Design Document Indexing

Objective: Index and search design documents (PRD, TDD, ADR, specs).

Tasks

ID	Task	Priority	Status
8.1.1	Create new EntityTypes: SPEC, PRD, TDD, ADR, REQUIREMENT	HIGH	DONE
8.1.2	Implement `DesignDocParser` in `claude_indexer/analysis/`	HIGH	DONE
8.1.3	Add document type detection (patterns in filename/content)	HIGH	DONE
8.1.4	Implement section extraction from markdown	MEDIUM	DONE
8.1.5	Extract individual requirements as separate entities	MEDIUM	DONE
8.1.6	Create relations between docs and code components	MEDIUM	DONE
8.1.7	Add design doc paths to configuration	LOW	DONE

Entity Types:

class EntityType(Enum):
    # Existing types...
    SPEC = "spec"                    # Design specifications
    PRD = "prd"                      # Product requirements documents
    TDD = "tdd"                      # Technical design documents
    ADR = "adr"                      # Architecture decision records
    REQUIREMENT = "requirement"      # Individual requirements from specs

DesignDocParser:

class DesignDocParser(CodeParser):
    """Parser for design documents (PRD, TDD, ADR, specs)."""

    DOC_TYPE_PATTERNS = {
        "prd": [r"product\s+requirements?\s+document", r"^prd"],
        "tdd": [r"technical\s+design\s+document", r"^tdd"],
        "adr": [r"architecture\s+decision\s+record", r"^adr"],
        "spec": [r"specification", r"^spec"],
    }

    def parse(self, file_path: Path) -> ParserResult:
        # Extract sections, requirements, decisions
        ...

Configuration (in unified_config.py):

class DesignDocsConfig(BaseModel):
    enabled: bool = True
    paths: list[str] = ["docs/", "specs/", "design/", "*.md"]
    doc_patterns: dict[str, str] = {
        "prd": "**/PRD*.md",
        "tdd": "**/TDD*.md",
        "adr": "**/adr/*.md",
    }

Testing Requirements:

Test document type detection with various formats
Test section extraction from real PRD/TDD files
Test requirement entity creation
Verify relations to code components

Success Criteria:

Auto-detect document type with >90% accuracy
Extract sections and requirements correctly
Create searchable entities for all design docs

Implementation Details (Milestone 8.1 Complete):

Created claude_indexer/analysis/design_doc_parser.py with DesignDocParser
Added 5 new EntityTypes: SPEC, PRD, TDD, ADR, REQUIREMENT
Added DesignDocsConfig to claude_indexer/config/unified_config.py
Document type detection via filename patterns and content patterns
Section extraction respects configurable max_section_depth
Requirement extraction supports RFC 2119 (MUST/SHALL/SHOULD/MAY), [REQ-XXX], and numbered lists
Relations created between documents, sections, and requirements
Tests: 30 unit tests covering all functionality

Milestone 8.2: New MCP Tools for Documents

Objective: Add MCP tools for searching and retrieving design documents.

Tasks

ID	Task	Priority	Status
8.2.1	Add `search_docs` tool to MCP server	HIGH	DONE
8.2.2	Add `get_doc` tool for full document retrieval	HIGH	DONE
8.2.3	Implement docTypes filtering (prd, tdd, adr, spec)	MEDIUM	DONE
8.2.4	Add section-specific retrieval	LOW	DONE
8.2.5	Create TypeScript interfaces for doc tools	MEDIUM	DONE
8.2.6	Add validation for doc tool inputs	MEDIUM	DONE

MCP Tool: search_docs:

{
  name: "search_docs",
  description: "Search design documents, specifications, PRDs, and ADRs",
  inputSchema: {
    type: "object",
    properties: {
      query: { type: "string", description: "Search query" },
      docTypes: {
        type: "array",
        items: { type: "string" },
        description: "Filter: prd, tdd, spec, adr"
      },
      limit: { type: "number", default: 10 }
    },
    required: ["query"]
  }
}

MCP Tool: get_doc:

{
  name: "get_doc",
  description: "Retrieve full content of a specific design document",
  inputSchema: {
    type: "object",
    properties: {
      docId: { type: "string", description: "Document ID or file path" },
      section: { type: "string", description: "Optional: specific section" }
    },
    required: ["docId"]
  }
}

Testing Requirements:

Test search with various queries
Test filtering by document type
Test full document retrieval
Benchmark search latency (<50ms)

Success Criteria:

Both tools implemented and tested
Filter by document type works correctly
<50ms search latency

Implementation Details (Milestone 8.2 Complete):

Created DocSearchResult and DocContent interfaces in mcp-qdrant-memory/src/types.ts
Added SearchDocsRequest and GetDocRequest interfaces in validation.ts
Implemented validateSearchDocsRequest and validateGetDocRequest validators
Added searchDocs() and getDoc() methods to QdrantPersistence class
Added search_docs and get_doc tool definitions to MCP server
Supports docTypes filtering: prd, tdd, adr, spec
Section-specific retrieval via section parameter
Multi-project support via collection parameter
Returns sections sorted by line number, requirements with type classification

Milestone 8.3: Issue Tracker Integration

Objective: Enable querying tickets from Linear and GitHub Issues.

Tasks

ID	Task	Priority	Status
8.3.1	Create `claude_indexer/integrations/` package	HIGH	DONE
8.3.2	Implement `LinearClient` with GraphQL queries	HIGH	DONE
8.3.3	Implement `GitHubIssuesClient` with REST API	HIGH	DONE
8.3.4	Create `TicketEntity` data model	HIGH	DONE
8.3.5	Add `search_tickets` MCP tool	HIGH	DONE
8.3.6	Add `get_ticket` MCP tool with comments/PRs	HIGH	DONE
8.3.7	Implement authentication configuration	MEDIUM	DONE
8.3.8	Add rate limiting and caching	MEDIUM	DONE
8.3.9	Create ticket sync service (background)	LOW	DEFERRED

TicketEntity Data Model:

@dataclass
class TicketEntity:
    id: str                           # e.g., "AVO-123", "github#456"
    source: str                       # "linear", "github"
    title: str
    description: str
    status: str                       # "open", "in_progress", "done"
    assignee: str | None
    labels: list[str]
    priority: str | None
    acceptance_criteria: list[str]    # Extracted requirements
    linked_prs: list[str]             # PR references

    def to_entity(self) -> Entity:
        """Convert to standard Entity for indexing."""
        ...

LinearClient:

class LinearClient:
    BASE_URL = "https://api.linear.app/graphql"

    async def search_issues(
        self,
        query: str,
        status: list[str] | None = None,
        labels: list[str] | None = None,
        limit: int = 20
    ) -> list[TicketEntity]:
        ...

    async def get_issue(self, identifier: str) -> TicketEntity:
        ...

MCP Tool: search_tickets:

{
  name: "search_tickets",
  description: "Search issue tracker for relevant tickets (Linear, GitHub)",
  inputSchema: {
    type: "object",
    properties: {
      query: { type: "string" },
      status: { type: "array", items: { type: "string" } },
      labels: { type: "array", items: { type: "string" } },
      source: { type: "string", enum: ["linear", "github", "all"] }
    },
    required: ["query"]
  }
}

Configuration:

class LinearConfig(BaseModel):
    enabled: bool = False
    api_key: str = ""  # From LINEAR_API_KEY env var
    team_id: str = ""

class GitHubIssuesConfig(BaseModel):
    enabled: bool = False
    token: str = ""  # From GITHUB_TOKEN env var
    owner: str = ""
    repo: str = ""

Testing Requirements:

Test Linear API integration (mock responses)
Test GitHub API integration (mock responses)
Test authentication handling
Test rate limiting

Documentation:

API key configuration guide
Ticket search examples

Success Criteria:

Both integrations work with API keys
Search returns relevant tickets
Acceptance criteria extracted correctly

Milestone 8.4: Plan Mode Tool Access Control

Objective: Ensure Plan Mode only has read-only access to MCP tools.

Tasks

ID	Task	Priority	Status
8.4.1	Create `PlanModeGuard` class in MCP server	HIGH	DONE
8.4.2	Define allowed tools list for Plan Mode	HIGH	DONE
8.4.3	Define blocked tools list (write operations)	HIGH	DONE
8.4.4	Integrate guard into MCP request handler	HIGH	DONE
8.4.5	Add `set_plan_mode` internal tool	MEDIUM	DONE
8.4.6	Create clear error messages for blocked tools	MEDIUM	DONE

PlanModeGuard:

const PLAN_MODE_ALLOWED = [
  // Read-only code memory
  "search_similar", "read_graph", "get_implementation",
  // Read-only documents
  "search_docs", "get_doc",
  // Read-only tickets
  "search_tickets", "get_ticket",
];

const PLAN_MODE_BLOCKED = [
  // Write operations
  "create_entities", "create_relations", "add_observations",
  "delete_entities", "delete_observations", "delete_relations",
];

class PlanModeGuard {
  private isPlanMode: boolean = false;

  setPlanMode(enabled: boolean): void { ... }

  isToolAllowed(toolName: string): boolean {
    if (!this.isPlanMode) return true;
    return !PLAN_MODE_BLOCKED.includes(toolName);
  }
}

Testing Requirements:

Test tool blocking in Plan Mode
Test allowed tools work correctly
Verify error messages are clear

Success Criteria:

Write tools blocked in Plan Mode
Clear error messages for blocked operations
No security bypass possible

Implementation Details (Milestone 8.4 Complete):

Created mcp-qdrant-memory/src/plan-mode-guard.ts with PlanModeGuard class
Environment variable detection via CLAUDE_PLAN_MODE (matches Python implementation)
Blocked tools: create_entities, create_relations, add_observations, delete_entities, delete_observations, delete_relations
Allowed tools: search_similar, read_graph, get_implementation, search_docs, get_doc, search_tickets, get_ticket, set_plan_mode
Added set_plan_mode MCP tool to enable/disable Plan Mode
Error responses include blocked tools list and hint for resolution
Version bumped to 0.6.4

Phase 9: Quality Guardrails Framework

Goal: Create the framework for validating implementation plans against quality rules.

Milestone 9.1: Core Guardrails Data Model

Objective: Define data structures for plan validation rules.

Tasks

ID	Task	Priority	Status
9.1.1	Create `claude_indexer/ui/plan/guardrails/` package	HIGH	DONE
9.1.2	Define `PlanValidationContext` dataclass	HIGH	DONE
9.1.3	Define `PlanValidationFinding` dataclass	HIGH	DONE
9.1.4	Define `PlanRevision` and `RevisionType`	HIGH	DONE
9.1.5	Create abstract `PlanValidationRule` base class	HIGH	DONE
9.1.6	Create `PlanGuardrailConfig` for rule configuration	MEDIUM	DONE

Core Data Structures:

# claude_indexer/ui/plan/guardrails/base.py

class RevisionType(Enum):
    ADD_TASK = "add_task"
    MODIFY_TASK = "modify_task"
    REMOVE_TASK = "remove_task"
    ADD_DEPENDENCY = "add_dependency"
    REORDER_TASKS = "reorder_tasks"

@dataclass
class PlanValidationContext:
    plan: ImplementationPlan
    memory_client: Any | None = None
    collection_name: str | None = None
    project_path: Path = field(default_factory=Path.cwd)
    config: PlanGuardrailConfig = None
    source_requirements: str = ""

    def search_memory(self, query: str, **kwargs) -> list[dict]:
        """Search semantic memory for similar code/patterns."""
        ...

@dataclass
class PlanValidationFinding:
    rule_id: str
    severity: Severity
    summary: str
    affected_tasks: list[str] = field(default_factory=list)
    suggestion: str | None = None
    can_auto_revise: bool = False
    confidence: float = 1.0
    evidence: list[Evidence] = field(default_factory=list)
    suggested_revision: PlanRevision | None = None

@dataclass
class PlanRevision:
    revision_type: RevisionType
    rationale: str
    target_task_id: str | None = None
    new_task: Task | None = None
    modifications: dict[str, Any] = field(default_factory=dict)
    dependency_additions: list[tuple[str, str]] = field(default_factory=list)

class PlanValidationRule(ABC):
    @property
    @abstractmethod
    def rule_id(self) -> str: ...

    @property
    @abstractmethod
    def category(self) -> str: ...  # coverage, consistency, architecture, performance

    @abstractmethod
    def validate(self, context: PlanValidationContext) -> list[PlanValidationFinding]: ...

    @abstractmethod
    def suggest_revision(self, finding: PlanValidationFinding, context: PlanValidationContext) -> PlanRevision | None: ...

Testing Requirements:

Unit tests for all data classes
Test serialization/deserialization
Test validation context memory search

Success Criteria:

All data structures defined and tested
Follows existing pattern from claude_indexer/rules/base.py
Memory search integration works

Implementation Details (Milestone 9.1 Complete):

Created claude_indexer/ui/plan/guardrails/ package with base.py, config.py, init.py
RevisionType enum with ADD_TASK, MODIFY_TASK, REMOVE_TASK, ADD_DEPENDENCY, REORDER_TASKS
PlanRevision dataclass with serialization support
PlanValidationFinding dataclass with evidence and suggested revision
PlanValidationContext with plan, config, memory search integration
PlanValidationRule ABC following rules/base.py pattern
PlanGuardrailConfig Pydantic model with category toggles, auto-revise settings
Added plan_guardrails to UnifiedConfig
Tests: 52 unit tests covering all functionality

Milestone 9.2: Plan Guardrail Engine

Objective: Create the engine that orchestrates rule validation.

Tasks

ID	Task	Priority	Status
9.2.1	Create `PlanGuardrailEngine` coordinator class	HIGH	DONE
9.2.2	Implement rule discovery (auto-load from directory)	HIGH	DONE
9.2.3	Add parallel rule execution support	MEDIUM	DONE
9.2.4	Implement severity filtering	MEDIUM	DONE
9.2.5	Create `PlanGuardrailResult` aggregation	MEDIUM	DONE
9.2.6	Add performance timing for rules	LOW	DONE

PlanGuardrailEngine:

class PlanGuardrailEngine:
    def __init__(self, config: PlanGuardrailConfig):
        self.config = config
        self.rules: dict[str, PlanValidationRule] = {}
        self._discover_rules()

    def _discover_rules(self) -> None:
        """Auto-discover rules from guardrails/rules/ directory."""
        ...

    def validate(self, context: PlanValidationContext) -> list[PlanValidationFinding]:
        """Run all enabled rules against the plan."""
        findings = []
        for rule_id, rule in self.rules.items():
            if self.config.is_rule_enabled(rule_id):
                findings.extend(rule.validate(context))
        return findings

    def auto_revise(
        self,
        plan: ImplementationPlan,
        findings: list[PlanValidationFinding]
    ) -> RevisedPlan:
        """Apply auto-revisions based on findings."""
        ...

Testing Requirements:

Test rule discovery
Test parallel execution (Milestone 13.5)
Test severity filtering
Benchmark validation latency

Success Criteria:

Rules auto-discovered from directory
Parallel execution reduces latency (Milestone 13.5)
<500ms total validation time

Implementation Details (Milestone 9.2 Complete):

Created claude_indexer/ui/plan/guardrails/engine.py with:
- PlanGuardrailEngineConfig dataclass (timeout, error handling, confidence threshold)
- RuleExecutionResult dataclass for individual rule results
- PlanGuardrailResult dataclass with findings, statistics, error tracking
- PlanGuardrailEngine class with full rule lifecycle management
Key features:
- register() / unregister() for manual rule registration
- discover_rules() for auto-discovery from directory
- validate() runs all enabled rules with configurable filtering
- validate_fast() runs only fast rules (<100ms) for sync checks
- validate_category() runs rules in specific category
- Confidence threshold filtering
- Max findings per rule limiting
- Error handling with continue_on_error option
- Performance timing recorded per rule and total
Tests: 42 unit tests covering all functionality

Parallel Execution (Milestone 13.5):

Added parallel_execution flag to PlanGuardrailEngineConfig (default: False)
Added max_parallel_workers config option (default: 4)
Implemented _validate_parallel() using ThreadPoolExecutor
Extracted _validate_sequential() for the original behavior
validate() method now conditionally uses parallel or sequential execution
Tests: 8 additional tests for parallel execution behavior

Milestone 9.3: Plan Validation Rules (5 Rules)

Objective: Implement the core quality validation rules.

Tasks

ID	Task	Priority	Rule Name	Category	Status
9.3.1	Test Requirement Detection	HIGH	`PLAN.TEST_REQUIREMENT`	coverage	DONE
9.3.2	Documentation Requirement Detection	HIGH	`PLAN.DOC_REQUIREMENT`	coverage	DONE
9.3.3	Duplicate Code Detection	HIGH	`PLAN.DUPLICATE_DETECTION`	consistency	DONE
9.3.4	Architectural Consistency Check	MEDIUM	`PLAN.ARCHITECTURAL_CONSISTENCY`	architecture	DONE
9.3.5	Performance Pattern Detection	MEDIUM	`PLAN.PERFORMANCE_PATTERN`	performance	DONE

Implementation Location: claude_indexer/ui/plan/guardrails/rules/

Rule 1: TestRequirementRule

class TestRequirementRule(PlanValidationRule):
    """Ensures new features have corresponding test tasks."""

    FEATURE_KEYWORDS = ["implement", "add", "create", "build", "develop", "new"]
    TEST_KEYWORDS = ["test", "spec", "unittest", "pytest", "jest"]
    TRIVIAL_PATTERNS = [r"^(fix|rename|move|delete)\s+(typo|comment|readme)"]

    @property
    def rule_id(self) -> str:
        return "PLAN.TEST_REQUIREMENT"

    def validate(self, context: PlanValidationContext) -> list[PlanValidationFinding]:
        findings = []
        for task in context.plan.all_tasks:
            if self._is_feature_task(task) and not self._has_test_task(task, context.plan):
                if not self._is_trivial(task):
                    findings.append(self._create_finding(task))
        return findings

    def suggest_revision(self, finding, context) -> PlanRevision:
        """Auto-add test task."""
        feature_task = self._get_task(finding.affected_tasks[0], context.plan)
        test_task = Task(
            id=f"TASK-TST-{feature_task.id[-4:]}",
            title=f"Add tests for {feature_task.title}",
            description=f"Write tests for {feature_task.title}",
            scope=feature_task.scope,
            priority=feature_task.priority + 1,
            estimated_effort="low",
            impact=feature_task.impact * 0.8,
            acceptance_criteria=["Unit tests cover main functionality", "Tests pass in CI"],
            dependencies=[feature_task.id],
            tags=["testing", "quality"],
        )
        return PlanRevision(
            revision_type=RevisionType.ADD_TASK,
            rationale=f"Feature '{feature_task.title}' needs test coverage",
            new_task=test_task,
        )

Rule 2: DuplicateDetectionRule

class DuplicateDetectionRule(PlanValidationRule):
    """Detects tasks that might duplicate existing functionality."""

    SIMILARITY_THRESHOLD = 0.70

    @property
    def rule_id(self) -> str:
        return "PLAN.DUPLICATE_DETECTION"

    def validate(self, context: PlanValidationContext) -> list[PlanValidationFinding]:
        findings = []
        if context.memory_client is None:
            return findings

        for task in context.plan.all_tasks:
            query = f"{task.title} {task.description[:200]}"
            results = context.search_memory(
                query=query,
                entity_types=["function", "class", "implementation_pattern"]
            )

            for result in results:
                if result.get("score", 0) >= self.SIMILARITY_THRESHOLD:
                    findings.append(self._create_finding(task, result))

        return findings

    def suggest_revision(self, finding, context) -> PlanRevision:
        """Modify task to reference existing code."""
        return PlanRevision(
            revision_type=RevisionType.MODIFY_TASK,
            target_task_id=finding.affected_tasks[0],
            rationale="Potential duplicate detected",
            modifications={
                "description": f"{task.description}\n\n**Note:** Review existing implementation before proceeding.",
                "acceptance_criteria": task.acceptance_criteria + [
                    f"Verified no duplication with existing code"
                ],
            }
        )

Testing Requirements:

Unit tests for each rule with positive/negative cases
Test auto-revision generation
Test with real implementation plans
Measure false positive rate (<10% target)

Documentation:

Rule reference documentation
Configuration examples
Override mechanisms

Success Criteria:

All 5 rules implemented and tested
<10% false positive rate
Clear, actionable findings

Implementation Details (Milestone 9.3 Complete):

Created claude_indexer/ui/plan/guardrails/rules/ package with 5 rules
TestRequirementRule: Detects feature tasks without test coverage, auto-suggests test tasks
DocRequirementRule: Detects user-facing changes without documentation tasks
DuplicateDetectionRule: Uses semantic memory search to find potential duplicate code
ArchitecturalConsistencyRule: Validates file paths against project patterns
PerformancePatternRule: Detects N+1 queries, missing caching, blocking operations, etc.
All rules follow PlanValidationRule ABC pattern
Updated guardrails/__init__.py with rule exports
Tests: 159 unit tests covering all 5 rules with positive/negative/edge cases

Phase 10: Auto-Revision System

Goal: Automatically apply revisions to plans based on validation findings.

Milestone 10.1: Auto-Revision Engine

Objective: Create the engine that applies plan revisions automatically.

Tasks

ID	Task	Priority	Status
10.1.1	Create `AutoRevisionEngine` class	HIGH	DONE
10.1.2	Implement revision sorting by severity	HIGH	DONE
10.1.3	Implement conflict detection	HIGH	DONE
10.1.4	Add circular dependency prevention	HIGH	DONE
10.1.5	Implement revision application methods	HIGH	DONE
10.1.6	Create `RevisedPlan` result dataclass	MEDIUM	DONE
10.1.7	Add post-revision dependency resolution	MEDIUM	DONE
10.1.8	Add iteration limit (prevent infinite loops)	HIGH	DONE

AutoRevisionEngine:

@dataclass
class AppliedRevision:
    revision: PlanRevision
    finding: PlanValidationFinding
    success: bool
    error: str | None = None

@dataclass
class RevisedPlan:
    original_plan: ImplementationPlan
    revised_plan: ImplementationPlan
    revisions_applied: list[AppliedRevision] = field(default_factory=list)
    revisions_skipped: list[tuple[PlanRevision, str]] = field(default_factory=list)
    qa_passed: bool = False

class AutoRevisionEngine:
    MAX_ITERATIONS = 3

    def revise_plan(
        self,
        plan: ImplementationPlan,
        findings: list[PlanValidationFinding],
        rules: dict[str, PlanValidationRule]
    ) -> RevisedPlan:
        """
        Apply revisions based on findings.

        Algorithm:
        1. Sort findings by severity (CRITICAL > HIGH > MEDIUM > LOW)
        2. For each finding that can be auto-revised:
           a. Get revision suggestion from rule
           b. Check for conflicts
           c. Apply revision if safe
        3. Re-resolve dependencies
        4. Update priorities
        """
        ...

    def _check_conflicts(self, plan: ImplementationPlan, revision: PlanRevision) -> str | None:
        """Check if revision would create conflicts."""
        if revision.revision_type == RevisionType.ADD_TASK:
            if any(t.id == revision.new_task.id for t in plan.all_tasks):
                return f"Task ID '{revision.new_task.id}' already exists"

        if revision.revision_type == RevisionType.ADD_DEPENDENCY:
            for from_id, to_id in revision.dependency_additions:
                if self._would_create_cycle(plan, from_id, to_id):
                    return f"Would create circular dependency"

        return None

    def _would_create_cycle(self, plan, from_id, to_id) -> bool:
        """DFS to detect circular dependencies."""
        ...

Testing Requirements:

Test revision application for each type
Test conflict detection
Test circular dependency prevention
Test iteration limit enforcement

Success Criteria:

Revisions applied without conflicts
No circular dependencies introduced
Iteration limit prevents infinite loops

Implementation Details (Milestone 10.1 Complete):

Created claude_indexer/ui/plan/guardrails/auto_revision.py with:
- AppliedRevision dataclass for tracking applied revisions
- RevisedPlan dataclass with audit trail formatting
- AutoRevisionEngine class with full revision lifecycle
Key features:
- Revision sorting by severity (CRITICAL > HIGH > MEDIUM > LOW) and type order
- Conflict detection for all RevisionTypes (ADD_TASK, MODIFY_TASK, REMOVE_TASK, ADD_DEPENDENCY, REORDER_TASKS)
- Circular dependency detection using DFS algorithm
- Post-revision dependency resolution (removes orphaned dependencies)
- MAX_ITERATIONS = 3 to prevent infinite loops
- Configurable via PlanGuardrailConfig (auto_revise, max_revisions_per_plan, revision_confidence_threshold)
- Human-readable audit trail via format_audit_trail()
Tests: 55 unit tests covering all functionality

Milestone 10.2: Revision Audit Trail

Objective: Track all revisions for transparency and debugging.

Tasks

ID	Task	Priority	Status
10.2.1	Add revision history to ImplementationPlan	MEDIUM	DONE
10.2.2	Create human-readable revision summary	MEDIUM	DONE
10.2.3	Add revision rollback capability	LOW	DONE
10.2.4	Implement revision persistence	LOW	DONE

Audit Trail Format:

## Plan Revisions Applied

### 1. Added Test Task (PLAN.TEST_REQUIREMENT)
- **Reason**: Feature 'Add user authentication' needs test coverage
- **Added**: TASK-TST-0001 "Add tests for user authentication"
- **Confidence**: 95%

### 2. Modified Task (PLAN.DUPLICATE_DETECTION)
- **Reason**: Potential duplicate of existing 'AuthService.login()'
- **Modified**: TASK-0002 description to reference existing code
- **Confidence**: 78%

Testing Requirements:

Test audit trail generation
Test revision summary formatting
Verify all revisions tracked

Success Criteria:

Complete audit trail for all revisions
Human-readable summaries
Transparency for user review

Implementation Details (Milestone 10.2 Complete):

Added revision_history: list[AppliedRevision] field to ImplementationPlan
Added format_revision_history() method for human-readable markdown output
Created claude_indexer/ui/plan/guardrails/revision_history.py with:
- PlanSnapshot dataclass for versioned plan state snapshots
- RevisionHistoryManager for snapshot creation, versioning, and rollback
- PlanPersistence for JSON file save/load of plans and history
Full serialization support with backward compatibility for old plans
Exports added to guardrails/__init__.py and plan/__init__.py
Tests: 41 unit tests covering all functionality

Phase 11: Prompt Augmentation & Exploration Hints

Goal: Inject planning guidelines and exploration hints into Claude's context.

Milestone 11.1: Planning Guidelines Generator

Objective: Generate contextual planning guidelines for injection.

Tasks

ID	Task	Priority	Status
11.1.1	Create `hooks/planning_guidelines.py`	HIGH	DONE
11.1.2	Define guidelines template with placeholders	HIGH	DONE
11.1.3	Load project patterns from CLAUDE.md	MEDIUM	DONE
11.1.4	Generate collection-specific MCP commands	MEDIUM	DONE
11.1.5	Add configuration for guideline customization	LOW	DONE

Planning Guidelines Template:

PLANNING_GUIDELINES_TEMPLATE = """
=== PLANNING QUALITY GUIDELINES ===

When formulating this implementation plan, follow these guidelines:

## 1. Code Reuse Check (CRITICAL)
Before proposing ANY new function, class, or component:
- Search the codebase: `{mcp_prefix}search_similar("functionality")`
- Check existing patterns: `{mcp_prefix}read_graph(entity="Component", mode="relationships")`
- If similar exists, plan to REUSE or EXTEND it
- State explicitly: "Verified no existing implementation" or "Will extend existing Y"

## 2. Testing Requirements
Every plan that modifies code MUST include:
- [ ] Unit tests for new/modified functions
- [ ] Integration tests for API changes
- Task format: "Add tests for [feature] in [test_file]"

## 3. Documentation Requirements
Include documentation tasks when:
- Adding public APIs -> Update API docs
- Changing user-facing behavior -> Update README
- Adding configuration -> Update config docs

## 4. Architecture Alignment
Your plan MUST align with project patterns:
{project_patterns}

## 5. Performance Considerations
Flag any step that may introduce:
- O(n^2) or worse complexity
- Unbounded memory usage
- Missing timeouts on network calls

== END PLANNING GUIDELINES ==
"""

def generate_planning_guidelines(
    collection_name: str,
    project_patterns: str = "",
    exploration_hints: list[str] = None
) -> str:
    """Generate planning guidelines with project context."""
    mcp_prefix = f"mcp__{collection_name}-memory__"
    ...

Testing Requirements:

Test template rendering
Test project pattern loading
Test MCP prefix generation

Success Criteria:

Guidelines correctly formatted
Project-specific patterns included
MCP commands use correct collection

Implementation Details (Milestone 11.1 Complete):

Implemented as claude_indexer/hooks/planning/guidelines.py
PlanningGuidelinesConfig with all section toggles
PlanningGuidelines output with full_text, sections, mcp_commands
PlanningGuidelinesGenerator with 5 template sections
CLAUDE.md pattern loading from project root or .claude/
Tests: 18 unit tests in test_guidelines.py

Milestone 11.2: Exploration Hints Generator

Objective: Generate hints to guide Claude's sub-agent parallel exploration.

Tasks

ID	Task	Priority	Status
11.2.1	Create `hooks/exploration_hints.py`	HIGH	DONE
11.2.2	Implement entity extraction from prompts	HIGH	DONE
11.2.3	Generate duplicate-check hints	HIGH	DONE
11.2.4	Generate test-discovery hints	MEDIUM	DONE
11.2.5	Generate documentation hints	MEDIUM	DONE
11.2.6	Generate architecture hints	LOW	DONE

Exploration Hints Generator:

def generate_exploration_hints(prompt: str, collection_name: str) -> list[str]:
    """Generate exploration hints for parallel sub-agents."""
    mcp_prefix = f"mcp__{collection_name}-memory__"
    entities = extract_entities(prompt)

    hints = [
        # Duplicate Check
        f"## Duplicate Check\n{mcp_prefix}search_similar('{entities[0]}', entityTypes=['function', 'class'])",

        # Test Discovery
        f"## Test Discovery\n{mcp_prefix}search_similar('test', entityTypes=['file'])",

        # Documentation
        f"## Documentation\n{mcp_prefix}search_similar('docs', entityTypes=['documentation'])",
    ]

    # Entity-specific hints
    for entity in entities[:3]:
        hints.append(
            f"## {entity} Analysis\n{mcp_prefix}read_graph(entity='{entity}', mode='smart')"
        )

    return hints

def extract_entities(prompt: str) -> list[str]:
    """Extract likely code entities from prompt."""
    patterns = [
        r'\b[A-Z][a-z]+(?:[A-Z][a-z]+)+\b',  # CamelCase
        r'\b[a-z]+(?:_[a-z]+)+\b',            # snake_case
        r'["\']([^"\']+)["\']',               # Quoted terms
    ]
    ...

Testing Requirements:

Test entity extraction accuracy
Test hint generation with various prompts
Verify MCP commands are valid

Success Criteria:

Entities extracted with >80% accuracy
Hints guide toward quality checks
MCP commands are executable

Implementation Details (Milestone 11.2 Complete):

Implemented as claude_indexer/hooks/planning/exploration.py
ExplorationHintsConfig with section toggles and max_entity_hints
ExplorationHints output with hints, extracted_entities, mcp_commands
ExplorationHintsGenerator with entity extraction patterns
Supports CamelCase, snake_case, quoted terms, technical terms
Tests: 21 unit tests in test_exploration.py

Milestone 11.3: Prompt Handler Integration

Objective: Integrate guidelines and hints into UserPromptSubmit hook.

Tasks

ID	Task	Priority	Status
11.3.1	Modify `hooks/prompt_handler.py` for Plan Mode	HIGH	DONE
11.3.2	Implement guidelines injection	HIGH	DONE
11.3.3	Implement hints injection	HIGH	DONE
11.3.4	Add configuration toggle	MEDIUM	DONE
11.3.5	Measure injection latency (<50ms target)	MEDIUM	DONE

Testing Requirements:

Test full hook flow with Plan Mode
Test injection timing
Verify guidelines appear in context

Success Criteria:

Guidelines injected for Plan Mode
<50ms injection latency
Claude follows guidelines

Implementation Details (Milestone 11.3 Complete):

hooks/prompt_handler.py - Main hook with Plan Mode detection
claude_indexer/hooks/planning/injector.py - Coordinates injection
PlanContextInjectionConfig with all toggles and compact mode
PlanContextInjector assembles guidelines + hints
inject_plan_context() convenience function
Tests: 22 unit tests in test_injector.py

Phase 12: Plan QA Verification

Goal: Verify generated plans meet quality standards before user approval.

Milestone 12.1: Plan QA Verifier

Objective: Post-generation verification of plan quality.

Tasks

ID	Task	Priority	Status
12.1.1	Create `claude_indexer/hooks/plan_qa.py`	HIGH	DONE
12.1.2	Implement missing test detection	HIGH	DONE
12.1.3	Implement missing doc detection	HIGH	DONE
12.1.4	Implement duplicate check verification	HIGH	DONE
12.1.5	Create `PlanQAResult` dataclass	MEDIUM	DONE
12.1.6	Generate human-readable feedback	MEDIUM	DONE
12.1.7	Integrate with plan output	MEDIUM	DONE

PlanQAVerifier:

@dataclass
class PlanQAResult:
    is_valid: bool = True
    missing_tests: list[str] = field(default_factory=list)
    missing_docs: list[str] = field(default_factory=list)
    potential_duplicates: list[str] = field(default_factory=list)
    architecture_warnings: list[str] = field(default_factory=list)
    suggestions: list[str] = field(default_factory=list)

    def has_issues(self) -> bool:
        return bool(self.missing_tests or self.missing_docs or
                    self.potential_duplicates or self.architecture_warnings)

    def format_feedback(self) -> str:
        """Format feedback for plan output."""
        if not self.has_issues():
            return "\n[Plan QA: All quality checks passed]"

        lines = ["\n=== Plan QA Feedback ==="]
        if self.missing_tests:
            lines.append("\n[WARN] Missing Test Coverage:")
            for item in self.missing_tests:
                lines.append(f"  - {item}")
        ...
        return "\n".join(lines)

class PlanQAVerifier:
    CODE_CHANGE_PATTERNS = re.compile(
        r'(add|create|implement|modify)\s+(?:a\s+)?(function|class|component)',
        re.IGNORECASE
    )

    def verify_plan(self, plan_text: str) -> PlanQAResult:
        """Verify a plan meets quality standards."""
        result = PlanQAResult()

        if self._needs_tests(plan_text) and not self._has_test_tasks(plan_text):
            result.missing_tests.append("Plan modifies code but includes no test tasks")
            result.suggestions.append("Add unit/integration test task")

        if self._is_user_facing(plan_text) and not self._has_doc_tasks(plan_text):
            result.missing_docs.append("User-facing changes without doc update")

        if self._creates_new_code(plan_text) and not self._mentions_reuse_check(plan_text):
            result.potential_duplicates.append("New code without explicit duplicate check")

        return result

Testing Requirements:

Test with plans missing tests
Test with plans missing docs
Test with complete plans
Verify feedback formatting

Success Criteria:

Detects missing test/doc tasks
Generates actionable feedback
<50ms verification latency

Implementation Details (Milestone 12.1 Complete):

Created claude_indexer/hooks/plan_qa.py with:
- PlanQAConfig dataclass with check toggles and strict mode settings
- PlanQAResult dataclass with has_issues(), format_feedback(), to_dict()
- PlanQAVerifier class with pattern-based detection for:
  - CODE_CHANGE_PATTERNS (test coverage check)
  - TEST_TASK_PATTERNS (test task detection)
  - DOC_TASK_PATTERNS (doc task detection)
  - USER_FACING_PATTERNS (user-facing change detection)
  - REUSE_CHECK_PATTERNS (duplicate verification)
  - ARCHITECTURE_CONCERN_PATTERNS (performance anti-patterns)
- verify_plan_qa() convenience function
Tests: 50+ unit tests in test_plan_qa.py covering all scenarios

Milestone 12.2: QA Integration Points

Objective: Integrate QA verification into the planning workflow.

Tasks

ID	Task	Priority	Status
12.2.1	Add QA check after guardrail validation	HIGH	DONE
12.2.2	Append QA feedback to plan output	HIGH	DONE
12.2.3	Track QA pass/fail metrics	LOW	DONE
12.2.4	Add QA override configuration	LOW	DONE

Testing Requirements:

Test end-to-end QA flow
Test feedback appears in plan
Test override configuration

Success Criteria:

QA feedback visible to user
Metrics tracked for analysis (Milestone 13.5)
Override available for edge cases

Implementation Details (Milestone 12.2 Complete):

Updated claude_indexer/hooks/planning/injector.py with:
- Added qa_enabled and qa_config to PlanContextInjectionConfig
- Added verify_plan_output() method to PlanContextInjector
- QA configuration supports JSON serialization
Updated claude_indexer/hooks/__init__.py to export Plan QA classes
PlanQAConfig provides override toggles:
- enabled: Master toggle for QA
- check_tests/check_docs/check_duplicates/check_architecture: Individual checks
- fail_on_missing_tests/fail_on_missing_docs: Strict mode settings

QA Metrics Tracking (Milestone 13.5):

Added QA fields to MetricSnapshot in claude_indexer/ui/metrics/models.py:
- qa_checks_passed, qa_issues_found, qa_missing_tests, qa_missing_docs
- qa_potential_duplicates, qa_architecture_warnings, qa_verification_time_ms
Added record_qa_verification() method to MetricsCollector
Added get_qa_metrics_summary() for aggregated QA metrics
Updated PlanQAVerifier with optional metrics_collector parameter
Tests: 15 unit tests in test_qa_metrics.py

Phase 13: Polish, Testing & Documentation

Goal: Ensure production readiness with comprehensive testing and documentation.

Milestone 13.1: Integration Testing

Objective: End-to-end testing of Plan Mode integration.

Tasks

ID	Task	Priority	Status
13.1.1	Create `tests/integration/test_plan_mode.py`	HIGH	DONE
13.1.2	Test full Plan Mode flow (detect -> augment -> validate -> revise)	HIGH	DONE
13.1.3	Test MCP tool integration	HIGH	DONE
13.1.4	Test issue tracker integration	MEDIUM	DONE
13.1.5	Test design doc indexing	MEDIUM	DONE
13.1.6	Create mock Claude Code Plan Mode responses	MEDIUM	DONE

Implementation Notes (Milestone 13.1 Complete):

Created comprehensive integration test suite with 47 tests
Test classes: TestPlanModeDetectionIntegration, TestContextInjectionIntegration, TestGuardrailValidationIntegration, TestAutoRevisionIntegration, TestFullPlanModeFlow, TestDesignDocIndexingIntegration, TestPlanQAVerification
Coverage: Plan Mode detection, context injection, guardrail validation, auto-revision, full pipeline flow, design doc parsing, QA verification
Performance: All tests complete in <0.2 seconds
All lint checks pass (black, isort, flake8, ruff)

Integration Test Scenarios:

class TestPlanModeIntegration:
    def test_full_plan_mode_flow(self):
        """Test complete Plan Mode integration."""
        # 1. Detect Plan Mode
        # 2. Inject guidelines
        # 3. Validate plan
        # 4. Apply auto-revisions
        # 5. Run QA check
        # 6. Verify output
        ...

    def test_plan_with_missing_tests(self):
        """Verify test tasks auto-added."""
        ...

    def test_duplicate_detection(self):
        """Verify duplicate warning generated."""
        ...

    def test_doc_requirement(self):
        """Verify doc tasks auto-added for user-facing changes."""
        ...

Testing Requirements:

>80% code coverage for new components
All scenarios covered
Performance benchmarks included

Success Criteria:

All integration tests pass (47 tests)
>80% coverage
Performance within targets (<0.2s for all tests)

Milestone 13.2: Performance Optimization ✅ DONE

Objective: Meet all performance targets.

Tasks

ID	Task	Priority	Status
13.2.1	Profile Plan Mode detection latency	HIGH	DONE
13.2.2	Profile guidelines injection latency	HIGH	DONE
13.2.3	Profile validation latency	HIGH	DONE
13.2.4	Optimize hot paths	MEDIUM	DONE
13.2.5	Add caching where beneficial	MEDIUM	DONE
13.2.6	Create performance benchmarks	MEDIUM	DONE

Performance Targets:

Operation	Target	Budget
Plan Mode Detection	<10ms	Pattern matching
Guidelines Generation	<20ms	Template substitution
Exploration Hints	<30ms	Entity extraction
Plan Validation	<500ms	5 rules
Auto-Revision	<200ms	Conflict checking
Plan QA	<50ms	Pattern matching
Total Overhead	<100ms	(excluding validation)

Testing Requirements:

Benchmark all operations
Test under load
Verify targets met

Implementation Details (Milestone 13.2 Complete):

Replaced time.time() with time.perf_counter() across all Plan Mode files for sub-millisecond precision
Added LRU caching for CLAUDE.md project patterns loading (mtime-based invalidation)
Added LRU caching for entity extraction in exploration hints (128-entry cache)
Created comprehensive benchmark suite: tests/benchmarks/test_plan_mode_performance.py

Files Modified:

claude_indexer/hooks/plan_mode_detector.py - Timing precision
claude_indexer/hooks/planning/guidelines.py - Timing + LRU cache
claude_indexer/hooks/planning/exploration.py - Timing + entity cache
claude_indexer/hooks/planning/injector.py - Timing precision
claude_indexer/hooks/plan_qa.py - Timing precision
claude_indexer/ui/plan/guardrails/engine.py - Timing precision
claude_indexer/ui/plan/guardrails/auto_revision.py - Timing precision

New Files:

tests/benchmarks/test_plan_mode_performance.py - 7 benchmark test classes

Benchmark Tests:

TestPlanModeDetectionPerformance - <10ms p95 target
TestGuidelinesGenerationPerformance - <20ms p95 target
TestExplorationHintsPerformance - <30ms p95 target
TestPlanQAPerformance - <50ms p95 target
TestGuardrailValidationPerformance - <500ms target
TestAutoRevisionPerformance - <200ms target
TestEndToEndPlanModePerformance - <100ms overhead target
TestMemoryUsage - <50MB peak usage
TestScalabilityMetrics - Linear scaling verification

Success Criteria:

All performance targets met
No regression from baseline
Clear metrics dashboard

Milestone 13.3: Documentation

Objective: Comprehensive documentation for Plan Mode integration.

Tasks

ID	Task	Priority	Status
13.3.1	Update README.md with Plan Mode features	HIGH	DONE
13.3.2	Create `docs/PLAN_MODE.md` comprehensive guide	HIGH	DONE
13.3.3	Update HOOKS.md with Plan hooks	HIGH	DONE
13.3.4	Document all configuration options	MEDIUM	DONE
13.3.5	Create plan guardrail rule reference	MEDIUM	DONE
13.3.6	Add troubleshooting section	MEDIUM	DONE
13.3.7	Update CLAUDE.md with Plan guidelines	LOW	DONE

Documentation Structure:

docs/
├── PLAN_MODE.md          # Comprehensive Plan Mode guide
│   ├── Overview
│   ├── Configuration
│   ├── Quality Guardrails
│   ├── Auto-Revision Behavior
│   ├── MCP Context Integration
│   └── Troubleshooting
├── HOOKS.md              # Updated with Plan hooks
└── memory-functions.md   # Updated with new MCP tools

Implementation Details (Milestone 13.3 Complete):

Created docs/PLAN_MODE.md with ~700 lines covering:
- Quick start and activation methods
- Detection methods (4 signals) with confidence scoring
- Context injection (guidelines + exploration hints)
- All 5 guardrail rules with examples and auto-fix behavior
- Auto-revision system and audit trail
- Complete configuration reference
- Troubleshooting section
- Performance metrics
Overhauled README.md (reduced from 781 to ~400 lines):
- Problem/Solution led structure
- Feature highlights for all 4 major capabilities
- Simplified setup with one-command path
- Clean documentation links
Updated CLAUDE.md with Plan Mode usage section
Updated docs/HOOKS.md already had Plan Mode sections (Milestones 7.2, 8.4)

Testing Requirements:

All documentation accurate
Examples tested and working
No broken links

Success Criteria:

New user understands Plan Mode in <10 minutes
All features documented
Examples are copy-paste ready

Milestone 13.4: User Experience Validation ✅ DONE

Objective: Validate the "magical" UX described in PRD.

Tasks

ID	Task	Priority	Status
13.4.1	Conduct user testing sessions	HIGH	DEFERRED - Manual Process
13.4.2	Collect feedback on plan quality	HIGH	DONE
13.4.3	Measure plan approval rate	MEDIUM	DONE
13.4.4	Iterate on findings format	MEDIUM	DONE
13.4.5	Add configuration for thoroughness level	LOW	DONE

Implementation Details (Milestone 13.4 Complete):

Extended PlanAdoptionRecord with feedback fields: approved, approved_at, rejection_reason, accuracy_rating, user_notes, revision_count
Added approval metrics to MetricsReport: approval_rate, pending_approval_count, average_accuracy_rating, average_revision_count, rejection_reasons_summary()
Added feedback recording methods to MetricsCollector: record_plan_approval(), record_plan_revision(), get_approval_rate_history(), get_quality_metrics_summary()
Created claude_indexer/ui/plan/formatters.py with:
- ThoroughnessLevel enum (minimal, standard, thorough, exhaustive)
- format_plan_findings_for_display() with thoroughness-aware output
- format_plan_findings_for_claude() for Claude consumption
- SEVERITY_ICONS and CATEGORY_NAMES mappings
Added thoroughness_level and group_findings_by_severity to PlanGuardrailConfig
Added format_for_display() and format_for_claude() methods to PlanGuardrailResult
Tests: 54 unit tests covering all functionality

Success Metrics Validation:

>90% of plans include test/doc tasks when appropriate (Phase 12 Plan QA)
<10% of plans require user revision (pending user testing)
Existing code reuse suggested in >80% of applicable cases (Phase 12 duplicate check)
Users report plans "feel like senior engineer's work" (pending user testing)

Success Criteria:

All code implementation complete
User feedback collection enabled (infrastructure ready)
UX validated through testing (pending task 13.4.1)

Milestone 13.5: Deferred Items Completion ✅ DONE

Objective: Complete deferred items from Phases 9 and 12.

Tasks

ID	Task	Priority	Status
13.5.1	Implement parallel rule execution (9.2.3)	MEDIUM	DONE
13.5.2	Implement QA metrics tracking (12.2.3)	LOW	DONE
13.5.3	Update Ticket Sync Service notes	LOW	DONE

Implementation Details (Milestone 13.5 Complete):

Parallel Rule Execution (9.2.3):

Updated PlanGuardrailEngineConfig with max_parallel_workers (default: 4)
Added _validate_parallel() method using concurrent.futures.ThreadPoolExecutor
Added _validate_sequential() method (refactored from original code)
Modified validate() to conditionally use parallel or sequential execution
Parallel mode executes rules concurrently with configurable worker count
Error handling preserved in parallel mode (continue_on_error behavior)
Tests: 8 tests for parallel execution behavior

QA Metrics Tracking (12.2.3):

Extended MetricSnapshot with 7 QA-specific fields:
- qa_checks_passed, qa_issues_found, qa_missing_tests
- qa_missing_docs, qa_potential_duplicates, qa_architecture_warnings
- qa_verification_time_ms
Added record_qa_verification() to MetricsCollector:
- Creates MetricSnapshot from PlanQAResult
- Calculates checks passed/failed counts
- Records as tier 2 (design-time) snapshot
Added get_qa_metrics_summary() to MetricsCollector:
- Returns pass rate, average issues, issue breakdown
- Aggregates across all QA verification snapshots
Updated PlanQAVerifier with optional metrics_collector parameter:
- Records metrics automatically after verification
- Accepts optional plan_id for correlation
Backward compatible: old snapshots without QA fields load with defaults
Tests: 15 tests for QA metrics functionality

Ticket Sync Service (8.3.9) - Deferred to Phase 14:

Remains deferred due to complexity (new scheduler, Qdrant schema, sync state)
Recommended for dedicated Phase 14: Background Services
Current on-demand ticket fetching via MCP tools remains functional

Files Modified:

claude_indexer/ui/plan/guardrails/engine.py - Parallel execution
claude_indexer/ui/metrics/models.py - QA fields in MetricSnapshot
claude_indexer/ui/metrics/collector.py - QA metrics methods
claude_indexer/hooks/plan_qa.py - Metrics integration

Test Coverage:

tests/unit/ui/plan/guardrails/test_engine.py - 8 parallel execution tests
tests/unit/ui/metrics/test_qa_metrics.py - 15 QA metrics tests

Phase 14: MCP Server Enhancement - Testing Foundation

Goal: Establish comprehensive testing infrastructure for the mcp-qdrant-memory MCP server using Vitest.

PRD Reference: mcp-qdrant-memory/docs/PRD.md - Phase 14: MCP Server Enhancement

Milestone 14.1: Testing Foundation ✅ DONE

Objective: Set up Vitest testing framework with 60%+ baseline coverage, targeting 90-95% on core modules.

Tasks

ID	Task	Priority	Status
14.1.1	Add Vitest devDependencies to package.json	HIGH	DONE
14.1.2	Add test scripts to package.json	HIGH	DONE
14.1.3	Create vitest.config.ts with coverage thresholds	HIGH	DONE
14.1.4	Create test fixtures (entities, relations)	MEDIUM	DONE
14.1.5	Create planModeGuard.test.ts	HIGH	DONE
14.1.6	Create tokenCounter.test.ts	HIGH	DONE
14.1.7	Create validation.test.ts	HIGH	DONE
14.1.8	Create bm25Service.test.ts	HIGH	DONE
14.1.9	Fix BM25 vitest import compatibility	HIGH	DONE

Implementation Details:

Configuration Files:
- package.json - Added test scripts (test, test:watch, test:coverage, test:ui) and devDependencies (@vitest/coverage-v8, @vitest/ui)
- vitest.config.ts - Vitest configuration with v8 coverage provider, HTML/LCOV reporters
Test Files Created (207 tests total):
- src/__tests__/planModeGuard.test.ts - 38 tests, 100% coverage
- src/__tests__/tokenCounter.test.ts - 34 tests, 96.57% coverage
- src/__tests__/validation.test.ts - 74 tests, 83.28% coverage
- src/__tests__/bm25Service.test.ts - 47 tests, 95.87% coverage
- src/__tests__/fixtures/ - Test data for entities and relations
Bug Fix: Fixed BM25 library import compatibility issue where vitest SSR transforms imports differently than Node.js ESM, causing BM25.default.default is not a function error. Solution handles both import behaviors.

Coverage Results:

Module	Statements	Branches	Functions
plan-mode-guard.ts	100%	100%	100%
tokenCounter.ts	96.57%	90.69%	100%
validation.ts	83.28%	78.14%	100%
bm25Service.ts	95.87%	87.71%	100%

Success Criteria:

207 tests passing
>90% coverage on core modules (plan-mode-guard, tokenCounter, bm25Service)
>80% coverage on validation module
Build passes with no TypeScript errors
Test execution <1 second

Milestone 14.2: Integration Testing ✅ DONE

Objective: Add integration tests for MCP server functionality.

Tasks

ID	Task	Priority	Status
14.2.1	Create MCP tool integration tests	HIGH	DONE
14.2.2	Create Qdrant persistence tests	HIGH	DONE
14.2.3	Add mock Qdrant client for isolated testing	MEDIUM	DONE
14.2.4	Test hybrid search (semantic + BM25)	MEDIUM	DONE

Implementation Details

New Files Created:

src/__tests__/mocks/qdrantClient.mock.ts - Mock Qdrant client with in-memory storage
src/__tests__/mocks/openaiClient.mock.ts - Mock OpenAI embeddings with deterministic generation
src/__tests__/mocks/index.ts - Mock infrastructure exports
src/__tests__/integration/qdrant.integration.test.ts - 45 tests for QdrantPersistence
src/__tests__/integration/mcp-tools.integration.test.ts - 50 tests for MCP tool validation
src/__tests__/integration/hybrid-search.integration.test.ts - 30 tests for BM25/hybrid search

Test Coverage:

Total tests: 362 (207 unit + 155 integration)
QdrantPersistence: Connection, Entity CRUD, Relation CRUD, Search, Scroll, Cache, Error handling, Multi-collection
MCP Tools: Write tool validation, Read tool validation, Plan Mode access control, Collection parameter support
Hybrid Search: BM25 keyword search, RRF fusion algorithm, Result processing, Unicode/special characters

Acceptance Criteria ✅

Mock infrastructure supports isolated testing without external dependencies
All MCP tools have validation tests
QdrantPersistence CRUD operations tested
Hybrid search (semantic + BM25) fusion tested
TypeScript build passes with no errors
All 362 tests pass

Milestone 14.3: CI/CD Integration ✅ DONE

Objective: Integrate testing into CI/CD pipeline.

Tasks

ID	Task	Priority	Status
14.3.1	Add GitHub Actions workflow for MCP tests	HIGH	DONE
14.3.2	Configure coverage thresholds in CI	MEDIUM	DONE
14.3.3	Add test status badge to README	LOW	DONE

Implementation Details

New Files Created:

mcp-qdrant-memory/.github/workflows/ci.yml - GitHub Actions CI workflow

Workflow Configuration:

Triggers: Push and PR to main/master branches
Concurrency: Cancel in-progress runs on same branch
Jobs:
- build: TypeScript compilation with artifact upload
- typecheck: Strict type validation (tsc --noEmit)
- test: Tests with coverage (Node.js 18, 20, 22 matrix)
- security: npm audit for vulnerability scanning
Caching: npm dependencies cached for faster runs
Artifacts: Coverage reports uploaded (Node 20)

README Updates:

Added CI badge linking to GitHub Actions workflow

Acceptance Criteria ✅

CI workflow runs on push/PR to main/master
Build job compiles TypeScript successfully
Type check job validates types
Test job runs on Node.js 18, 20, 22 matrix (362 tests)
Coverage reports uploaded as artifacts
Security audit job runs
CI badge visible in README

Phase 15: MCP Server Code Quality & Documentation

Goal: Add ESLint, Prettier, pre-commit hooks, and governance documentation to mcp-qdrant-memory.

PRD Reference: mcp-qdrant-memory/docs/PRD.md - Section 4.3: Code Quality Tooling, Section 4.4: Documentation

Milestone 15.1: Code Quality Tooling ✅ DONE

Objective: Establish linting and formatting infrastructure for the MCP server.

Tasks

ID	Task	Priority	Status
15.1.1	Install ESLint, Prettier, Husky, lint-staged dependencies	HIGH	DONE
15.1.2	Create ESLint configuration (`eslint.config.mjs`) with TypeScript support	HIGH	DONE
15.1.3	Create Prettier configuration (`.prettierrc`, `.prettierignore`)	HIGH	DONE
15.1.4	Update `package.json` with lint/format scripts	HIGH	DONE
15.1.5	Initialize Husky and configure pre-commit hook	MEDIUM	DONE
15.1.6	Update CI workflow with lint job	HIGH	DONE

Implementation Details:

ESLint v9 flat config with typescript-eslint/recommendedTypeChecked
Relaxed rules for existing codebase (warnings for any-related rules)
Prettier with double quotes, semicolons, 100 char width
Pre-commit hook runs lint-staged (ESLint + Prettier on staged files)
CI lint job runs npm run lint and npm run format:check

New npm Scripts:

{
  "lint": "eslint src/",
  "lint:fix": "eslint src/ --fix",
  "format": "prettier --write .",
  "format:check": "prettier --check .",
  "typecheck": "tsc --noEmit"
}

Acceptance Criteria ✅

ESLint configured and passing (0 errors, warnings allowed)
Prettier configured and passing
Pre-commit hooks functional
CI lint job passing

Milestone 15.2: Governance Documentation ✅ DONE

Objective: Add contributor documentation and project governance files.

Tasks

ID	Task	Priority	Status
15.2.1	Create `CONTRIBUTING.md` with development guidelines	HIGH	DONE
15.2.2	Create `CHANGELOG.md` with version history	HIGH	DONE
15.2.3	Create `LICENSE` file (MIT)	HIGH	DONE

Implementation Details:

CONTRIBUTING.md includes development setup, code style, testing, PR process
CHANGELOG.md follows Keep a Changelog format
LICENSE file matches package.json MIT declaration

Acceptance Criteria ✅

CONTRIBUTING.md present with development guidelines
CHANGELOG.md present with version history
LICENSE file present

Phase 15 Summary

Files Created:

mcp-qdrant-memory/eslint.config.mjs - ESLint configuration
mcp-qdrant-memory/.prettierrc - Prettier configuration
mcp-qdrant-memory/.prettierignore - Prettier ignore patterns
mcp-qdrant-memory/.husky/pre-commit - Pre-commit hook
mcp-qdrant-memory/CONTRIBUTING.md - Contributor guide
mcp-qdrant-memory/CHANGELOG.md - Version history
mcp-qdrant-memory/LICENSE - MIT license

Files Modified:

mcp-qdrant-memory/package.json - Added lint/format scripts, lint-staged config
mcp-qdrant-memory/.github/workflows/ci.yml - Added lint job

CI/CD:

Lint job added to GitHub Actions workflow
Runs ESLint and Prettier checks on all PRs
0 errors required, warnings allowed for gradual cleanup

Phase 16: MCP Server CI/CD Completion

Goal: Complete CI/CD infrastructure for the mcp-qdrant-memory MCP server.

Status: ✅ Complete

Milestone 16.1: Release Workflow ✅ DONE

Objective: Automated npm publishing on version tags.

Files Created:

mcp-qdrant-memory/.github/workflows/release.yml - Release workflow

Features:

Triggers on version tags (v*..)
Builds and tests before publishing
npm publish with provenance for supply chain security
Auto-generates GitHub releases with release notes

Milestone 16.2: Dependabot Configuration ✅ DONE

Objective: Automated dependency management.

Files Created:

mcp-qdrant-memory/.github/dependabot.yml - Dependabot configuration

Features:

Weekly npm dependency updates (Monday)
Dev dependencies grouped together
PR limit of 10 to avoid noise
Consistent commit message prefix (chore(deps))

Phase 16 Summary

MCP Server Progress:

Phase 1 (Testing): 100% complete - 362 tests
Phase 2 (CI/CD): 100% complete - CI, Release, Dependabot
Phase 3 (Code Quality): 100% complete - ESLint, Prettier, Husky, docs
Phase 4 (Error Handling): 0% - Not started
Phase 5 (Security): 0% - Not started
Phase 6 (Coverage): 100% complete - Integration tests

Overall MCP Server Progress: 65%

Phase 17: MCP Server Error Handling & Security

Goal: Complete Phase 4 (Error Handling) and Phase 5 (Security) for the mcp-qdrant-memory MCP server.

Status: ✅ Complete

PRD Reference: mcp-qdrant-memory/docs/PRD.md - Sections 4.5, 4.7, 4.8

Milestone 17.1: Security - Scoped Fetch Override ✅ DONE

Objective: Fix critical security issue where Qdrant API key was leaked to all fetch requests.

File Modified:

mcp-qdrant-memory/src/fetch-override.ts

Change: Previously, the fetch override added the Qdrant API key to ALL fetch requests globally, including third-party APIs like Voyage AI, OpenAI, Linear, and GitHub. This leaked the API key to external services.

Fix: Scoped API key injection to only Qdrant URLs using url.startsWith(qdrantUrl) check.

Tests Created:

mcp-qdrant-memory/src/__tests__/fetch-override.test.ts (18 tests)

Acceptance Criteria ✅

Qdrant API key only added to Qdrant URL requests
Voyage AI requests do NOT contain Qdrant API key
OpenAI/Linear/GitHub requests do NOT contain Qdrant API key
All 18 security tests passing

Milestone 17.2: Security - Input Size Validation ✅ DONE

Objective: Add input size limits to prevent DoS attacks via oversized payloads.

File Modified:

mcp-qdrant-memory/src/validation.ts

Constants Added:

export const INPUT_LIMITS = {
  QUERY_MAX_LENGTH: 10000,
  ENTITIES_MAX_COUNT: 1000,
  ENTITY_NAME_MAX_LENGTH: 500,
  OBSERVATIONS_MAX_COUNT: 100,
  OBSERVATION_MAX_LENGTH: 50000,
  RELATIONS_MAX_COUNT: 1000,
  ENTITY_NAMES_MAX_COUNT: 1000,
} as const;

Validators Updated:

validateSearchSimilarRequest - query length
validateSearchDocsRequest - query length
validateCreateEntitiesRequest - entity count, name length, observations count/length
validateAddObservationsRequest - entity name, contents count/length
validateCreateRelationsRequest - relations count
validateDeleteEntitiesRequest - entity names count

Tests Extended:

mcp-qdrant-memory/src/__tests__/validation.test.ts (+22 tests)

Acceptance Criteria ✅

All validators enforce size limits
Clear error messages indicate limit exceeded
All 22 new validation tests passing

Milestone 17.3: Edge Case - BM25 Unicode Support ✅ DONE

Objective: Fix Unicode stripping in BM25 text processing.

File Modified:

mcp-qdrant-memory/src/bm25/bm25Service.ts

Change: Replaced ASCII-only regex with Unicode-aware regex.

// Before (stripped Unicode)
.replace(/[^\w\s]/g, ' ')

// After (preserves Unicode letters and numbers)
.replace(/[^\p{L}\p{N}\s]/gu, ' ')

Tests Extended:

mcp-qdrant-memory/src/__tests__/bm25Service.test.ts (+7 Unicode tests)

Languages Tested:

Chinese (用户认证)
Japanese (ユーザー認証)
Korean (사용자 인증)
Cyrillic (аутентификации)
Arabic (مصادقة)
Accented Latin (autenticación)

Acceptance Criteria ✅

Unicode characters preserved in BM25 indexing
All 7 Unicode tests passing
Mixed ASCII/Unicode content searchable

Milestone 17.4: Error Handling - Context Preservation ✅ DONE

Objective: Preserve error context when wrapping errors in McpError.

File Modified:

mcp-qdrant-memory/src/index.ts

Change: Added error cause to McpError for stack trace preservation.

} catch (error) {
  const mcpError = new McpError(
    ErrorCode.InternalError,
    error instanceof Error ? error.message : String(error)
  );
  // Preserve original error for debugging
  (mcpError as any).cause = error;
  throw mcpError;
}

Acceptance Criteria ✅

Original error attached as cause
Stack traces preserved for debugging
All existing tests still passing

Phase 17 Summary

Files Modified:

mcp-qdrant-memory/src/fetch-override.ts - Scoped API key injection
mcp-qdrant-memory/src/validation.ts - Added INPUT_LIMITS and size validation
mcp-qdrant-memory/src/bm25/bm25Service.ts - Unicode-aware regex
mcp-qdrant-memory/src/index.ts - Error context preservation

Files Created:

mcp-qdrant-memory/src/__tests__/fetch-override.test.ts - Security tests

Tests Added:

18 fetch-override security tests
22 input validation size limit tests
7 BM25 Unicode support tests
Total: 47 new tests

MCP Server Progress Update:

Phase 1 (Testing): 100% complete - 362 tests → 407 tests
Phase 2 (CI/CD): 100% complete
Phase 3 (Code Quality): 100% complete
Phase 4 (Error Handling): 100% complete ✅ NEW
Phase 5 (Security): 100% complete ✅ NEW
Phase 6 (Coverage): 100% complete

Overall MCP Server Progress: 100% ✅

Phase 18: MCP Server Resilience & Resource Management

Objective: Complete remaining P1/P2 items from the MCP server PRD (sections 4.5 and 4.6).

Status: Complete ✅

Milestone 18.1: Graceful Shutdown

Objective: Handle process termination signals properly.

Tasks

ID	Task	Priority	Status
18.1.1	Create `src/shutdown.ts` with signal handler infrastructure	HIGH	DONE
18.1.2	Implement SIGTERM handler with 10-second grace period	HIGH	DONE
18.1.3	Implement SIGINT handler (same logic as SIGTERM)	HIGH	DONE
18.1.4	Add cleanup actions: cancel pending requests, flush logs	HIGH	DONE
18.1.5	Integrate shutdown manager into `src/index.ts`	HIGH	DONE
18.1.6	Create `src/__tests__/shutdown.test.ts` with tests	HIGH	DONE

Implementation Details:

Created ShutdownManager class with:
- Signal handler registration (SIGTERM/SIGINT)
- Cleanup callback registration and execution
- Pending request tracking via AbortController
- Configurable grace period (default 10s)
- Global singleton access via getShutdownManager()
21 tests covering all shutdown functionality

Success Criteria:

Graceful shutdown handles SIGTERM/SIGINT properly
Cleanup callbacks executed in order
Pending requests cancelled on shutdown

Milestone 18.2: External API Timeouts

Objective: Add configurable timeouts for HTTP calls to external APIs.

Tasks

ID	Task	Priority	Status
18.2.1	Create `src/http-client.ts` with `fetchWithTimeout()` utility	HIGH	DONE
18.2.2	Update Voyage AI embedding calls (30s timeout)	HIGH	DONE
18.2.3	Update Linear API calls (10s timeout)	HIGH	DONE
18.2.4	Update GitHub API calls (10s timeout)	HIGH	DONE
18.2.5	Create `src/__tests__/http-client.test.ts` with tests	HIGH	DONE

Implementation Details:

Created fetchWithTimeout() utility with:
- AbortController-based timeout mechanism
- TimeoutError and ShutdownAbortError classes
- Integration with ShutdownManager for request tracking
- DEFAULT_TIMEOUTS constants for each API
- Environment variable support for custom timeouts
Updated all external API calls in qdrant.ts:
- Voyage AI: 30s timeout
- Linear API: 10s timeout (2 call sites)
- GitHub API: 10s timeout (4 call sites)
23 tests covering timeout behavior

Success Criteria:

External API calls have configurable timeouts
TimeoutError thrown on timeout
Requests properly cancelled and cleaned up

Milestone 18.3: BM25 Service LRU Cleanup

Objective: Implement LRU cleanup for per-collection BM25 services.

Tasks

ID	Task	Priority	Status
18.3.1	Add `lastAccessed` timestamp to BM25 service tracking	MEDIUM	DONE
18.3.2	Implement `cleanupStaleServices()` method	MEDIUM	DONE
18.3.3	Add periodic cleanup timer (every 5 minutes)	MEDIUM	DONE
18.3.4	Add max service count limit (10 collections)	MEDIUM	DONE
18.3.5	Add cleanup on graceful shutdown	MEDIUM	DONE
18.3.6	Add tests for LRU cleanup behavior	MEDIUM	DONE

Implementation Details:

Added BM25ServiceEntry interface with lastAccessed timestamp
Configuration constants:
- BM25_MAX_SERVICES = 10 (max cached services)
- BM25_TTL_MS = 30 minutes (stale threshold)
- BM25_CLEANUP_INTERVAL_MS = 5 minutes
LRU cleanup algorithm:
1. Remove entries not accessed within TTL
2. If count > max, evict oldest by lastAccessed
Public methods:
- getBM25ServiceCount() for monitoring
- stopBM25Cleanup() for graceful shutdown
4 integration tests for BM25 service management

Success Criteria:

BM25 services cleaned up based on LRU policy
Max service count enforced
Cleanup stops on shutdown

Phase 18 Summary

Files Created:

mcp-qdrant-memory/src/shutdown.ts - Graceful shutdown manager
mcp-qdrant-memory/src/http-client.ts - HTTP client with timeout support
mcp-qdrant-memory/src/__tests__/shutdown.test.ts - Shutdown tests
mcp-qdrant-memory/src/__tests__/http-client.test.ts - HTTP client tests

Files Modified:

mcp-qdrant-memory/src/index.ts - Shutdown integration, cleanup methods
mcp-qdrant-memory/src/persistence/qdrant.ts - BM25 LRU cleanup, fetchWithTimeout usage
mcp-qdrant-memory/src/__tests__/integration/qdrant.integration.test.ts - BM25 tests

Tests Added:

21 shutdown tests
23 http-client tests
4 BM25 management tests
Total: 48 new tests

PRD Coverage:

4.5.2 Timeout Configuration - Complete
4.6.1 Graceful Shutdown - Complete
4.6.2 BM25 Service Cleanup - Complete

Deferred to Future Phase:

4.5.1 Result Type Pattern - Requires significant refactoring
4.6.3 Connection Lifecycle Management - Health checks, reconnection logic

Phase 19: MCP Server Structured Logging

Objective: Implement structured logging throughout the MCP server codebase, replacing all console.error/console.log calls with a consistent, configurable logging system.

Status: Complete ✅

Milestone 19.1: Logger Implementation

Objective: Create a structured logging system with module-specific loggers.

Tasks

ID	Task	Priority	Status
19.1.1	Create `src/logger.ts` with structured logger implementation	HIGH	DONE
19.1.2	Implement 4 log levels (debug, info, warn, error)	HIGH	DONE
19.1.3	Add JSON output format (LOG_FORMAT=json)	HIGH	DONE
19.1.4	Add human-readable output format (default)	MEDIUM	DONE
19.1.5	Implement LOG_LEVEL environment variable filtering	MEDIUM	DONE
19.1.6	Create pre-configured module loggers	MEDIUM	DONE
19.1.7	Add child logger support with context inheritance	LOW	DONE
19.1.8	Create `src/__tests__/logger.test.ts` with tests	HIGH	DONE

Implementation Details:

Created Logger class with:
- 4 log levels: debug, info, warn, error
- JSON structured output (LOG_FORMAT=json)
- Human-readable output (default)
- Log level filtering via LOG_LEVEL env var
- Child logger support with context inheritance
- Error serialization with stack traces
Pre-configured module loggers:
- logger (general MCP server)
- qdrantLogger (database operations)
- bm25Logger (keyword search)
- validationLogger (input validation)
- planModeLogger (plan mode access control)
- shutdownLogger (shutdown lifecycle)
- configLogger (configuration loading)
- ignoreFilterLogger (claudeignore filtering)
29 tests covering all logger functionality

Success Criteria:

Structured logging with consistent format
Configurable log level filtering
Module-specific loggers for better debugging

Milestone 19.2: Logger Integration

Objective: Replace all console.error/console.log calls with structured logging.

Tasks

ID	Task	Priority	Status
19.2.1	Integrate logger into `src/config.ts`	HIGH	DONE
19.2.2	Integrate logger into `src/validation.ts`	HIGH	DONE
19.2.3	Integrate logger into `src/bm25/bm25Service.ts`	HIGH	DONE
19.2.4	Integrate logger into `src/plan-mode-guard.ts`	MEDIUM	DONE
19.2.5	Integrate logger into `src/shutdown.ts`	HIGH	DONE
19.2.6	Integrate logger into `src/claudeignore/filter.ts`	MEDIUM	DONE
19.2.7	Integrate logger into `src/index.ts`	HIGH	DONE
19.2.8	Integrate logger into `src/persistence/qdrant.ts`	HIGH	DONE
19.2.9	Update `src/__tests__/shutdown.test.ts` for new format	MEDIUM	DONE

Implementation Details:

Replaced 60+ console.error/console.log calls across 8 production files
Each module uses its designated logger (e.g., qdrantLogger for database operations)
Debug-level logging for verbose debugging (e.g., relation processing, batch progress)
Info-level logging for normal operations (e.g., server startup, BM25 initialization)
Warn-level logging for warnings (e.g., ignore file not found, safety breaks)
Error-level logging for errors with stack traces

Success Criteria:

All console.error/console.log calls replaced
All 503 tests passing
TypeScript clean (no errors)
Prettier formatted

Phase 19 Summary

Files Created:

mcp-qdrant-memory/src/logger.ts - Structured logger implementation (198 lines)
mcp-qdrant-memory/src/__tests__/logger.test.ts - Logger test suite (453 lines)

Files Modified:

mcp-qdrant-memory/src/config.ts - Environment variable error logging
mcp-qdrant-memory/src/validation.ts - Debug-level validation logging
mcp-qdrant-memory/src/bm25/bm25Service.ts - BM25 operation logging
mcp-qdrant-memory/src/plan-mode-guard.ts - Plan mode state change logging
mcp-qdrant-memory/src/shutdown.ts - Shutdown lifecycle logging
mcp-qdrant-memory/src/claudeignore/filter.ts - Ignore pattern warning logging
mcp-qdrant-memory/src/index.ts - Server lifecycle and auto-reduce logging
mcp-qdrant-memory/src/persistence/qdrant.ts - Database operation logging
mcp-qdrant-memory/src/__tests__/shutdown.test.ts - Updated for new log format

Tests Added:

29 logger tests
Total MCP Server Tests: 503

PRD Coverage:

4.5.3 Structured Logging - Complete ✅

Appendix: Plan Guardrail Rule Specifications

A.1 Coverage Rules (2)

#	Rule	Severity	Detection	Auto-Fix
1	`PLAN.TEST_REQUIREMENT`	MEDIUM	Feature task without test dependency	Add test task
2	`PLAN.DOC_REQUIREMENT`	LOW	User-facing change without doc task	Add doc task

A.2 Consistency Rules (1)

#	Rule	Severity	Detection	Auto-Fix
1	`PLAN.DUPLICATE_DETECTION`	HIGH	Semantic similarity >70% with existing code	Modify task to reference existing

A.3 Architecture Rules (1)

#	Rule	Severity	Detection	Auto-Fix
1	`PLAN.ARCHITECTURAL_CONSISTENCY`	MEDIUM	File paths outside established patterns	Add warning to task

A.4 Performance Rules (1)

#	Rule	Severity	Detection	Auto-Fix
1	`PLAN.PERFORMANCE_PATTERN`	LOW	Known anti-patterns (N+1, no caching)	Add performance note

Implementation Order Summary

Critical Path

Phase 7: Plan Mode Detection & Hook Infrastructure (foundation)
Phase 9: Quality Guardrails Framework (core value)
Phase 10: Auto-Revision System (key differentiator)
Phase 11: Prompt Augmentation (guidance injection)

High Priority

Phase 8: MCP Context Integration (rich context)
Phase 12: Plan QA Verification (quality assurance)

Lower Priority

Phase 13: Polish, Testing & Documentation (production readiness)

Dependencies Graph

Phase 7 (Detection/Hooks)
       |
       +----------------------+
       v                      v
Phase 9 (Guardrails)    Phase 8 (MCP Context)
       |                      |
       v                      |
Phase 10 (Auto-Revision) <----+
       |
       v
Phase 11 (Prompt Augmentation)
       |
       v
Phase 12 (Plan QA)
       |
       v
Phase 13 (Polish/Testing/Docs)

Risk Mitigation

Risk	Impact	Mitigation
False positives annoying	HIGH	Configurable thresholds, easy overrides
Performance too slow	MEDIUM	Parallel validation, caching
Auto-revision conflicts	HIGH	Conflict detection, iteration limits
Issue tracker API failures	MEDIUM	Graceful degradation, caching
Plan Mode detection fails	HIGH	Multiple detection methods, fallbacks

Success Criteria Summary

Generated from PRD.md and TDD.md (Plan Mode Integration)

FilesExpand file tree

MILESTONES.md

Latest commit

History

MILESTONES.md

File metadata and controls

Plan Mode Integration - Implementation Milestones

Implementation Milestones

Table of Contents

Executive Summary

Vision

Key Design Decisions

Current State (Prerequisites Complete)

Success Metrics (from PRD)

Phase 7: Foundation - Plan Mode Detection & Hook Infrastructure

Milestone 7.1: Plan Mode Detection

Tasks

Milestone 7.2: Hook Infrastructure Extension

Tasks

Phase 8: MCP Context Integration

Milestone 8.1: Design Document Indexing

Tasks

Milestone 8.2: New MCP Tools for Documents

Tasks

Milestone 8.3: Issue Tracker Integration

Tasks

Milestone 8.4: Plan Mode Tool Access Control

Tasks

Phase 9: Quality Guardrails Framework

Milestone 9.1: Core Guardrails Data Model

Tasks

Milestone 9.2: Plan Guardrail Engine

Tasks

Milestone 9.3: Plan Validation Rules (5 Rules)

Tasks

Rule 1: TestRequirementRule

Rule 2: DuplicateDetectionRule

Phase 10: Auto-Revision System

Milestone 10.1: Auto-Revision Engine

Tasks

Milestone 10.2: Revision Audit Trail

Tasks

Phase 11: Prompt Augmentation & Exploration Hints

Milestone 11.1: Planning Guidelines Generator

Tasks

Milestone 11.2: Exploration Hints Generator

Tasks

Milestone 11.3: Prompt Handler Integration

Tasks

Phase 12: Plan QA Verification

Milestone 12.1: Plan QA Verifier

Tasks

Milestone 12.2: QA Integration Points

Tasks

Phase 13: Polish, Testing & Documentation

Milestone 13.1: Integration Testing

Tasks

Milestone 13.2: Performance Optimization ✅ DONE

Tasks

Milestone 13.3: Documentation

Tasks

Milestone 13.4: User Experience Validation ✅ DONE

Tasks

Milestone 13.5: Deferred Items Completion ✅ DONE

Tasks

Phase 14: MCP Server Enhancement - Testing Foundation

Milestone 14.1: Testing Foundation ✅ DONE

Tasks

Milestone 14.2: Integration Testing ✅ DONE

Tasks

Implementation Details

Acceptance Criteria ✅

Milestone 14.3: CI/CD Integration ✅ DONE

Tasks

Implementation Details

Acceptance Criteria ✅

Phase 15: MCP Server Code Quality & Documentation

Milestone 15.1: Code Quality Tooling ✅ DONE

Tasks

Acceptance Criteria ✅