Code Review Report

Metadata

Project: git-notes-memory
Review Date: 2025-12-20
Reviewer: Claude Code Review Agent (6 Specialist Subagents)
Scope: Full codebase - src/, hooks/, tests/
Branch: feature/hook-enhancement-v2

Executive Summary

Overall Health Score: 7.5/10

Dimension	Score	Critical	High	Medium	Low
Security	9/10	0	0	0	1
Performance	6/10	2	3	3	4
Architecture	7/10	0	0	7	10
Code Quality	7/10	0	1	5	17
Test Coverage	6/10	4	12	10	0
Documentation	8/10	2	5	6	0

Key Findings

CRITICAL: N+1 query pattern in IndexService.search_vector() causes latency to scale linearly with results
CRITICAL: Novelty check can trigger 2-5 second embedding model load during hook execution, violating <50ms target
HIGH: Major test coverage gaps in hooks subsystem (7 source files with no tests)
HIGH: Missing v0.3.1 entry in CHANGELOG.md
HIGH: Bare exception handlers in hook entry points swallow errors silently

Recommended Action Plan

Immediate (before next deploy): Fix N+1 query, add model-loaded check in novelty checker
This Sprint: Add tests for signal_detector, capture_decider, context_builder; update CHANGELOG
Next Sprint: Cache project detection, simplify session start memory count, add Protocol interfaces
Backlog: DRY up handler boilerplate, define named constants for magic numbers

Critical Findings (2)

PERF-001: N+1 Query Pattern in Vector Search

Location: src/git_notes_memory/index.py:1006-1022
Category: Performance
Severity: CRITICAL

Description: Each vector match triggers a separate self.get(memory_id) query inside a loop, causing latency to scale O(n) with result count.

Impact: Vector search with 10 results executes 11 queries instead of 1. At 50ms per query, this adds 500ms latency.

Evidence:

for row in cursor.fetchall():
    memory_id = row[0]
    distance = row[1]
    memory = self.get(memory_id)  # N+1: Separate query per result

Remediation:

# Use single JOIN query
sql = """
    SELECT m.*, v.distance
    FROM vec_memories v
    JOIN memories m ON v.id = m.id
    WHERE v.embedding MATCH ? AND k = ?
    ORDER BY v.distance LIMIT ?
"""

PERF-002: Novelty Check Triggers Full Model Load in Hook Path

Location: src/git_notes_memory/hooks/novelty_checker.py:99-113, 151-158
Category: Performance
Severity: CRITICAL

Description: First novelty check during hook execution loads the sentence-transformer model (2-5 seconds), violating the <50ms hook target.

Impact: First session with novelty checking enabled blocks Claude Code for several seconds.

Evidence:

def check_novelty(self, text: str, ...) -> NoveltyResult:
    recall = self._get_recall_service()  # Triggers model load
    results = recall.search(text, ...)   # Blocks on embedding

Remediation:

def check_novelty(self, text: str, ...) -> NoveltyResult:
    embedding = self._get_embedding_service()
    if not embedding.is_loaded:
        # Model not loaded - skip expensive check, assume novel
        return NoveltyResult(novelty_score=1.0, is_novel=True, ...)
    # Model loaded - safe to proceed
    ...

High Priority Findings (8)

PERF-003: Subprocess in Project Detection

Location: src/git_notes_memory/config.py:219-225
Category: Performance
Severity: HIGH

Description: Each get_project_identifier() call spawns a subprocess (git config --get), adding 50-200ms latency.

Remediation: Add LRU cache or read .git/config file directly without subprocess.

PERF-004: Uncached File Reads in Project Detector

Location: src/git_notes_memory/hooks/project_detector.py:146-176
Category: Performance
Severity: HIGH

Description: Each detect_project() reads up to 5 files without caching.

Remediation: Add @lru_cache(maxsize=16) decorator.

PERF-005: Session Start Initializes Full Index

Location: src/git_notes_memory/hooks/session_start_handler.py:67-83
Category: Performance
Severity: HIGH

Description: Every session start runs IndexService.initialize() and multiple COUNT queries just for a status message.

Remediation: Use lightweight SELECT COUNT(*) without full initialization.

TEST-001: Missing Tests for signal_detector.py

Location: src/git_notes_memory/hooks/signal_detector.py
Category: Test Coverage
Severity: HIGH

Description: Core signal detection logic (600+ lines) has no dedicated test file.

Remediation: Create tests/test_signal_detector.py with parameterized tests for all signal types.

TEST-002: Missing Tests for capture_decider.py

Location: src/git_notes_memory/hooks/capture_decider.py
Category: Test Coverage
Severity: HIGH

Description: Capture decision logic (350+ lines) has no dedicated test file.

Remediation: Create tests/test_capture_decider.py testing threshold-based action selection.

TEST-003: Missing Tests for context_builder.py

Location: src/git_notes_memory/hooks/context_builder.py
Category: Test Coverage
Severity: HIGH

Description: Context building logic (400+ lines) has no dedicated test file.

Remediation: Create tests/test_context_builder.py testing budget calculation and XML output.

DOC-001: CHANGELOG Missing v0.3.1 Entry

Location: CHANGELOG.md
Category: Documentation
Severity: HIGH

Description: Current version is 0.3.1 (per pyproject.toml) but CHANGELOG only documents up to 0.3.0.

Remediation: Add 0.3.1 entry with bump-my-version, emoji markers, namespace styling features.

QUAL-001: Bare Exception Handlers in Hook Entry Points

Location: hooks/session_start.py:38-40, hooks/user_prompt.py:41-46, hooks/stop.py:47-50
Category: Code Quality
Severity: HIGH

Description: Bare except Exception: without logging masks potential bugs.

Remediation:

except Exception as e:
    print(f"[memory-hook] Unexpected error: {e}", file=sys.stderr)
    print(json.dumps({"continue": True}))
    sys.exit(0)

Medium Priority Findings (25)

Architecture (7)

ID	Location	Issue
ARCH-001	`capture.py:176-898`	CaptureService has too many responsibilities (validation, locking, git ops, indexing, 8 capture types)
ARCH-002	`signal_detector.py:36-107`	Signal patterns hardcoded - violates Open/Closed principle
ARCH-003	All services	No Protocol/ABC interfaces for services - limits testability
ARCH-004	`tests/conftest.py:17-96`	Singleton reset accesses private module attributes directly
ARCH-005	`capture.py:908-923`	Default CaptureService lacks index/embedding services
ARCH-006	Hook handlers	Direct service instantiation instead of dependency injection
ARCH-007	`hooks/models.py:120-122`	`NoveltyResult.similar_memory_ids` uses mutable list in frozen dataclass

Performance (3)

ID	Location	Issue
PERF-006	`context_builder.py:375-409`	Sequential semantic searches (learnings + patterns) when parallel possible
PERF-007	`novelty_checker.py:253-278`	Batch novelty check is sequential
PERF-008	`index.py:677-685`	`get_all_ids()` uses unbounded fetchall()

Code Quality (5)

ID	Location	Issue
QUAL-002	`hooks/*.py` (7 files)	Duplicated hook entry point pattern (~100 lines repeated)
QUAL-003	8+ handler classes	Duplicated lazy `_get_*_service()` pattern
QUAL-004	`signal_detector.py:254-299`	`_extract_context()` has cyclomatic complexity ~11
QUAL-005	`capture_decider.py:128-232`	`decide()` method has ~12 cyclomatic complexity
QUAL-006	`novelty_checker.py:210-218`	Swallowed exception assumes novel without logging context

Test Coverage (6)

ID	Location	Issue
TEST-004	`session_start_handler.py`	No dedicated test file
TEST-005	`user_prompt_handler.py`	No dedicated test file
TEST-006	`stop_handler.py`	No dedicated test file
TEST-007	`hooks/models.py`	No dedicated test file for model validation
TEST-008	All hook handlers	End-to-end hook pipeline integration test missing
TEST-009	`signal_detector.py`	Performance test for <50ms requirement missing

Documentation (4)

ID	Location	Issue
DOC-002	`README.md`	Missing troubleshooting section
DOC-003	`hooks/hooks.json`	Matchers and timeouts undocumented
DOC-004	Repository	No CONTRIBUTING.md file
DOC-005	Repository	No SECURITY.md file

Low Priority Findings (32)

Security (1)

SEC-001: Lock file permissions 0o644 could be 0o600 (capture.py:81)

Architecture (10)

Inconsistent singleton variable naming across modules
Large IndexService (1115 lines) but well-organized
SyncService.reindex() is 79 lines
HookConfig has 25+ configuration fields
Test file naming confusion (test_hook_integration.py vs test_hooks_integration.py)
Missing hook handler Protocol/ABC
Context XML serialization mixed with data gathering
Config parsing duplication between core and hooks

Performance (4)

Import of json inside functions (minor lookup overhead)
Config loader creates default HookConfig for every load
dotenv.load_dotenv() at module import (5-10ms)
Signal detector compiles patterns on first use (5-10ms, but cached)

Code Quality (17)

Unused escape_xml_text() function
Unused classify() method
Single-letter variable 's' in comprehensions
Ambiguous variable name 'result'
Hardcoded thresholds 0.95, 0.7, 0.3 without constants
Hardcoded summary length 200
Hardcoded context window lengths
Inconsistent import organization
CaptureDecider.init has 6 parameters
Deep nesting in _extract_tags()

Positive Observations

Security Strengths

All subprocess calls avoid shell=True - no command injection
Proper yaml.safe_load() - no deserialization vulnerabilities
Comprehensive path validation with symlink resolution
Parameterized SQL queries - no SQL injection
Input size limits prevent DoS
Timeout protection on all hooks

Architecture Strengths

Consistent lazy initialization via _get_*() methods
Services support dependency injection via constructors
Clean configuration separation (config.py, config_loader.py)
Excellent exception design with recovery_action hints
Frozen dataclasses for immutability

Code Quality Strengths

100% type hint coverage (mypy strict mode)
Google-style docstrings consistently applied
Graceful degradation - hooks never block Claude Code

Test Strengths

Proper pytest fixtures with autouse singleton reset
Good test naming conventions
Core services (capture, recall, index) well tested

Documentation Strengths

Excellent CLAUDE.md with complete environment variables
Comprehensive USER_GUIDE.md and DEVELOPER_GUIDE.md
All public APIs have docstrings

Appendix

Files Reviewed

17 core library modules (src/git_notes_memory/)
18 hook modules (src/git_notes_memory/hooks/)
8 hook entry scripts (hooks/)
30 test files (tests/)
Configuration files (pyproject.toml, plugin.json, hooks.json)
Documentation (README.md, CLAUDE.md, CHANGELOG.md, docs/)

Specialist Agents Deployed

Security Analyst
Performance Engineer
Architecture Reviewer
Code Quality Analyst
Test Coverage Analyst
Documentation & Standards Reviewer

Review Methodology

Static analysis patterns
OWASP Top 10 security checklist
SOLID principles assessment
Cyclomatic complexity estimation
Test coverage gap analysis

FilesExpand file tree

CODE_REVIEW.md

Latest commit

History

CODE_REVIEW.md

File metadata and controls

Code Review Report

Metadata

Executive Summary

Overall Health Score: 7.5/10

Key Findings

Recommended Action Plan

Critical Findings (2)

PERF-001: N+1 Query Pattern in Vector Search

PERF-002: Novelty Check Triggers Full Model Load in Hook Path

High Priority Findings (8)

PERF-003: Subprocess in Project Detection

PERF-004: Uncached File Reads in Project Detector

PERF-005: Session Start Initializes Full Index

TEST-001: Missing Tests for signal_detector.py

TEST-002: Missing Tests for capture_decider.py

TEST-003: Missing Tests for context_builder.py

DOC-001: CHANGELOG Missing v0.3.1 Entry

QUAL-001: Bare Exception Handlers in Hook Entry Points

Medium Priority Findings (25)

Architecture (7)

Performance (3)

Code Quality (5)

Test Coverage (6)

Documentation (4)

Low Priority Findings (32)

Security (1)

Architecture (10)

Performance (4)

Code Quality (17)

Positive Observations

Security Strengths

Architecture Strengths

Code Quality Strengths

Test Strengths

Documentation Strengths

Appendix

Files Reviewed

Specialist Agents Deployed

Review Methodology