Refactor codebase per RFC #18 audit findings#19
Conversation
- pyproject.toml with FastMCP, uvicorn, and dev dependencies - Makefile with check, fmt, lint, test, install, uninstall targets - LaunchAgent plist and install/uninstall scripts for auto-start - dev.sh script for development mode with auto-reload - Basic FastMCP server with placeholder tools: - get_status: Returns server status - ingest_logs: Placeholder for log ingestion - query_tool_frequency: Placeholder for frequency queries - Usage guide as MCP resource at session-analytics://guide - Tests for the placeholder tools - README with installation and usage instructions Server runs on port 8081 (to not conflict with event-bus on 8080). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- storage.py with SQLiteStorage class: - Events table with denormalized fields for fast queries - Sessions table for session metadata - Ingestion state tracking for incremental updates - Patterns table for pre-computed insights - Indexes on timestamp, session_id, tool_name, project_path - Data classes: Event, Session, IngestionState, Pattern - CRUD operations for all entities with batch insert support - get_db_stats() for monitoring database health - Updated server.py to use storage for get_status() - Comprehensive test suite (16 tests) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implements log file discovery and parsing: - find_log_files(): Discovers JSONL files within date range - parse_tool_use(): Extracts tool info (command, file_path, skill_name) - parse_entry(): Parses entries into Event objects - ingest_file(): Incremental ingestion with mtime/size tracking - ingest_logs(): Full ingestion orchestration - update_session_stats(): Aggregates session statistics Integrates with server.py to provide real data for ingest_logs tool. Closes #3 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implements all query MCP tools: - query_tool_frequency: Tool usage counts with project filter - query_timeline: Events in time window with filtering - query_commands: Bash command breakdown with prefix filter - query_sessions: Session metadata and token totals - query_tokens: Token usage grouped by day/session/model Also adds: - ensure_fresh_data(): Auto-refresh mechanism (5 min staleness) - Comprehensive tests for all queries (18 new tests) Closes #4 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implements pattern detection for /improve-workflow integration: - compute_tool_frequency_patterns(): Tool usage frequency - compute_command_patterns(): Bash command frequency - compute_sequence_patterns(): Tool n-gram detection - compute_permission_gaps(): Commands not in settings.json - get_insights(): Unified insights API for /improve-workflow New MCP tools: - query_sequences: Common tool patterns - query_permission_gaps: Commands needing settings.json - get_insights: Pre-computed patterns Adds 16 new tests (69 total). Closes #5 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Adds command-line interface for shell access: - session-analytics-cli status: Database stats - session-analytics-cli ingest: Trigger log ingestion - session-analytics-cli frequency: Tool usage counts - session-analytics-cli commands: Bash command breakdown - session-analytics-cli sessions: Session metadata - session-analytics-cli tokens: Token usage by day/session/model - session-analytics-cli sequences: Tool patterns - session-analytics-cli permissions: Commands needing settings.json - session-analytics-cli insights: Pre-computed patterns All commands support --json for machine-readable output. Also updates README with CLI usage documentation. Closes #6 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add comprehensive CLI test coverage (15 tests) - Fix format_output ordering for insights command - Test all CLI commands: status, ingest, frequency, commands, sessions, tokens, sequences, permissions, insights - Test both human-readable and JSON output modes Closes #7 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Document all 10 MCP tools with descriptions - Add CLI usage examples for all 9 commands - Include example JSON output for key queries - Document architecture, data model, and integration points - Add development and installation instructions 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implements all 12 findings from the codebase audit: **P1 - Encapsulation:** - Add execute_query() and execute_write() public methods to SQLiteStorage - Migrate 8 locations (4 in queries.py, 4 in patterns.py) to use public API **P2 - Code Quality:** - Add build_where_clause() helper to reduce query duplication - Refactor format_output() to use formatter registry pattern - Change to module-qualified imports in server.py - Read version from importlib.metadata instead of hardcoding **P3 - Future Extensibility:** - Add schema migration framework with @migration decorator - Define __all__ exports in __init__.py **P4 - Polish:** - Remove empty pass branch in ingest.py - Document timestamp handling with clear comments - Add CLI epilog with examples - Update CLAUDE.md and README.md with architecture patterns Closes #18 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Code Review: PR #19 - Refactor codebase per RFC #18 audit findingsThis is an excellent refactoring that systematically addresses all 12 findings from RFC #18. The implementation is clean, well-tested, and follows best practices. Strengths1. Excellent Encapsulation - The new public API (execute_query/execute_write) in storage.py:162-192 properly abstracts database access with clear separation and good documentation. 2. Strong Test Coverage - 1,591 lines of tests across 6 test files (84 tests total) covering storage, queries, patterns, ingest, cli, and server. 3. Elegant Formatter Registry - cli.py:17-135 implements clean decorator-based registry that's easy to extend. 4. Good Query Helper - queries.py:8-40 build_where_clause() eliminates duplication across 4 instances. 5. Migration Framework - storage.py:112-132 provides future-proof schema evolution with decorator pattern. Issues FoundCritical: None High Priority:
Medium Priority: Low Priority: RecommendationsDocumentation: Excellent README and CLAUDE.md updates. Consider adding more API docstrings. Security: Uses parameterized queries throughout. settings.local.json should not be committed. Performance: Proper indexes, batch operations, pre-computed patterns, auto-refresh with staleness detection. Maintainability: Clear separation of concerns, consistent naming, migration framework, comprehensive tests. SummaryAll 12 RFC #18 findings successfully addressed:
Recommended Actions:
Verdict: Approve with minor changes (fix settings.local.json before merge) Excellent work on this comprehensive refactoring! |
Code Review: PR #19 - Refactor codebase per RFC #18 audit findingsOverviewThis is an impressive refactor that systematically addresses all 12 findings from RFC #18. The PR adds ~4,700 lines of well-structured code with comprehensive test coverage (84 tests across 7 test files). The implementation follows clean architecture principles and establishes clear patterns for future development. ✅ StrengthsArchitecture & Design
Code Quality
Developer Experience
🔍 Issues & RecommendationsSecurity ConcernsP0: SQL Injection Risk (storage.py:427, 434) Recommendation: Use string concatenation instead of f-strings for SQL to make the pattern safer. P1: Path Traversal (ingest.py:43) Recommendation: Validate or sanitize BugsP1: Timezone Handling (storage.py:567-569) P2: Silent Failure in Formatter Registry (cli.py:130-135) P2: Missing Error Handling (queries.py:68-73) PerformanceP1: Missing Index (storage.py:262-265) P2: Unbounded Query (storage.py:409) Code QualityP2: Magic Number (cli.py:36, 44, 67) P2: Duplicate Code (storage.py:318-350, 354-392) P3: Type Annotation (cli.py:23) TestingP1: Test Coverage Gap P2: Missing Edge Case Tests
📊 Test Coverage AnalysisStrong coverage across 7 test files:
🎯 RecommendationsBefore Merge
Follow-up PRs
✨ ConclusionThis is excellent work that significantly improves codebase maintainability. The architecture patterns provide a solid foundation. Recommendation: Merge after addressing the P0 security concern and adding the missing index. Great job! 🚀 |
Summary
Implements all 12 findings from the codebase audit (RFC #18):
execute_query()/execute_write()public API to SQLiteStorage, migrate 8 locations from_connect()build_where_clause()helper, formatter registry pattern in CLI, module-qualified imports, importlib.metadata for version@migrationdecorator,__all__exportsTest plan
make checkpasses (format, lint, test)Closes #18
🤖 Generated with Claude Code