Skip to content

Phase 7: Polish and CLI tests#16

Closed
evansenter wants to merge 8 commits into
mainfrom
phase-7-polish
Closed

Phase 7: Polish and CLI tests#16
evansenter wants to merge 8 commits into
mainfrom
phase-7-polish

Conversation

@evansenter

Copy link
Copy Markdown
Owner

Summary

  • Add comprehensive CLI test coverage (15 tests)
  • Fix format_output ordering bug for insights command (summary check before sequences)
  • Tests cover all CLI commands with both human-readable and JSON output modes

Test Plan

  • All 84 tests pass
  • Code formatted with ruff
  • Linting passes

Closes #7

🤖 Generated with Claude Code

evansenter and others added 7 commits December 31, 2025 04:14
- pyproject.toml with FastMCP, uvicorn, and dev dependencies
- Makefile with check, fmt, lint, test, install, uninstall targets
- LaunchAgent plist and install/uninstall scripts for auto-start
- dev.sh script for development mode with auto-reload
- Basic FastMCP server with placeholder tools:
  - get_status: Returns server status
  - ingest_logs: Placeholder for log ingestion
  - query_tool_frequency: Placeholder for frequency queries
- Usage guide as MCP resource at session-analytics://guide
- Tests for the placeholder tools
- README with installation and usage instructions

Server runs on port 8081 (to not conflict with event-bus on 8080).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- storage.py with SQLiteStorage class:
  - Events table with denormalized fields for fast queries
  - Sessions table for session metadata
  - Ingestion state tracking for incremental updates
  - Patterns table for pre-computed insights
  - Indexes on timestamp, session_id, tool_name, project_path
- Data classes: Event, Session, IngestionState, Pattern
- CRUD operations for all entities with batch insert support
- get_db_stats() for monitoring database health
- Updated server.py to use storage for get_status()
- Comprehensive test suite (16 tests)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implements log file discovery and parsing:
- find_log_files(): Discovers JSONL files within date range
- parse_tool_use(): Extracts tool info (command, file_path, skill_name)
- parse_entry(): Parses entries into Event objects
- ingest_file(): Incremental ingestion with mtime/size tracking
- ingest_logs(): Full ingestion orchestration
- update_session_stats(): Aggregates session statistics

Integrates with server.py to provide real data for ingest_logs tool.

Closes #3

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implements all query MCP tools:
- query_tool_frequency: Tool usage counts with project filter
- query_timeline: Events in time window with filtering
- query_commands: Bash command breakdown with prefix filter
- query_sessions: Session metadata and token totals
- query_tokens: Token usage grouped by day/session/model

Also adds:
- ensure_fresh_data(): Auto-refresh mechanism (5 min staleness)
- Comprehensive tests for all queries (18 new tests)

Closes #4

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implements pattern detection for /improve-workflow integration:
- compute_tool_frequency_patterns(): Tool usage frequency
- compute_command_patterns(): Bash command frequency
- compute_sequence_patterns(): Tool n-gram detection
- compute_permission_gaps(): Commands not in settings.json
- get_insights(): Unified insights API for /improve-workflow

New MCP tools:
- query_sequences: Common tool patterns
- query_permission_gaps: Commands needing settings.json
- get_insights: Pre-computed patterns

Adds 16 new tests (69 total).

Closes #5

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Adds command-line interface for shell access:
- session-analytics-cli status: Database stats
- session-analytics-cli ingest: Trigger log ingestion
- session-analytics-cli frequency: Tool usage counts
- session-analytics-cli commands: Bash command breakdown
- session-analytics-cli sessions: Session metadata
- session-analytics-cli tokens: Token usage by day/session/model
- session-analytics-cli sequences: Tool patterns
- session-analytics-cli permissions: Commands needing settings.json
- session-analytics-cli insights: Pre-computed patterns

All commands support --json for machine-readable output.

Also updates README with CLI usage documentation.

Closes #6

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add comprehensive CLI test coverage (15 tests)
- Fix format_output ordering for insights command
- Test all CLI commands: status, ingest, frequency, commands,
  sessions, tokens, sequences, permissions, insights
- Test both human-readable and JSON output modes

Closes #7

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@claude

claude Bot commented Dec 31, 2025

Copy link
Copy Markdown

Pull Request Review: Phase 7 - Polish and CLI Tests

Summary

This PR adds comprehensive CLI test coverage (15 tests) and fixes a bug in format_output ordering for the insights command. Overall, this is excellent work with strong code quality, comprehensive testing, and good attention to detail.

✅ Strengths

  1. Excellent Test Coverage

    • Added 15 CLI-specific tests bringing total to 84 tests
    • Tests cover all CLI commands with both human-readable and JSON output modes
    • Good use of fixtures and mocking for isolated unit tests
    • Proper separation of concerns with capsys for output testing
  2. Well-Structured CLI Implementation

    • Clean separation between command handlers and formatting logic
    • Consistent argument parsing across all subcommands
    • Good use of argparse with helpful descriptions and defaults
    • Smart format detection using data structure introspection in format_output()
  3. Strong Security Practices

    • ✅ All SQL queries use parameterized statements (no SQL injection risk)
    • ✅ No f-strings with user input in SQL queries
    • ✅ Proper WHERE clause construction with params list
    • ✅ Settings file properly scoped with reasonable permissions
  4. Good Documentation

    • README.md provides clear usage examples
    • Comprehensive docstrings in CLI functions
    • Helpful inline comments in Makefile and scripts
  5. Production-Ready Tooling

    • Makefile targets are well-organized and documented
    • LaunchAgent setup with proper error handling
    • Dev mode script with graceful cleanup on exit
    • Good separation of dev vs production dependencies

🔍 Areas for Improvement

1. Error Handling in CLI (Minor)

The CLI commands lack exception handling. If the storage initialization or queries fail, users will see raw Python tracebacks instead of friendly error messages.

Recommendation:

def cmd_status(args):
    """Show database status."""
    try:
        storage = SQLiteStorage()
        stats = storage.get_db_stats()
        last_ingest = storage.get_last_ingestion_time()
        # ... rest of implementation
    except Exception as e:
        print(f"Error: {e}", file=sys.stderr)
        sys.exit(1)

2. Potential Division by Zero (Minor)

In cli.py:83, the code does:

lines.append(f"Size: {data.get('db_size_bytes', 0) / 1024:.1f} KB")

While unlikely, if db_size_bytes is 0, this works fine (0/1024 = 0), but consider adding size formatting utilities for better human-readable output (e.g., MB for larger databases).

3. macOS-Only Scripts (Documentation)

The LaunchAgent scripts are macOS-specific but this isn't clearly documented in the README. Consider adding a note about platform requirements or providing systemd alternatives for Linux users.

Recommendation:
Add to README:

## Platform Support
- **macOS**: Full support with LaunchAgent auto-start
- **Linux**: Manual server start (systemd service coming soon)

4. Hardcoded Limits (Minor)

The format_output() function truncates output at 20 items ([:20]) in multiple places. Consider making this configurable:

def format_output(data: dict, json_output: bool = False, limit: int = 20) -> str:

5. Missing Input Validation (Minor)

Arguments like --days, --min-count, --threshold don't validate for negative values. Argparse could include:

sub.add_argument("--days", type=int, default=7, 
                 help="Days to analyze (default: 7)",
                 metavar="N", 
                 choices=range(1, 366))  # or use a custom type validator

🐛 Potential Bugs

1. Insights Ordering Bug Fix

The PR description mentions fixing a "format_output ordering bug" where summary check needs to come before sequences. I verified this is correctly implemented in cli.py:57-63 with the summary check before the sequences check. Good catch!

2. Bash Script Error Handling

In scripts/dev.sh:10, the script does:

source .venv/bin/activate

If the venv doesn't exist, this will fail ungracefully. Consider adding:

if [ ! -f .venv/bin/activate ]; then
    echo "Error: Virtual environment not found. Run 'make dev' first."
    exit 1
fi
source .venv/bin/activate

📊 Performance Considerations

  1. Storage Initialization - Each CLI command creates a new SQLiteStorage() instance. For the CLI use case, this is fine, but for long-running processes, consider connection pooling.

  2. Pattern Computation - The compute_sequence_patterns() and compute_permission_gaps() functions could be expensive for large datasets. The existing caching via the patterns table is a good mitigation.

🧪 Test Quality

The tests are well-written with good coverage:

  • ✅ Tests use temporary databases (proper isolation)
  • ✅ Fixtures are reusable and clear
  • ✅ Both JSON and human-readable output modes tested
  • ✅ Mocking strategy is appropriate

One suggestion: Consider adding integration tests that verify the full CLI flow (argument parsing → command execution → output), not just the command functions directly.

📝 Code Quality & Style

  • ✅ Consistent with project conventions (CLAUDE.md)
  • ✅ Proper type hints throughout
  • ✅ Ruff formatting and linting passes
  • ✅ No unnecessary complexity
  • ✅ DRY principles followed

🎯 Overall Assessment

Rating: 9/10 - Excellent work!

This PR is production-ready with only minor suggestions for improvement. The bug fix is correct, test coverage is comprehensive, and the code quality is high. The CLI provides a solid interface for both human and programmatic use.

✅ Approval Recommendation

I recommend approving and merging this PR. The minor improvements suggested above can be addressed in future PRs as polish items, but they don't block this release.

Great job on achieving comprehensive test coverage and maintaining code quality! 🎉

- Document all 10 MCP tools with descriptions
- Add CLI usage examples for all 9 commands
- Include example JSON output for key queries
- Document architecture, data model, and integration points
- Add development and installation instructions

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@claude

claude Bot commented Dec 31, 2025

Copy link
Copy Markdown

PR Review: Phase 7 - Polish and CLI Tests

Overview

This PR adds comprehensive CLI test coverage (15 tests) and fixes a critical bug in the format_output function. The changes are well-structured and align with the project's goals.


✅ Strengths

1. Bug Fix - Well Identified and Resolved
The format_output ordering fix is correct. The issue was that get_insights() returns a dict with both "summary" and "sequences" keys, causing the elif chain to match "sequences" first. Moving the "summary" check before "sequences" ensures proper formatting for insights output.

2. Comprehensive Test Coverage

  • 15 new tests covering all CLI commands
  • Tests both human-readable and JSON output modes
  • Good use of fixtures (storage, populated_storage)
  • Proper mocking with patch to isolate units

3. Documentation Quality

  • README.md is well-written with clear examples
  • CLAUDE.md updates improve clarity and organization
  • Makefile provides clear development workflows

4. Code Quality

  • Clean, readable code following Python conventions
  • Good separation of concerns (CLI commands as functions)
  • Proper use of argparse for CLI argument handling

🔍 Issues & Recommendations

Critical: Missing Test for cmd_ingest

The cmd_ingest function is not tested in test_cli.py. This is a critical function that should have test coverage.

Recommendation:

def test_cmd_ingest(self, populated_storage, capsys):
    """Test ingest command."""
    class Args:
        json = False
        days = 7
        project = None
        force = False
    
    with patch("session_analytics.cli.SQLiteStorage", return_value=populated_storage):
        with patch("session_analytics.cli.ingest_logs", return_value={"files_found": 5, "files_processed": 3, "events_added": 100}):
            cmd_ingest(Args())
    
    captured = capsys.readouterr()
    assert "Files found:" in captured.out

Moderate: Test Isolation Concerns

The tests use capsys.readouterr() to verify output but only check for presence of strings, not their actual values. This makes tests somewhat brittle and less informative when they fail.

Recommendation:
Consider refactoring to separate the data retrieval from formatting:

# Instead of testing via print output
def test_cmd_frequency_data(self, populated_storage):
    """Test frequency command returns correct data."""
    with patch("session_analytics.cli.SQLiteStorage", return_value=populated_storage):
        # Capture return value instead of stdout
        result = query_tool_frequency(populated_storage, days=7, project=None)
        assert result["total_tool_calls"] == 2
        assert len(result["tools"]) == 2

Minor: .claude/settings.local.json Committed

This file appears to be developer-specific configuration and probably shouldn't be committed to the repo.

Recommendation:
Add to .gitignore:

.claude/settings.local.json

Minor: Hardcoded Test Data

The populated_storage fixture uses hardcoded UUIDs like "e1" and session IDs like "s1". While this works, using a factory or more realistic test data would improve maintainability.

Recommendation:

import uuid

Event(
    id=None,
    uuid=str(uuid.uuid4()),  # More realistic
    # ...
)

Documentation: pyproject.toml Missing License File

pyproject.toml declares license = "MIT" but there's no LICENSE file in the repository.

Recommendation:
Add a LICENSE file or update pyproject.toml to include license text.


🔒 Security Review

✅ No security concerns identified. The code:

  • Doesn't expose sensitive data
  • Uses parameterized SQL queries (via SQLiteStorage)
  • Properly handles file paths with Path objects
  • No command injection vulnerabilities

🎯 Performance Considerations

✅ Generally good:

  • List slicing with [:20] prevents excessive output
  • Efficient elif chains in format_output

⚠️ Minor concern:
The format_output function uses multiple elif conditions checking dict keys. For very large result dicts, this could be optimized with a lookup table or type hints to determine formatting strategy.


📊 Test Coverage Assessment

Current coverage: 15 CLI tests added
Missing coverage:

  • cmd_ingest (critical)
  • Error cases (invalid args, missing database, etc.)
  • Edge cases (empty results, malformed data)

✨ Summary

Approve with minor changes recommended

This is solid work that adds valuable test coverage and fixes a real bug. The main issues are:

  1. Must fix: Add test for cmd_ingest
  2. Should fix: Remove .claude/settings.local.json from repo
  3. Nice to have: Improve test assertions and add edge case coverage

The bug fix is correct and the overall code quality is good. Once the missing cmd_ingest test is added, this will be ready to merge.


Code Quality: 8/10
Test Coverage: 7/10 (would be 9/10 with cmd_ingest test)
Documentation: 9/10
Overall:Approved with recommendations

@evansenter

Copy link
Copy Markdown
Owner Author

Superseded by #19 which contains all Phase 1-7 work plus RFC #18 refactoring.

@evansenter evansenter closed this Dec 31, 2025
@evansenter evansenter deleted the phase-7-polish branch December 31, 2025 12:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Phase 7: Polish

1 participant