Skip to content

Add new analytics commands and expand flags#30

Merged
evansenter merged 6 commits into
mainfrom
feature/new-analytics-commands
Jan 1, 2026
Merged

Add new analytics commands and expand flags#30
evansenter merged 6 commits into
mainfrom
feature/new-analytics-commands

Conversation

@evansenter

Copy link
Copy Markdown
Owner

Summary

  • New CLI commands: file-activity, languages, projects, mcp-usage for deeper analytics insights
  • Expand flag system: frequency auto-expands by default (Bash→commands, Skill→skills, Task→agents), sequences supports --expand for command-level patterns
  • UX improvements: Descriptive headers on all CLI output, better token breakdown in sessions, make reinstall target
  • Tooling: Global 7-day report script in scripts/global-report.sh

New Commands

Command Description Flags
file-activity Files with most reads/edits/writes --days, --project, --limit, --collapse-worktrees
languages Language distribution from file extensions --days, --project
projects Activity breakdown by project --days
mcp-usage MCP server and tool usage --days, --project

Test plan

  • All 200 tests pass
  • Tested each new command with real data
  • Verified --collapse-worktrees consolidates worktree paths (512 → 364 files)
  • Verified global-report.sh generates full report to /tmp

🤖 Generated with Claude Code

New CLI commands:
- file-activity: Show file read/write activity with --collapse-worktrees option
- languages: Language distribution from file extensions
- projects: Activity breakdown by project
- mcp-usage: MCP server and tool usage grouped by server

Expand flag system:
- frequency: Now auto-expands Bash→commands, Skill→skills, Task→agents (--no-expand to disable)
- sequences: Add --expand flag to show command-level workflow patterns

Other improvements:
- Add descriptive headers to all CLI output formatters
- Update sessions output to show input/output/total token breakdown
- Add make reinstall target (pip install + restart LaunchAgent)
- Add global-report.sh script for 7-day analytics report
- Update CLAUDE.md with restart/reinstall guidance

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@claude

claude Bot commented Jan 1, 2026

Copy link
Copy Markdown

PR Review: Add new analytics commands and expand flags

Overall this is a solid PR that adds useful analytics capabilities. The code follows the project's patterns well and the documentation updates are thorough. Here's my detailed feedback:

✅ What looks good

  1. Well-structured new commands - The new file-activity, languages, projects, and mcp-usage commands fill real analytics gaps and follow the project's naming conventions (query_* for data retrieval).

  2. Consistent formatter pattern - All new formatters use the @_register_formatter(predicate) decorator pattern and include descriptive headers as required.

  3. Expand flag implementation - The asymmetric defaults (frequency expands by default, sequences does not) make sense for the UX and are well-documented in CLAUDE.md.

  4. Global report script - Nice addition for generating comprehensive reports. The fallback to .venv/bin/ is a good pattern.

  5. Documentation updates - CLAUDE.md updates are comprehensive and include the new make reinstall target explanation.

⚠️ Suggestions

  1. Import placement in queries.py (line 1245):

    import re

    This import is placed mid-file after function definitions. Move it to the top of the file with other imports for consistency.

  2. Language detection extensibility (query_languages, lines 1362-1388):
    The hardcoded CASE statement for language detection works but is rigid. Consider extracting the extension-to-language mapping to a constant dict for easier maintenance:

    EXTENSION_LANGUAGES = {
        ".rs": "Rust", ".py": "Python", ".ts": "TypeScript", ...
    }

    This is minor - the current approach works fine.

  3. Missing project filter in query_projects:
    Unlike other query functions, query_projects doesn't accept a project filter parameter. This is intentional (you're querying across projects), but worth documenting in the docstring to avoid confusion.

  4. Potential SQL injection concern (already addressed):
    I see build_where_clause() is used consistently, which is good. Just confirming the pattern is maintained in all new queries. ✓

🧪 Test coverage

The test changes look minimal (just adding expand=False to the Args class). Consider adding tests for:

  1. New CLI commands - cmd_file_activity, cmd_languages, cmd_projects, cmd_mcp_usage
  2. New formatters - _format_file_activity, _format_languages, _format_projects, _format_mcp_usage
  3. --collapse-worktrees flag - Test the worktree path collapsing logic
  4. Breakdown expansion in query_tool_frequency with expand=True

The PR mentions "All 200 tests pass" but the new functionality appears untested. While the code is straightforward, tests would help prevent regressions.

📋 Minor nits

  1. Line 211 in cli.py - The breakdown limit of 8 items is hardcoded. Consider making this configurable or at least a constant.

  2. The get_repo_name() helper function in query_projects (lines 1447-1456) looks for specific directory names like "projects", "repos", "src". This heuristic may not work for all setups, but the fallback to the last path component is reasonable.

Summary

This is a well-implemented feature PR. The main feedback is around adding test coverage for the new functionality. The code quality is good and follows established patterns.

Recommendation: Approve with suggestion to add tests for the new commands in a follow-up.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@claude

claude Bot commented Jan 1, 2026

Copy link
Copy Markdown

Code Review: Add new analytics commands and expand flags

Thanks for this well-structured PR! The new analytics commands and expand flag system are useful additions. Here's my review:

✅ What looks good

  1. Consistent API design: The new query functions follow established patterns (query_file_activity, query_languages, etc.) and use the existing build_where_clause() helper properly.

  2. CLI follows conventions: Commands are properly registered with argparse, use standard flags (--days, --project, --limit), and integrate well with the formatter registry.

  3. Documentation updates: CLAUDE.md is updated with the new commands and the expand flag behavior explanation is clear and helpful.

  4. UX improvements: Adding descriptive headers to CLI output (e.g., "Bash commands by frequency") and comma-formatting large numbers improves readability.

  5. The make reinstall target: Good addition to the Makefile for handling pyproject.toml changes.

⚠️ Issues to consider

1. Missing test coverage for new query functions

The PR adds 4 new query functions (query_file_activity, query_languages, query_projects, query_mcp_usage) but there are no corresponding unit tests in tests/test_queries.py. While the PR description mentions "200 tests pass", these are existing tests - the new functionality isn't tested.

Consider adding tests for:

  • query_file_activity with collapse_worktrees=True/False
  • query_languages language detection from file extensions
  • query_projects repo name extraction logic
  • query_mcp_usage parsing of mcp__<server>__<tool> format

2. Hardcoded language mappings in query_languages

The language detection uses a large CASE statement with hardcoded extensions. This works, but:

  • Missing common extensions: .kt (Kotlin), .scala, .zig, .lua, .php, .vue, .svelte
  • Consider extracting to a constant or helper for maintainability

3. _collapse_worktree_path regex pattern

The regex \.worktrees/[^/]+/ correctly matches .worktrees/<branch>/, but it may inadvertently match paths with .worktrees as a directory name (not just git worktrees). Minor edge case, but worth noting.

4. MCP tool name parsing assumption

In query_mcp_usage, the parsing logic assumes mcp__<server>__<tool> format:

parts = tool_name.split("__")
if len(parts) >= 3:
    server = parts[1]
    tool = "__".join(parts[2:])

This is fragile if tool names contain __. The code handles it with "__".join(parts[2:]), which is good, but a comment explaining the expected format would help.

5. get_repo_name heuristic

The repo name extraction in query_projects:

for part in ("projects", "repos", "src", "Documents"):
    ...

This won't work well for paths like /home/user/work/my-project where none of those markers exist. Might fall back to the last path component unintentionally.

💡 Suggestions

  1. Add MCP tools for new queries: The new query_* functions are only exposed via CLI. Should they also be exposed as MCP tools (get_file_activity, get_languages, etc.) per the project's API conventions?

  2. Consider adding --json output examples to the documentation for the new commands.

  3. The global-report.sh script is a nice addition, but it could fail silently on some errors due to the CLI fallback logic. Consider adding || exit 1 after critical commands.

📊 Summary

Aspect Assessment
Code quality Good - follows existing patterns
Performance No concerns - queries use existing indexes
Security No concerns
Test coverage Needs improvement - new functions lack tests

Overall this is a solid PR that adds useful functionality. I'd recommend adding tests for the new query functions before merging. The expand flag implementation in particular is well-designed and the UX improvements (headers, formatting) are thoughtful.

- Add tests for new query functions (file_activity, languages, projects, mcp_usage)
- Add MCP tools for new queries (get_file_activity, get_languages, get_projects, get_mcp_usage)
- Update query_projects docstring explaining no project filter
- Add CLI/MCP parity guidance to CLAUDE.md Architecture Patterns
- Document new MCP tools in guide.md and CLAUDE.md

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@claude

claude Bot commented Jan 1, 2026

Copy link
Copy Markdown

PR Review: Add new analytics commands and expand flags

This is a well-structured PR that adds useful analytics capabilities while maintaining consistency with the existing codebase. Here's my detailed review:


✅ Strengths

1. Excellent CLI/MCP Parity
The PR correctly follows the CLAUDE.md guidance: every new query function (query_file_activity, query_languages, query_projects, query_mcp_usage) is exposed on both the CLI (cli.py) and MCP server (server.py). Documentation is updated in both guide.md and CLAUDE.md.

2. Good Test Coverage
279 new lines of tests in test_queries.py covering:

  • TestQueryFileActivity - including worktree collapsing
  • TestQueryLanguages - extension mapping
  • TestQueryProjects - cross-project activity
  • TestQueryMcpUsage - MCP tool parsing

3. Consistent Architecture

  • Uses build_where_clause() helper throughout
  • Follows existing formatter registry pattern with @_register_formatter
  • Proper use of ensure_fresh_data() in MCP tools

4. Thoughtful UX

  • --expand / --no-expand flags with sensible defaults
  • --collapse-worktrees is a practical feature for worktree users
  • Descriptive headers on all CLI output

🔍 Suggestions for Improvement

1. Minor: get_repo_name could handle more edge cases (queries.py:1449-1458)

def get_repo_name(path: str) -> str:
    parts = path.rstrip("/").split("/")
    for i, part in enumerate(parts):
        if part in ("projects", "repos", "src", "Documents"):
            if i + 1 < len(parts):
                return parts[i + 1]
    return parts[-1] if parts else path

Consider adding common patterns like github, work, code, or handling ~/.claude/projects/-Users-... encoding. Not blocking, but would improve project name display.

2. Consider: Limiting language CASE statement for performance (queries.py:1362-1387)
The language detection uses a large SQL CASE statement. For very large databases, consider:

  • Adding a language column populated during ingest (denormalization)
  • Or caching the extension-to-language mapping in Python

This is minor since the query is limited by the time filter, but worth noting for future scale.

3. Test coverage: cmd_frequency with --no-expand
The CLI test for cmd_frequency (test_cli.py:196-208) doesn't test the new --no-expand flag. Consider adding:

def test_cmd_frequency_no_expand(self, populated_storage, capsys):
    class Args:
        json = False
        days = 7
        project = None
        no_expand = True  # New flag
    # ...

4. Shell script: Add error handling for CLI failures (scripts/global-report.sh)
The report script doesn't capture failures gracefully. If a command fails mid-report, you get a partial output:

"$CLI" mcp-usage --days "$DAYS" 2>&1 || echo "MCP usage unavailable"

🐛 Minor Issues

1. Unused limit parameter (queries.py:1316)
query_file_activity takes a limit parameter but the SQL query doesn't include LIMIT ?. The limit is applied in Python ([:limit]), which works but is less efficient for large result sets.

Consider:

-- Add LIMIT to SQL for efficiency if you know you'll sort by total anyway
-- Though the current approach works fine for typical use cases

2. CLI test fixture: no_expand attribute not set (test_cli.py:196-208)
The Args class in test_cmd_frequency doesn't have a no_expand attribute, but cmd_frequency accesses it via getattr(args, "no_expand", False). The test passes because of the default, but it's implicit. Making it explicit would be clearer.


📊 Overall Assessment

Verdict: LGTM with minor suggestions

This is a solid PR that:

  • Adds genuinely useful analytics features
  • Follows established patterns consistently
  • Includes good test coverage
  • Updates documentation properly

The suggestions above are minor improvements rather than blockers. Nice work! 🎉


🤖 Generated with Claude Code

Full paths are more useful to callers than truncated display.
Changed format to two lines per file for readability.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@claude

claude Bot commented Jan 1, 2026

Copy link
Copy Markdown

PR Review: Add new analytics commands and expand flags

Nice work on this substantial feature addition! This PR adds valuable analytics capabilities and improves the CLI UX. Here's my detailed review:

Strengths

Well-structured additions:

  • New query functions (query_file_activity, query_languages, query_projects, query_mcp_usage) follow established patterns and use build_where_clause() consistently
  • Good test coverage for new functionality (279 new lines in test_queries.py)
  • Documentation updates in CLAUDE.md and guide.md maintain CLI/MCP parity
  • The expand flag system is well-thought-out with sensible defaults

Good UX improvements:

  • Descriptive headers on CLI output help users understand what each command shows
  • Token formatting with commas (:,) improves readability
  • The --collapse-worktrees flag is a nice touch for worktree users

Issues to Address

1. Missing CLI tests for new commands (Minor)

While test_queries.py has good coverage for the new query functions (TestQueryFileActivity, TestQueryLanguages, TestQueryProjects, TestQueryMcpUsage), I don't see corresponding tests for the new CLI commands (cmd_file_activity, cmd_languages, cmd_projects, cmd_mcp_usage) in test_cli.py. Consider adding tests for these to maintain consistency with existing CLI command tests.

2. Potential SQL performance consideration

In query_languages (lines 1359-1394), the CASE statement with many LIKE comparisons could be slow on large datasets. Consider:

  • Adding an index on file extension if this becomes a hot path
  • Or extracting extension in Python post-query

This is likely fine for typical usage, but worth monitoring.

3. global-report.sh hardcoded path

The script outputs to /tmp/session-analytics-report.md which is fine for Unix systems but won't work on Windows. Since this is macOS-focused (LaunchAgent), this is acceptable but could be noted in the script.

Suggestions

  1. Add CLI tests for new commands - Even simple smoke tests would help ensure parity

  2. Consider caching language mappings - The extension-to-language mapping in query_languages could be extracted to a constant for easier maintenance

  3. Minor: _format_projects table alignment - The format string uses fixed widths ({:<30}) which might truncate long project names. Consider dynamic width based on content.

Code Quality

  • ✅ Follows project conventions (formatter registry pattern, module imports)
  • ✅ Consistent naming (follows MCP API conventions from CLAUDE.md)
  • ✅ Good docstrings with Args/Returns documentation
  • ✅ Proper use of build_where_clause() for SQL construction
  • ✅ No security concerns (parameterized queries used correctly)
  • ✅ Defensive coding with getattr(args, "no_expand", False)

Summary

Approve with minor suggestions. This is a solid PR that adds useful analytics capabilities while maintaining consistency with the existing codebase. The main actionable item is adding CLI tests for the new commands to match the query test coverage.

The expand flag system is particularly well-designed - having different defaults for frequency (expanded by default) vs sequences (collapsed by default) based on their different use cases shows good UX thinking.


🤖 Generated with Claude Code

CLI formatters:
- Remove [:20] limits from tools, commands, tokens, sequences, gaps, journey, messages, sessions, commits
- Remove [:8] limit from breakdowns
- Remove message text truncation ([:57]...)
- Remove session_id truncation ([:16], [:12])
- Remove SHA truncation ([:8])

Query layer:
- Remove [:20] from parallel_periods
- Remove LIMIT 10 from analyze_trends top_tools

Full data is more useful to callers than truncated summaries.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@claude

claude Bot commented Jan 1, 2026

Copy link
Copy Markdown

PR Review: Add new analytics commands and expand flags

Overall, this is a well-structured PR that adds valuable new analytics capabilities while maintaining consistency with the existing codebase patterns. Here is my detailed feedback:

Strengths

1. Excellent CLI/MCP Parity
The PR properly maintains parity between CLI and MCP interfaces as required by CLAUDE.md. All four new query functions (query_file_activity, query_languages, query_projects, query_mcp_usage) are exposed in both cli.py and server.py.

2. Solid Test Coverage
The new tests in test_queries.py are comprehensive:

  • TestQueryFileActivity covers basic activity and worktree collapsing
  • TestQueryLanguages validates language detection from extensions
  • TestQueryProjects tests project activity aggregation
  • TestQueryMcpUsage verifies MCP tool parsing and server grouping

3. Consistent Design Patterns

  • Uses build_where_clause() helper consistently
  • Follows the formatter registry pattern in CLI
  • MCP tools call ensure_fresh_data() before queries

4. Good Documentation Updates
Both CLAUDE.md and guide.md are updated to reflect the new commands, and the expand flag behavior is well-documented.

Minor Suggestions

1. Worktree regex is correctly pre-compiled at module level
I noticed WORKTREE_PATTERN is correctly defined at module level in queries.py:1247. Good practice.

2. The query_projects function intentionally omits project filter
The docstring at queries.py:1420-1422 correctly explains why there is no project filter. This is well-documented and appropriate for the use case.

Potential Issues

1. MCP tool name parsing edge case
In query_mcp_usage at queries.py:1522-1528, the MCP tool name parsing splits on __. This correctly handles tools with __ in the name, but if a server name contains __, it would be split incorrectly. This seems unlikely in practice but worth noting. The current implementation is reasonable.

2. Language detection is hardcoded
The language mapping in query_languages uses a hardcoded CASE statement. This is fine for common languages but might miss some (e.g., .kt for Kotlin, .scala for Scala, .vue for Vue). Consider if this should be extensible in the future. For now, the "Other" fallback handles unlisted extensions gracefully.

Test Plan Verification

The PR description states all tests pass and manual testing was done for:

  • Each new command with real data
  • --collapse-worktrees consolidating worktree paths (512 to 364 files)
  • global-report.sh generating full report

The test coverage for the new functionality appears thorough.

Security Considerations

No security concerns. The new queries use parameterized SQL via build_where_clause() and do not introduce any new input vectors.

Summary

Approve - This PR is well-implemented with good test coverage, proper documentation updates, and follows the established patterns. The expand flag system is thoughtfully designed with different defaults for frequency vs sequences based on their primary use cases.

The only actionable suggestions are cosmetic improvements, which are optional.


Generated with Claude Code

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@claude

claude Bot commented Jan 1, 2026

Copy link
Copy Markdown

PR Review: Add new analytics commands and expand flags

Overall, this is a well-structured PR that adds valuable new analytics features while maintaining consistency with existing patterns. The code quality is high and follows the project's conventions well.

✅ Strengths

Code Quality

  • Excellent adherence to project conventions (CLI/MCP parity, formatter registry pattern, build_where_clause() helper)
  • Clean separation of concerns between queries, CLI, and server layers
  • Good documentation updates in CLAUDE.md and guide.md
  • Consistent use of the @_register_formatter decorator pattern

New Features

  • file-activity, languages, projects, mcp-usage commands are well-designed
  • The --collapse-worktrees flag is a nice touch for consolidating worktree paths
  • The expand flag system (--expand/--no-expand) is thoughtfully designed with sensible defaults
  • Descriptive headers on CLI output improve UX

Testing

  • Good test coverage for the new query functions in test_queries.py
  • Tests cover edge cases like worktree path collapsing

⚠️ Areas for Improvement

Missing CLI Tests for New Commands
The new CLI commands (cmd_file_activity, cmd_languages, cmd_projects, cmd_mcp_usage) are not imported or tested in tests/test_cli.py. Consider adding tests similar to the existing ones:

def test_cmd_file_activity(self, populated_storage, capsys):
    class Args:
        json = False
        days = 7
        project = None
        limit = 20
        collapse_worktrees = False

    with patch("session_analytics.cli.SQLiteStorage", return_value=populated_storage):
        cmd_file_activity(Args())

    captured = capsys.readouterr()
    assert "Files touched:" in captured.out

Minor Issues

  1. get_repo_name() in query_projects() (queries.py:1449-1458): This function is defined inside query_projects() but could be extracted to module level for reuse and testing. Also, the fallback behavior returns the last path component which may not be ideal for paths like /Users/dev/projects/myapp - it would return myapp which is correct, but for paths ending in / it might behave unexpectedly.

  2. Potential edge case in query_languages(): The SQL CASE statement handles many common extensions but falls back to 'Other' for anything unknown. Consider adding a note in the docstring that this is an approximation, not exhaustive.

  3. _format_signals change: The formatter now shows the full session_id instead of truncated (:16 was removed). This is fine, but be aware it may make the output wider.

  4. Global report script path handling (scripts/global-report.sh:15-19): The script tries the venv path relative to the script directory if the CLI isn't in PATH. This is good defensive code, but consider adding a message when using the venv fallback for debugging.

🔒 Security

No security concerns identified. The code properly uses parameterized queries and doesn't expose any sensitive operations.

📊 Performance

  • The new queries use appropriate WHERE clauses and GROUP BY operations
  • The query_file_activity aggregation in Python (lines 1296-1313) is reasonable for expected data sizes, though for very large datasets, doing the aggregation in SQL might be more efficient

💡 Suggestions (Optional)

  1. Consider adding --format options (table, csv) in future for the new commands
  2. The query_projects function intentionally has no project filter (as documented), but you might want to add a name filter in the future

Summary

This is a solid PR that adds useful analytics capabilities. The main recommendation is to add CLI tests for the new commands to match the existing test coverage pattern. The code is well-structured and ready to merge after addressing the test coverage gap.

@evansenter evansenter merged commit 8fbe882 into main Jan 1, 2026
3 checks passed
@evansenter evansenter deleted the feature/new-analytics-commands branch January 1, 2026 22:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant