RFC #17 Phases 2-9: Advanced analytics features by evansenter · Pull Request #22 · evansenter/agent-session-analytics

evansenter · 2025-12-31T16:37:08Z

Summary

Implements the remaining phases from RFC #17:

Phase 2 - Failure Analysis: query_failure_correlation, query_common_failures
Phase 3 - Session Classification: classify_sessions, query_session_progression
Phase 4 - Trend Analysis: analyze_trends, compare_periods
Phase 5 - User Workflow: get_user_journey, find_related_sessions
Phase 6 - Git Integration: ingest_git_history, correlate_git_with_sessions, query_session_commits
Phases 7-9 - Integration: Enhanced get_insights(), CLI commands, MCP server tools

Also includes fixes from code review:

Proper exception logging (no silent swallowing) in get_insights()
Counters for skipped entries in git ingestion
85 new tests covering edge cases (169 total)

Test plan

All 169 tests pass
Format clean (ruff format --check)
Lint clean (ruff check)
Code review agents found no critical issues after fixes

Closes #17

🤖 Generated with Claude Code

Implement the remaining phases from RFC #17: **Phase 2 - Failure Analysis** - query_failure_correlation: Correlate tool failures with bash commands - query_common_failures: Aggregate failure patterns **Phase 3 - Session Classification** - classify_sessions: Heuristic classification (debugging, development, research, maintenance, mixed) - query_session_progression: Track session stage transitions **Phase 4 - Trend Analysis** - analyze_trends: Token usage trends over time with growth rates - compare_periods: Period-over-period comparisons **Phase 5 - User Workflow** - get_user_journey: Session summaries with tool chains - find_related_sessions: Find sessions with similar patterns **Phase 6 - Git Integration (Analytics)** - ingest_git_history: Parse and store git commits - correlate_git_with_sessions: Link commits to sessions by timing - query_session_commits: Get commits associated with sessions **Phase 7-9 - Integration** - Enhanced get_insights() with failure analysis, trends, classifications - CLI commands: failures, classify, trends, compare, journey, related, git-ingest, session-commits, enhanced insights - Server tools: All query functions exposed via MCP Also includes: - Exception logging (not silent swallowing) per code review - Counters for skipped entries in git ingestion - 85 new tests (169 total) covering edge cases 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

claude · 2025-12-31T16:38:41Z

PR Review: RFC #17 Phases 2-9: Advanced analytics features

This is a substantial PR adding ~4000 lines across 9 files implementing advanced analytics features. Overall, the code is well-structured and follows the existing patterns in the codebase. Here's my detailed review:

Code Quality & Best Practices

Strengths:

Good separation of concerns between patterns.py (analytics/computation), queries.py (data retrieval), and server.py (API layer)
Consistent use of the formatter registry pattern in CLI for output formatting
Proper docstrings with Args/Returns documentation
Exception logging in get_insights() uses exc_info=True for proper stack traces
The @migration decorator pattern for schema changes is clean and maintainable

Suggestions:

CLI imports (cli.py:6-42): The import organization could be cleaner. Consider grouping related imports or using a single import block with aliases instead of multiple from ... import statements with parentheses.
Magic numbers: Some hardcoded values could benefit from being constants:
- patterns.py:560: Message truncation at 480 chars
- patterns.py:462: 5-minute buffer in correlation
- queries.py:886: Session minimum 5 events for classification

Potential Bugs or Issues

Git correlation timing (ingest.py:654-658): The 5-minute buffer after session end is added, but there's no buffer before session start. A commit made just before starting a session might be related work but won't be correlated.
Division by zero protection (patterns.py:494): Good handling with if sessions_with_errors > 0, but similar patterns in classify_sessions (queries.py:902-903) divide by total which is set to row["total_events"] or 1 - this is handled correctly.
Timestamp handling consistency: There's mixed handling of timestamps as strings vs datetime objects throughout (e.g., ingest.py:629-632, queries.py:574-577). While the defensive isinstance checks work, consider normalizing at the storage layer.
Session classification heuristics (queries.py:911-925): The classification thresholds (0.15 for debugging, 0.3 for development, etc.) are reasonable but undocumented. Consider adding comments explaining why these values were chosen or making them configurable.

Performance Considerations

N+1 query pattern in correlate_git_with_sessions (ingest.py:649-678): Each commit update is a separate execute_write() call. For large numbers of commits, consider batching updates:
```
# Instead of individual updates in loop
updates = [(session_id, sha) for ...]
storage.executemany("UPDATE git_commits SET session_id = ? WHERE sha = ?", updates)
```
Full table scans in sample_sequences (patterns.py:220-229): The query fetches all events then processes in Python. For very large datasets, consider:
- Adding a LIMIT to the initial query
- Using window functions in SQL to detect sequences
Memory usage in pattern detection: Functions like analyze_failures and sample_sequences load all events into memory. This is fine for personal analytics but could be problematic for very active users. Consider streaming or pagination.

Security Concerns

Git command injection (ingest.py:514-525): The since_date is constructed from datetime objects and formatted as %Y-%m-%d, which is safe. The repo_path is passed as cwd rather than interpolated into the command. This looks secure.
SQL injection protection: All queries use parameterized statements, which is correct.
User message storage (ingest.py:220): User messages are truncated to 2000 chars (USER_MESSAGE_MAX_LENGTH). Consider whether sensitive data in user messages should be stored at all, or if additional sanitization is needed.

Test Coverage

The PR adds 85 new tests (169 total), which is excellent. The tests cover:

Edge cases for empty data
Error conditions
Various parameter combinations

Test gaps I noticed:

No tests for the git correlation error handling path (ingest.py:670-677)
No tests for the compare_to="same_last_month" branch in analyze_trends
No integration tests between git ingestion and session correlation

Additional Observations

Good practice: The skipped_malformed and skipped_date_parse counters in git ingestion provide useful debugging info.
Potential enhancement: The classify_sessions function uses simple heuristics. Consider adding a confidence score based on how dominant the classification is.
CLI completeness: All new MCP tools have corresponding CLI commands, which maintains good parity between interfaces.

Summary

This is a well-implemented feature set that follows the existing codebase patterns. The main areas for improvement are:

Performance optimizations for the correlation loop
Additional test coverage for error paths
Minor code organization in imports

The code is production-ready with the understanding that it's designed for personal analytics rather than high-volume usage.

Recommendation: Approve with minor suggestions. The code quality is high, tests are comprehensive, and the implementation follows established patterns.

- Add symmetric pre-session buffer (5 min before/after session) - Batch git correlation updates with executemany for performance - Document classification threshold heuristics in comments - Add tests for: - Pre-session buffer (commits before session start) - Batch correlation error handling - Enhanced same_last_month comparison test 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

claude · 2025-12-31T16:56:30Z

Code Review: RFC #17 Phases 2-9 - Advanced Analytics Features

Thank you for this comprehensive PR implementing the remaining phases of RFC #17. This is a substantial addition (~6k lines) adding failure analysis, session classification, trend analysis, user workflow tracking, and git integration. Here's my detailed review:

Overall Assessment

Quality: Good - well-structured code following existing patterns
Test Coverage: Excellent - 85 new tests covering edge cases
Documentation: Good - clear docstrings and helpful comments

Strengths

Consistent Architecture: New functions follow established patterns (build_where_clause, formatter registry, storage API usage).
Proper Exception Handling: The get_insights() function now properly logs exceptions (exc_info=True) rather than silently swallowing them:
```
except Exception as e:
    logger.warning("Failed to analyze trends in get_insights: %s", e, exc_info=True)
```
Git Ingestion Counters: Good addition of skipped_malformed and skipped_date_parse counters for debugging ingestion issues.
Defensive Timestamp Handling: Consistent handling of both datetime objects and ISO strings throughout:
```
if isinstance(timestamp, str):
    timestamp = datetime.fromisoformat(timestamp)
```
Batch Operations: The correlate_git_with_sessions uses batch updates via executemany() for efficiency.

Issues and Suggestions

1. Security: Command Injection Risk in Git Ingestion (Medium Priority)

In ingest.py:ingest_git_history(), the since_date is formatted but not sanitized:

since_date = (datetime.now() - timedelta(days=days)).strftime("%Y-%m-%d")
result = subprocess.run([
    "git", "log", f"--since={since_date}", ...
])

While days is an integer (safe here), the pattern of embedding variables in git commands should be reviewed. Consider validating repo_path more strictly:

if not (repo_path / ".git").is_dir():  # Not just exists()
    return {"error": f"Not a git repository: {repo_path}", ...}

2. Potential SQL Injection via f-strings (Low Priority - Verified Safe)

Files like queries.py:classify_sessions() use f-strings for SQL:

WHERE {where_clause}

I verified these are built from hardcoded strings via build_where_clause(), making them safe. The existing comments like # Safe: where_clause is built from hardcoded condition strings are helpful - consider adding similar comments in classify_sessions() for consistency.

3. Missing Input Validation (Low Priority)

sample_sequences() in patterns.py accepts user-provided pattern string without length/character validation:

if " → " in pattern:
    target_tools = [t.strip() for t in pattern.split(" → ")]
else:
    target_tools = [t.strip() for t in pattern.split(",")]

Consider adding validation to prevent extremely long patterns or unusual characters that could cause issues.

4. Random Sampling Non-Determinism (Minor)

In patterns.py:sample_sequences():

samples = random.sample(occurrences, count)

For debugging/reproducibility, consider adding an optional seed parameter, or document that results are non-deterministic.

5. Hardcoded Classification Thresholds (Suggestion)

In queries.py:classify_sessions(), the classification thresholds are inline:

if error_pct > 0.15 or (row["error_count"] or 0) > 5:
    category = "debugging"

Consider extracting these as module-level constants with documentation explaining the rationale (the comment on lines 910-916 is good but constants would be more maintainable):

DEBUGGING_ERROR_THRESHOLD_PCT = 0.15
DEBUGGING_ERROR_THRESHOLD_ABS = 5

6. Potential Performance Issue in `find_related_sessions`

The find_related_sessions() function queries all files/commands for a session, then does an IN clause with dynamic placeholders:

placeholders = ",".join("?" * len(file_paths))

For sessions with many files, this could create large IN clauses. Consider adding a limit or using a temporary table for very large lists.

Test Coverage Observations

The 85 new tests look comprehensive, covering:

Edge cases (empty data, missing files)
Various parameter combinations
Error handling paths
Integration between modules

One suggestion: Add a test for ingest_git_history when git log returns malformed output (partial line, missing delimiter).

Minor Nits

.parallel-context.md: Should this file be committed? It appears to be a session-specific context file that might not belong in the repo.
Import organization in cli.py: The new imports use multiple from session_analytics.patterns import (...) blocks - could be consolidated for readability.
Line length in comments: Some classification heuristic comments are quite long (lines 910-916 in queries.py). Consider breaking into multiple lines.

Summary

This is a well-implemented PR that significantly extends the analytics capabilities. The code follows existing patterns, has good test coverage, and properly handles edge cases. The issues identified are mostly minor or suggestions for improvement rather than blockers.

Recommendation: Approve with minor suggestions

🤖 Review generated with Claude Code

- Use .is_dir() instead of .exists() for git repo validation - Add SQL safety comment in classify_sessions for consistency - Add input validation in sample_sequences (length + char validation) - Remove .parallel-context.md from repo, add to .gitignore - Add test for malformed git log output handling 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

claude · 2025-12-31T17:54:38Z

PR Review: RFC #17 Phases 2-9: Advanced Analytics Features

This is a substantial PR implementing the remaining phases of RFC #17, adding advanced analytics capabilities. Here's my detailed review:

✅ Strengths

1. Well-Structured Architecture

Clean separation of concerns between queries.py (data queries), patterns.py (pattern detection), and ingest.py (data ingestion)
Consistent use of the formatter registry pattern in CLI
Good use of the build_where_clause() helper for SQL query building

2. Good Test Coverage

85 new tests bringing the total to 169
Tests cover edge cases like empty results, invalid inputs, and boundary conditions
Good fixture usage for test isolation

3. Defensive Programming

Pattern validation in sample_sequences() (length limits, alphanumeric validation)
Proper exception logging in get_insights() instead of silent swallowing
Counters for skipped entries in git ingestion (skipped_malformed, skipped_date_parse)

4. Documentation

Clear docstrings on all new functions
Helpful inline comments explaining classification heuristics (e.g., lines 911-917 in queries.py)

⚠️ Suggestions for Improvement

1. Potential SQL Injection Vector (Low Risk)
In queries.py:678-694, the find_related_sessions() function builds a dynamic IN clause:

placeholders = ",".join("?" * len(file_paths))
# ...
WHERE ... AND file_path IN ({placeholders})

While the placeholders themselves are safe, the f-string interpolation pattern could be error-prone if modified in the future. Consider using a helper function or adding a comment noting the safety.

2. Unbounded Memory in Pattern Detection
In patterns.py:250-305, sample_sequences() loads all matching events into memory before filtering. For very active users with large histories, this could be memory-intensive. Consider:

Adding a LIMIT clause for safety
Or processing in batches by session

3. Timestamp Handling Inconsistency
There's repeated defensive code for handling timestamps that may be strings or datetimes:

if isinstance(start, str):
    start = datetime.fromisoformat(start)

This appears in multiple places (queries.py:574-577, ingest.py:629-632, etc.). Consider creating a utility function like ensure_datetime(val) to centralize this logic.

4. Magic Numbers in Classification Heuristics
In classify_sessions() (lines 918-932), the thresholds are hardcoded:

if error_pct > 0.15 or (row["error_count"] or 0) > 5:
    category = "debugging"

While the comments explain the rationale, consider making these configurable or at least extracting them as named constants for easier tuning.

5. Git Correlation Window
In correlate_git_with_sessions(), the 5-minute buffer is hardcoded:

buffer = timedelta(minutes=5)

This might be worth exposing as a parameter for different workflows (e.g., longer buffer for CI-heavy workflows).

🐛 Minor Issues

1. Potential Division by Zero
In patterns.py:513:

avg_errors_per_session = total_errors / sessions_with_errors if sessions_with_errors > 0 else 0

This is correctly handled, but analyze_trends() line 864 does similar without the guard:

"error_rate": errors / total_events if total_events > 0 else 0,

Both are correct, but the inconsistent pattern could lead to future bugs. Consider a helper.

2. CLI Formatter Predicates Could Collide
Multiple formatters use similar predicates like checking for specific keys. If two formatters have overlapping predicates, the first match wins. Consider adding more specific checks or documenting the precedence.

🔒 Security Considerations

Git subprocess call in ingest_git_history() properly uses capture_output=True, timeout=30, and passes arguments as a list rather than a shell string - good practice.
No user input directly in SQL - all queries use parameterized statements correctly.

📈 Performance Considerations

The FTS5 full-text search index (events_fts) is a good addition for efficient text search
Batch operations (add_git_commits_batch, executemany) are used appropriately
Consider adding pagination to get_user_journey() for very active users

Summary

This is a well-implemented PR that adds significant functionality. The code quality is high, with good documentation, test coverage, and defensive programming practices. The suggestions above are enhancements rather than blockers.

Recommendation: Approve with minor suggestions

🤖 Generated with Claude Code

evansenter merged commit 7968220 into main Dec 31, 2025
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC #17 Phases 2-9: Advanced analytics features#22

RFC #17 Phases 2-9: Advanced analytics features#22
evansenter merged 3 commits into
mainfrom
rfc20-remaining-phases

evansenter commented Dec 31, 2025

Uh oh!

claude Bot commented Dec 31, 2025

Uh oh!

claude Bot commented Dec 31, 2025

Uh oh!

claude Bot commented Dec 31, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

evansenter commented Dec 31, 2025

Summary

Test plan

Uh oh!

claude Bot commented Dec 31, 2025

PR Review: RFC #17 Phases 2-9: Advanced analytics features

Code Quality & Best Practices

Potential Bugs or Issues

Performance Considerations

Security Concerns

Test Coverage

Additional Observations

Summary

Uh oh!

claude Bot commented Dec 31, 2025

Code Review: RFC #17 Phases 2-9 - Advanced Analytics Features

Overall Assessment

Strengths

Issues and Suggestions

1. Security: Command Injection Risk in Git Ingestion (Medium Priority)

2. Potential SQL Injection via f-strings (Low Priority - Verified Safe)

3. Missing Input Validation (Low Priority)

4. Random Sampling Non-Determinism (Minor)

5. Hardcoded Classification Thresholds (Suggestion)

6. Potential Performance Issue in find_related_sessions

Test Coverage Observations

Minor Nits

Summary

Uh oh!

claude Bot commented Dec 31, 2025

PR Review: RFC #17 Phases 2-9: Advanced Analytics Features

✅ Strengths

⚠️ Suggestions for Improvement

🐛 Minor Issues

🔒 Security Considerations

📈 Performance Considerations

Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

6. Potential Performance Issue in `find_related_sessions`