feat: Close drill-down gaps in MCP API (RFC #49) by evansenter · Pull Request #52 · evansenter/agent-session-analytics

evansenter · 2026-01-04T04:17:20Z

Summary

Self-play testing revealed that LLMs could see aggregate counts but couldn't drill down to actionable specifics. This PR closes those gaps with 4 targeted fixes:

error_examples in analyze_failures: When errors_by_tool shows "Bash: 5 errors", error_examples now reveals WHICH commands failed (top 5 per tool with counts)
classification_factors in classify_sessions: Sessions include trigger threshold and metrics explaining WHY categorized
fnmatch for permission_gaps: Glob patterns like Bash(make*) now work correctly
Clearer sample_sequences error: Fixed misleading message about underscores

Test plan

All 293 tests pass
New tests for error_examples in test_patterns.py
New tests for classification_factors in test_queries.py
Existing fnmatch tests verify glob pattern matching
make check passes (fmt, lint, test)

Closes #45, #46, #48, #49
Supersedes #47 (was already working via tool_id joins)

🤖 Generated with Claude Code

Self-play testing revealed that LLMs could see aggregate counts but couldn't drill down to actionable specifics. This closes those gaps: - **error_examples in analyze_failures**: When errors_by_tool shows "Bash: 5 errors", error_examples now reveals WHICH commands failed (top 5 per tool with counts) - **classification_factors in classify_sessions**: Sessions now include the trigger threshold and relevant metrics explaining WHY they were categorized (e.g., "error_rate > 15%", error_rate: 33.2%) - **fnmatch for permission_gaps**: Patterns like Bash(make*) now correctly match commands using glob patterns, not just exact matches - **Clearer sample_sequences error**: Message now says "must be alphanumeric or underscores" (was misleading about underscores) Closes #45, #46, #48, #49 Supersedes #47 (was already working via tool_id joins) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

claude · 2026-01-04T04:18:42Z

Prompt: evansenter/dotfiles/.../claude-review.md

Code Review

Summary

This PR closes drill-down gaps in the MCP API by adding error_examples to analyze_failures, classification_factors to classify_sessions, fnmatch-based glob pattern matching for permission gaps, and a clearer error message for invalid tool names in sample_sequences. These changes address issues #45, #46, #48, and #49.

Issues Found

Critical

None

Important

None

Suggestions

src/session_analytics/patterns.py:660 - The base command extraction for glob patterns uses rstrip("*").rstrip() which may not handle all edge cases correctly. For example, a pattern like ./scripts/*.sh would have base extracted as ./scripts/ (with trailing slash) after rstripping *, then ./scripts after rstripping whitespace. The split()[0] on line 663 would then give ./scripts/ or ./scripts depending on presence of spaces. Consider whether patterns like ./scripts/*.sh should add the base command to base_commands at all, since this is a path pattern not a command pattern.
src/session_analytics/patterns.py:656 - For standard :* patterns, the base command is added to both base_commands AND glob_patterns. This creates redundancy since _command_matches_patterns first checks base_commands membership, making the fnmatch check on the same value unnecessary. Consider only adding to one collection for these simple cases.
tests/test_patterns.py:577 - The test uses timedelta(hours=1, minutes=-1) which is equivalent to 59 minutes - this works but is confusing. Consider using timedelta(minutes=59) or timedelta(hours=1) - timedelta(minutes=1) for clarity.

Verdict

REQUEST_CHANGES - The implementation is solid and addresses the linked issues well. The suggestions above are minor improvements for code clarity and edge case handling.

Automated review by Claude Code

Address reviewer feedback: use timedelta(minutes=59) instead of timedelta(hours=1, minutes=-1) for clarity in test timestamps. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

evansenter · 2026-01-04T04:28:36Z

Feedback Addressed

Implemented

[Suggestion] tests/test_patterns.py:577 - Clarified confusing timedelta(hours=1, minutes=-1) to timedelta(minutes=59) with comments explaining intent

Skipped

[Suggestion] patterns.py:656 - Redundancy in collections - Intentional design for O(1) lookup with fnmatch fallback
[Suggestion] patterns.py:660 - Path pattern edge case - fnmatch fallback handles it correctly; rare edge case not worth added complexity

claude · 2026-01-04T04:29:48Z

Prompt: evansenter/dotfiles/.../claude-review.md

Code Review

Summary

This PR closes drill-down gaps in the MCP API by adding error_examples to analyze_failures, classification_factors to classify_sessions, fnmatch-based glob pattern matching for permission gaps, and a clearer error message in sample_sequences. The changes address issues #45, #46, #48, and #49.

Issues Found

Critical

None

Important

None

Suggestions

None

Previous review feedback was properly addressed:

Implemented: The confusing timedelta(hours=1, minutes=-1) was clarified to timedelta(minutes=59) with explanatory comments
Skipped (with valid justifications): Redundancy in collections (intentional O(1) lookup design) and path pattern edge case (fnmatch fallback handles correctly)

Verdict

APPROVE - The implementation is solid, tests are comprehensive, documentation is updated, and all previous feedback has been appropriately addressed. The PR successfully closes the drill-down gaps identified in RFC #49.

Automated review by Claude Code

evansenter merged commit 81e3a7d into main Jan 4, 2026
3 checks passed

evansenter deleted the fix/rfc-49-drill-down-gaps branch January 4, 2026 04:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Close drill-down gaps in MCP API (RFC #49)#52

feat: Close drill-down gaps in MCP API (RFC #49)#52
evansenter merged 2 commits into
mainfrom
fix/rfc-49-drill-down-gaps

evansenter commented Jan 4, 2026

Uh oh!

claude Bot commented Jan 4, 2026

Uh oh!

evansenter commented Jan 4, 2026

Uh oh!

claude Bot commented Jan 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

evansenter commented Jan 4, 2026

Summary

Test plan

Uh oh!

claude Bot commented Jan 4, 2026

Code Review

Summary

Issues Found

Critical

Important

Suggestions

Verdict

Uh oh!

evansenter commented Jan 4, 2026

Feedback Addressed

Implemented

Skipped

Uh oh!

claude Bot commented Jan 4, 2026

Code Review

Summary

Issues Found

Critical

Important

Suggestions

Verdict

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant