Skip to content

feat: Close drill-down gaps in MCP API (RFC #49)#52

Merged
evansenter merged 2 commits into
mainfrom
fix/rfc-49-drill-down-gaps
Jan 4, 2026
Merged

feat: Close drill-down gaps in MCP API (RFC #49)#52
evansenter merged 2 commits into
mainfrom
fix/rfc-49-drill-down-gaps

Conversation

@evansenter

Copy link
Copy Markdown
Owner

Summary

Self-play testing revealed that LLMs could see aggregate counts but couldn't drill down to actionable specifics. This PR closes those gaps with 4 targeted fixes:

  • error_examples in analyze_failures: When errors_by_tool shows "Bash: 5 errors", error_examples now reveals WHICH commands failed (top 5 per tool with counts)
  • classification_factors in classify_sessions: Sessions include trigger threshold and metrics explaining WHY categorized
  • fnmatch for permission_gaps: Glob patterns like Bash(make*) now work correctly
  • Clearer sample_sequences error: Fixed misleading message about underscores

Test plan

  • All 293 tests pass
  • New tests for error_examples in test_patterns.py
  • New tests for classification_factors in test_queries.py
  • Existing fnmatch tests verify glob pattern matching
  • make check passes (fmt, lint, test)

Closes #45, #46, #48, #49
Supersedes #47 (was already working via tool_id joins)

🤖 Generated with Claude Code

Self-play testing revealed that LLMs could see aggregate counts but
couldn't drill down to actionable specifics. This closes those gaps:

- **error_examples in analyze_failures**: When errors_by_tool shows
  "Bash: 5 errors", error_examples now reveals WHICH commands failed
  (top 5 per tool with counts)

- **classification_factors in classify_sessions**: Sessions now include
  the trigger threshold and relevant metrics explaining WHY they were
  categorized (e.g., "error_rate > 15%", error_rate: 33.2%)

- **fnmatch for permission_gaps**: Patterns like Bash(make*) now
  correctly match commands using glob patterns, not just exact matches

- **Clearer sample_sequences error**: Message now says "must be
  alphanumeric or underscores" (was misleading about underscores)

Closes #45, #46, #48, #49
Supersedes #47 (was already working via tool_id joins)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@claude

claude Bot commented Jan 4, 2026

Copy link
Copy Markdown

Prompt: evansenter/dotfiles/.../claude-review.md

Code Review

Summary

This PR closes drill-down gaps in the MCP API by adding error_examples to analyze_failures, classification_factors to classify_sessions, fnmatch-based glob pattern matching for permission gaps, and a clearer error message for invalid tool names in sample_sequences. These changes address issues #45, #46, #48, and #49.

Issues Found

Critical

None

Important

None

Suggestions

  • src/session_analytics/patterns.py:660 - The base command extraction for glob patterns uses rstrip("*").rstrip() which may not handle all edge cases correctly. For example, a pattern like ./scripts/*.sh would have base extracted as ./scripts/ (with trailing slash) after rstripping *, then ./scripts after rstripping whitespace. The split()[0] on line 663 would then give ./scripts/ or ./scripts depending on presence of spaces. Consider whether patterns like ./scripts/*.sh should add the base command to base_commands at all, since this is a path pattern not a command pattern.

  • src/session_analytics/patterns.py:656 - For standard :* patterns, the base command is added to both base_commands AND glob_patterns. This creates redundancy since _command_matches_patterns first checks base_commands membership, making the fnmatch check on the same value unnecessary. Consider only adding to one collection for these simple cases.

  • tests/test_patterns.py:577 - The test uses timedelta(hours=1, minutes=-1) which is equivalent to 59 minutes - this works but is confusing. Consider using timedelta(minutes=59) or timedelta(hours=1) - timedelta(minutes=1) for clarity.

Verdict

REQUEST_CHANGES - The implementation is solid and addresses the linked issues well. The suggestions above are minor improvements for code clarity and edge case handling.


Automated review by Claude Code

Address reviewer feedback: use timedelta(minutes=59) instead of
timedelta(hours=1, minutes=-1) for clarity in test timestamps.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@evansenter

Copy link
Copy Markdown
Owner Author

Feedback Addressed

Implemented

  • [Suggestion] tests/test_patterns.py:577 - Clarified confusing timedelta(hours=1, minutes=-1) to timedelta(minutes=59) with comments explaining intent

Skipped

  • [Suggestion] patterns.py:656 - Redundancy in collections - Intentional design for O(1) lookup with fnmatch fallback
  • [Suggestion] patterns.py:660 - Path pattern edge case - fnmatch fallback handles it correctly; rare edge case not worth added complexity

@claude

claude Bot commented Jan 4, 2026

Copy link
Copy Markdown

Prompt: evansenter/dotfiles/.../claude-review.md

Code Review

Summary

This PR closes drill-down gaps in the MCP API by adding error_examples to analyze_failures, classification_factors to classify_sessions, fnmatch-based glob pattern matching for permission gaps, and a clearer error message in sample_sequences. The changes address issues #45, #46, #48, and #49.

Issues Found

Critical

None

Important

None

Suggestions

None

Previous review feedback was properly addressed:

  • Implemented: The confusing timedelta(hours=1, minutes=-1) was clarified to timedelta(minutes=59) with explanatory comments
  • Skipped (with valid justifications): Redundancy in collections (intentional O(1) lookup design) and path pattern edge case (fnmatch fallback handles correctly)

Verdict

APPROVE - The implementation is solid, tests are comprehensive, documentation is updated, and all previous feedback has been appropriately addressed. The PR successfully closes the drill-down gaps identified in RFC #49.


Automated review by Claude Code

@evansenter evansenter merged commit 81e3a7d into main Jan 4, 2026
3 checks passed
@evansenter evansenter deleted the fix/rfc-49-drill-down-gaps branch January 4, 2026 04:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add error examples to analyze_failures output

1 participant