feat: Data quality fixes + multi-project git ingestion#76
Conversation
Fixes several data quality issues discovered via new smoke tests: - Fix warmup events incorrectly marked as errors (Issue #75) - 83% of "errors" (8,046/9,663) were warmup Task tool exits - Updated ingest.py to not mark warmup as errors - Added migration 12 to backfill existing data - Fix compaction detection finding 0 entries - Detection was looking at 'summary' entries but markers appear in 'user' entries - Added migrations 10, 11 to backfill existing data - Add multi-project git ingestion (git-ingest-all) - New decode_project_path() handles hyphenated directory names - Scans all known projects from events table - Now finding 5 repos/247 commits (up from 2 repos/132 commits) - Add smoke test suite (tests/test_smoke_real_data.py) - 10 tests validating assumptions against real database - Run with SESSION_ANALYTICS_SMOKE_TEST=1 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Code ReviewSummaryThis PR fixes data quality issues by correcting warmup error inflation (Issue #75), fixing compaction detection to look at user entries, and adds multi-project git ingestion with a new Issues FoundCriticalNone Important
Suggestions
VerdictREQUEST_CHANGES - Missing documentation in guide.md for the new MCP tool (required per CLAUDE.md), and the new Automated review by Claude Code |
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Code ReviewSummaryThis PR fixes data quality issues (warmup error inflation per Issue #75, compaction detection looking at wrong entry type), adds multi-project git ingestion via Issues FoundCriticalNone Important
Suggestions
VerdictREQUEST_CHANGES - Missing documentation in guide.md for the new MCP tool (required per CLAUDE.md), and unit tests needed for the Automated review by Claude Code |
- Add ingest_git_history_all_projects to guide.md Git Integration section - Add 5 unit tests for ingest_git_history_all_projects covering: - Empty project list - Projects without .git - Projects with .git - Decode failures - Result structure - Update benchmark skipped tools comment - Improve docstring with detailed return value documentation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Feedback AddressedImplemented
|
Code ReviewSummaryThis PR fixes data quality issues (warmup error inflation per Issue #75, compaction detection looking at wrong entry type), adds multi-project git ingestion via Issues FoundCriticalNone ImportantNone SuggestionsNone All previously raised issues have been addressed in the "Feedback Addressed" comment:
VerdictAPPROVE - All feedback has been addressed. The implementation is thorough with proper test coverage, documentation, and migration backfills. Automated review by Claude Code |
- Add Phase 0 to ingest logs, git commits from all projects, and correlate - Use ingest_git_history_all_projects() for cross-repo git correlation - Add get_error_details() call to drill into specific failing patterns - Note that warmup events no longer count as errors (fixed in session-analytics) Ref: evansenter/agent-session-analytics#76 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…ils (#196) - Add Phase 0 to ingest logs, git commits from all projects, and correlate - Use ingest_git_history_all_projects() for cross-repo git correlation - Add get_error_details() call to drill into specific failing patterns - Note that warmup events no longer count as errors (fixed in session-analytics) Ref: evansenter/agent-session-analytics#76 Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Resolve conflicts after main updated with PR #76 changes. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Summary
git-ingest-allcommand scans all known projects (5 repos, 247 commits vs previous 2 repos, 132 commits)Test plan
🤖 Generated with Claude Code