Skip to content

feat: Add agent tracking and fix token deduplication (RFC #41)#42

Merged
evansenter merged 2 commits into
mainfrom
rfc-41-agent-tracking
Jan 4, 2026
Merged

feat: Add agent tracking and fix token deduplication (RFC #41)#42
evansenter merged 2 commits into
mainfrom
rfc-41-agent-tracking

Conversation

@evansenter

Copy link
Copy Markdown
Owner

Summary

Closes #41

  • Fix token duplication: Each tool_use in multi-tool messages was getting full token count, inflating totals 2-3x. Now tokens are stored only on assistant events, with tool_use events linking via parent_uuid
  • Add agent tracking: Extract agentId from agent-*.jsonl files to distinguish Task subagent work from main session
  • Capture metadata: Add parent_uuid, agent_id, is_sidechain, version fields to events table
  • New API: get_agent_activity MCP tool and agents CLI command for agent vs main session breakdown

Test plan

  • All 282 tests pass including new RFC RFC: Close data coverage gaps in session analytics ingestion #41 test class with 8 tests
  • Token deduplication verified (tokens only on assistant events)
  • Agent fields propagate correctly to all event types
  • Migration v5 works for both fresh and existing databases
  • CLI agents command displays formatted output
  • MCP get_agent_activity tool returns structured data

🤖 Generated with Claude Code

Closes #41

## Problem
- Token duplication: Each tool_use in multi-tool messages got full token count, inflating totals 2-3x
- No agent tracking: Task subagent work was indistinguishable from main session
- Missing metadata: parentUuid, isSidechain, version fields not captured

## Solution
- **Event hierarchy**: Assistant events now store tokens; tool_use events link via parent_uuid with no tokens
- **Agent tracking**: Extract agentId from agent-*.jsonl files to distinguish subagent work
- **New fields**: parent_uuid, agent_id, is_sidechain, version added to events table
- **New API**: get_agent_activity MCP tool and CLI command for agent vs main session breakdown

## Changes
- storage.py: Schema v5 migration, 4 new Event fields, updated CRUD methods
- ingest.py: Refactored parse_entry() to create assistant+tool_use hierarchy
- queries.py: Added query_agent_activity() function
- server.py: Added get_agent_activity MCP tool
- cli.py: Added agents command with formatter
- Tests: 282 tests passing with comprehensive RFC #41 coverage

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@claude

claude Bot commented Jan 4, 2026

Copy link
Copy Markdown

Prompt: evansenter/dotfiles/.../claude-review.md

Code Review

Summary

This PR adds agent tracking and fixes token duplication for RFC #41. It introduces new schema fields (parent_uuid, agent_id, is_sidechain, version), fixes token counting by storing tokens only on assistant events (not duplicated across tool_use events), adds the get_agent_activity MCP tool and agents CLI command, and includes comprehensive test coverage for the new functionality.

Issues Found

Critical

None

Important

  • tests/ - Missing tests for query_agent_activity() function. While there are excellent tests for ingestion parsing and storage of agent fields (8 new tests in TestRFC41AgentTracking), there are no tests for the query function itself in queries.py:1644-1767. This function has complex logic including aggregation, top-tools calculation, and percentage computation that should be tested.

Suggestions

  • src/session_analytics/ingest.py:72 - The re module is imported inside extract_command_name() function. Convention is to import at module level for clarity and minor performance benefit (avoiding repeated import checks). Consider moving import re to the top of the file with other imports.

  • src/session_analytics/ingest.py:96 - The command name normalization match.group(1).strip().lower().replace(" ", "-") doesn't handle all special characters (e.g., apostrophes in "I'm Lost" become "i'm-lost"). Consider using a regex replacement like re.sub(r'[^a-z0-9]+', '-', command_name) for more robust slugification if consistent command names are important for querying.

  • RFC RFC: Close data coverage gaps in session analytics ingestion #41 mentions "Session slug, version" as metadata to capture for the sessions table, but this PR only adds version to the events table, not the sessions table. The slug field exists in the Session dataclass but update_session_stats() doesn't extract/populate it. This may be intentional deferral but worth noting since the issue isn't fully closed.

Verdict

REQUEST_CHANGES - One important issue (missing query tests) and a few minor suggestions. The implementation is well-structured with good test coverage for ingestion, but the query function for agent activity needs tests.


Automated review by Claude Code

- Add 7 tests for query_agent_activity() covering aggregation,
  top-tools calculation, and percentage computation
- Move `import re` to module level in ingest.py
- Use robust regex slugification for command names
  (handles apostrophes and special chars)
- Create issue #43 to track session-level metadata enhancement

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@evansenter

Copy link
Copy Markdown
Owner Author

Feedback Addressed

Implemented

  • [Important] Added 7 tests for query_agent_activity() covering aggregation, top-tools calculation, and percentage computation
  • [Suggestion] Moved import re to module level in ingest.py
  • [Suggestion] Added robust regex slugification for command names (handles apostrophes and special chars like "I'm Lost" → "i-m-lost")

Deferred

@claude

claude Bot commented Jan 4, 2026

Copy link
Copy Markdown

Prompt: evansenter/dotfiles/.../claude-review.md

Code Review

Summary

This PR adds agent tracking and fixes token duplication (RFC #41). It introduces parent_uuid, agent_id, is_sidechain, and version fields to the events table, fixes token counting by storing tokens only on assistant events (eliminating 2-3x inflation from multi-tool messages), and adds the get_agent_activity MCP tool and agents CLI command.

Previous Feedback Verification

All items from the previous Feedback Addressed comment have been verified:

  • [Important] 7 tests for query_agent_activity() added in TestQueryAgentActivity class
  • [Suggestion] import re moved to module level in ingest.py
  • [Suggestion] Robust regex slugification implemented
  • [Deferred] Session-level metadata tracked in Add session-level metadata (slug, version) #43

Issues Found

Critical

None

Important

None

Suggestions

None

Verdict

APPROVE - All previous feedback has been addressed. The implementation is well-structured with comprehensive test coverage (30 new tests), proper schema migration, and clean separation of concerns. Token deduplication fix and agent tracking are correctly implemented.


Automated review by Claude Code

@evansenter evansenter merged commit 867848c into main Jan 4, 2026
3 checks passed
@evansenter evansenter deleted the rfc-41-agent-tracking branch January 4, 2026 00:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RFC: Close data coverage gaps in session analytics ingestion

1 participant