Summary
Session compaction events ("type": "summary") exist in Claude Code JSONL logs but lack timestamps and aren't meaningfully ingested. This RFC proposes inferring timestamps from surrounding events and exposing compaction rate queries to answer questions like "how often do my sessions compact?"
Problem / Motivation
When asked "how many compactions per hour during active sessions?", we had to write ad-hoc Python scripts to:
- Grep for
"type":"summary" in raw JSONL files
- Infer timing from surrounding timestamped events
- Calculate rates manually
This data should be queryable via MCP tools like any other session metric.
Context
- Discovered during: User question about compaction frequency
- Relevant files:
src/session_analytics/ingest.py:366-381 - Currently creates Event with datetime.now() fallback (wrong - uses ingestion time, not compaction time)
src/session_analytics/queries.py - Needs new compaction queries
src/session_analytics/server.py - Needs new MCP tools
- Related issues/PRs: None
- Raw data format:
{"type": "summary", "summary": "...", "leafUuid": "..."}
Proposed Solution
1. Fix Timestamp Inference in Ingestion
Summary events have no timestamp. Infer from the last timestamped event before the summary:
# Track last seen timestamp during file parsing
last_timestamp = None
for line in file:
entry = json.loads(line)
if entry.get("timestamp"):
last_timestamp = entry["timestamp"]
if entry.get("type") == "summary":
# Use last_timestamp, not datetime.now()
2. Add Compaction-Specific Fields
Store the summary text and leafUuid for drill-down:
ALTER TABLE events ADD COLUMN compaction_summary TEXT;
ALTER TABLE events ADD COLUMN leaf_uuid TEXT;
3. New Query Functions
def get_compaction_stats(storage, days=7, project=None) -> dict:
"""Returns compaction frequency metrics."""
return {
"total_compactions": int,
"sessions_with_compactions": int,
"avg_per_hour": float,
"median_messages_between": float,
"by_session": [{"session_id": str, "count": int, "rate_per_hour": float}]
}
4. New MCP Tool
@mcp.tool()
def get_compaction_stats(days: int = 7, project: str | None = None) -> dict:
"""Get compaction frequency and timing metrics."""
Assumptions
| Assumption |
Confidence |
Impact if Wrong |
| Summary events always follow timestamped events |
High |
Could have null timestamps for edge cases |
leafUuid uniquely identifies compaction point |
Medium |
May need to track line number as fallback |
| Compaction rate is useful without message content |
High |
Users might want pre/post compaction context |
Open Questions
- Should we track "messages since last compaction" as a running metric in sessions table?
- Should compaction events link to the last N events before compaction (for context)?
Actionable Requirements
| # |
Requirement |
Owner |
Blocked By |
| 1 |
Track last_timestamp during file parsing in ingest.py |
Claude |
- |
| 2 |
Add compaction_summary, leaf_uuid columns (migration v6) |
Claude |
- |
| 3 |
Implement get_compaction_stats() in queries.py |
Claude |
1, 2 |
| 4 |
Add MCP tool in server.py |
Claude |
3 |
| 5 |
Add CLI formatter in cli.py |
Claude |
3 |
| 6 |
Update guide.md |
Claude |
4 |
Test Requirements
- Unit:
test_ingest.py - summary event timestamp inference
- Integration:
test_queries.py - compaction stats aggregation
- Edge cases:
- Session with no compactions
- Summary as first event (no prior timestamp)
- Multiple compactions in quick succession
Implementation Checklist
Summary
Session compaction events (
"type": "summary") exist in Claude Code JSONL logs but lack timestamps and aren't meaningfully ingested. This RFC proposes inferring timestamps from surrounding events and exposing compaction rate queries to answer questions like "how often do my sessions compact?"Problem / Motivation
When asked "how many compactions per hour during active sessions?", we had to write ad-hoc Python scripts to:
"type":"summary"in raw JSONL filesThis data should be queryable via MCP tools like any other session metric.
Context
src/session_analytics/ingest.py:366-381- Currently creates Event withdatetime.now()fallback (wrong - uses ingestion time, not compaction time)src/session_analytics/queries.py- Needs new compaction queriessrc/session_analytics/server.py- Needs new MCP tools{"type": "summary", "summary": "...", "leafUuid": "..."}Proposed Solution
1. Fix Timestamp Inference in Ingestion
Summary events have no timestamp. Infer from the last timestamped event before the summary:
2. Add Compaction-Specific Fields
Store the summary text and leafUuid for drill-down:
3. New Query Functions
4. New MCP Tool
Assumptions
leafUuiduniquely identifies compaction pointOpen Questions
Actionable Requirements
last_timestampduring file parsing in ingest.pycompaction_summary,leaf_uuidcolumns (migration v6)get_compaction_stats()in queries.pyTest Requirements
test_ingest.py- summary event timestamp inferencetest_queries.py- compaction stats aggregationImplementation Checklist
parse_entry()get_compaction_stats()query