Found during the v0.8.4 release review (Chaos Gremlin persona). Deferred — needs a retention/segmentation design, >30 min.
Problem
#895 correctly stopped per-turn truncation so traces persist across turns. But snapshot() rewrites the entire spans array on every span completion, and there is no per-trace span cap. MAX_TRACES=100 bounds the count of trace files, not the size of any one file. For a deliberately long-lived session (the stated design goal) with thousands of tool calls, ses_<id>.json grows monotonically and is fully serialized on every event — an O(n²) write-amplification and a disk-growth concern over a multi-day session.
Proposal (pick one, to be designed)
- Per-session span cap with head/tail retention (keep first N + last M, summarize the middle), or
- Append-only segmented trace files + a manifest, so a snapshot doesn't rewrite the whole array, or
- Debounce/coalesce snapshots more aggressively for large traces and write deltas.
Acceptance
- A session emitting 10k+ spans has bounded
ses_<id>.json size and sub-linear per-event write cost.
- Viewer still reconstructs the (possibly summarized) trace.
Refs: #895
Found during the v0.8.4 release review (Chaos Gremlin persona). Deferred — needs a retention/segmentation design, >30 min.
Problem
#895 correctly stopped per-turn truncation so traces persist across turns. But
snapshot()rewrites the entirespansarray on every span completion, and there is no per-trace span cap.MAX_TRACES=100bounds the count of trace files, not the size of any one file. For a deliberately long-lived session (the stated design goal) with thousands of tool calls,ses_<id>.jsongrows monotonically and is fully serialized on every event — an O(n²) write-amplification and a disk-growth concern over a multi-day session.Proposal (pick one, to be designed)
Acceptance
ses_<id>.jsonsize and sub-linear per-event write cost.Refs: #895