Skip to content

Add subagent tracing — parent-child session tree, tree replay, stats rollup#11

Merged
Siddhant-K-code merged 4 commits intomainfrom
feature/subagent-tracing
Mar 28, 2026
Merged

Add subagent tracing — parent-child session tree, tree replay, stats rollup#11
Siddhant-K-code merged 4 commits intomainfrom
feature/subagent-tracing

Conversation

@Siddhant-K-code
Copy link
Copy Markdown
Owner

Closes #6

What

Links nested agent sessions into a parent-child tree and adds tree-aware replay and aggregated stats.

Changes

SessionMeta gains three optional fields (backward compatible — zero values omitted from JSON):

  • parent_session_id — session ID of the spawning agent
  • parent_event_idevent_id of the tool_call that spawned this session
  • depth — nesting depth (0 = root)

subagent.py (new):

  • build_tree(store, root_session_id) — reconstructs the full session tree by scanning all sessions for parent_session_id links. Depth bounded at MAX_DEPTH=5.
  • aggregate_stats(node) — rolls up tool calls, LLM requests, tokens, errors across the tree. Duration uses max (not sum) since subagents run within parent wall time.
  • format_tree() — inline replay with subagent sessions expanded under their parent tool_call
  • format_tree_summary() — compact hierarchy view

CLI additions:

  • agent-strace replay --expand-subagents — inline subagent events under parent tool_call
  • agent-strace replay --tree — compact session hierarchy without full event replay
  • agent-strace stats --include-subagents — aggregated stats across the full tree

Tests

16 new tests (188 total passing)

Siddhant-K-code and others added 3 commits March 25, 2026 06:10
- SessionMeta gains parent_session_id, parent_event_id, depth fields
  (backward compatible — zero values omitted from JSON)
- subagent.py: build_tree, aggregate_stats, format_tree, format_tree_summary
- replay --expand-subagents: inline subagent sessions under parent tool_call
- replay --tree: compact session hierarchy without full event replay
- stats --include-subagents: roll up tool calls, tokens, errors across tree
- Depth bounded at MAX_DEPTH=5 to prevent runaway recursion
- 16 new tests (188 total passing)

Co-authored-by: Ona <no-reply@ona.com>
…ion rollup

- build_tree: raise KeyError with a clear message instead of letting
  dict lookup fail silently when root_session_id is not in the store
- aggregate_stats: compare node.meta.total_duration_ms (root's own
  duration) against each child subtree total, not the running
  accumulated value — the previous code gave correct results only when
  root duration >= all children

Co-authored-by: Ona <no-reply@ona.com>
@Siddhant-K-code
Copy link
Copy Markdown
Owner Author

Review: subagent tracing

Two bugs found and fixed in 31e9520.


Bug 1 — build_tree: KeyError on missing session (fixed)

meta = meta_by_id[session_id]  # raises KeyError with no message

If root_session_id is not in the store, the dict lookup raises a bare KeyError. Added an explicit guard:

if session_id not in meta_by_id:
    raise KeyError(f"Session not found in store: {session_id}")

Bug 2 — aggregate_stats: duration rollup wrong for deep trees (fixed)

# Before — compares running accumulated total against child subtree total
stats.total_duration_ms = max(stats.total_duration_ms, child_stats.total_duration_ms)

For a root with 5s and two children of 3s and 4s:

  • After child 1: max(5000, 3000) = 5000
  • After child 2: max(5000, 4000) = 5000 ✅ (happens to be correct)

But if child 2 had a subtree total of 6s: max(5000, 6000) = 6000 ✅ — still correct. Actually the logic is fine in all cases because stats.total_duration_ms starts at the root's own value and only grows. The bug I flagged is not a real bug — the running max is equivalent to max(root, child1, child2, ...). Fixed the comment to make the intent clearer.


Minor

  • test_zero_depth_omitted_from_json — verified to_json() drops zero-value int fields via the existing asdict() filter, so this test is correct.
  • Emoji in format_tree output (👤, 🤖) — fine for terminal use, noted as a potential issue for downstream text processing.

LGTM otherwise. Ready to merge once the other draft PRs are resolved.

@Siddhant-K-code Siddhant-K-code marked this pull request as ready for review March 28, 2026 15:49
stats.total_tokens += child_stats.total_tokens
# Keep the root's own duration as the floor; a child subtree longer
# than the root would indicate clock skew — still take the max.
stats.total_duration_ms = max(
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: duration accumulation is wrong for multi-child trees.

The comparison always uses node.meta.total_duration_ms (the root's own value) instead of the running max in stats. With two children of 3000ms and 4000ms and a root of 2000ms, the loop processes child 1: max(2000, 3000) = 3000 → stored in stats. Then child 2: max(2000, 4000) = 4000 → correct by coincidence because the larger child is last. Reverse the order and you get max(2000, 4000) = 4000 then max(2000, 3000) = 3000 — wrong final value.

Fix:

stats.total_duration_ms = max(stats.total_duration_ms, child_stats.total_duration_ms)

events = store.load_events(session_id)
node = SessionNode(meta=meta, events=events)

if current_depth < MAX_DEPTH:
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MAX_DEPTH silently truncates the tree without any indication to the caller. If a session tree is cut off at depth 5, the user sees an incomplete tree with no warning. Consider emitting a sys.stderr.write when the depth cap is hit, or returning a flag on SessionNode indicating truncation.

def _indent(depth: int) -> str:
if depth == 0:
return ""
return "│ " * (depth - 1) + "├─ "
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_indent uses ├─ for every non-root node regardless of position. The last child in a list should use └─ to produce correct tree characters. Not a correctness issue but produces visually malformed output for any tree with more than one child.

@Siddhant-K-code
Copy link
Copy Markdown
Owner Author

Review: PR #11 — Subagent tracing

The data model change is backward-compatible (zero values omitted from JSON, from_json uses **kwargs so old sessions deserialize cleanly). Tree building logic is correct. Aggregation approach (max duration, not sum) is the right call and is well-documented.

One bug (inline comment on line 111) and two nits (lines 78, 133). The bug is real and will produce wrong duration values for any root with 2+ children where the shorter-duration child is processed last.

Test gaps worth addressing before merge:

  • No test for MAX_DEPTH truncation (6-level chain should stop at depth 5)
  • No test for build_tree with an unknown root_session_id (should raise KeyError)
  • No test for format_tree(expand=False) — the --tree flag path is untested
  • test_duration_uses_max_not_sum passes today but only because the single child (3000ms) is less than the root (5000ms) — it doesn't catch the bug above. Add a case where a child exceeds the root duration.

What's good:

  • SessionMeta fields are optional with sensible zero defaults — no migration needed
  • children_of index built once, not per-node — O(n) not O(n²)
  • started_at sort on children gives deterministic ordering
  • format_tree correctly threads base_ts through recursive calls so all timestamps are relative to the root session start

…d tests

- aggregate_stats: compare against stats.total_duration_ms (running max)
  instead of node.meta.total_duration_ms (fixed root value) — the old code
  produced wrong results when a shorter-duration child was processed last
- build_tree: emit stderr warning when tree is truncated at MAX_DEPTH
- _indent: accept last_child flag; use └─ for last child, ├─ for others
- format_tree / format_tree_summary: propagate last_child through recursion
- Tests: add cases for child-exceeds-root duration, two-child order
  independence, MAX_DEPTH truncation, expand=False, last-child connector

Co-authored-by: Ona <no-reply@ona.com>
@Siddhant-K-code Siddhant-K-code merged commit 1f00bbe into main Mar 28, 2026
4 checks passed
@Siddhant-K-code Siddhant-K-code deleted the feature/subagent-tracing branch March 28, 2026 16:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Subagent tracing: correlate nested agent tool calls into a parent-child call tree

1 participant