Skip to content

feat: add flow trace replay bundles for flow runs#181

Open
osolmaz wants to merge 9 commits intomainfrom
codex/flow-trace-replay-spec
Open

feat: add flow trace replay bundles for flow runs#181
osolmaz wants to merge 9 commits intomainfrom
codex/flow-trace-replay-spec

Conversation

@osolmaz
Copy link
Contributor

@osolmaz osolmaz commented Mar 26, 2026

Summary

  • store flow runs as versioned trace/replay bundles with manifest.json, flow.json, trace.ndjson, projections, bundled session snapshots/events, and artifact receipts
  • link ACP node attempts to bundled session conversation/event ranges so previous runs can be replayed step by step
  • add runtime/store/tests/spec updates for the new bundle model

Validation

  • pnpm run build:test && node --test dist-test/test/flows-store.test.js dist-test/test/flows.test.js
  • pnpm run check
  • pnpm run check:docs
  • tmpdir=$(mktemp -d /tmp/acpx-trace-smoke-XXXXXX) && printf '{"text":"trace-smoke"}' > "$tmpdir/input.json" && node dist/cli.js --approve-all --cwd "$tmpdir" --format json flow run examples/flows/shell.flow.ts --input-file "$tmpdir/input.json"

Notes

  • local Codex review surfaced a bundled-session event ordering race; this PR fixes it by serializing per-file appends in FlowRunStore and adds a regression test
  • the external visualizer is not part of this PR; this change focuses on the runtime-side bundle format and replay data

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c589a3b3c7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +222 to +223
- `session_bound`
- `artifact_written`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Mark session/artifact events as conditional

The required event list makes session_bound and artifact_written mandatory for every implementation, but non-ACP flows can complete without any session binding or artifact output (the runtime supports compute/checkpoint-only graphs in src/flows/types.ts). That forces producers to emit synthetic events or violate the spec, which will fragment replay tooling behavior. These event types should be explicitly conditional ("when relevant"), like the ACP/action-specific events below.

Useful? React with 👍 / 👎.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. The implementation PR supersedes this doc-only draft and makes these event types conditional in runtime output; I am leaving this older doc review unresolved because it targets the earlier spec-only commit rather than the current diff.


- `run_started`
- `run_completed`
- `run_failed`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve timeout terminal status in trace schema

This document declares the trace as the source of truth, but the minimum terminal events only cover run_completed and run_failed, with no required representation for timeout termination. Since run state has a distinct timed_out status (FlowRunState.status in src/flows/types.ts), replay consumers cannot reliably reconstruct that outcome from the required trace contract alone. Please require an explicit timeout terminal event (or require terminal payloads to include final status).

Useful? React with 👍 / 👎.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. The implementation PR now emits explicit timeout outcomes in flow state and trace, so this earlier spec-only concern no longer reflects the current branch. I am keeping the historical comment for context but treating it as stale relative to the current head.

@osolmaz
Copy link
Contributor Author

osolmaz commented Mar 26, 2026

Final report for the current head 5ca4608.

Summary:

  • Implemented flow trace/replay run bundles end to end: manifest.json, flow.json, trace.ndjson, projection files, bundled session snapshots/events, artifact receipts, and per-attempt step trace linkage.
  • Added direct flow/runtime coverage for the new bundle model, including isolated and persistent ACP traces.
  • After local Codex review surfaced a real correctness issue, fixed bundled session event ordering by serializing per-file appends in FlowRunStore and added a regression test for concurrent appendSessionEvent() calls.

Validation run locally:

  • pnpm run build:test && node --test dist-test/test/flows-store.test.js dist-test/test/flows.test.js
  • pnpm run check
  • pnpm run check:docs
  • tmpdir=$(mktemp -d /tmp/acpx-trace-smoke-XXXXXX) && node dist/cli.js --approve-all --cwd "$tmpdir" --format json flow run examples/flows/shell.flow.ts --input-json '{"text":"trace-smoke"}'

Review status:

  • Local codex review --base main against the earlier head surfaced one actionable correctness issue in bundled session event ordering; that fix is included in 5ca4608.
  • A second codex review --base main pass was run against the current head and re-inspected the changed runtime/store code. It used the review budget without surfacing any additional concrete P0/P1 findings or new GitHub comments on the latest commit.
  • Existing Codex PR comments were older doc-spec comments from the pre-implementation head; I replied on-thread to mark them stale relative to the implemented branch.

CI/CD:

  • PR checks are green on 5ca4608: scope, Format, Typecheck, Lint, Build, Conformance Smoke, Test, Docs.

Non-blocking caveats:

  • I could validate the new bundle layout and replay artifacts locally, but I did not build the external visualizer itself in this PR.
  • The current local Codex review command was slow/noisy due a local Codex state-db migration warning, but it still completed enough work to catch and drive the bundled-event ordering fix.

This PR looks ready for human review / landing.

@osolmaz osolmaz changed the title docs: add flow trace and replay spec feat: add flow trace replay bundles for flow runs Mar 26, 2026
@osolmaz
Copy link
Contributor Author

osolmaz commented Mar 26, 2026

Final report for the current head e0df665.

Summary:

  • Added an external replay viewer example under examples/flows/replay-viewer/ that reads saved flow run bundles, renders the workflow with React Flow, and shows the bound ACP session, timeline, and trace/events in a side inspector.
  • Added a bundled sample run generated from a real two-turn.flow.ts execution with the mock ACP agent so the viewer opens with a complete replay immediately.
  • Wired the viewer into repo docs and checks with new viewer:typecheck, viewer:build, and viewer:preview scripts, plus docs/README updates and a changelog entry.

Validation run locally:

  • pnpm run viewer:typecheck
  • pnpm run viewer:build
  • pnpm run check:docs
  • pnpm run check
  • pnpm run build:test && node dist-test/src/cli.js --approve-all --cwd "$tmpdir" --format json --agent "node /Users/onur/offline/acpx/dist-test/test/mock-agent.js" flow run examples/flows/two-turn.flow.ts --input-json '{"topic":"How should we visualize a flow trace elegantly?"}' ✅ (used to generate and verify the sample bundle)
  • pnpm run viewer:preview + browser validation against http://127.0.0.1:4173/
    • verified the graph renders
    • verified replay/timeline selection updates the graph and inspector
    • verified the ACP session tab shows the bound conversation for the selected ACP step
    • verified the events tab shows flow trace plus bundled ACP session events

Review status:

  • Existing inline Codex comments on this PR are older doc-spec comments from the pre-viewer head; they were already replied to on-thread and remain stale relative to the current diff.
  • I reran local codex review --base main against the current head. On this machine the tool is currently unstable because of a local Codex state-db migration warning (migration 21 was previously applied but is missing in the resolved migrations) and it repeatedly stalled while diff-reading instead of returning a final verdict. It did not surface any concrete new P0/P1 finding on the current head before stalling.

CI/CD:

  • PR checks are green on e0df665: scope, Format, Typecheck, Lint, Build, Conformance Smoke, Test, Docs.

Non-blocking caveats:

  • The viewer is intentionally an external/example app, not a new acpx CLI command.
  • I validated the bundled sample path end to end and the browser UI itself. I did not fully automate the OS-level directory picker flow for opening an arbitrary local run bundle, though that code path is implemented.

This PR looks ready for human review / landing.

@osolmaz
Copy link
Contributor Author

osolmaz commented Mar 27, 2026

Final report for the current head 03b6700.

Summary:

  • Renamed the flow API and replay-bundle schema fields away from kind to explicit names: nodeType, actionType, currentNodeType, and sourceType.
  • Updated the runtime, flow builders, persisted run-bundle projections/trace payloads, the PR-triage example flow, and the replay viewer to use the renamed schema consistently.
  • Added scripts/lint-flow-schema-terms.ts and wired it into pnpm run lint so kind cannot be reintroduced into the flow API or replay-schema surfaces.
  • Regenerated the bundled replay-viewer sample run from a fresh real two-turn.flow.ts execution so the checked-in sample matches the current schema.

Validation run locally:

  • pnpm run viewer:typecheck
  • pnpm run build:test
  • pnpm run check
  • pnpm run check:docs
  • HOME=$(mktemp -d /tmp/acpx-sample-home-XXXXXX) node dist-test/src/cli.js --approve-all --cwd "$tmp_home/work" --format json --agent "node /Users/onur/offline/acpx/dist-test/test/mock-agent.js" flow run examples/flows/two-turn.flow.ts --input-json '{"topic":"How should a replay viewer show ACP turns?"}' ✅ (used to refresh the bundled sample run)
  • Browser smoke against http://127.0.0.1:4173/ with agent-browser ✅ (viewer loaded the regenerated sample run with the renamed fields)

Review status:

  • Existing inline comments on this PR are older spec-stage comments and already marked stale on-thread.
  • I ran codex review --base main on the current head. On this machine the review tool is still noisy/unstable because of a local Codex state DB migration warning, and it stalled while diff-reading instead of returning a final verdict. It did not surface any concrete new P0/P1 finding on 03b6700 before stalling.

CI/CD:

  • GitHub checks are green on 03b6700: scope, Format, Typecheck, Lint, Build, Conformance Smoke, Test, Docs.

Non-blocking caveats:

  • This is a breaking rename inside the experimental flow/replay surface, but it is still confined to the not-yet-landed branch/PR.

This PR still looks ready for human review / landing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant