This roadmap is an execution document, not an idea backlog. Each unchecked item requires an owner, an acceptance test, and a decision gate.
- Program status
- Level 3 — Causal Command Protocol (complete)
- Level 4 — Durable command journal + replay (active planning)
- Level 5 — Formal invariants + chaos harness
- Decision ledger (open)
- Operating discipline
- Current phase: Level 4 (Durable command journal + replay)
- State: 🚧 Active planning (Level 3 complete)
- Protocol baseline:
1.0.0 - Normative contract:
PROTOCOL.md
- Serialize commands by lane (
session:<id>/server) - Preserve causal order for burst traffic on the same session
- Add integration coverage
Acceptance evidence
integration: serializes create -> steer -> follow_up on same session lane
- Support
dependsOn,ifSessionVersion,idempotencyKey - Validate envelope fields at admission
- Gate execution on dependency + version preconditions
Acceptance evidence
- Validation coverage for envelope fields
- Session manager coverage for dependency + version checks
- Emit
command_accepted,command_started,command_finished - Include causal metadata + outcome fields
- Document ordering guarantees
Acceptance evidence
integration: emits command lifecycle events- lifecycle contract documented in
README.mdandPROTOCOL.md
- Implement TTL idempotency cache
- Replay duplicate command IDs from in-flight/completed outcomes
- Reject conflicting
id/idempotencyKeyfingerprints
Acceptance evidence
- Unit + integration coverage for replay and conflict paths
- Track monotonic per-session
sessionVersion - Keep read-only commands version-neutral
- Return
sessionVersionin applicable successful responses
Acceptance evidence
- Tests for create/switch/mutation version behavior
- Dependency timeout behavior
- Idempotency TTL expiry behavior
- Replay behavior around timeout edge cases
Acceptance evidence
- Runtime-tuned unit tests for timeout/TTL boundaries
Goal: make causality crash-survivable and replay-auditable.
- Persist envelope + lifecycle transitions + terminal outcome
- Assign deterministic per-lane sequence numbers
- Define stable on-disk schema + migration policy
Owner: TBD
Acceptance tests
- Restart preserves completed outcomes
- Journal entries are strictly ordered per lane
- One-version schema migration fixture passes
Decision gate (DG-L4.1)
- Backend choice: append-only JSONL vs embedded store (SQLite)
- Durability mode: strongest fsync vs throughput-biased
- Rehydrate journal on startup
- Classify pre-crash in-flight commands (
recoverable/failed) with explicit reason - Expose recovery summary event/endpoint
Owner: TBD
Acceptance tests
- Forced kill during active commands recovers without protocol corruption
- Recovery classification is deterministic across repeated boots
Decision gate (DG-L4.2)
- Side-effect policy for in-flight work: compensate vs mark failed
- Deterministic replay mode for audit/debug
-
get_command_historyAPI (session, commandId, time-window filters; bounded response) - Redaction-aware export path for incident reports
Owner: TBD
Acceptance tests
- Replay output matches lane order + terminal outcomes
- History API round-trips trace fixtures
Decision gate (DG-L4.3)
- Replay placement: in-process feature vs offline tool
- Retention policy (time + size)
- Compaction that preserves replay correctness for retained outcomes + in-flight recovery
- PII redaction hooks before persistence/export
Owner: TBD
Acceptance tests
- Compaction remains replay-equivalent
- Redaction policy enforced on persistence + export
Decision gate (DG-L4.4)
- Compliance envelope: retainable vs prohibited data classes
Goal: adversarial confidence, not just happy-path confidence.
- Property tests: dependency causality
- Property tests: deterministic per-session order
- Property tests: sessionVersion monotonic/gap-free on success
Owner: TBD
Acceptance tests
- Invariants hold under randomized schedules
- Inject transport drops, duplicates, delayed writes, partial frames
- Inject extension UI stalls and timeout races
- Inject journal write failures (L4+)
Owner: TBD
Acceptance tests
- No invariant regression under configured fault matrix
- Build randomized multi-client burst harness
- Differential-check against single-threaded reference model
Owner: TBD
Acceptance tests
- Differential checker reports zero semantic drift across seeded runs
- CI gate: invariant suite + chaos smoke
- Release gate: replay determinism non-regression
Owner: TBD
Acceptance tests
- Build fails on reliability SLO regression
- DL-001: journal backend + durability profile (L4.1)
- DL-002: recovery semantics for side-effecting in-flight commands (L4.2)
- DL-003: replay engine placement: in-process vs offline (L4.3)
- DL-004: retention/compliance data envelope (L4.4)
- Update this file in the same PR as architecture/protocol changes.
- No item closes without acceptance evidence (test name or artifact).
- Deferred findings require: rationale, owner, trigger, deadline, blast radius.
- Link incidents to roadmap items so failure becomes design input, not recurring surprise.