Skip to content

feat: compaction-resilient session state (Claude Code + Codex)#15

Open
gabelul wants to merge 2 commits into
mainfrom
feat/compaction-resilience
Open

feat: compaction-resilient session state (Claude Code + Codex)#15
gabelul wants to merge 2 commits into
mainfrom
feat/compaction-resilience

Conversation

@gabelul
Copy link
Copy Markdown
Owner

@gabelul gabelul commented May 25, 2026

Why

The orchestrator/ideate flows run inside a host (Claude Code, Codex) that compacts the conversation when its context window fills. Today most of the work survives by luck — the PRD is on disk, screens live in Stitch's backend — but a long ideation session loses its gathered research, because the PRD only gets written at the final phase. This makes the resilience deliberate instead of accidental, and ports it to Codex too.

How — two layers

State lives outside the conversation. One canonical writer, scripts/stitch-session.mjs, owns .stitch/session/state.json (active project, generated screens, applied design system, artifact pointers) and a running prd-draft.md. stitch-ideate flushes after each phase; stitch-orchestrator records the project/screens/design-system as it goes.

Hooks re-orient after a compaction. PreCompact snapshots the raw transcript (owner-only) and a RESUME.md breadcrumb; SessionStart (source: compact) injects a "you're mid-flow, re-read this, don't restart" line via additionalContext. Both no-op silently when there's no Stitch session and always exit 0.

Host support

  • Claude Code: hooks auto-register from hooks/hooks.json; skills reach the helper via ${CLAUDE_SKILL_DIR}.
  • Codex CLI: ships as a real Codex plugin (.codex-plugin/plugin.json). Codex fires SessionStart:compact too and provides CLAUDE_PLUGIN_ROOT as a compat env var, so the same hooks.json works. Codex skills can't reference a bundled script, so the helper is exposed as an on-PATH stitch-session command (npm bin + installer symlink); skills call it through a wrapper that falls back to the Claude Code path, and self-recover via ss read when hooks aren't trusted yet.
  • OpenCode / Crush: skill calls no-op cleanly; these keep the architectural resilience without the hook layer.

Reviewed

OMC code-reviewer (APPROVE) + a Codex second opinion. Security model holds — state values are sanitized before context injection, no path traversal in snapshot filenames, no shell/JSON injection. The symlink main-guard bug (npm bin is a symlink, so a naive import.meta.url === argv[1] silently skipped main()) was caught and fixed with realpathSync.

Verify before merge (live hosts — can't be tested offline)

  • Claude Code: /compact mid-flow → next turn knows the project
  • Codex: codex plugin add ., trust the hooks, /compact → re-orientation fires and ${CLAUDE_PLUGIN_ROOT} resolves

Honest limitations

  • Mid-phase reasoning gap. The per-phase flush captures structured state, not the reasoning the model develops within a phase (why it picked a direction, what surprised it during research). The durable fix is the exec plans pattern — a continuously-updated Decision Log + Surprises file written alongside prd-draft.md. Tracked as a follow-up in docs/dev-docs/exec-plans-followup.md; the raw-transcript backstop in snapshots/ is the recoverable-but-ugly stopgap until that lands.
  • Codex PostCompact stdout is ignored, so re-orientation rides on SessionStart:compact, not PostCompact.

Details: docs/compaction-resilience.md. Landmines: docs/dev-docs/troubleshooting.md.

Gabi added 2 commits May 25, 2026 23:46
Active project, generated screens, and the PRD draft now survive a host
context compaction instead of living only in the conversation.

- scripts/stitch-session.mjs: one canonical state writer (CLI + helpers)
- hooks/: PreCompact snapshots a transcript backstop and breadcrumb;
  SessionStart re-orients the model on source=compact via additionalContext
- stitch-ideate / stitch-orchestrator: persist state as they go, plus a
  resume self-check for hosts where the hooks aren't trusted yet (Codex)
- on-PATH 'stitch-session' launcher (npm bin + installer symlink) so Codex
  skills, which can't reference a bundled script, can still write state
- .codex-plugin/ manifest so the hooks register under Codex CLI

State values are sanitized before context injection, the hooks always exit 0
(never block compaction), and the launcher's main-guard resolves symlinks.
Sharpens the 'honest limitations' framing in both the PR body and the
feature doc: the mid-phase reasoning loss isn't 'covered by the transcript
backstop', it's a real gap whose proper fix is the exec plans pattern.

Adds docs/dev-docs/exec-plans-followup.md with the planned shape:
.stitch/session/plan.md with five sections (Purpose / Progress /
Surprises / Decision Log / Outcomes), written at decision points via the
ss wrapper. Designed not built; the current PR stays scoped.

Pattern from https://developers.openai.com/cookbook/articles/codex_exec_plans
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant