Skip to content

Track onboard FSM migration stack and resume compatibility #4533

@cv

Description

@cv

Summary

This issue tracks the stacked onboarding finite-state-machine (FSM) migration work and explains the current intended end-state for reviewers and maintainers.

The stack moves onboarding from a large manual sequence in src/lib/onboard.ts toward explicit FSM state results, a strict runtime/runner, and phase slices. Fresh onboarding is now being migrated onto FSM slices. Resume/ahead-state flows intentionally remain compatibility-driven for now so historical repair and backstop checks still execute even when the saved machine state is already ahead.

Stack overview

1. FSM target, metadata, and mappings

2. Runtime lifecycle/events/result primitives

3. Existing handlers return FSM results

4. Runner, record-only steps, and compatibility bridge

5. Ordered results and sequence adapter

6. Flow context, phase adapters, and slice helpers

7. Live fresh-flow migration and boundary tests

Current behavior after the stack

Fresh onboarding is migrated to FSM slices:

  1. Initial slice: preflight/gateway → stops at provider_selection
  2. Core slice: provider/inference + sandbox → stops at openclaw or agent_setup
  3. Final slice: branch setup + policies + finalization → completion

Step helpers can now record status without mutating the durable machine state. Fresh flows use record-only step mutation so OnboardRuntime owns machine transitions.

Resume compatibility is intentional

Resume/ahead-state flows still use the compatibility execution path. This is intentional because many historical fixes are repair/backstop checks inside phase bodies. Those checks must still run even if a saved session has already advanced beyond the phase's nominal FSM state.

Examples of this pattern include preflight/gateway backstops, provider repair, sandbox reuse/repair decisions, policy reconciliation, and final verification/recovery checks.

A future project can model resume repairs as first-class FSM behavior, but this stack keeps resume conservative to avoid regressions.

Suggested review order

  1. Review and merge the foundation PRs first (docs(onboard): document FSM migration target #4361 onward).
  2. Keep later PRs draft until their bases are merged/restacked.
  3. Review handler-result PRs independently; most are small behavior-preserving changes.
  4. Review live-slice PRs (refactor(onboard): run initial phases through FSM slice #4499, refactor(onboard): run core phases through FSM slice #4500, refactor(onboard): run final phases through FSM slice #4507, refactor(onboard): extract live FSM slice runner helper #4530, test(onboard): cover live FSM slice boundaries #4531) more carefully because they affect the live onboard.ts execution path for fresh runs.

Follow-up work after this stack

  • Move large inline live phase wiring out of src/lib/onboard.ts into a dedicated module.
  • Clarify/rename the resume compatibility result application helper.
  • Audit high-level smoke coverage for fresh, retry, branch, and resume paths after the stack lands.
  • Decide later whether resume repair/backstop behavior should become explicit FSM policy.

Metadata

Metadata

Assignees

Labels

Getting StartedUse this label to identify setup, installation, or onboarding issues.

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions