fix(sandbox): non-destructive stop + reap; auto-start on reattach (bd openlock-27e) by vessux · Pull Request #40 · vessux/openlock

vessux · 2026-05-26T07:09:07Z

Summary

openlock stop and the 30-minute idle reaper no longer destroy workspace volumes — they call the new fork openshell sandbox stop verb instead.
Reattaching to a session whose container has been stopped now auto-starts it via openshell sandbox start before waiting for the supervisor.
Help text in _descriptions.ts already advertised "preserves state" / "no removal" — those statements are now finally accurate.

Why

session-ops.ts was routing both stopSession and reapIdleStaleSessions through deleteSandbox, which calls openshell sandbox delete and tears down the container, the workspace volume, and the session-scoped handshake secret atomically. A workspace that survived a Ctrl-C would silently disappear after half an hour of inactivity. Discovered while reviewing post-#39 sidebar items.

Filed as bd openlock-27e.

Changes

src/sandbox/container.ts: new buildSandboxStopArgv / buildSandboxStartArgv argv builders and stopSandbox / startSandbox async wrappers mirroring deleteSandbox.
src/sandbox/session-ops.ts:
- stopSession now calls stopSandbox (was deleteSandbox).
- reapIdleStaleSessions now calls stopSandbox.
- cleanSession still calls deleteSandbox — that path is the explicit-teardown verb.
src/sandbox/session.ts: reattachSession calls startSandbox when the existing container's state is exited, before waitForSandboxReady. The supervisor lives inside the container, so without an explicit start the existing wait path would time out.
src/sandbox/container.test.ts: argv-shape tests for both new builders.
package.json: scope bun test to ./src/ ./tests/ so the runner does not pick up the openshell-fork checkout's z3-sys build artifacts (a .test.ts file ships inside vendored z3 sources).

Fork dependency

The CLI verbs openshell sandbox stop / openshell sandbox start are added by vessux/OpenShell#3. Dev-mode openlock builds pick up the new binaries from the local openshell-fork/target/debug checkout, but a production install will need a fork release tag bump (OPENSHELL_FORK_TAG in src/sandbox/fork-binaries.ts) once that fork PR merges and a release is cut.

This PR intentionally does not bump the fork tag — that should land as a separate small commit after the fork release is available.

Test plan

bun run lint — only pre-existing biome warning in tests/integration/post-create-openrouter-real.test.ts:154; no new warnings.
bun run typecheck — clean.
bun run knip — clean.
bun run test — 579 pass / 0 fail / 6 skip across 70 src files.
Manual smoke once the fork tag is bumped: openlock sandbox foo, exit, openlock stop foo, openlock sandbox foo should reattach with workspace intact instead of failing.
Manual smoke: openlock reap followed by reattach — same expectation.

…enlock-27e) Bug: `openlock stop <name>` and the 30-minute idle-reaper both routed through `deleteSandbox`, which tears down the container, the workspace volume, and the session-scoped handshake secret in one call. Help text claimed "preserves state" and "no removal" — both lies. A workspace that survived a `Ctrl-C` would silently disappear after half an hour of being idle. Fix: - container.ts: add buildSandboxStopArgv / buildSandboxStartArgv and stopSandbox / startSandbox wrappers around the new openshell sandbox stop / start verbs (vessux/OpenShell#3). - session-ops.ts: stopSession and reapIdleStaleSessions now call stopSandbox instead of deleteSandbox. cleanSession keeps the destructive deleteSandbox call (that is the explicit-teardown path). - session.ts: when reattaching to a session whose container is in "exited" state, call startSandbox before waitForSandboxReady. The supervisor lives inside the container, so without an explicit start the existing wait path would time out. The existing _descriptions.ts already advertises "preserves state" and "no removal" — those statements are now finally true. Other: - package.json: scope `bun test` to `./src/ ./tests/` so the test runner does not pick up the openshell-fork checkout's z3-sys build artifacts (a TypeScript test file shipped inside the vendored z3 source). The wired-up TS layer talks to the fork's new Stop/Start RPCs via the openshell CLI. Dev-mode builds (this repo plus a sibling openshell-fork checkout) pick up the binaries from openshell-fork/target/debug; production installs will need a fork release tag bump once the upstream fork PR merges.

Fork v0.5.0 ships the `openshell sandbox stop` / `openshell sandbox start` verbs that this branch's new `stopSandbox`/`startSandbox` wrappers call. Production installs need the binary tarballs that v0.5.0 publishes; dev-mode picks up the local `openshell-fork/target/debug` checkout regardless and ran fine without this bump. See vessux/OpenShell#3 + release notes for v0.5.0.

`openshell sandbox <verb>` operates on the gateway-registered sandbox name (the one passed to `--name` at create), not the underlying podman container name `openshell-sandbox-<name>`. openlock was prefixing every name with `SANDBOX_PREFIX` before invoking the binary, so the gateway always answered NotFound: $ openshell sandbox get openshell-sandbox-foo → NotFound $ openshell sandbox get foo → Ready `deleteSandbox` masked this by silently swallowing stderr/exit code (`{stdout:"ignore", stderr:"ignore"}`), so `openlock clean` reported success while leaving the podman container orphaned — discovered when trying to smoke-test PR-#40's auto-start path which depends on `getSandboxState` returning a non-`missing` value. Drop the prefix from every callsite that talks to the openshell binary (session.ts / session-ops.ts / cli/exec / cli/shell), drop the unused `SANDBOX_PREFIX` constant + file (integration tests hardcode the container name where they hit podman directly), and surface non-NotFound delete failures instead of swallowing them. Also fixes `getSandboxState`: - `openshell sandbox get` has never accepted `-o json` (only `--policy-only` on top of `[name]`), so the JSON.parse path always exit-coded to `"missing"`. Drop the flag, parse the human "Phase: X" line instead. - Add ANSI-stripping + a `parseSandboxGetPhase` unit test. Without this, PR-#40's `if (state === "exited") await startSandbox(...)` in `reattachSession` is dead code: `getSandboxState` returns `"missing"` first and we exit before reaching the start.

Fork v0.5.0 has no SandboxPhase::Stopped — an explicit `openshell sandbox stop` transitions phase Ready -> Error because the watch loop treats any ContainerExited as a terminal failure. Before this, parseSandboxGetPhase mapped Error to "other"; reattachSession then printed "Attaching to running" but did not invoke startSandbox, and waitForSandboxReady hung probing a stopped container. Mapping Error to "exited" lets reattach trigger startSandbox; genuine provisioning failures still surface when startSandbox/waitForSandboxReady fail. Tracked as bd openlock-z9i (fork-side fix: add Stopped variant or intentional-stop flag).

vessux added 3 commits May 26, 2026 09:08

vessux closed this May 26, 2026

vessux reopened this May 26, 2026

vessux merged commit c0d7241 into main May 26, 2026
5 checks passed

vessux deleted the feat/sandbox-stop-start branch May 26, 2026 13:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(sandbox): non-destructive stop + reap; auto-start on reattach (bd openlock-27e)#40

fix(sandbox): non-destructive stop + reap; auto-start on reattach (bd openlock-27e)#40
vessux merged 4 commits into
mainfrom
feat/sandbox-stop-start

vessux commented May 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vessux commented May 26, 2026

Summary

Why

Changes

Fork dependency

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant