Skip to content
6 changes: 6 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,3 +53,9 @@ Cut releases from `next` via an annotated `vX.Y.Z` tag — see `docs/releasing.m
- Keep skill instructions declarative. Let the binary own path resolution and mutation guards.
- Add skill smoke tests before changing first-officer or ensign command text.
- Preserve current FO/ensign write-scope rules: the first officer mutates entity state; ensigns write assigned code, reports, and artifacts.

## Runtime Support

- When adding a new runtime host or debugging first-contact runtime friction, read `docs/runtime-support.md` first.
- Use the documented "assume it already works" operating prompt before declaring a host impossible due to auth setup, extension/package discovery, or tool-shape mismatch.
- Prove runtime claims with live or fixture-backed durable state evidence: process exit, entity body, state-checkout git log, and clean status. Do not substitute transcript phrasing or instruction-prose grep for behavior.
184 changes: 184 additions & 0 deletions docs/runtime-support.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,184 @@
# Adding runtime support

Runtime support means Spacedock can launch or drive a host as a first officer, dispatch ensigns through that host's native agent mechanism, and prove the resulting workflow state. A host is not supported because its instructions mention Spacedock; it is supported when a live or fixture-backed run exercises the host and verifies durable state.

Use this guide when adding a new host such as Pi, or when turning a spike into a supported runtime lane.

## Runtime layers

Add support in small layers. Each layer should have its own proof.

1. **Skill adapters**
- Add `skills/first-officer/references/<host>-first-officer-runtime.md`.
- Add `skills/ensign/references/<host>-ensign-runtime.md`.
- Wire both from the corresponding `SKILL.md` runtime-adapter section.
- The adapter must name the host's native mechanism. Do not emulate Claude `Agent`, `SendMessage`, `TeamCreate`, or `TeamDelete` unless the host really provides those tools.

2. **Dispatch host mode**
- Teach `spacedock dispatch build` to accept `host: "<host>"` when the assignment shape differs by host.
- Keep entity paths and worktree paths explicit, especially for split-root workflows (`state: .spacedock-state`).
- Test both positive shape and banned-tool negative cases.

3. **Runtime contracts and registries**
- If the host has long-lived workers, define the minimum worker record: label, substrate, run/session handle, entity, stage, state, and completion epoch.
- Reject stale completion evidence after follow-up or reuse. A previous completion must never satisfy a later assignment.
- If the host has a team API, adapt Spacedock lifecycle intents to the host's native action schema.

4. **Launch/install UX**
- Add `spacedock <host>` only after the manual/live harness proves the runtime path.
- Add `spacedock install --host <host>` only when the install path is known and can be checked without mutating unrelated global host state.
- Add `doctor --host <host>` when there is a manifest, package, or runtime health check to verify.

5. **Live runner**
- Prove the host with a live-gated test when the claim is runtime integration.
- Prefer a temp workflow fixture, isolated host config/session dirs, and copied credentials over global host state.
- Assert process exit, entity content, git log, and clean state. Do not pass by transcript phrasing.

## Acceptance checklist

A new runtime support slice is not done until the entity or PR records evidence for each applicable item:

- Dispatch output uses the host-native contract and excludes incompatible host tool names.
- The first-officer and ensign skills load host runtime adapters.
- Split-root entity paths remain in the state checkout and are not rewritten into a code worktree.
- Follow-up/reuse cannot accept stale completion evidence, if reuse exists.
- Optional team substrates are represented as adapters over their real action schema.
- A live smoke proves the default dispatch path when runtime behavior is the claim.
- Install/launch commands exist only after the underlying mechanism is proven.

## Test strategy

Use the smallest proof at the same abstraction level as the claim:

- **Text claim:** parse or inspect the real instruction files.
- **Dispatch shape claim:** run `spacedock dispatch build` with a fixture and inspect emitted JSON/body.
- **Adapter claim:** table-test lifecycle intents to exact host-native payloads.
- **Registry claim:** unit-test persistence and stale epoch rejection.
- **Runtime claim:** live-gated host run that mutates a temp workflow and verifies durable state.

A substring search over code or prose is not proof of behavior. It is acceptable only when the claim itself is about text being present or absent.

## Manifesting from void

When a runtime seems unsupported on first contact, do not treat setup friction as proof the product path is impossible. Use a deliberate "assume it works" prompt to force the implementation loop to iron out auth, package paths, and tool-shape mismatches before declaring a blocker.

Use this operating prompt for the first implementation/validation loop:

```text
Assume <runtime> support is supposed to work. Do not treat missing polish, auth setup friction, or tool-shape mismatch as proof the runtime is impossible. In first-officer capacity, iron out the frictions:

- if auth is missing in an isolated harness, copy/reuse the existing host auth file correctly;
- if the dispatch substrate needs a local package/extension path, wire it explicitly;
- if the host tool shape differs from Claude/Codex, adapt to the host-native contract rather than emulating Claude tools;
- if a live test fails due to harness setup, fix the harness and rerun;
- only stop for a real product/design blocker, not for first-contact setup friction.
```

For Pi, the concrete version was:

```text
Assume Pi support is supposed to work. Do not treat missing polish, auth setup friction, or tool-shape mismatch as proof the runtime is impossible. In FO capacity, iron out the frictions:

- if Pi auth is missing in an isolated harness, copy/reuse the existing Pi OAuth auth file correctly;
- if the dispatch substrate needs a local package/extension path, wire it explicitly;
- if the Pi tool shape differs from Claude/Codex, adapt to the Pi-native contract rather than emulating Claude tools;
- if a live test fails due to harness setup, fix the harness and rerun;
- only stop for a real product/design blocker, not for first-contact setup friction.
```

That prompt matters because it changes the default failure interpretation. A missing `auth.json`, an extension not auto-discovered in a temp home, or a different subagent tool schema is harness work. A real blocker is a proven inability to launch, delegate, observe completion, or verify durable workflow state after the harness is correct.

## Pi live-smoke mechanism

The Pi proof used a live-gated test named:

```bash
go test -tags live -run TestLivePiSubagentEnsignSmoke ./internal/ensigncycle -v -count=1
```

The harness did this:

1. Resolve `pi` from `PATH` and the local Spacedock repo root.
2. Resolve the installed `pi-subagents` package root, defaulting to:

```text
~/.pi/agent/npm/node_modules/pi-subagents
```

3. Create temp runtime state:

```text
PI_CODING_AGENT_DIR=<temp>
PI_CODING_AGENT_SESSION_DIR=<temp>
--session-dir <temp>
HOME=<clean temp>
```

4. Copy only the operator's existing OAuth file into the isolated Pi home:

```text
~/.pi/agent/auth.json -> $PI_CODING_AGENT_DIR/auth.json
```

5. Launch `pi --print` with explicit local resources:

```text
--extension ~/.pi/agent/npm/node_modules/pi-subagents/src/extension/index.ts
--skill ~/.pi/agent/npm/node_modules/pi-subagents/skills/pi-subagents
--skill <spacedock checkout>/skills/first-officer
--skill <spacedock checkout>/skills/ensign
```

6. Create a temp split-root workflow:
- `README.md` declares `state: .spacedock-state`.
- The entity is folder-form in `.spacedock-state/pi-live-smoke/index.md`.
- Both workflow root and state checkout are git repositories.

7. Ask the Pi parent to call `subagent(...)` exactly once.
8. Require the worker to append a stage report and commit only the state-checkout entity path.
9. Assert durable outcomes:
- Pi process exits successfully.
- Entity body contains the exact smoke marker and stage report shape.
- State checkout git log contains the worker commit.
- The entity path has no uncommitted changes.

## Exact Pi parent prompt

The live test formats this prompt with repository and temp paths. Keep the structure when debugging Pi runtime support; only substitute the paths and marker.

```text
You are the Spacedock first officer for a live Pi smoke test.

Use the pi-subagents subagent(...) tool exactly once to dispatch one Pi ensign worker. Do not use or mention Claude Agent, SendMessage, TeamCreate, or TeamDelete tools.

Dispatch a worker with agent "delegate" and this task:

Load and follow the local Spacedock ensign skill at <repo>/skills/ensign/SKILL.md and the Pi ensign adapter at <repo>/skills/ensign/references/pi-ensign-runtime.md. This is a split-root Spacedock workflow.

Workflow directory: <workflowRoot>
State checkout: <stateRoot>
Entity file: <entityPath>
Target stage: implementation

Required worker actions:
1. Read the workflow README and entity file.
2. Do not edit YAML frontmatter.
3. Append an implementation stage report to the entity body containing the exact marker PI-LIVE-SUBAGENT-ENSIGN-SMOKE, at least one '- DONE:' item, and a '### Summary' subsection.
4. Commit only the entity path in the state checkout with message 'ensign: pi live smoke'. Use a path-scoped git add/commit for pi-live-smoke/index.md.
5. Return a concise completion result naming the entity file and commit evidence.

After subagent(...) returns, you as first officer must verify the entity file contains PI-LIVE-SUBAGENT-ENSIGN-SMOKE and verify the state checkout git log contains 'ensign: pi live smoke'. Exit successfully only after those durable checks pass.
```

## Skill install and load paths

For Pi, `spacedock pi` launches the proven front door by loading local resources explicitly:

```text
<spacedock checkout>/skills/first-officer
<spacedock checkout>/skills/ensign
~/.pi/agent/npm/node_modules/pi-subagents/skills/pi-subagents
~/.pi/agent/npm/node_modules/pi-subagents/src/extension/index.ts
```

`spacedock install --host pi` is an idempotent readiness check and setup guide for that substrate; it does not install a Claude/Codex-style marketplace plugin and does not accept `--plugin-dir`. Resolve the local skill checkout by running it from the checkout or setting `SPACEDOCK_REPO_ROOT`. `spacedock doctor --host pi` reports the Pi CLI, auth file, `pi-subagents` extension/skill, and local Spacedock skill health; it still accepts `--plugin-dir <spacedock checkout>` for local skill checkout diagnostics. Live tests should not mutate global `~/.pi/agent`; they should keep using isolated Pi homes with copied auth.
41 changes: 33 additions & 8 deletions internal/cli/cli.go
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,7 @@ func newRootCommand(ctx context.Context, rawArgs []string, env []string, dir str
root.AddCommand(
newClaudeCommand(ctx, env, dir, stdout, stderr),
newCodexCommand(ctx, env, dir, stdout, stderr),
newPiCommand(ctx, env, dir, stdout, stderr),
newInstallCommand(ctx, env, stdout, stderr),
newDoctorCommand(ctx, env, stdout, stderr),
newStatusCommand(ctx, env, dir, stdin, stdout, stderr, runner),
Expand Down Expand Up @@ -197,14 +198,36 @@ func newCodexCommand(ctx context.Context, env []string, dir string, stdout, stde
return cmd
}

// newPiCommand wires `spacedock pi` to Pi's native skill/extension resource
// loading instead of Claude/Codex plugin or team-tool semantics.
func newPiCommand(ctx context.Context, env []string, dir string, stdout, stderr io.Writer) *cobra.Command {
cmd := &cobra.Command{
Use: "pi [task] [-- pi-flags]",
Short: "Start Pi as your Spacedock first officer",
GroupID: "launch",
DisableFlagParsing: true,
RunE: func(cmd *cobra.Command, args []string) error {
if wantsHelp(args) {
return cmd.Help()
}
if code := runPi(ctx, args, dir, env, execPiRuntimeOps{}, stdout, stderr); code != 0 {
return exitCodeError{code}
}
return nil
},
}
setPiHelp(cmd, stdout)
return cmd
}

// newInstallCommand wires `spacedock install` (the renamed `init`). Behavior is
// unchanged from init: install the per-host plugin then run doctor (claude), or
// emit the documented codex add prose. DisableFlagParsing keeps the post-subcommand
// argv verbatim for the existing hand-parsed runInit (so `--host`/`--check` parse
// exactly as before); `-h`/`--help` is intercepted here.
func newInstallCommand(ctx context.Context, env []string, stdout, stderr io.Writer) *cobra.Command {
cmd := &cobra.Command{
Use: "install [--host claude|codex] [--check]",
Use: "install [--host claude|codex|pi] [--check]",
Short: "Install the Spacedock plugin for a host, then check it",
GroupID: "setup",
DisableFlagParsing: true,
Expand All @@ -213,18 +236,19 @@ func newInstallCommand(ctx context.Context, env []string, stdout, stderr io.Writ
return cmd.Help()
}
applyDevBranchOverride(env)
if code := runInit(ctx, args, execHost{}, stdout, stderr); code != 0 {
if code := runInitWithPi(ctx, args, execHost{}, execPiRuntimeOps{}, env, stdout, stderr); code != 0 {
return exitCodeError{code}
}
return nil
},
}
cmd.Flags().String("host", "claude", "Host to install the plugin for (claude or codex)")
cmd.Flags().String("host", "claude", "Host to install the plugin for (claude, codex, or pi)")
cmd.Flags().Bool("check", false, "Run the compatibility report without installing")
setSetupHelp(cmd, stdout, `
Examples:
spacedock install
spacedock install --host codex
spacedock install --host pi
spacedock install --check
`)
return cmd
Expand All @@ -234,7 +258,7 @@ Examples:
// `--host`/`--plugin-manifest` handling preserved verbatim.
func newDoctorCommand(ctx context.Context, env []string, stdout, stderr io.Writer) *cobra.Command {
cmd := &cobra.Command{
Use: "doctor [--host claude|codex]",
Use: "doctor [--host claude|codex|pi]",
Short: "Check the installed plugin and this binary are compatible",
GroupID: "setup",
DisableFlagParsing: true,
Expand All @@ -243,18 +267,19 @@ func newDoctorCommand(ctx context.Context, env []string, stdout, stderr io.Write
return cmd.Help()
}
applyDevBranchOverride(env)
if code := runDoctor(ctx, args, execHost{}, stdout, stderr); code != 0 {
if code := runDoctorWithPi(ctx, args, execHost{}, execPiRuntimeOps{}, env, stdout, stderr); code != 0 {
return exitCodeError{code}
}
return nil
},
}
cmd.Flags().String("host", "claude", "Host to check (claude or codex)")
cmd.Flags().String("host", "claude", "Host to check (claude, codex, or pi)")
cmd.Flags().String("plugin-manifest", "", "Read this manifest directly instead of resolving the installed plugin")
setSetupHelp(cmd, stdout, `
Examples:
spacedock doctor
spacedock doctor --host codex
spacedock doctor --host pi --plugin-dir ./checkout
`)
return cmd
}
Expand Down Expand Up @@ -507,7 +532,7 @@ _spacedock() {
local cur prev verbs status_flags
cur="${COMP_WORDS[COMP_CWORD]}"
prev="${COMP_WORDS[COMP_CWORD-1]}"
verbs="claude codex install doctor status new state completion dispatch --version --help"
verbs="claude codex pi install doctor status new state completion dispatch --version --help"
status_flags="--workflow-dir --next --next-id --boot --validate --archived --json --quiet --new --folder --set --where --archive --resolve --short-id --discover --root"
if [ "$COMP_CWORD" -eq 1 ]; then
COMPREPLY=( $(compgen -W "$verbs" -- "$cur") )
Expand All @@ -527,7 +552,7 @@ const zshCompletion = `#compdef spacedock
# spacedock zsh completion
_spacedock() {
local -a verbs status_flags
verbs=(claude codex install doctor status new state completion dispatch --version --help)
verbs=(claude codex pi install doctor status new state completion dispatch --version --help)
status_flags=(--workflow-dir --next --next-id --boot --validate --archived --json --quiet --new --folder --set --where --archive --resolve --short-id --discover --root)
if (( CURRENT == 2 )); then
compadd -- $verbs
Expand Down
34 changes: 32 additions & 2 deletions internal/cli/help.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,10 @@ const topLevelHelp = tagline + `
Launch
claude [task] [-- claude-flags] Start Claude Code as your Spacedock first officer
codex [task] [-- codex-flags] Start Codex as your Spacedock first officer
pi [task] [-- pi-flags] Start Pi as your Spacedock first officer
Setup
install [--host claude|codex] Install the Spacedock plugin for a host, then check it
doctor [--host claude|codex] Check the installed plugin and this binary are compatible
install [--host claude|codex|pi] Install the Spacedock plugin for a host, then check it
doctor [--host claude|codex|pi] Check the installed plugin and this binary are compatible
Workflow
status [args] Show or update workflow state
new [--folder] SLUG Create an entity from a stdin body (auto-discovers the workflow)
Expand Down Expand Up @@ -75,6 +76,35 @@ Examples:
})
}

// setPiHelp installs the Pi-specific launch help. Pi loads explicit skills and
// extensions instead of a Claude/Codex plugin manifest.
func setPiHelp(cmd *cobra.Command, w io.Writer) {
cmd.Flags().String("plugin-dir", "", "Load a local Spacedock skill checkout")
cmd.SetHelpFunc(func(c *cobra.Command, _ []string) {
fmt.Fprint(w, tagline+`

Usage:
spacedock pi [task] [--plugin-dir <checkout>] [-- pi-flags]

Start Pi as your Spacedock first officer by loading the Pi-native pi-subagents
extension/skill and the Spacedock first-officer/ensign skills. The optional task
is appended to the launch prompt; everything after -- forwards verbatim to pi.

Flags:
`)
fmt.Fprint(w, c.Flags().FlagUsages())
fmt.Fprint(w, `
Forwarding:
Tokens before -- are spacedock's (the task + --plugin-dir). Tokens after --
forward verbatim to pi, e.g. --model, --print, or --session-dir.

Examples:
spacedock pi --plugin-dir ./checkout
spacedock pi "drive the workflow" --plugin-dir ./checkout -- --model google/gemini
`)
})
}

// setSetupHelp installs a per-command help renderer for install/doctor: the
// command's own flags and an Examples block. A per-command HelpFunc is set so the
// root's grouped HelpFunc is not inherited.
Expand Down
Loading
Loading