Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
148 changes: 29 additions & 119 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,153 +1,63 @@
# Spacedock

Spacedock runs agent work through defined stages, so you can delegate in batches and make only the calls that need your judgment.
Spacedock runs agent work through defined stages so you can delegate in batches and only weigh in on the calls that need your judgment.

The first officer coordinates the flow: it dispatches workers to advance each work item and surfaces approval-worthy decisions to you, the captain, so batches move forward without pulling you into every session.
You queue up the work, the agents move each item through its stages, and you get pulled in at approval gates with a stage report (findings, verdicts, artifacts, anomalies) ready for a yes or a no. No raw output dumps, no babysitting one chat at a time.

**You want Spacedock if:**
## Is this for me?

- **You're a human tired of context-switching** between agent sessions to make approval decisions. Spacedock batches the decisions an agent wants to hand back to you and presents each with evidence, so you approve or redirect without re-loading context.
- **You're an agent delegating repeatable work** and want a structured place to queue up approval-worthy decisions for your human without interrupting them for every tiny step.
**You triage Gmail every morning.** Spacedock can fetch your inbox, sort receipts toward a tax folder, archive newsletters, and surface anything that smells like customer support back to you with a proposed reply. You approve the batch; it executes.

## What's Different
**You are planning two weeks in Japan.** Spacedock can research neighborhoods, draft an itinerary, and stop at the decisions that need you (which hotel in Kyoto, which day trip out of Tokyo). Once bookings are locked, the next stage produces a packing list and a daily run sheet.

- **Approval gates with structured evidence.** Every gate comes with a stage report: findings, verdicts, artifacts, anomalies. You approve, redirect, or bounce back faster than sifting through raw output or a sprawling log.
- **Adversarial review gates.** Review stages can be configured to push back rather than rubber-stamp. They target sycophancy, thin evidence, and work that looks busy without proving its claim. Work clears the gate when it survives the challenge.
- **Plan in batches, decide as work flows back.** Queue multiple work items at once; agents advance each through its stages independently while you handle approvals as they surface.
- **The workflow learns with you.** The first officer helps you adjust it when patterns emerge: a stage that never fires, a gate that keeps bouncing the same issue back, a schema field that always ends up empty.
- **Isolation when needed.** Stages that touch shared state run in their own git worktree; lightweight stages run inline. You declare which is which, and the first officer enforces it.
- **Work doesn't die at the context limit.** When an agent runs out of context, Spacedock swaps in a successor that carries forward what's in flight. Nothing gets lost in the handoff.
**Your inbound code-review queue keeps piling up.** Spacedock can pull each open PR, run an adversarial review, queue the verdict for a thumbs up or down, and post the approved review to GitHub.

## Quick Start
## What is Spacedock?

**Prerequisites:** Claude Code or Codex CLI.
A workflow is a directory of markdown work item files plus a README that defines the stages, the schema, and the gates. There are three roles: Captain (you), First Officer (orchestrator), Ensign (worker). The First Officer reads the workflow README, dispatches Ensigns for items ready to advance, and pauses at gates to ask the Captain to approve or reject. (Rejection can bounce the item back to an earlier stage for revision; details in `docs/USAGE.md`.)

### Claude Code
Spacedock is not a chat agent and not a single-skill loop. Gates present structured evidence so the Captain decides on findings, not transcript. Review gates can be adversarial: they push back instead of rubber-stamping. The Captain queues many work items and decides as each surfaces, instead of running one session at a time. When a pattern emerges (a stage that never fires, a gate that keeps bouncing), `/spacedock:refit` adjusts the workflow without losing local mods. Stages that touch shared state run in their own git worktree; lighter stages run inline. When an Ensign hits the context limit, a successor picks up the in-flight state from the markdown files and carries on.

1. Install the plugin:

```bash
claude plugin marketplace add clkao/spacedock && claude plugin install spacedock
```

2. Commission a workflow with your own mission prompt:

```bash
claude --agent spacedock:first-officer "/commission <your mission prompt>"
```

3. Or start from one of these example workflows — copy and run:

**Email triage:**
```bash
claude --agent spacedock:first-officer "/commission Email triage: fetch, categorize, and act on Gmail inbox. Entity: a batch of up to 50 emails. Stages: intake (use gws-cli, triage in:inbox and read email body if necessary, categorize, propose action per email, output as table) → approval (Captain reviews proposal) -> execute (carry out approved actions, do not mark as read). Use gws-cli (https://github.com/googleworkspace/cli/tree/main/skills/gws-gmail), GOOGLE_WORKSPACE_CLI_CONFIG_DIR=~/.config/gws/<account> for different accounts. Walk me through gws-cli setup if not already done."
```

**[Superpowers](https://github.com/obra/superpowers)-style dev task workflow:**
```bash
claude --agent spacedock:first-officer "/commission Dev task workflow: superpowers-style design → plan → implement → review with ## Design and ## Implementation Plan inlined in the entity body (no separate spec/plan files), implement on isolated worktrees with strict TDD, design and review gated for approval."
```

### Codex CLI

1. Clone Spacedock and start Codex from the repo root:

```bash
git clone https://github.com/clkao/spacedock.git /path/to/spacedock
cd /path/to/spacedock
codex --enable multi_agent
```
## Quick start

2. Restart Codex if it was already open, then open `/plugins` and install **Spacedock** from the repo-local marketplace entry.
Prerequisites: [Claude Code](https://docs.claude.com/en/docs/claude-code) installed (Anthropic's CLI; runs on macOS, Linux, and Windows via WSL). For the email example: a working `gws-cli` for Gmail (setup notes: https://github.com/googleworkspace/cli/tree/main/skills/gws-gmail; this includes a one-time Google OAuth flow).

The authoritative Codex plugin manifest is `.codex-plugin/plugin.json`, and the authoritative local catalog is `.agents/plugins/marketplace.json`. That catalog points to `./plugins/spacedock`, which is a checked-in symlink to the repository root so Codex loads the real plugin package directly.
Once those are in place, the steps below take about five minutes. With a clean machine and no tools installed, plan on twenty minutes total. Nothing runs against your inbox until the First Officer pauses at a gate and you approve.

3. Prompt Codex to use the first-officer skill and commission your workflow:
1. Install the plugin:

```bash
Use the spacedock:first-officer skill to run /commission <your mission prompt> in this directory.
claude plugin marketplace add clkao/spacedock && claude plugin install spacedock
```

Legacy compatibility: older Codex setups can still expose `~/.agents/skills/spacedock` directly:
2. Commission an email triage workflow:

```bash
mkdir -p ~/.agents/skills
ln -s /path/to/spacedock/skills ~/.agents/skills/spacedock
claude --agent spacedock:first-officer "/spacedock:commission Email triage: fetch, categorize, and act on Gmail inbox. Entity: an email batch (up to 50 messages). Stages: intake (use gws-cli, triage in:inbox and read email body if necessary, categorize, propose action per email, output as table) then approval (Captain reviews proposal) then execute (carry out approved actions, do not mark as read). Use gws-cli (https://github.com/googleworkspace/cli/tree/main/skills/gws-gmail), GOOGLE_WORKSPACE_CLI_CONFIG_DIR=~/.config/gws/<account> for different accounts. Walk me through gws-cli setup if not already done."
```

The `.claude-plugin/plugin.json` and `.claude-plugin/marketplace.json` files remain synchronized legacy mirrors of the Codex-first metadata for migration compatibility.

> Codex multi-agent is experimental. The Claude Code path is the primary supported surface.
The First Officer commissions the workflow, dispatches an Ensign to gather your inbox, then pauses with a categorized proposal and waits for your approval before touching anything.

## What a Work Item Looks Like
### If you are a developer

```yaml
---
id: 054
title: Session debrief command
status: done
---
Same install line. Commission with this mission instead:

Problem statement, design notes, acceptance criteria, and stage reports
all live in the body of this file as the work moves through stages.
```bash
claude --agent spacedock:first-officer "/spacedock:commission Dev task workflow: superpowers-style design then plan then implement then review with ## Design and ## Implementation Plan inlined in the entity body (no separate spec/plan files), implement on isolated worktrees with strict TDD, design and review gated for approval."
```

See [a completed example](https://github.com/clkao/spacedock/blob/main/docs/plans/_archive/session-debrief.md) from Spacedock's own workflow.

## Concepts

| Concept | What it is |
|---------|------------|
| **Mission** | The purpose of the workflow: what it processes and what it delivers. |
| **Work item** | A single markdown file describing one thing being worked on: an email batch, a dev task, a draft. |
| **Workflow** | A directory of work items plus the README that defines stages, schema, and gates. |
| **Stage** | A named step a work item passes through (e.g. design, implement, review). |
| **Gate** | A pause point at a stage boundary where the captain approves, redirects, or bounces the work back. |

*"I am the master of my fate, I am the captain of my soul."* -- William Ernest Henley, *Invictus*

| Role | Who |
|---------|------------|
| **Captain** | You. You define the mission and make the calls at approval gates. |
| **First Officer** | The orchestrator agent that manages the workflow and reports to you at gates. |
| **Ensign** | The worker agent that moves a single item forward through one stage. |

## How It Works

The first officer reads the workflow README, checks work item statuses, and dispatches ensigns for items ready to advance. Stages that need isolation (typically implementation work with commits) run inside their own git worktree; lightweight stages (design, review, triage) run inline. At approval gates the first officer pauses and presents the ensign's stage report for your review: approve, redo with feedback, or reject. Rejected work automatically bounces back for revision in a fresh round of the earlier stage, with a hard cap so you never get stuck in an infinite loop. When you end a session, `/spacedock:debrief` captures what happened (commits, task state changes, decisions, open issues) into a record the next session picks up automatically (see [an example debrief](https://github.com/clkao/spacedock/blob/main/docs/plans/_debriefs/2026-04-09-01.md) from a real session).

## What Gets Generated

When you run `/spacedock:commission`, the following files are added to your workflow directory:

- **`{dir}/README.md`**: workflow schema, stage definitions, and work item template
- **`{dir}/*.md`**: seed work item files
- **`{dir}/_mods/`**: local modifications carried across refits

**Shipped by the Spacedock plugin:**

- **`spacedock:first-officer`**: the orchestrator agent that reads workflow state and dispatches ensigns
- **`spacedock:ensign`**: the worker agent dispatched to do stage work
- **`skills/commission/bin/status`**: read and advance workflow state without switching to a separate tracking tool

The generated workflow README is the single source of truth. The first officer reads it to know what stages exist, what quality criteria to enforce, and when to pause for your review.

Workflows can extend their own behavior via markdown mod files (`_mods/*.md`) that declare hook handlers for lifecycle events like startup, idle, or merge. For example, the [`pr-merge` mod](docs/plans/_mods/pr-merge.md) opens a pull request automatically when a completed worktree branch is ready to land.

When a new Spacedock release is available, use `/spacedock:refit` to upgrade your workflow scaffolding while keeping local modifications.
The First Officer commissions a generic dev workflow with four stages, opens a worktree for implementation, and pauses at design and review for your call. For deeper dev shapes (PR review queue, Linear ticket ship, cross-repo upgrade), see [`docs/EXAMPLES.md`](docs/EXAMPLES.md).

## Tips
## Codex CLI

- **Run Spacedock inside a sandbox.** Recommended: [agent-safehouse](https://github.com/eugene1g/agent-safehouse) (macOS), [packnplay](https://github.com/obra/packnplay), a devcontainer, or a VM.
- **Talk directly to an ensign.** Claude Code supports agent team chat: while a dispatched ensign is running, you can `Shift+Up` / `Shift+Down` to switch panes and give the ensign feedback directly instead of routing everything through the first officer.
The Codex CLI path is supported but experimental. See the Codex section of [`docs/USAGE.md`](docs/USAGE.md) for the setup steps and the plugin manifest layout.

## Use Cases
## Where to go next

- **Email triage**: classify and route incoming messages with AI agents, escalate to a human at review gates
- **Dev task workflow**: [superpowers](https://github.com/obra/superpowers)-style design -> plan -> implement -> review with approval gates
- **Content publishing**: manage drafts through editing, review, and publication stages
- **Research workflows**: process papers or data through analysis, synthesis, and validation
- **Dogfooding Spacedock's own development.** Spacedock is self-hosted. Its own development runs on a plain text workflow at [`docs/plans/`](docs/plans/). Run `skills/commission/bin/status --workflow-dir docs/plans` to see the current state.
- [`docs/GETTING_STARTED.md`](docs/GETTING_STARTED.md): a guided first run, end to end.
- [`docs/USAGE.md`](docs/USAGE.md): the mental model, the YAML schema, and stage flags.
- [`docs/EXAMPLES.md`](docs/EXAMPLES.md): eight worked examples across household, knowledge work, and development.
- [`docs/PROMPTS.md`](docs/PROMPTS.md): the fill-in-the-blank Initiating Prompt template and persona variants.

## License

Expand Down
Loading
Loading