diff --git a/docs/USER_GUIDE.md b/docs/USER_GUIDE.md new file mode 100644 index 0000000..7b16776 --- /dev/null +++ b/docs/USER_GUIDE.md @@ -0,0 +1,792 @@ +# ClawControl — User Guide + +A task-oriented walkthrough of ClawControl. Pair it with [`README.md`](../README.md) +for the project overview, [`docs/architecture.md`](architecture.md) for the +internals, and [`docs/doctor-checks.md`](doctor-checks.md) for the health +check reference. + +## Contents + +1. [Install](#1-install) +2. [First launch — the wizard](#2-first-launch--the-wizard) +3. [Tour of the UI](#3-tour-of-the-ui) +4. [Common workflows](#4-common-workflows) +5. [CLI reference](#5-cli-reference) +6. [Keyboard shortcuts](#6-keyboard-shortcuts) +7. [Configuration](#7-configuration) +8. [Backups & restore](#8-backups--restore) +9. [Updates](#9-updates) +10. [One-click scaling](#10-one-click-scaling) +11. [Going to production](#11-going-to-production) +12. [Troubleshooting](#12-troubleshooting) +13. [FAQ](#13-faq) + +## 1. Install + +Three install paths. Pick whichever fits the host. + +### npm (recommended for personal / dev machines) + +```sh +npm install -g clawcontrol +clawcontrol start +``` + +The first run launches the onboarding wizard (see §2). Subsequent runs skip +straight to the server + opens `http://localhost:3000` in your default +browser. + +> **Heads up.** The npm package name `clawcontrol` is already taken on the +> public registry by an unrelated package. The real publish will use a +> scoped name (`@clawcontrol/cli` or similar). Override the package the +> installer fetches with `CLAWCONTROL_PACKAGE=@your-scope/clawcontrol`. + +### One-shot installer (Linux / macOS / WSL) + +```sh +curl -fsSL https://example.com/install.sh | bash +``` + +What it does: + +1. Detects the OS (macOS / Linux / WSL / Windows-shell). +2. Verifies Node.js ≥ 20 — installs via `nvm` or Homebrew if missing. +3. `npm install -g clawcontrol` (or `$CLAWCONTROL_PACKAGE`). +4. Runs `clawcontrol start`. + +Pass `--no-start` if you only want to install: `bash -s -- --no-start`. + +### Docker + +```sh +docker compose up -d # just clawcontrol +docker compose --profile ollama up -d # + local Ollama +docker compose --profile openclaw up -d # + OpenClaw runtime +docker compose --profile ollama --profile openclaw up -d # the works +``` + +The `clawcontrol` container exposes ports `3000` (UI) and `3001` (API + WS). +Persistent state lives in the `clawcontrol-data` named volume — `docker +compose down` keeps it, `down -v` wipes it. + +### From source (contributors) + +```sh +git clone https://github.com/clawcontrol/clawcontrol.git +cd clawcontrol +pnpm install +pnpm start # server :3001 + UI :3000 with hot reload +``` + +Requirements: Node ≥ 20, pnpm ≥ 9. + +## 2. First launch — the wizard + +The first time you run `clawcontrol start` (and `~/.clawcontrol/config.json` +doesn't exist), an interactive wizard collects everything the server needs. + +The whole thing takes well under two minutes. None of the answers are +permanent — you can change them later in **Settings** or by editing +`~/.clawcontrol/config.json` directly. + +| Prompt | What it's for | Default | +|---|---|---| +| **Gateway URL** | Where OpenClaw's WebSocket lives. Auto-detected when `openclaw` is on PATH or `~/.openclaw/` exists. | `ws://localhost:3002` | +| **Admin password** | Optional. If supplied, the wizard generates a random bearer token AND stores an `argon2id` hash of the password for future password-based login. Leave blank for "open mode" (no auth required — fine for localhost-only setups). | empty (open mode) | +| **Backup storage** | `Local` (writes tar.gz to `~/.clawcontrol/backups/`) · `S3` (asks for bucket + region + access keys, written to a separate 0600 file) · `Disabled` (no scheduled backups). | Local | +| **Default LLM provider** | One of: Anthropic / OpenAI / Gemini / OpenRouter / Ollama / Skip. Picking a provider triggers the next prompt. | Anthropic | +| **API key** | Sent to the server on first boot, encrypted with AES-256-GCM via the machine's `secret.key`, and inserted into the `api_keys` table. Skipped automatically for Ollama (local). | none | + +When the wizard finishes: + +- `~/.clawcontrol/secret.key` — 32 random bytes (mode 0600). Keep this file + safe — losing it means losing every encrypted API key. +- `~/.clawcontrol/config.json` — server config (mode 0600). +- `~/.clawcontrol/pending-keys.json` — the API key the wizard collected + (mode 0600). The server picks this up on first boot, encrypts it via the + same module that protects keys added later through the UI, and unlinks + the file. +- A bearer token is printed once to the terminal — copy it. The UI's + `/setup` page asks for it the first time you connect. + +The server is then started, `/api/health` is polled until it responds, and +your default browser opens `http://localhost:3000`. + +## 3. Tour of the UI + +Sidebar groups (collapsible — click the group header): + +### Overview + +- **Dashboard** (`/`) — single-page snapshot. Live agents, mission progress, + pending board approvals, tasks awaiting your sign-off, next heartbeats, + spend MTD and projection. Refresh button bottom-right of the page header. + +### Paperclip (governance — amber **NEW** chip) + +- **Goals** (`/goals`) — the 4-level hierarchy (Mission → Project → Agent + goal → Task). Each row shows level, owner, progress bar, status chip, + and a relative due date. Sorted by level then creation order. +- **Org chart** (`/org-chart`) — visual reports-to tree with budget bars + on every node. Right-side panel lists pending board governance actions + (hire requests, strategy reviews, budget changes). +- **Heartbeats** (`/heartbeats`) — cron-scheduled agent runs. Each row + has an enable toggle, a "Run now" button (queues if OpenClaw is offline), + the schedule expression, last run + status, and next run time. +- **Budgets** (`/budgets`) — summary ring (% used) plus per-agent rows + with progress bars. Rows turn red at 100% and the agent auto-pauses; + the **Override + Resume** button on a paused row bumps the budget cap + and flips the agent back to idle. + +### Fleet + +- **Agents** (`/agents`) — card grid with name, title, role, model, + status dot, budget bar, task count. Clicking a card opens a slide-over + with pause / resume / restart / clone / delete actions and the + agent's SOUL.md text. The **+ New agent** button in the header opens + a 3-field create modal. +- **Models & APIs** (`/models`) — list of stored API keys (labels only — + the secret never leaves the encrypted store). Each row has **Test** + (POSTs to the adapter, surfaces ok/error + latency) and **Delete**. + **+ Add key** opens a modal with the provider drop-down (driven by + the live `/api/api-keys/providers` registry). +- **Organizations** (`/organizations`) — card grid; clicking a card + opens a slide-over with the mission and the top of the org chart. + +### Work + +- **Mission Board** (`/mission-board`) — 5-column kanban (Inbox / + Assigned / In progress / Review / Done). Cards show priority chip, + approval flag, agent avatar, and ID. Clicking a card opens a slide-over; + if the task `requires_approval`, **Approve** and **Request revision** + (with feedback textarea) buttons appear. +- **Skills** (`/skills`) — registered tools / prompt templates / MCP + bindings / cron triggers. **Test** runs the skill stub with a sample + input. + +### Data + +- **MMR memory** (`/memory`) — vector collections. Search box runs + semantic queries (stubbed in this release — vector store wires up in + a future phase). Per-collection **Reindex** and **Delete** actions. +- **Channels** (`/channels`) — Telegram / Discord / Slack / email / + webhook integrations. Each row has an enable toggle. +- **Browser** (`/browser`) — focused view of the Doctor's `chrome` + check + auto-fix. + +### System + +- **Backups** (`/backups`) — table of archives with **Restore** and + **Delete** per row. **Backup now** in the header creates a manual + archive. +- **Doctor** (`/doctor`) — 10 health checks rendered as cards. Each + card has **Re-run** and (when applicable) an auto-fix button. Fix + output streams into a terminal panel above the grid. The + **Export report** button downloads a JSON snapshot. +- **Audit** (`/audit`) — paginated log with agent_id / action / limit + filters and a CSV export. +- **Settings** (`/settings`) — system status, update check + install, + notes on auth. + +### Header chrome + +- Search box (placeholder — Cmd+K is the real way to search). +- Live **gateway** chip — `connected` (green) / `connecting` (amber) / + `disconnected` (red). +- Live **WS** chip — `open` / `connecting` / `closed`. +- "Update available" amber chip when `/api/updates/check` says so. + +### Offline banner + +Whenever OpenClaw is unreachable or the WebSocket is down, a red ribbon +appears at the top of every page with a **Run Doctor** link. + +### Mobile + +At widths under 768px the sidebar disappears and a 5-tab bottom bar +takes its place: Dashboard / Agents / Mission Board / Doctor / Settings. +Cmd+K (or its on-screen palette equivalent) covers the rest of the +navigation. + +## 4. Common workflows + +### Deploy a new agent + +1. Sidebar → **Agents**. +2. **+ New agent** in the page header. +3. Fill in name, role, title, provider, model, monthly budget (in cents + — e.g. `3000` = $30/month), and a SOUL.md describing how the agent + behaves. +4. **Create**. The agent appears in the grid with status `idle`. +5. Click the card → slide-over → **Resume** to flip it to `running`. + +The card refetches automatically on the `agent:status_changed` WS event; +no need to reload. + +### Approve / reject a task + +The Dashboard's **Awaiting your sign-off** card and the Mission Board's +**Review** column both surface tasks with `requires_approval = true`. + +1. Click the task card → slide-over. +2. Read the description; the assigned agent's avatar + status appears + inline. +3. **Approve** flips `approval_status` to `approved` (toast + WS + broadcast). +4. **Request revision** is enabled once you type feedback. Posting it + sets `approval_status` to `rejected` and writes the feedback into the + task's audit thread. + +### Diagnose an issue with Doctor + +1. Sidebar → **Doctor** (or `Cmd+D`). +2. The page auto-runs all 10 checks every 30 s. Each card shows + PASS / WARN / FAIL with a one-line message. +3. For checks with an auto-fix, click the cyan fix button. Output streams + live into the terminal panel at the top of the page. +4. **Export report** downloads the latest results + a system fingerprint + as JSON — handy when filing an issue. + +### Take a manual backup + +Three equivalent paths: + +- UI: **Backups** → **Backup now**. +- CLI: `clawcontrol backup`. +- Shortcut: `Cmd+B` from anywhere in the UI. + +Each path writes a tar.gz to `~/.clawcontrol/backups/clawcontrol-.tar.gz`, +emits `backup:started` then `backup:completed` over WS, and a green toast +fires in the UI. Retention removes archives older than `backup.retention_days` +(default 30) on every successful backup. + +### Restore from a backup + +1. Sidebar → **Backups**. +2. Click **Restore** on the row you want to roll back to. +3. The UI confirms — restoring stops OpenClaw + the scheduler, swaps the + DB / config / MMR files, then restarts everything. +4. The page refreshes when the restore broadcasts `backup:restored`. + +If you don't have a working server, restore manually: + +```sh +clawcontrol stop +tar -xzf .tar.gz -C ~ # writes back ~/.clawcontrol +clawcontrol start +``` + +### Adjust a budget after auto-pause + +Agents auto-pause when `spent_cents >= budget_cents`. The UI shows them +red on the Budgets page. + +1. Sidebar → **Budgets**. +2. Find the red row. +3. **Override + Resume** — bumps the cap and flips the agent back to + idle in one action. The audit trail records `budget.override`. + +### Schedule a heartbeat + +1. Sidebar → **Heartbeats**. +2. **+ New heartbeat**. +3. Pick the agent from the drop-down, enter a 5-field cron expression + (e.g. `0 */4 * * *` for "every 4 hours"), and a task description. +4. **Schedule**. The scheduler picks it up immediately — no server + restart needed. + +If OpenClaw is offline when the heartbeat fires, the dispatcher queues +the run and replays it the moment OpenClaw reconnects. The +**Heartbeats** page row turns amber for queued runs. + +### Add a new model provider key + +1. Sidebar → **Models & APIs** → **+ Add key**. +2. Pick the provider, paste the key, label it (e.g. "prod" / "experiments"). +3. **Add**. The key is encrypted with AES-256-GCM via the machine's + `secret.key` before persistence. The list shows labels and a + `last_validated_at` column — never the secret. +4. Click **Test** on the row. The button POSTs to + `/api/api-keys/:id/test-connection`, the server decrypts and asks the + matching adapter to make a minimal validation call, and a toast shows + ok/error + latency. + +### Chase a budget anomaly + +1. Sidebar → **Audit**. +2. Filter `agent_id = ag_xxx` and `action = budget.updated` (or + `budget.limit_hit`) to see every spend update. +3. **Export CSV** for off-line analysis. + +## 5. CLI reference + +`clawcontrol --help` prints a condensed version of this section. + +| Command | Behavior | +|---|---| +| `clawcontrol start` | First run launches the wizard, then spawns the bundled server detached, writes `~/.clawcontrol/server.pid`, waits up to 15 s on `/api/health`, and opens the UI in your default browser. Set `CLAWCONTROL_NO_OPEN=1` to suppress the browser launch (used by Docker, CI, headless servers). | +| `clawcontrol stop` | Reads the PID file and sends SIGTERM. Waits up to 8 s for graceful exit, then SIGKILL. Removes the PID file. | +| `clawcontrol status` | Prints whether the server is running, its pid, and the response of `/api/health`. Stale PID files are detected and reported. | +| `clawcontrol backup` | POSTs to `/api/backups` with `type:'manual'`. Prints the archive path. | +| `clawcontrol doctor` | GETs `/api/doctor`, prints a colored PASS/WARN/FAIL table. Exits non-zero on any FAIL — drop it in CI to fail builds when the host is unhealthy. | +| `clawcontrol update` | Checks for updates, asks for confirmation, runs the staged install pipeline. | +| `clawcontrol reset` | Deletes `config.json` (and the `pending-keys.json` if any). Keeps DB, secrets, and backups. Asks for confirmation. | +| `clawcontrol reset --hard` | Deletes the entire `~/.clawcontrol/` directory. Asks for confirmation. | +| `clawcontrol export ` | tar-czf the entire `~/.clawcontrol/` directory (excluding `server.pid` and `server.log`). The server must be stopped first. | +| `clawcontrol import ` | Extracts a previous export over `~/.clawcontrol/` after confirmation. Stops the server first if it's running. | +| `clawcontrol logs` | `tail -F` on `~/.clawcontrol/server.log`. | +| `clawcontrol --version` (`-v`) | Prints the CLI version. | +| `clawcontrol --help` (`-h`) | Prints help. | + +### Environment variables the CLI honors + +| Variable | Effect | +|---|---| +| `CLAWCONTROL_NO_OPEN=1` | Don't try to open a browser on `start` (containers, CI, SSH). | +| `CLAWCONTROL_DEBUG=1` | Print full stack traces on errors instead of just the message. | +| `CLAWCONTROL_PACKAGE` | Override the npm package name `install.sh` fetches. | +| `OPENCLAW_BIN` | Path or name of the OpenClaw binary used by the process-manager and instance-manager. Defaults to `openclaw`. | +| `OPENCLAW_HOME` | OpenClaw's data root. Defaults to `~/.openclaw/`. | +| `OPENCLAW_RELEASES_URL` | GitHub releases URL the Doctor's `openclaw-version` check fetches. | +| `OLLAMA_URL` | Ollama HTTP endpoint. Defaults to `http://localhost:11434`. | +| `CODEX_BIN` | Path or name of the OpenAI Codex CLI binary the codex adapter spawns. | + +## 6. Keyboard shortcuts + +| Shortcut | Action | +|---|---| +| `⌘K` / `Ctrl+K` | Fuzzy palette over agents, tasks, goals, navigation, and quick actions. | +| `⌘D` / `Ctrl+D` | Jump to **Doctor**. | +| `⌘B` / `Ctrl+B` | Backup now (loading toast → result toast). | +| `⌘/` / `Ctrl+/` | Show the keyboard reference modal. | +| `Esc` | Close any open overlay (palette, modal, slide-over). | +| `G` then `A` | Go to **Agents** (chord — 1.2 s window after pressing G). | +| `G` then `M` | Go to **Mission Board**. | +| `G` then `O` | Go to **Org chart**. | + +Chord shortcuts (the G-then-X family) intentionally **don't fire while you're +typing** in an input, textarea, or content-editable region — your keystrokes +flow through to the form. + +## 7. Configuration + +ClawControl reads `~/.clawcontrol/config.json` on every start. The wizard +creates it on first run; manual edits are honored on the next restart. + +```jsonc +{ + "port": 3001, + "host": "127.0.0.1", + "authToken": null, // bearer; null = open mode + "authPasswordHash": null, // argon2id, optional + "dbPath": "/home/you/.clawcontrol/clawcontrol.db", + "openclaw": { + "gatewayUrl": "ws://localhost:3002" + }, + "backup": { + "schedule": "0 2 * * *", // node-cron expression + "retention_days": 30, + "encryption_enabled": false, // wraps tar.gz in AES-256-GCM + "s3_bucket": null + }, + "updates": { + "auto_check": true, + "auto_install": false, + "repo_url": "https://api.github.com/repos/clawcontrol/clawcontrol/releases/latest" + } +} +``` + +### Files in `~/.clawcontrol/` + +| Path | Contents | Mode | +|---|---|---| +| `config.json` | The above. | 0600 | +| `secret.key` | 32-byte AES-256-GCM master key, hex-encoded. | 0600 | +| `clawcontrol.db` (+ `-wal`, `-shm`) | SQLite database. | 0600 | +| `backups/` | Auto + manual archives (`clawcontrol-YYYYMMDD-HHMMSS.tar.gz[.enc]`). | 0700 | +| `instances.json` | Registry for one-click scaling — list of `{id, pid, port, …}`. | 0600 | +| `pending-keys.json` | Wizard-collected API keys awaiting first-boot ingestion. | 0600 | +| `server.pid` | The CLI writes the spawned server's PID here. | 0600 | +| `server.log` | Server stdout + stderr. Tail with `clawcontrol logs`. | 0600 | +| `s3-credentials.json` | Only when S3 backup is configured. | 0600 | + +> **`secret.key` is your one-and-only encryption root.** Lose it and every +> stored API key becomes unrecoverable. `clawcontrol export` includes it +> in the archive — keep that archive somewhere safe. + +### Auth + +When `authToken` is `null` the API is open (any caller on the host can hit +`/api/*`). Set a token to require `Authorization: Bearer ` on every +request. The UI's `/setup` page accepts the token; once entered it's +stored in `localStorage` and attached to every fetch. + +`authPasswordHash` is an `argon2id` digest of an admin password collected +by the wizard. The hash is parked for a future password-based login flow +and isn't checked at runtime today. + +## 8. Backups & restore + +### What's in a backup + +Each archive (default `clawcontrol-YYYYMMDD-HHMMSS.tar.gz`) contains: + +- `clawcontrol.db` — the SQLite database, captured after a WAL TRUNCATE + checkpoint so the file is self-consistent. +- `manifest.json` — id, type, timestamp, schema version, includes flags. +- `openclaw/config.json` (when present) — OpenClaw's runtime config. +- `openclaw/mmr.tar` (when present) — vector store snapshot. + +When `backup.encryption_enabled` is `true`, the tar.gz is wrapped in AES-256-GCM +using the same `secret.key` that protects API keys, and the file extension +becomes `.tar.gz.enc`. + +### Schedule + +`backup.schedule` is a 5-field cron expression. The default `0 2 * * *` +means "02:00 every day, host-local time". `node-cron` interprets it. + +`backup.retention_days` (default 30) controls cleanup — every successful +backup deletes archives older than that. + +### Manual snapshots + +```sh +clawcontrol backup # CLI +# OR Cmd+B from anywhere in the UI +# OR Backups page → Backup now +``` + +### Restore + +Two ways: + +```sh +# (1) Through the UI: Backups → Restore on the row. +# (2) From the CLI / file system, on a stopped server: +clawcontrol stop +tar -xzf ~/.clawcontrol/backups/clawcontrol-20260427-040256.tar.gz -C ~ +clawcontrol start +``` + +The UI restore stops OpenClaw and the scheduler, swaps the DB / config / +MMR files, then restarts everything. The stream of progress lines is +broadcast over WS as `backup:restored` and surfaces as a green toast. + +### Disaster recovery (lost host) + +The minimum you need to recover: + +1. The latest backup archive (e.g. via `clawcontrol export`). +2. Node 20+ on the new host. + +```sh +npm install -g clawcontrol +clawcontrol import /path/to/clawcontrol-export.tar.gz +clawcontrol start +``` + +The server picks up the restored config and DB and you're back online. + +## 9. Updates + +ClawControl checks GitHub for new releases on a daily cron (`updates.auto_check`, +default true). When a newer version is published, the header surfaces an +"update available" amber chip and **Settings** offers an Install button. + +### Manually check / install + +```sh +clawcontrol update +# OR Settings → Updates → Install update +``` + +The 8-stage install pipeline: + +1. **Pre-update backup** — manual-type backup so you can roll back. +2. **Download** the release tarball via curl. +3. **Extract** to a temp dir. +4. **`pnpm install --prod`** in the staged tree. +5. **Validate** (smoke check the entry point). +6. **Stage** the new release path. The actual binary swap is left to the + supervisor that owns the parent process — systemd / pm2 / Docker + restart picks it up. + +Any failure broadcasts `update:failed` with the stage; the pre-update +backup remains in `~/.clawcontrol/backups/` for restore. + +### Auto-install (advanced) + +Set `updates.auto_install = true` in `config.json`. The 03:00 daily cron +will then run the full pipeline whenever a new release is found. Most +deployments leave this off and review changelogs manually. + +## 10. One-click scaling + +For workloads that outgrow a single OpenClaw instance: + +```sh +curl -X POST http://localhost:3001/api/instances \ + -H "Content-Type: application/json" \ + -d '{"port": 3003}' +``` + +Each request spawns a fresh detached OpenClaw on the requested port, writes +its pid into `~/.clawcontrol/instances.json`, and broadcasts +`instance:status`. + +```sh +curl http://localhost:3001/api/instances +# → { instances: [{ id, pid, port, home, bin, startedAt }, …] } + +curl http://localhost:3001/api/instances//health +# → { state, pid, port, uptimeMs, cpu, memoryBytes, agentCount } + +curl -X DELETE http://localhost:3001/api/instances/ +# SIGTERMs and removes the row +``` + +`listInstances()` prunes dead PIDs on every read so the registry never +drifts from reality. + +## 11. Going to production + +A handful of changes you'll want before exposing ClawControl beyond +localhost. + +### 11.1 Set an admin token + +Open `~/.clawcontrol/config.json` and put a long random string in +`authToken`. Restart the server. Every `/api/*` request now requires +`Authorization: Bearer `. The UI's `/setup` page accepts the token +the first time you connect; it's then kept in `localStorage`. + +`/api/health` stays open so liveness probes keep working. + +### 11.2 Bind to a non-loopback interface (carefully) + +By default the server listens on `127.0.0.1` — only the local host can +reach it. To accept remote connections, change `host` in +`config.json` to `0.0.0.0` (or a specific NIC), restart, and put a +reverse proxy (nginx / Caddy / Traefik) in front of it terminating TLS. + +Don't expose ClawControl to the public internet without an admin token +set. + +### 11.3 Process supervision + +`bin/clawcontrol.js start` is fine for personal use, but production +deployments should run under a supervisor that handles crashes: + +- **systemd** — drop a unit file pointing at `node /path/to/clawcontrol/packages/server/dist/index.js`. +- **Docker** — `docker compose up -d` (the entrypoint is `tini`, signals + are forwarded correctly, restart policy is `unless-stopped`). +- **pm2** — `pm2 start /path/to/clawcontrol/packages/server/dist/index.js --name clawcontrol`. + +The Doctor's `process` check auto-detects all three and uses the +appropriate restart command for the **Restart OpenClaw** auto-fix. + +### 11.4 Off-host backups + +Local backups protect against accidental deletes. To survive losing the +host, configure S3 in the wizard or set `backup.s3_bucket` in +`config.json` (and drop credentials in `~/.clawcontrol/s3-credentials.json`). +S3 upload itself is wired but not yet implemented in this release; the +config keys exist so a future patch can light it up without a config +migration. + +For now, schedule an off-host pull instead — e.g. nightly: + +```sh +0 3 * * * rsync -az host:/root/.clawcontrol/backups/ /backup/clawcontrol/ +``` + +### 11.5 Logging + +`server.log` is append-only and never rotated by the server. Stick a +`logrotate` rule on it in production. Same for `~/.openclaw/logs/`. + +### 11.6 Doctor as a smoke test in CI + +```sh +clawcontrol doctor +echo "exit=$?" +``` + +`clawcontrol doctor` returns non-zero when any check is `fail`. Wire it +into a deploy gate or a periodic external probe. + +## 12. Troubleshooting + +The first thing to try, almost always, is `clawcontrol doctor` — its 10 +checks cover most failure modes and three of them have one-click fixes. +Below: symptom → likely cause → resolution. + +### "Server failed to come up within 15s" + +`clawcontrol start` waits on `/api/health` for 15 seconds. If the +server crashed during boot, you'll see this message. + +```sh +tail -n 100 ~/.clawcontrol/server.log +``` + +Common causes: + +- **Port 3001 already in use**. Change `port` in `config.json` or stop + whatever is on it. +- **better-sqlite3 native binding mismatch**. Reinstall: `pnpm install` + from source, or reinstall the global package. +- **Corrupted DB**. Move `~/.clawcontrol/clawcontrol.db*` aside and + restart — migrations + seed will recreate a fresh database. + +### Header gateway chip is red ("disconnected") + +ClawControl can't reach OpenClaw at `config.openclaw.gatewayUrl`. + +1. Click **Run Doctor** in the red ribbon (or `Cmd+D`). +2. The `process` check tells you whether OpenClaw is running. If it's + stopped, click **Restart OpenClaw**. +3. The `gateway` check tells you the latency or the connect error. + +The whole UI keeps working when OpenClaw is offline — Doctor, backups, +budgets, audit, and read-only views of agents/tasks/goals/heartbeats +all still respond. Heartbeats that fire while offline are queued and +replayed on reconnect. + +### Toasts not arriving / data feels stale + +The WS chip in the header should read `open`. If it's `closed` or +`connecting`, the UI is offline from the server even though it can still +hit REST. Check: + +- Browser console for connection errors. +- `/ws` on the server (proxy / firewall might be stripping the upgrade). +- The OfflineBanner — it surfaces both `gateway` and `ws` state. + +### "Agent auto-paused" toast won't go away + +That's the `budget:limit_hit` event firing on every page load until +you act on it. Two fixes: + +- Click **Override + Resume** on the agent in **Budgets** (bumps the cap + and flips status back to idle). +- Or pause the agent permanently from **Agents** → slide-over → Pause. + +### Doctor shows `chrome` failing + +Install Google Chrome or Chromium and click the auto-fix on the chrome +card. The fix only points OpenClaw at an installed binary; it does not +install Chrome. Standard paths it searches: `/usr/bin/google-chrome`, +`/usr/bin/chromium`, `/snap/bin/chromium`, +`/Applications/Google Chrome.app/Contents/MacOS/Google Chrome`, +plus several others. + +### Doctor's `disk-space` is amber/red + +The auto-fix runs `find -mtime` cleanup on `~/.openclaw/logs/` and +`/tmp/clawcontrol-*`. Click it. If you're still tight, look at the +backup retention — `backup.retention_days` defaults to 30; lower it. + +### Lost the bearer token + +```sh +cat ~/.clawcontrol/config.json | grep authToken +``` + +…and paste it into the UI's `/setup` page. Or set `authToken` to `null` +to disable auth entirely (only safe for localhost-only setups). + +### Lost `secret.key` + +Encrypted API keys are gone — there's no recovery. Re-create them: + +1. **Models & APIs** → delete each row. +2. **+ Add key** for each provider with the original key string. + +The wizard's pre-stored key (if any) is plaintext under +`~/.clawcontrol/pending-keys.json` until first boot, but that file is +unlinked the moment the server picks it up. + +### Database upgrade between versions + +Migrations are append-only, so any older DB rolls forward automatically +on the next boot. If you ever need to roll back, restore from the most +recent backup taken on the older version. + +### Reset and start over + +```sh +clawcontrol reset # deletes config only — DB + secrets stay +clawcontrol reset --hard # nukes ~/.clawcontrol/ entirely +``` + +Both prompt for confirmation. + +## 13. FAQ + +**Why does the brief say "OpenClaw 3001" but my config says 3002?** +Phase-1 brief had a typo (`ws://localhost:3001` would conflict with the +control plane's own port). The default that actually works is +`ws://localhost:3002`. Everything in the code base uses the correct +default; the prompt text in the brief is the only place the older +port shows up. + +**Can I run ClawControl without OpenClaw?** +Yes. Every page reads from the local SQLite DB; the only things that +require a live OpenClaw are sending tasks to a real agent and receiving +push events from it. Doctor, backups, budgets, goals, the org chart, +and the audit log all keep working when OpenClaw is dead — that's the +whole architectural point of putting the control plane in a separate +process. + +**Where do I add a new model provider?** +One file under `packages/server/src/adapters/.ts` plus a one-line +registration in `registry.ts`. See +[`CONTRIBUTING.md`](../CONTRIBUTING.md#adapter-pattern--adding-a-new-model-provider) +for a worked example with Mistral. + +**How do I add a Doctor check?** +One file under `packages/server/src/doctor/checks/.ts` plus +registration in `doctor.ts`. Optional auto-fix lives under `fixes/`. +Hard rule: no check or fix is allowed to import `openclaw-client` — +they have to inspect via PID, file system, or direct network probe so +they keep working when OpenClaw is dead. + +**Are heartbeats persisted across server restarts?** +Yes — `heartbeats` rows live in SQLite. The scheduler reads enabled +rows on every boot and registers a node-cron job per row. +`next_run_at` is computed via `cron-parser` so the UI shows a real +future tick, even before the first fire. + +**What happens if the server crashes mid-backup?** +The next start re-runs migrations (idempotent) and the partial archive +in `~/.clawcontrol/backups/` is left on disk. Delete it manually or via +the UI; the integrity check on restore (manifest + schema_version) will +catch a malformed archive. + +**Are API keys ever sent to my browser?** +No. `GET /api/api-keys` returns labels and `last_validated_at` +timestamps only. The encrypted ciphertext stays server-side; decrypt +happens only when the server itself dispatches a `testConnection` or +makes an outbound model call. + +**Can I run multiple ClawControls on one host?** +Out of the box, no — they all expect to own `~/.clawcontrol/`. Override +`HOME` (or set custom paths in `config.json`) to run side-by-side +instances. + +**Why is the npm package called something else?** +The bare `clawcontrol` name is already claimed on the public npm +registry by an unrelated project. The actual publish uses a scoped name +(e.g. `@clawcontrol/cli`). `install.sh` and the CLI surface accept a +`CLAWCONTROL_PACKAGE` env override so you can point it wherever the +real artifact lives. + +**Where do I file bugs?** +Open an issue at https://github.com/clawcontrol/clawcontrol/issues with +the JSON output of `clawcontrol doctor` (`POST /api/doctor/export`) +attached — that report includes the system fingerprint plus every +check's status, message, and details, which usually narrows things down +on the first round trip. diff --git a/packages/server/src/services/backup-service.ts b/packages/server/src/services/backup-service.ts index 4ab0558..aac92ce 100644 --- a/packages/server/src/services/backup-service.ts +++ b/packages/server/src/services/backup-service.ts @@ -103,9 +103,13 @@ export async function createBackup(opts: CreateOptions, config: Config): Promise // 3-5. Stage everything in a temp dir, then tar.gz it. const stage = mkdtempSync(join(tmpdir(), 'clawcontrol-backup-')); + // Source the path from the live DB handle rather than re-reading config — + // tests (and any future runtime override) might have a different path + // than what was on disk when the server booted. + const liveDbPath = getDb().name; try { // 3. Copy DB file. - copyFileSync(config.dbPath, join(stage, 'clawcontrol.db')); + copyFileSync(liveDbPath, join(stage, 'clawcontrol.db')); // 4. Copy ~/.openclaw/config.json if present. const ocConfig = join(OPENCLAW_HOME, 'config.json'); if (existsSync(ocConfig)) { @@ -122,7 +126,7 @@ export async function createBackup(opts: CreateOptions, config: Config): Promise const manifest = { id, type: opts.type, stamp, created_at: Date.now(), - schema_version: readSchemaVersion(config.dbPath), + schema_version: readSchemaVersion(liveDbPath), includes: { db: true, openclaw_config: existsSync(ocConfig), @@ -227,13 +231,15 @@ export async function restoreBackup(backupId: string, config: Config): Promise