Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 9 additions & 3 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,11 +29,17 @@ Standboy is a VSCode/Cursor extension that auto-expands a Game Boy emulator pane
- **ROM byte delivery.** ROM bytes do **not** travel through `postMessage`. A 32MB GBA ROM serialised via `Array.from(uint8)` produced a ~128MB smi array plus a JSON string of similar size and would OOM the extension host on import. Instead, `loadAndPostRom` (`src/extension.ts`) `access()`-checks the ROM file at `library.romFilePath(hash, ext)`, builds a webview-resource URI via `provider.asWebviewFileUri`, and posts only the URI — `<libraryRoot>` is already in `localResourceRoots` so the webview can `fetch()` the bytes directly into a `Blob` and hand EJS a `blob:` URL. Saves stay inline (≤128KB, harmless). The webview revokes the blob URL inside `EJS_onGameStart` so the ROM-sized Blob is released as soon as EJS has copied it into its Emscripten FS.
- **Cover fetcher (`src/covers.ts`).** Tries the canonical name first, then progressively-stripped variants of the user's filename. Network calls happen in the extension host; webview only loads cached files via `asWebviewUri`. CSP stays locked-down. Concurrency 4, `coverUpdate` messages stream back to the grid as art lands.
- **Auto-show.** Sidebar `WebviewViewProvider` in primary activity bar with `retainContextWhenHidden: true`. Auto-expand on activity = `vscode.commands.executeCommand("standboy.gameView.focus")`; auto-collapse = focus `workbench.view.explorer`. Gated by the `standboy.autoShow` boolean setting (default `true`); when off, the activity dot still pulses but no focus events fire. The setting is exposed both in VSCode's Settings UI and as a one-click pill in the menu drawer's **Auto-show** section — webview reads the current value from the host's `autoShow` message (sent on `ready` and after every config change), writes via `setAutoShow` webview→host. The host does **not** optimistically echo `setAutoShow`; it lets `onAutoShowChange` deliver the persisted value, so the pill's state is always disk-derived (a failed `writeAutoShow` simply leaves the pill on its prior value rather than fooling the user into thinking the change took). Writes target whichever scope already owns the value (`cfg.inspect()` → workspace folder / workspace / global) so an in-app toggle never silently no-ops against a workspace override. State is driven by `ActivityDetector` (`src/activity.ts`), which OR's two independent signals:
- **Override (authoritative).** A sentinel file at `~/.standboy/agent-active`, written/deleted by the user's agent via lifecycle hooks. `src/agent.ts` watches it via `vscode.workspace.createFileSystemWatcher`. Trusted when present, but **both edges are debounced** (`showDelayMs: 5000`, `hideDelayMs: 5000`) so trivial agent turns never strobe the panel and back-to-back turns hold it open without flicker. Detector exposes a separate `onSchedule` callback that fires when a hide is queued (with `durationMs`) so the webview can render a countdown progress bar and the user isn't surprised by the focus shift.
- **Override (authoritative).** A sentinel file at `~/.standboy/agent-active`, written/deleted by the user's agent via lifecycle hooks. `src/agent.ts` watches it and parses its `<kind>:<ts>` content (`kind ∈ {prompt, tool}`). Trusted when fresh; **both edges are debounced** (`showDelayMs: 5000`, `hideDelayMs: 5000`) so trivial agent turns never strobe the panel and back-to-back turns hold it open without flicker. The watcher is strictly event-driven — `fs.watch` on the parent dir handles transitions, and a single `setTimeout` (armed when the sentinel is fresh, reset on each fresh write) handles the timestamp-aging check. **No polling**, no `setInterval`. Two failure modes the watcher reconciles past the hooks:
- **Stale-timestamp TTL.** A sentinel whose recorded timestamp is older than `STALE_THRESHOLD_MS` (5 min) is treated as absent. Catches the "Stop hook didn't fire on user interrupt" case — Claude Code's `Stop` doesn't run on interrupt, so the sentinel would otherwise stay pinned until the next activation. The one-shot stale timer (armed inside the watcher's `check()` whenever the sentinel is fresh, cleared when it's absent) fires once after the TTL with no refresh and triggers a re-check; nothing is scheduled while the sentinel is idle. The threshold doubles as the on-activate cleanup window (`cleanupStaleSentinel`); one constant for both is enough since they're the same concept (sentinel age check).
- **Prompt-ping for re-show.** Watcher emits `onPromptPing` when a fresh `prompt`-kind write lands while it was already in the active state — i.e., a new user turn during an ongoing run. Extension wires this to the focus command (gated on `isVisible()`), so a manually-closed panel re-opens on the next prompt. `tool`-kind refreshes deliberately don't fire it — that way mid-run tool activity doesn't fight a user's deliberate close.

Detector exposes a separate `onSchedule` callback that fires when a hide is queued (with `durationMs`) so the webview can render a countdown progress bar and the user isn't surprised by the focus shift.

- **Burst (heuristic fallback).** Edit-burst detector — multi-character changes within a 1.5s window — for users who haven't connected an agent in the Detection menu, or for agents we don't have specific hook support for.

- **Agent-detection setup (`src/hooks.ts`).** Lives entirely in the menu drawer's **Detection** section — there is no command-palette entry (`contributes.commands` is empty). Host exposes `getAgentStatus()` (returns `{ claude: { detected, connected }, cursor: { detected, connected } }`) and `setExclusiveAgent(agent, enabled)`. **Mutually exclusive by design**: connecting one agent disconnects the other so the two never share the sentinel file at `~/.standboy/agent-active` — a stop hook from agent A can't race a start hook from agent B and prematurely hide the panel. Internal helpers `setClaudeHooks` / `setCursorHooks` are exported for tests; production code goes through `setExclusiveAgent`. The webview pulls status on `ready` (and after every toggle) via the `agentStatus` host→webview message, sends `setAgent` webview→host to flip a single agent. Detection logic: Claude Code present if `~/.claude/settings.json` or `~/.claude/projects/` exists; Cursor present if `vscode.env.appName.toLowerCase().includes("cursor")`. When neither is detected, the section renders an empty-state line; Standboy still works as a manual emulator, auto-show just stays off.
- **Claude Code** → `~/.claude/settings.json`. Events: `UserPromptSubmit` + `PreToolUse` (start), `Stop` (stop). Schema is `{ hooks: { <event>: [{ matcher?, hooks: [{type:"command", command}] }] } }` — we append rather than replace, identifying our entries by `command` containing the absolute marker path.
- **Cursor** → `~/.cursor/hooks/hooks.json`. Events: `beforeSubmitPrompt` (start), `afterAgentResponse` + `sessionEnd` (stop, the second is a safety-net). Schema is `{ version, hooks: { <event>: { command } | [{ command }] } }` — single object becomes an array if a user hook is already present so both fire.
- **Claude Code** → `~/.claude/settings.json`. Events: `UserPromptSubmit` (`marker.cjs prompt`), `PreToolUse` (`marker.cjs tool`), `Stop` (`marker.cjs stop`). Schema is `{ hooks: { <event>: [{ matcher?, hooks: [{type:"command", command}] }] } }`. We identify our entries by `command` containing the absolute marker path. **Install wipes ours-entries before re-adding** so users on the old single-`start`-command schema get migrated to the prompt/tool split on the next reinstall (driven by an activate-time idempotent re-install — see `extension.ts`).
- **Cursor** → `~/.cursor/hooks/hooks.json`. Events: `beforeSubmitPrompt` (`marker.cjs prompt`), `afterAgentResponse` + `sessionEnd` (`marker.cjs stop`, the second is a safety-net). Schema is `{ version, hooks: { <event>: { command } | [{ command }] } }` — single object becomes an array if a user hook is already present so both fire. Install also wipes ours-entries first for the same migration reason as Claude.
- The marker script (`~/.standboy/marker.cjs`) is **embedded as a string constant** in `src/agent.ts` and written out at activation. Doing it that way means uninstalling the extension can't leave hook commands pointing at a vanished `extensionPath/...` script. `extension.ts` calls `ensureMarkerInstalled()` at activation regardless of whether the user has connected an agent, so the FileSystemWatcher's parent dir always exists on first install.
- Both install and uninstall are **idempotent and no-op-when-nothing-changed** (a `mutated` flag avoids spurious atomic writes).
- **No automatic cleanup on extension uninstall.** VSCode's `vscode:uninstall` lifecycle is unreliable (microsoft/vscode#155561, #102260 — extension dir is deleted before the hook script runs; both still open as of 2026), and there is no other API that distinguishes uninstall from quit in `deactivate()`. Every comparable extension (Continue.dev, Cline, etc.) takes the same approach.
Expand Down
2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"name": "standboy",
"displayName": "Standboy",
"description": "A Game Boy emulator that auto-shows during AI agent activity.",
"version": "0.3.0",
"version": "0.3.1",
"publisher": "mfbzme",
"license": "MIT",
"icon": "media/icon-512.png",
Expand Down
228 changes: 189 additions & 39 deletions src/agent.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ import * as path from "node:path";
import {
STALE_THRESHOLD_MS,
cleanupStaleSentinel,
parseSentinelContent,
watchSentinel,
} from "./agent";

Expand All @@ -17,6 +18,39 @@ function sleep(ms: number): Promise<void> {
return new Promise((r) => setTimeout(r, ms));
}

describe("parseSentinelContent", () => {
it("parses new prompt-kind format", () => {
expect(parseSentinelContent("prompt:1234")).toEqual({
kind: "prompt",
ts: 1234,
});
});

it("parses new tool-kind format", () => {
expect(parseSentinelContent("tool:5678")).toEqual({
kind: "tool",
ts: 5678,
});
});

it("falls back to legacy when kind is unrecognized", () => {
expect(parseSentinelContent("other:9999")).toEqual({
kind: "legacy",
ts: 9999,
});
});

it("parses legacy bare-timestamp format", () => {
expect(parseSentinelContent("9999")).toEqual({ kind: "legacy", ts: 9999 });
});

it("returns null for malformed content", () => {
expect(parseSentinelContent("garbage")).toBeNull();
expect(parseSentinelContent("prompt:")).toBeNull();
expect(parseSentinelContent("")).toBeNull();
});
});

describe("cleanupStaleSentinel", () => {
let tmp: string;
let file: string;
Expand All @@ -36,18 +70,25 @@ describe("cleanupStaleSentinel", () => {

it("removes a sentinel whose recorded timestamp is older than the threshold", async () => {
const start = Date.now();
await fs.writeFile(file, String(start - STALE_THRESHOLD_MS - 1000));
await fs.writeFile(file, `tool:${start - STALE_THRESHOLD_MS - 1000}`);
expect(await cleanupStaleSentinel(start, file)).toBe(true);
expect(fsSync.existsSync(file)).toBe(false);
});

it("preserves a sentinel that is still fresh", async () => {
const start = Date.now();
await fs.writeFile(file, String(start - 1000));
await fs.writeFile(file, `prompt:${start - 1000}`);
expect(await cleanupStaleSentinel(start, file)).toBe(false);
expect(fsSync.existsSync(file)).toBe(true);
});

it("reads legacy bare-timestamp sentinels", async () => {
const start = Date.now();
await fs.writeFile(file, String(start - STALE_THRESHOLD_MS - 1000));
expect(await cleanupStaleSentinel(start, file)).toBe(true);
expect(fsSync.existsSync(file)).toBe(false);
});

it("falls back to mtime when contents are malformed", async () => {
await fs.writeFile(file, "not-a-number");
// Just-written file: mtime is now, so it shouldn't be considered stale.
Expand All @@ -71,39 +112,39 @@ describe("watchSentinel", () => {

it("emits the initial absent state once after construction", async () => {
const events: boolean[] = [];
const w = watchSentinel((active) => events.push(active), {
dir: tmp,
pollIntervalMs: 200,
});
const w = watchSentinel(
{ onChange: (active) => events.push(active) },
{ dir: tmp }
);
// Allow the async initial check to flush.
await sleep(50);
await sleep(100);
expect(events).toEqual([false]);
w.dispose();
});

it("emits initial present state when the sentinel exists at start", async () => {
await fs.writeFile(file, String(Date.now()));
await fs.writeFile(file, `prompt:${Date.now()}`);
const events: boolean[] = [];
const w = watchSentinel((active) => events.push(active), {
dir: tmp,
pollIntervalMs: 200,
});
await sleep(50);
const w = watchSentinel(
{ onChange: (active) => events.push(active) },
{ dir: tmp }
);
await sleep(100);
expect(events).toEqual([true]);
w.dispose();
});

it("fires on create and delete transitions, once each", async () => {
const events: boolean[] = [];
const w = watchSentinel((active) => events.push(active), {
dir: tmp,
pollIntervalMs: 100,
});
await sleep(50);
const w = watchSentinel(
{ onChange: (active) => events.push(active) },
{ dir: tmp }
);
await sleep(100);
expect(events).toEqual([false]);

await fs.writeFile(file, String(Date.now()));
// Wait long enough for fs.watch or the poll to pick it up.
await fs.writeFile(file, `prompt:${Date.now()}`);
// Wait long enough for fs.watch to deliver the event.
await sleep(400);
expect(events).toEqual([false, true]);

Expand All @@ -116,46 +157,155 @@ describe("watchSentinel", () => {

it("does not emit duplicate events for unchanged state", async () => {
const events: boolean[] = [];
const w = watchSentinel((active) => events.push(active), {
dir: tmp,
// Aggressive polling — without state-change tracking we'd see
// multiple `false` events here.
pollIntervalMs: 50,
});
const w = watchSentinel(
{ onChange: (active) => events.push(active) },
{ dir: tmp }
);
await sleep(400);
expect(events).toEqual([false]);
w.dispose();
});

it("recovers from rewriting the same sentinel file (back-to-back agent turns)", async () => {
const events: boolean[] = [];
const w = watchSentinel((active) => events.push(active), {
dir: tmp,
pollIntervalMs: 100,
});
await sleep(50);
const w = watchSentinel(
{ onChange: (active) => events.push(active) },
{ dir: tmp }
);
await sleep(100);

await fs.writeFile(file, String(Date.now()));
await fs.writeFile(file, `prompt:${Date.now()}`);
await sleep(300);
// Marker script writes the file again on the next PreToolUse —
// sentinel still exists, state should NOT flap to false-then-true.
await fs.writeFile(file, String(Date.now()));
await fs.writeFile(file, `tool:${Date.now()}`);
await sleep(300);

expect(events).toEqual([false, true]);
w.dispose();
});

it("treats a stale sentinel as absent (catches interrupted agent runs)", async () => {
// Sentinel exists but its timestamp is well past the TTL — happens
// when the agent's Stop hook didn't fire (user interrupted).
await fs.writeFile(file, `tool:${Date.now() - 60_000}`);
const events: boolean[] = [];
const w = watchSentinel(
{ onChange: (active) => events.push(active) },
{ dir: tmp, ttlMs: 1000 }
);
await sleep(150);
// Even though the file exists, age (60s) > ttl (1s) → reported as absent.
expect(events).toEqual([false]);
w.dispose();
});

it("flips to absent when a previously-fresh sentinel ages past the TTL", async () => {
const events: boolean[] = [];
const w = watchSentinel(
{ onChange: (active) => events.push(active) },
{ dir: tmp, ttlMs: 300 }
);
await sleep(100);
// Write a fresh sentinel — watcher should report active.
await fs.writeFile(file, `tool:${Date.now()}`);
await sleep(300);
expect(events).toEqual([false, true]);

// Don't write again — let the one-shot stale timer fire when the
// recorded timestamp ages past the TTL.
await sleep(400);
expect(events).toEqual([false, true, false]);

w.dispose();
});

it("fires onPromptPing when a fresh prompt write lands during an active run", async () => {
const events: boolean[] = [];
let pings = 0;
const w = watchSentinel(
{
onChange: (active) => events.push(active),
onPromptPing: () => pings++,
},
{ dir: tmp }
);
await sleep(100);
// Initial prompt — fires onChange(true), but not promptPing (we
// were transitioning idle→active, the existing show path handles this).
await fs.writeFile(file, `prompt:${Date.now()}`);
await sleep(300);
expect(events).toEqual([false, true]);
expect(pings).toBe(0);

// Tool refresh during active run — no promptPing, no onChange.
await fs.writeFile(file, `tool:${Date.now()}`);
await sleep(300);
expect(events).toEqual([false, true]);
expect(pings).toBe(0);

// New user prompt arrives during the same active run — promptPing
// fires so the extension can re-show a manually-closed panel.
await fs.writeFile(file, `prompt:${Date.now()}`);
await sleep(300);
expect(events).toEqual([false, true]);
expect(pings).toBe(1);

w.dispose();
});

it("does not fire onPromptPing when transitioning from absent to prompt", async () => {
// The idle→active edge already drives the show command via onChange;
// firing promptPing too would be a redundant double-trigger.
let pings = 0;
const w = watchSentinel(
{
onChange: () => undefined,
onPromptPing: () => pings++,
},
{ dir: tmp }
);
await sleep(100);
await fs.writeFile(file, `prompt:${Date.now()}`);
await sleep(300);
expect(pings).toBe(0);
w.dispose();
});

it("recheck() picks up a state change that no event delivered", async () => {
// Simulates a dropped fs.watch event on macOS: write the sentinel
// without giving the watcher time to observe it via fs events,
// dispose the watcher's fs subscription, then verify recheck()
// surfaces the missed transition. (We can't actually force fs.watch
// to drop, but we can verify recheck() is the recovery path.)
const events: boolean[] = [];
const w = watchSentinel(
{ onChange: (active) => events.push(active) },
{ dir: tmp }
);
await sleep(100);
expect(events).toEqual([false]);

// Pretend the OS dropped the create event — just check that
// recheck() observes the file we wrote.
await fs.writeFile(file, `prompt:${Date.now()}`);
w.recheck();
await sleep(100);
expect(events).toEqual([false, true]);

w.dispose();
});

it("stops firing after dispose", async () => {
const events: boolean[] = [];
const w = watchSentinel((active) => events.push(active), {
dir: tmp,
pollIntervalMs: 100,
});
await sleep(50);
const w = watchSentinel(
{ onChange: (active) => events.push(active) },
{ dir: tmp }
);
await sleep(100);
w.dispose();

await fs.writeFile(file, String(Date.now()));
await fs.writeFile(file, `prompt:${Date.now()}`);
await sleep(300);
expect(events).toEqual([false]);
});
Expand Down
Loading
Loading