From 72f4a0a6defe467981db355f423ddbc87161a4c8 Mon Sep 17 00:00:00 2001
From: NiveditJain <nivedit@exosphere.host>
Date: Fri, 8 May 2026 17:39:48 -0700
Subject: [PATCH 1/3] [luv-324] fix: enforce Stop hook on OpenCode
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Stop hooks fired on OpenCode (visible in dashboard activity feed) but
the agent stopped without retry — same failure mode Cursor had pre-#318
and Copilot had pre-#299. Root cause: no `cli === "opencode"` branch in
policy-evaluator's Stop / SubagentStop handling, so OpenCode fell into
the generic exit-2 path. The plugin shim's applyDecision turns exit-2
into `throw new Error(reason)`, but throwing from the `session.idle`
event callback is a no-op — OpenCode is already idle by the time the
event fires.

Fix: emit `{hookSpecificOutput: {additionalContext: <MANDATORY ACTION
reasonText>}}` for opencode Stop / SubagentStop in both deny and
instruct paths. The shim already routes `additionalContext` through
`client.session.prompt(...)` which submits a new user message that
re-triggers the agent loop — same model as Cursor's `followup_message`
and Copilot's `{decision: "block", reason}`. Promote applyDecision to
async and `await client.session.prompt` for Stop/SubagentStop events
so the SDK round-trip completes before the plugin context tears down;
keep fire-and-forget for tool events to avoid hot-path latency.

Sister CLIs verified while in here:
- Gemini AfterAgent (canonical Stop) was already correctly emitting
  `{decision: "block", reason}`; new unit tests pin both deny and
  instruct shapes to prevent regression.
- Pi `agent_end` is observation-only by upstream design — Pi's agent
  loop has already exited and `AgentEndEventResult` exposes no `block`
  field. CLAUDE.md already documents this; no code change.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 CHANGELOG.md                                 |   3 +
 __tests__/hooks/opencode-plugin-shim.test.ts |  62 ++++++++
 __tests__/hooks/policy-evaluator.test.ts     | 144 +++++++++++++++++++
 src/hooks/integrations.ts                    |  28 ++--
 src/hooks/policy-evaluator.ts                |  49 +++++++
 5 files changed, 276 insertions(+), 10 deletions(-)
diff --git a/CHANGELOG.md b/CHANGELOG.md
index a545bade..5ce93b8e 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -2,6 +2,9 @@
 
 ## Unreleased
 
+### Fixes
+- Make `require-*-before-stop` policies actually enforce on OpenCode (sst/opencode). Empirically observed: a Stop hook fired (visible in the dashboard activity feed) but the agent stopped without retry, identical failure mode to Cursor pre-#318 and Copilot pre-#299. Root cause: the OpenCode plugin shim subscribes to `session.idle` (canonical `Stop`) and `policy-evaluator.ts` had no `cli === "opencode"` branch for `Stop` / `SubagentStop`, so OpenCode fell through to the generic Claude-shape `exitCode: 2 + stderr` response — which the shim's `applyDecision` turns into `throw new Error(reason)` from inside the `session.idle` event callback. By that point OpenCode has already gone idle; the throw is logged at most. The shim does have a working force-retry channel: when stdout JSON contains `hookSpecificOutput.additionalContext`, it submits the text via `client.session.prompt({path: {id: sessionID}, body: {parts: [{type: "text", text}]}})`, which becomes a new user message that re-triggers the agent loop — exactly the same model as Cursor's `followup_message` (#318) and Copilot's `{decision: "block", reason}` (#299). New `cli === "opencode" && eventType in {Stop, SubagentStop}` arms in both the deny and instruct paths of `policy-evaluator.ts` emit `{hookSpecificOutput: {additionalContext: <MANDATORY ACTION reasonText>}}` ahead of the generic Stop fall-through. The shim's `applyDecision` is promoted to `async` and now awaits `client.session.prompt` for `Stop` / `SubagentStop` events specifically (fire-and-forget for tool events, so we don't add SDK round-trip latency on the hot path) — without the await, OpenCode could tear down the plugin context before the SDK call completes. New unit tests in `policy-evaluator.test.ts` pin the deny + instruct shapes for opencode (Stop and SubagentStop) plus a regression test confirming PreToolUse still uses the Claude `permissionDecision: "deny"` shape; new tests in `opencode-plugin-shim.test.ts` assert the shim awaits `client.session.prompt` on `session.idle` (handler stays pending until the SDK call resolves), swallows SDK rejection (agent is exiting anyway), and still throws on exit-2 stderr (back-compat with stale binaries). Pi `agent_end` remains observation-only by upstream design — Pi's agent loop has already exited when `agent_end` fires and the `AgentEndEventResult` exposes no `block` field, documented in `CLAUDE.md`. Gemini `AfterAgent` (canonical `Stop`) was already correctly emitting `{decision: "block", reason}`; new unit tests in `policy-evaluator.test.ts` pin both the deny and instruct shapes to prevent regression.
+
 ## 0.0.10-beta.7 — 2026-05-08
 
 ### Fixes
diff --git a/__tests__/hooks/opencode-plugin-shim.test.ts b/__tests__/hooks/opencode-plugin-shim.test.ts
index 4133428b..1229ef7d 100644
--- a/__tests__/hooks/opencode-plugin-shim.test.ts
+++ b/__tests__/hooks/opencode-plugin-shim.test.ts
@@ -385,6 +385,68 @@ describe("OpenCode plugin shim — translation of binary response to plugin acti
     await hooks["permission.ask"]!({ tool: "bash", sessionID: "s" }, out);
     expect(out.status).toBe("deny");
   });
+
+  it("event session.idle + additionalContext → AWAITS client.session.prompt (Stop force-retry)", async () => {
+    // For Stop events the prompt is the only force-retry channel — it MUST
+    // land before the plugin handler returns or OpenCode tears down the
+    // plugin context. Verify by making session.prompt return a Promise that
+    // resolves on a tick we control: if the handler awaits, it won't return
+    // until we tick.
+    let resolvePrompt: () => void = () => {};
+    const promptPromise = new Promise<void>((res) => { resolvePrompt = res; });
+    const client = { session: { prompt: vi.fn().mockReturnValue(promptPromise) } };
+
+    responses.push({
+      status: 0,
+      stdout: JSON.stringify({ hookSpecificOutput: { additionalContext: "Run tests before stopping" } }),
+      stderr: "",
+    });
+    const { plugin } = await setup();
+    const hooks = await plugin({ client, directory: "/repo" });
+
+    // Fire the handler and observe that it does NOT resolve until prompt resolves.
+    let handlerSettled = false;
+    const handlerDone = (async () => {
+      await hooks.event!({ event: { type: "session.idle", properties: { sessionID: "ses_1" } } });
+      handlerSettled = true;
+    })();
+
+    // Yield to the microtask queue; the handler should still be pending.
+    await new Promise((r) => setImmediate(r));
+    expect(handlerSettled).toBe(false);
+    expect(client.session.prompt).toHaveBeenCalledTimes(1);
+    const arg = client.session.prompt.mock.calls[0][0] as { path: { id: string }; body: { parts: Array<{ type: string; text: string }> } };
+    expect(arg.path.id).toBe("ses_1");
+    expect(arg.body.parts[0]).toEqual({ type: "text", text: "Run tests before stopping" });
+
+    // Resolve the SDK call; the handler should now finish.
+    resolvePrompt();
+    await handlerDone;
+    expect(handlerSettled).toBe(true);
+  });
+
+  it("event session.idle + SDK rejection on prompt is swallowed (agent already exiting)", async () => {
+    responses.push({
+      status: 0,
+      stdout: JSON.stringify({ hookSpecificOutput: { additionalContext: "..." } }),
+      stderr: "",
+    });
+    const client = { session: { prompt: vi.fn().mockRejectedValue(new Error("network")) } };
+    const { plugin } = await setup();
+    const hooks = await plugin({ client, directory: "/repo" });
+    await expect(
+      hooks.event!({ event: { type: "session.idle", properties: { sessionID: "ses_1" } } }),
+    ).resolves.toBeUndefined();
+  });
+
+  it("event session.idle + exit 2 → still throws (back-compat with stale binaries)", async () => {
+    responses.push({ status: 2, stdout: "", stderr: "MANDATORY: commit before stopping" });
+    const { plugin } = await setup();
+    const hooks = await plugin({ client: fakeClient(), directory: "/repo" });
+    await expect(
+      hooks.event!({ event: { type: "session.idle", properties: { sessionID: "ses_1" } } }),
+    ).rejects.toThrow(/MANDATORY: commit before stopping/);
+  });
 });
 
 describe("OpenCode plugin shim — spawn options and registration", () => {
diff --git a/__tests__/hooks/policy-evaluator.test.ts b/__tests__/hooks/policy-evaluator.test.ts
index c1f0d941..22f1376f 100644
--- a/__tests__/hooks/policy-evaluator.test.ts
+++ b/__tests__/hooks/policy-evaluator.test.ts
@@ -251,6 +251,63 @@ describe("hooks/policy-evaluator", () => {
     expect(parsed.followup_message).toContain("subagent verification pending");
   });
 
+  it("OpenCode Stop + instruct emits {hookSpecificOutput.additionalContext} JSON on stdout (NOT exit 2)", async () => {
+    // Mirrors the Copilot/Cursor instruct-on-Stop fixes: without a per-CLI
+    // branch, opencode instruct verdicts fall through to exit-2 + stderr,
+    // which the OpenCode shim throws from inside session.idle — a no-op
+    // because the agent has already gone idle. The shim's only working
+    // force-retry channel is hookSpecificOutput.additionalContext routed
+    // through client.session.prompt.
+    registerPolicy("verify", "desc", () => ({
+      decision: "instruct",
+      reason: "needs verification",
+    }), { events: ["Stop"] });
+
+    const result = await evaluatePolicies("Stop", {}, { cli: "opencode" });
+    expect(result.exitCode).toBe(0);
+    expect(result.stderr).toBe("");
+    expect(result.decision).toBe("instruct");
+    const parsed = JSON.parse(result.stdout) as { hookSpecificOutput?: { additionalContext?: string } };
+    expect(parsed.hookSpecificOutput?.additionalContext).toContain("MANDATORY ACTION REQUIRED");
+    expect(parsed.hookSpecificOutput?.additionalContext).toContain("needs verification");
+  });
+
+  it("OpenCode SubagentStop + instruct also emits {hookSpecificOutput.additionalContext} JSON", async () => {
+    // Forward-compat parity with the SubagentStop deny test — see comment
+    // there for why we widen even though OpenCode doesn't yet expose
+    // subagent boundaries to plugins.
+    registerPolicy("verify", "desc", () => ({
+      decision: "instruct",
+      reason: "subagent verification pending",
+    }), { events: ["SubagentStop"] });
+
+    const result = await evaluatePolicies("SubagentStop", {}, { cli: "opencode" });
+    expect(result.exitCode).toBe(0);
+    expect(result.stderr).toBe("");
+    expect(result.decision).toBe("instruct");
+    const parsed = JSON.parse(result.stdout) as { hookSpecificOutput?: { additionalContext?: string } };
+    expect(parsed.hookSpecificOutput?.additionalContext).toContain("subagent verification pending");
+  });
+
+  it("Gemini Stop + instruct emits {decision:'block', reason} JSON on stdout (force-retry via AfterAgent)", async () => {
+    // Confirms the existing cli==="gemini" Stop arm in the instruct path
+    // emits the documented force-retry shape. Pairs with the deny-path
+    // test in the "Stop event deny format" describe block.
+    registerPolicy("verify", "desc", () => ({
+      decision: "instruct",
+      reason: "needs verification",
+    }), { events: ["Stop"] });
+
+    const result = await evaluatePolicies("Stop", {}, { cli: "gemini" });
+    expect(result.exitCode).toBe(0);
+    expect(result.stderr).toBe("");
+    expect(result.decision).toBe("instruct");
+    const parsed = JSON.parse(result.stdout) as { decision?: string; reason?: string };
+    expect(parsed.decision).toBe("block");
+    expect(parsed.reason).toContain("MANDATORY ACTION REQUIRED");
+    expect(parsed.reason).toContain("needs verification");
+  });
+
   it("accumulates multiple instruct messages", async () => {
     registerPolicy("first", "desc", () => ({
       decision: "instruct",
@@ -605,6 +662,93 @@ describe("hooks/policy-evaluator", () => {
       await evaluatePolicies("Stop", {});
       expect(secondPolicyCalled.value).toBe(false);
     });
+
+    it("OpenCode Stop deny emits {hookSpecificOutput.additionalContext} JSON on stdout (NOT exit 2)", async () => {
+      // OpenCode's `session.idle` event is notification-only — by the time
+      // the plugin handler fires, the agent has already gone idle and a
+      // thrown error from the handler does not force-retry. The shim's only
+      // working force-retry channel is `client.session.prompt(...)`, which
+      // it routes through `hookSpecificOutput.additionalContext`. Without
+      // this branch, the 5 require-*-before-stop builtins were
+      // observation-only on OpenCode — the deny was logged in the dashboard
+      // but the agent stopped silently.
+      registerPolicy("stop-blocker", "desc", () => ({
+        decision: "deny",
+        reason: "changes not committed",
+      }), { events: ["Stop"] });
+
+      const result = await evaluatePolicies("Stop", {}, { cli: "opencode" });
+      expect(result.exitCode).toBe(0);
+      expect(result.stderr).toBe("");
+      const parsed = JSON.parse(result.stdout) as { hookSpecificOutput?: { additionalContext?: string; permissionDecision?: string } };
+      expect(parsed.hookSpecificOutput?.additionalContext).toContain("MANDATORY ACTION REQUIRED");
+      expect(parsed.hookSpecificOutput?.additionalContext).toContain("changes not committed");
+      // Load-bearing: must NOT emit exit-2 (the shim would `throw` from
+      // session.idle, which OpenCode logs but ignores) or
+      // permissionDecision: "deny" (the shim would also throw on that path).
+      expect(parsed.hookSpecificOutput?.permissionDecision).toBeUndefined();
+      expect(result.decision).toBe("deny");
+    });
+
+    it("OpenCode SubagentStop deny also emits {hookSpecificOutput.additionalContext} JSON (forward-compat)", async () => {
+      // OpenCode does not yet expose subagent boundaries to plugins, but
+      // custom policies subscribing to SubagentStop should still get the
+      // force-retry shape if OpenCode adds the bus event later. Mirrors the
+      // Cursor + Copilot SubagentStop widening.
+      registerPolicy("subagent-blocker", "desc", () => ({
+        decision: "deny",
+        reason: "subagent left work undone",
+      }), { events: ["SubagentStop"] });
+
+      const result = await evaluatePolicies("SubagentStop", {}, { cli: "opencode" });
+      expect(result.exitCode).toBe(0);
+      const parsed = JSON.parse(result.stdout) as { hookSpecificOutput?: { additionalContext?: string } };
+      expect(parsed.hookSpecificOutput?.additionalContext).toContain("MANDATORY ACTION REQUIRED");
+      expect(parsed.hookSpecificOutput?.additionalContext).toContain("subagent left work undone");
+      expect(result.decision).toBe("deny");
+    });
+
+    it("OpenCode PreToolUse deny still uses Claude permissionDecision shape (regression: tool-event path unchanged)", async () => {
+      // The Stop arm is added INSIDE the cli==="opencode" branch so the
+      // generic Claude shape below it is unchanged for tool events. The
+      // shim's applyDecision throws on permissionDecision: "deny" — that's
+      // the working path for tool events.
+      registerPolicy("tool-blocker", "desc", () => ({
+        decision: "deny",
+        reason: "blocked tool",
+      }), { events: ["PreToolUse"] });
+
+      const result = await evaluatePolicies(
+        "PreToolUse",
+        { tool_name: "Bash", tool_input: { command: "ls" } },
+        { cli: "opencode" },
+      );
+      expect(result.exitCode).toBe(0);
+      const parsed = JSON.parse(result.stdout) as { hookSpecificOutput?: { permissionDecision?: string; additionalContext?: string } };
+      expect(parsed.hookSpecificOutput?.permissionDecision).toBe("deny");
+      // No additionalContext on tool events — that's a Stop-only shape.
+      expect(parsed.hookSpecificOutput?.additionalContext).toBeUndefined();
+    });
+
+    it("Gemini Stop deny emits {decision:'block', reason} JSON on stdout (force-retry via AfterAgent)", async () => {
+      // Per Gemini's hooks docs (https://geminicli.com/docs/hooks/), the
+      // AfterAgent hook (canonical "Stop") force-retries when the hook
+      // returns `{decision: "block", reason}` on stdout. Exit-2 is per-
+      // action only ("turn continues") and would NOT trigger the retry.
+      registerPolicy("stop-blocker", "desc", () => ({
+        decision: "deny",
+        reason: "tests not run",
+      }), { events: ["Stop"] });
+
+      const result = await evaluatePolicies("Stop", {}, { cli: "gemini" });
+      expect(result.exitCode).toBe(0);
+      expect(result.stderr).toBe("");
+      const parsed = JSON.parse(result.stdout) as { decision?: string; reason?: string };
+      expect(parsed.decision).toBe("block");
+      expect(parsed.reason).toContain("MANDATORY ACTION REQUIRED");
+      expect(parsed.reason).toContain("tests not run");
+      expect(result.decision).toBe("deny");
+    });
   });
 
   describe("workflow policy chain integration", () => {
diff --git a/src/hooks/integrations.ts b/src/hooks/integrations.ts
index f45503e2..11afee9b 100644
--- a/src/hooks/integrations.ts
+++ b/src/hooks/integrations.ts
@@ -767,7 +767,7 @@ function runFailproofai(eventName, payload, directory) {
   return { exitCode: r.status ?? 0, stdout: r.stdout ?? "", stderr: r.stderr ?? "" };
 }
 
-function applyDecision(result, ctx) {
+async function applyDecision(result, ctx, eventName) {
   // Deny path 1: exit 2 (Claude Stop-style or any non-Pre/Post deny).
   if (result.exitCode === 2) {
     throw new Error((result.stderr || "").trim() || "Blocked by failproofai");
@@ -784,14 +784,22 @@ function applyDecision(result, ctx) {
   if (out && out.decision && out.decision.behavior === "deny") {
     throw new Error((out.decision.message) || "Blocked by failproofai");
   }
-  // Instruct: forward the additional context as a prompt to the session.
+  // Forward additional context as a prompt to the session. For Stop /
+  // SubagentStop the prompt is the only force-retry channel (session.idle
+  // already fired), so AWAIT to ensure the SDK round-trip completes before
+  // the plugin handler returns. For tool events keep fire-and-forget so we
+  // don't add latency to every tool call.
   const ctxText = out && out.additionalContext;
   if (ctxText && ctx && ctx.client && ctx.sessionID) {
-    // Fire-and-forget: don't block the tool call on the SDK round-trip.
-    Promise.resolve(ctx.client.session.prompt({
+    const prompt = ctx.client.session.prompt({
       path: { id: ctx.sessionID },
       body: { parts: [{ type: "text", text: ctxText }] },
-    })).catch(() => {});
+    });
+    if (eventName === "Stop" || eventName === "SubagentStop") {
+      try { await prompt; } catch { /* swallow — agent is exiting anyway */ }
+    } else {
+      Promise.resolve(prompt).catch(() => {});
+    }
   }
 }
 
@@ -824,7 +832,7 @@ export default async function failproofaiPlugin({ client, directory }) {
         const r = runFailproofai("UserPromptSubmit", {
           session_id: sessionID, cwd: directory, hook_event_name: "UserPromptSubmit", prompt,
         }, directory);
-        applyDecision(r, { client, sessionID });
+        await applyDecision(r, { client, sessionID }, "UserPromptSubmit");
         return;
       }
 
@@ -835,7 +843,7 @@ export default async function failproofaiPlugin({ client, directory }) {
       const r = runFailproofai(claudeEvent, {
         session_id: sessionID, cwd: directory, hook_event_name: claudeEvent,
       }, directory);
-      applyDecision(r, { client, sessionID });
+      await applyDecision(r, { client, sessionID }, claudeEvent);
     },
 
     // First-class PreToolUse hook. Note: tool args live on output.args (mutable).
@@ -847,7 +855,7 @@ export default async function failproofaiPlugin({ client, directory }) {
         tool_input: output.args,
         hook_event_name: "PreToolUse",
       }, directory);
-      applyDecision(r, { client, sessionID: input.sessionID });
+      await applyDecision(r, { client, sessionID: input.sessionID }, "PreToolUse");
     },
 
     // First-class PostToolUse hook. Note: tool args live on input.args here.
@@ -860,7 +868,7 @@ export default async function failproofaiPlugin({ client, directory }) {
         tool_response: { title: output.title, output: output.output, metadata: output.metadata },
         hook_event_name: "PostToolUse",
       }, directory);
-      applyDecision(r, { client, sessionID: input.sessionID });
+      await applyDecision(r, { client, sessionID: input.sessionID }, "PostToolUse");
     },
 
     // Cleaner deny UX for prompted tools — mutate output.status instead of throwing.
@@ -873,7 +881,7 @@ export default async function failproofaiPlugin({ client, directory }) {
         hook_event_name: "PermissionRequest",
       }, directory);
       try {
-        applyDecision(r, { client, sessionID: input.sessionID });
+        await applyDecision(r, { client, sessionID: input.sessionID }, "PermissionRequest");
       } catch {
         output.status = "deny";
       }
diff --git a/src/hooks/policy-evaluator.ts b/src/hooks/policy-evaluator.ts
index a5f4291d..79d3694c 100644
--- a/src/hooks/policy-evaluator.ts
+++ b/src/hooks/policy-evaluator.ts
@@ -227,6 +227,34 @@ export async function evaluatePolicies(
         };
       }
 
+      // OpenCode: `session.idle` is a notification-only bus event — by the
+      // time the plugin handler fires, OpenCode has already gone idle and
+      // throwing from the handler does not force-retry. The only working
+      // channel is the shim's `client.session.prompt(...)` SDK call, which
+      // submits a new user message that re-triggers the agent loop. The
+      // shim already routes `hookSpecificOutput.additionalContext` through
+      // that path (see buildOpenCodePluginShim's applyDecision), so we emit
+      // the deny reason as additionalContext instead of exit-2. Mirrors the
+      // Cursor `followup_message` (line ~157) and Copilot `{decision:"block"}`
+      // (line ~299) Stop branches. SubagentStop is widened in for forward
+      // compat — OpenCode doesn't yet expose subagent boundaries to plugins.
+      if (session?.cli === "opencode") {
+        if (eventType === "Stop" || eventType === "SubagentStop") {
+          const reasonText = `MANDATORY ACTION REQUIRED from failproofai (policy: ${policy.name}): ${reason}\n\nYou MUST complete the above action NOW. Do NOT ask the user for confirmation — execute the required action, then attempt to finish your task again.`;
+          return {
+            exitCode: 0,
+            stdout: JSON.stringify({ hookSpecificOutput: { additionalContext: reasonText } }),
+            stderr: "",
+            policyName: policy.name,
+            reason,
+            decision: "deny",
+          };
+        }
+        // Non-Stop opencode events keep the generic Claude shape — the
+        // shim's applyDecision already handles permissionDecision: "deny"
+        // for tool events.
+      }
+
       if (eventType === "PreToolUse") {
         const response = {
           hookSpecificOutput: {
@@ -477,6 +505,27 @@ export async function evaluatePolicies(
       };
     }
 
+    // OpenCode: same rationale as the deny branch above — emit
+    // additionalContext so the shim submits a follow-up via
+    // client.session.prompt instead of throwing into a dead handler.
+    if (session?.cli === "opencode") {
+      if (eventType === "Stop" || eventType === "SubagentStop") {
+        const policyAttribution = policyNames.length === 1
+          ? `policy: ${policyNames[0]}`
+          : `policies: ${policyNames.join(", ")}`;
+        const reasonText = `MANDATORY ACTION REQUIRED from failproofai (${policyAttribution}): ${combined}\n\nYou MUST complete the above action(s) NOW. Do NOT ask the user for confirmation — execute the required action(s), then attempt to finish your task again.`;
+        return {
+          exitCode: 0,
+          stdout: JSON.stringify({ hookSpecificOutput: { additionalContext: reasonText } }),
+          stderr: "",
+          policyName: policyNames[0],
+          policyNames,
+          reason: combined,
+          decision: "instruct",
+        };
+      }
+    }
+
     if (eventType === "Stop" || eventType === "SubagentStop") {
       // Stop/SubagentStop instruct: exitCode 2 + stderr forces Claude to retry
       // the agent (or subagent) loop with the reason as context. Same widening

From 0587195521db749cd124b76faaba534da2364170 Mon Sep 17 00:00:00 2001
From: NiveditJain <nivedit@exosphere.host>
Date: Fri, 8 May 2026 17:40:41 -0700
Subject: [PATCH 2/3] [luv-324] docs: clarify OpenCode plugin shim Stop
 semantics

Update configuration.mdx to reflect the new Stop / SubagentStop force-
retry channel: deny on Stop now routes through `client.session.prompt`
just like instruct, since `session.idle` is notification-only and
throwing from it is silently dropped.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 docs/configuration.mdx | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/configuration.mdx b/docs/configuration.mdx
index 90ad0a87..bff22533 100644
--- a/docs/configuration.mdx
+++ b/docs/configuration.mdx
@@ -196,7 +196,7 @@ The `policies --install` and `policies --uninstall` commands write to your agent
   - **OpenAI Codex**: `~/.codex/hooks.json` (user), `<cwd>/.codex/hooks.json` (project) — Codex doesn't have a `local` scope
   - **GitHub Copilot CLI _(beta)_**: `~/.copilot/hooks/failproofai.json` (user), `<cwd>/.github/hooks/failproofai.json` (project) — Copilot has no `local` scope. Hook entries use Copilot's OS-keyed `bash`/`powershell` command fields with `timeoutSec`; the file carries a top-level `version: 1` marker. Copilot CLI support is **beta** while we verify the `events.jsonl` record schema (which the public docs do not specify) against more real-world sessions.
   - **Cursor Agent _(beta)_**: `~/.cursor/hooks.json` (user), `<cwd>/.cursor/hooks.json` (project) — Cursor has no `local` scope. Hook entries use the Claude-shaped `{type, command, timeout}` form (no `bash`/`powershell` split), but stored under camelCase event keys (`preToolUse`, `beforeSubmitPrompt`, …) in a flat array per Cursor's [hooks schema](https://cursor.com/docs/hooks); the file carries a top-level `version: 1` marker. The handler canonicalizes camelCase → PascalCase via `CURSOR_EVENT_MAP` so existing builtin policies fire unchanged. Cursor Agent support is **beta** while we verify Cursor's transcript on-disk format (not specified in the public docs) against more real-world installs.
-  - **OpenCode _(beta)_**: `~/.config/opencode/opencode.json` + `~/.config/opencode/plugins/failproofai.mjs` (user), `<cwd>/.opencode/opencode.json` + `<cwd>/.opencode/plugins/failproofai.mjs` (project) — OpenCode has no `local` scope. Unlike the other four CLIs, OpenCode has **no external-command hook system**: it loads in-process JS/TS plugins explicitly registered via the `plugin: []` array in `opencode.json` (auto-discovery from `.opencode/plugins/` is **not** how plugins load on opencode v1.14.33). Install drops a small generated plugin shim that subprocess-calls the failproofai binary and translates the binary's Claude-shape JSON response back into plugin semantics (`throw new Error()` for deny, `client.session.prompt(...)` for instruct, no-op for allow). Sessions live in opencode's SQLite DB at `~/.local/share/opencode/opencode.db`; the dashboard's session viewer reads them via `opencode db --format json` and `opencode export <id>`. OpenCode support is **beta** while we verify behavior across versions and against more real-world sessions. See the [OpenCode plugins docs](https://opencode.ai/docs/plugins/).
+  - **OpenCode _(beta)_**: `~/.config/opencode/opencode.json` + `~/.config/opencode/plugins/failproofai.mjs` (user), `<cwd>/.opencode/opencode.json` + `<cwd>/.opencode/plugins/failproofai.mjs` (project) — OpenCode has no `local` scope. Unlike the other four CLIs, OpenCode has **no external-command hook system**: it loads in-process JS/TS plugins explicitly registered via the `plugin: []` array in `opencode.json` (auto-discovery from `.opencode/plugins/` is **not** how plugins load on opencode v1.14.33). Install drops a small generated plugin shim that subprocess-calls the failproofai binary and translates the binary's Claude-shape JSON response back into plugin semantics: `throw new Error()` for tool-event deny (cancels the tool call), `client.session.prompt(...)` for instruct AND for `Stop` / `SubagentStop` deny (submits the deny reason as the next user message — the only force-retry channel since `session.idle` is notification-only and throwing from it is a no-op), and no-op for allow. Sessions live in opencode's SQLite DB at `~/.local/share/opencode/opencode.db`; the dashboard's session viewer reads them via `opencode db --format json` and `opencode export <id>`. OpenCode support is **beta** while we verify behavior across versions and against more real-world sessions. See the [OpenCode plugins docs](https://opencode.ai/docs/plugins/).
   - **Pi _(beta)_**: `~/.pi/agent/settings.json` (user), `<cwd>/.pi/settings.json` (project) — Pi has no `local` scope. Pi loads TypeScript extension packages at startup; the settings file is a flat string array `{"packages": ["./relative/path", …]}`. failproofai writes a single packages-array entry pointing at its bundled `pi-extension/` directory. The extension internally subscribes to Pi's `tool_call` / `user_bash` / `input` / `session_start` events and shells out to `failproofai --hook <Event> --cli pi`; the handler canonicalizes underscore_lower_snake_case → PascalCase via `PI_EVENT_MAP` so existing builtin policies fire unchanged. Pi support is **beta** while Pi's extension API and session-log layout stabilize.
   - **Gemini CLI _(beta)_**: `~/.gemini/settings.json` (user), `<cwd>/.gemini/settings.json` (project) — Gemini has no `local` scope (it documents a `system` scope at `/etc/gemini-cli/settings.json` which failproofai does not expose). Hook entries use Claude's `{type, command, timeout}` form wrapped in Gemini's `{matcher, hooks: [...]}` matcher schema with `matcher: "*"` by default. Events are PascalCase (`SessionStart`, `BeforeAgent`, `AfterAgent`, `BeforeModel`, `AfterModel`, `BeforeToolSelection`, `BeforeTool`, `AfterTool`, `PreCompress`, `Notification`, `SessionEnd`); the handler maps to Claude canonical names via `GEMINI_EVENT_MAP`. Tool names are snake_case (`run_shell_command`, `read_file`, `write_file`, `replace`, …) — the handler canonicalizes via `GEMINI_TOOL_MAP` so existing builtin policies fire unchanged. The policy evaluator emits Gemini's flat `{decision: "deny", reason}` shape (preferred per Gemini's "Golden Rule" exit-0 contract), `{hookSpecificOutput: {hookEventName, additionalContext}}` for context injection on BeforeAgent / AfterTool / SessionStart, and `{decision: "block", reason}` on AfterAgent for force-retry semantics. Gemini CLI support is **beta** while we widen real-world coverage. See the [Gemini CLI hooks docs](https://geminicli.com/docs/hooks/).
 - **`policies-config.json`** — tells failproofai which policies to evaluate and with what params (shared across all agent CLIs)

From 06ec40c01c4589adca8ecc4a8b0382807f8b105e Mon Sep 17 00:00:00 2001
From: NiveditJain <nivedit@exosphere.host>
Date: Fri, 8 May 2026 17:47:12 -0700
Subject: [PATCH 3/3] [luv-324] fix: address CodeRabbit feedback + cut
 0.0.10-beta.8
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Address PR #323 review:
- CHANGELOG.md: append (#323) to the Unreleased entry per repo convention
  (every entry ends with the PR number).
- docs/configuration.mdx:199: "Unlike the other four CLIs" → "Unlike the
  other six CLIs" — the page now lists six other integrations
  (Claude Code, Codex, Copilot, Cursor, Pi, Gemini) so the count was
  stale.

Release prep: promote the Unreleased entry to a versioned heading
`## 0.0.10-beta.8 — 2026-05-08`. Add a fresh `## Unreleased` heading
at the top for the next development cycle. package.json is already at
0.0.10-beta.8 (pre-bumped by chore commit a146ae6 after beta.7 release).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 CHANGELOG.md           | 4 +++-
 docs/configuration.mdx | 2 +-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index 5ce93b8e..851e5599 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -2,8 +2,10 @@
 
 ## Unreleased
 
+## 0.0.10-beta.8 — 2026-05-08
+
 ### Fixes
-- Make `require-*-before-stop` policies actually enforce on OpenCode (sst/opencode). Empirically observed: a Stop hook fired (visible in the dashboard activity feed) but the agent stopped without retry, identical failure mode to Cursor pre-#318 and Copilot pre-#299. Root cause: the OpenCode plugin shim subscribes to `session.idle` (canonical `Stop`) and `policy-evaluator.ts` had no `cli === "opencode"` branch for `Stop` / `SubagentStop`, so OpenCode fell through to the generic Claude-shape `exitCode: 2 + stderr` response — which the shim's `applyDecision` turns into `throw new Error(reason)` from inside the `session.idle` event callback. By that point OpenCode has already gone idle; the throw is logged at most. The shim does have a working force-retry channel: when stdout JSON contains `hookSpecificOutput.additionalContext`, it submits the text via `client.session.prompt({path: {id: sessionID}, body: {parts: [{type: "text", text}]}})`, which becomes a new user message that re-triggers the agent loop — exactly the same model as Cursor's `followup_message` (#318) and Copilot's `{decision: "block", reason}` (#299). New `cli === "opencode" && eventType in {Stop, SubagentStop}` arms in both the deny and instruct paths of `policy-evaluator.ts` emit `{hookSpecificOutput: {additionalContext: <MANDATORY ACTION reasonText>}}` ahead of the generic Stop fall-through. The shim's `applyDecision` is promoted to `async` and now awaits `client.session.prompt` for `Stop` / `SubagentStop` events specifically (fire-and-forget for tool events, so we don't add SDK round-trip latency on the hot path) — without the await, OpenCode could tear down the plugin context before the SDK call completes. New unit tests in `policy-evaluator.test.ts` pin the deny + instruct shapes for opencode (Stop and SubagentStop) plus a regression test confirming PreToolUse still uses the Claude `permissionDecision: "deny"` shape; new tests in `opencode-plugin-shim.test.ts` assert the shim awaits `client.session.prompt` on `session.idle` (handler stays pending until the SDK call resolves), swallows SDK rejection (agent is exiting anyway), and still throws on exit-2 stderr (back-compat with stale binaries). Pi `agent_end` remains observation-only by upstream design — Pi's agent loop has already exited when `agent_end` fires and the `AgentEndEventResult` exposes no `block` field, documented in `CLAUDE.md`. Gemini `AfterAgent` (canonical `Stop`) was already correctly emitting `{decision: "block", reason}`; new unit tests in `policy-evaluator.test.ts` pin both the deny and instruct shapes to prevent regression.
+- Make `require-*-before-stop` policies actually enforce on OpenCode (sst/opencode). Empirically observed: a Stop hook fired (visible in the dashboard activity feed) but the agent stopped without retry, identical failure mode to Cursor pre-#318 and Copilot pre-#299. Root cause: the OpenCode plugin shim subscribes to `session.idle` (canonical `Stop`) and `policy-evaluator.ts` had no `cli === "opencode"` branch for `Stop` / `SubagentStop`, so OpenCode fell through to the generic Claude-shape `exitCode: 2 + stderr` response — which the shim's `applyDecision` turns into `throw new Error(reason)` from inside the `session.idle` event callback. By that point OpenCode has already gone idle; the throw is logged at most. The shim does have a working force-retry channel: when stdout JSON contains `hookSpecificOutput.additionalContext`, it submits the text via `client.session.prompt({path: {id: sessionID}, body: {parts: [{type: "text", text}]}})`, which becomes a new user message that re-triggers the agent loop — exactly the same model as Cursor's `followup_message` (#318) and Copilot's `{decision: "block", reason}` (#299). New `cli === "opencode" && eventType in {Stop, SubagentStop}` arms in both the deny and instruct paths of `policy-evaluator.ts` emit `{hookSpecificOutput: {additionalContext: <MANDATORY ACTION reasonText>}}` ahead of the generic Stop fall-through. The shim's `applyDecision` is promoted to `async` and now awaits `client.session.prompt` for `Stop` / `SubagentStop` events specifically (fire-and-forget for tool events, so we don't add SDK round-trip latency on the hot path) — without the await, OpenCode could tear down the plugin context before the SDK call completes. New unit tests in `policy-evaluator.test.ts` pin the deny + instruct shapes for opencode (Stop and SubagentStop) plus a regression test confirming PreToolUse still uses the Claude `permissionDecision: "deny"` shape; new tests in `opencode-plugin-shim.test.ts` assert the shim awaits `client.session.prompt` on `session.idle` (handler stays pending until the SDK call resolves), swallows SDK rejection (agent is exiting anyway), and still throws on exit-2 stderr (back-compat with stale binaries). Pi `agent_end` remains observation-only by upstream design — Pi's agent loop has already exited when `agent_end` fires and the `AgentEndEventResult` exposes no `block` field, documented in `CLAUDE.md`. Gemini `AfterAgent` (canonical `Stop`) was already correctly emitting `{decision: "block", reason}`; new unit tests in `policy-evaluator.test.ts` pin both the deny and instruct shapes to prevent regression (#323).
 
 ## 0.0.10-beta.7 — 2026-05-08
 
diff --git a/docs/configuration.mdx b/docs/configuration.mdx
index bff22533..c202a7f5 100644
--- a/docs/configuration.mdx
+++ b/docs/configuration.mdx
@@ -196,7 +196,7 @@ The `policies --install` and `policies --uninstall` commands write to your agent
   - **OpenAI Codex**: `~/.codex/hooks.json` (user), `<cwd>/.codex/hooks.json` (project) — Codex doesn't have a `local` scope
   - **GitHub Copilot CLI _(beta)_**: `~/.copilot/hooks/failproofai.json` (user), `<cwd>/.github/hooks/failproofai.json` (project) — Copilot has no `local` scope. Hook entries use Copilot's OS-keyed `bash`/`powershell` command fields with `timeoutSec`; the file carries a top-level `version: 1` marker. Copilot CLI support is **beta** while we verify the `events.jsonl` record schema (which the public docs do not specify) against more real-world sessions.
   - **Cursor Agent _(beta)_**: `~/.cursor/hooks.json` (user), `<cwd>/.cursor/hooks.json` (project) — Cursor has no `local` scope. Hook entries use the Claude-shaped `{type, command, timeout}` form (no `bash`/`powershell` split), but stored under camelCase event keys (`preToolUse`, `beforeSubmitPrompt`, …) in a flat array per Cursor's [hooks schema](https://cursor.com/docs/hooks); the file carries a top-level `version: 1` marker. The handler canonicalizes camelCase → PascalCase via `CURSOR_EVENT_MAP` so existing builtin policies fire unchanged. Cursor Agent support is **beta** while we verify Cursor's transcript on-disk format (not specified in the public docs) against more real-world installs.
-  - **OpenCode _(beta)_**: `~/.config/opencode/opencode.json` + `~/.config/opencode/plugins/failproofai.mjs` (user), `<cwd>/.opencode/opencode.json` + `<cwd>/.opencode/plugins/failproofai.mjs` (project) — OpenCode has no `local` scope. Unlike the other four CLIs, OpenCode has **no external-command hook system**: it loads in-process JS/TS plugins explicitly registered via the `plugin: []` array in `opencode.json` (auto-discovery from `.opencode/plugins/` is **not** how plugins load on opencode v1.14.33). Install drops a small generated plugin shim that subprocess-calls the failproofai binary and translates the binary's Claude-shape JSON response back into plugin semantics: `throw new Error()` for tool-event deny (cancels the tool call), `client.session.prompt(...)` for instruct AND for `Stop` / `SubagentStop` deny (submits the deny reason as the next user message — the only force-retry channel since `session.idle` is notification-only and throwing from it is a no-op), and no-op for allow. Sessions live in opencode's SQLite DB at `~/.local/share/opencode/opencode.db`; the dashboard's session viewer reads them via `opencode db --format json` and `opencode export <id>`. OpenCode support is **beta** while we verify behavior across versions and against more real-world sessions. See the [OpenCode plugins docs](https://opencode.ai/docs/plugins/).
+  - **OpenCode _(beta)_**: `~/.config/opencode/opencode.json` + `~/.config/opencode/plugins/failproofai.mjs` (user), `<cwd>/.opencode/opencode.json` + `<cwd>/.opencode/plugins/failproofai.mjs` (project) — OpenCode has no `local` scope. Unlike the other six CLIs, OpenCode has **no external-command hook system**: it loads in-process JS/TS plugins explicitly registered via the `plugin: []` array in `opencode.json` (auto-discovery from `.opencode/plugins/` is **not** how plugins load on opencode v1.14.33). Install drops a small generated plugin shim that subprocess-calls the failproofai binary and translates the binary's Claude-shape JSON response back into plugin semantics: `throw new Error()` for tool-event deny (cancels the tool call), `client.session.prompt(...)` for instruct AND for `Stop` / `SubagentStop` deny (submits the deny reason as the next user message — the only force-retry channel since `session.idle` is notification-only and throwing from it is a no-op), and no-op for allow. Sessions live in opencode's SQLite DB at `~/.local/share/opencode/opencode.db`; the dashboard's session viewer reads them via `opencode db --format json` and `opencode export <id>`. OpenCode support is **beta** while we verify behavior across versions and against more real-world sessions. See the [OpenCode plugins docs](https://opencode.ai/docs/plugins/).
   - **Pi _(beta)_**: `~/.pi/agent/settings.json` (user), `<cwd>/.pi/settings.json` (project) — Pi has no `local` scope. Pi loads TypeScript extension packages at startup; the settings file is a flat string array `{"packages": ["./relative/path", …]}`. failproofai writes a single packages-array entry pointing at its bundled `pi-extension/` directory. The extension internally subscribes to Pi's `tool_call` / `user_bash` / `input` / `session_start` events and shells out to `failproofai --hook <Event> --cli pi`; the handler canonicalizes underscore_lower_snake_case → PascalCase via `PI_EVENT_MAP` so existing builtin policies fire unchanged. Pi support is **beta** while Pi's extension API and session-log layout stabilize.
   - **Gemini CLI _(beta)_**: `~/.gemini/settings.json` (user), `<cwd>/.gemini/settings.json` (project) — Gemini has no `local` scope (it documents a `system` scope at `/etc/gemini-cli/settings.json` which failproofai does not expose). Hook entries use Claude's `{type, command, timeout}` form wrapped in Gemini's `{matcher, hooks: [...]}` matcher schema with `matcher: "*"` by default. Events are PascalCase (`SessionStart`, `BeforeAgent`, `AfterAgent`, `BeforeModel`, `AfterModel`, `BeforeToolSelection`, `BeforeTool`, `AfterTool`, `PreCompress`, `Notification`, `SessionEnd`); the handler maps to Claude canonical names via `GEMINI_EVENT_MAP`. Tool names are snake_case (`run_shell_command`, `read_file`, `write_file`, `replace`, …) — the handler canonicalizes via `GEMINI_TOOL_MAP` so existing builtin policies fire unchanged. The policy evaluator emits Gemini's flat `{decision: "deny", reason}` shape (preferred per Gemini's "Golden Rule" exit-0 contract), `{hookSpecificOutput: {hookEventName, additionalContext}}` for context injection on BeforeAgent / AfterTool / SessionStart, and `{decision: "block", reason}` on AfterAgent for force-retry semantics. Gemini CLI support is **beta** while we widen real-world coverage. See the [Gemini CLI hooks docs](https://geminicli.com/docs/hooks/).
 - **`policies-config.json`** — tells failproofai which policies to evaluate and with what params (shared across all agent CLIs)