From 72f4a0a6defe467981db355f423ddbc87161a4c8 Mon Sep 17 00:00:00 2001 From: NiveditJain Date: Fri, 8 May 2026 17:39:48 -0700 Subject: [PATCH 1/3] [luv-324] fix: enforce Stop hook on OpenCode MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Stop hooks fired on OpenCode (visible in dashboard activity feed) but the agent stopped without retry — same failure mode Cursor had pre-#318 and Copilot had pre-#299. Root cause: no `cli === "opencode"` branch in policy-evaluator's Stop / SubagentStop handling, so OpenCode fell into the generic exit-2 path. The plugin shim's applyDecision turns exit-2 into `throw new Error(reason)`, but throwing from the `session.idle` event callback is a no-op — OpenCode is already idle by the time the event fires. Fix: emit `{hookSpecificOutput: {additionalContext: }}` for opencode Stop / SubagentStop in both deny and instruct paths. The shim already routes `additionalContext` through `client.session.prompt(...)` which submits a new user message that re-triggers the agent loop — same model as Cursor's `followup_message` and Copilot's `{decision: "block", reason}`. Promote applyDecision to async and `await client.session.prompt` for Stop/SubagentStop events so the SDK round-trip completes before the plugin context tears down; keep fire-and-forget for tool events to avoid hot-path latency. Sister CLIs verified while in here: - Gemini AfterAgent (canonical Stop) was already correctly emitting `{decision: "block", reason}`; new unit tests pin both deny and instruct shapes to prevent regression. - Pi `agent_end` is observation-only by upstream design — Pi's agent loop has already exited and `AgentEndEventResult` exposes no `block` field. CLAUDE.md already documents this; no code change. Co-Authored-By: Claude Opus 4.7 --- CHANGELOG.md | 3 + __tests__/hooks/opencode-plugin-shim.test.ts | 62 ++++++++ __tests__/hooks/policy-evaluator.test.ts | 144 +++++++++++++++++++ src/hooks/integrations.ts | 28 ++-- src/hooks/policy-evaluator.ts | 49 +++++++ 5 files changed, 276 insertions(+), 10 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index a545bade..5ce93b8e 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,6 +2,9 @@ ## Unreleased +### Fixes +- Make `require-*-before-stop` policies actually enforce on OpenCode (sst/opencode). Empirically observed: a Stop hook fired (visible in the dashboard activity feed) but the agent stopped without retry, identical failure mode to Cursor pre-#318 and Copilot pre-#299. Root cause: the OpenCode plugin shim subscribes to `session.idle` (canonical `Stop`) and `policy-evaluator.ts` had no `cli === "opencode"` branch for `Stop` / `SubagentStop`, so OpenCode fell through to the generic Claude-shape `exitCode: 2 + stderr` response — which the shim's `applyDecision` turns into `throw new Error(reason)` from inside the `session.idle` event callback. By that point OpenCode has already gone idle; the throw is logged at most. The shim does have a working force-retry channel: when stdout JSON contains `hookSpecificOutput.additionalContext`, it submits the text via `client.session.prompt({path: {id: sessionID}, body: {parts: [{type: "text", text}]}})`, which becomes a new user message that re-triggers the agent loop — exactly the same model as Cursor's `followup_message` (#318) and Copilot's `{decision: "block", reason}` (#299). New `cli === "opencode" && eventType in {Stop, SubagentStop}` arms in both the deny and instruct paths of `policy-evaluator.ts` emit `{hookSpecificOutput: {additionalContext: }}` ahead of the generic Stop fall-through. The shim's `applyDecision` is promoted to `async` and now awaits `client.session.prompt` for `Stop` / `SubagentStop` events specifically (fire-and-forget for tool events, so we don't add SDK round-trip latency on the hot path) — without the await, OpenCode could tear down the plugin context before the SDK call completes. New unit tests in `policy-evaluator.test.ts` pin the deny + instruct shapes for opencode (Stop and SubagentStop) plus a regression test confirming PreToolUse still uses the Claude `permissionDecision: "deny"` shape; new tests in `opencode-plugin-shim.test.ts` assert the shim awaits `client.session.prompt` on `session.idle` (handler stays pending until the SDK call resolves), swallows SDK rejection (agent is exiting anyway), and still throws on exit-2 stderr (back-compat with stale binaries). Pi `agent_end` remains observation-only by upstream design — Pi's agent loop has already exited when `agent_end` fires and the `AgentEndEventResult` exposes no `block` field, documented in `CLAUDE.md`. Gemini `AfterAgent` (canonical `Stop`) was already correctly emitting `{decision: "block", reason}`; new unit tests in `policy-evaluator.test.ts` pin both the deny and instruct shapes to prevent regression. + ## 0.0.10-beta.7 — 2026-05-08 ### Fixes diff --git a/__tests__/hooks/opencode-plugin-shim.test.ts b/__tests__/hooks/opencode-plugin-shim.test.ts index 4133428b..1229ef7d 100644 --- a/__tests__/hooks/opencode-plugin-shim.test.ts +++ b/__tests__/hooks/opencode-plugin-shim.test.ts @@ -385,6 +385,68 @@ describe("OpenCode plugin shim — translation of binary response to plugin acti await hooks["permission.ask"]!({ tool: "bash", sessionID: "s" }, out); expect(out.status).toBe("deny"); }); + + it("event session.idle + additionalContext → AWAITS client.session.prompt (Stop force-retry)", async () => { + // For Stop events the prompt is the only force-retry channel — it MUST + // land before the plugin handler returns or OpenCode tears down the + // plugin context. Verify by making session.prompt return a Promise that + // resolves on a tick we control: if the handler awaits, it won't return + // until we tick. + let resolvePrompt: () => void = () => {}; + const promptPromise = new Promise((res) => { resolvePrompt = res; }); + const client = { session: { prompt: vi.fn().mockReturnValue(promptPromise) } }; + + responses.push({ + status: 0, + stdout: JSON.stringify({ hookSpecificOutput: { additionalContext: "Run tests before stopping" } }), + stderr: "", + }); + const { plugin } = await setup(); + const hooks = await plugin({ client, directory: "/repo" }); + + // Fire the handler and observe that it does NOT resolve until prompt resolves. + let handlerSettled = false; + const handlerDone = (async () => { + await hooks.event!({ event: { type: "session.idle", properties: { sessionID: "ses_1" } } }); + handlerSettled = true; + })(); + + // Yield to the microtask queue; the handler should still be pending. + await new Promise((r) => setImmediate(r)); + expect(handlerSettled).toBe(false); + expect(client.session.prompt).toHaveBeenCalledTimes(1); + const arg = client.session.prompt.mock.calls[0][0] as { path: { id: string }; body: { parts: Array<{ type: string; text: string }> } }; + expect(arg.path.id).toBe("ses_1"); + expect(arg.body.parts[0]).toEqual({ type: "text", text: "Run tests before stopping" }); + + // Resolve the SDK call; the handler should now finish. + resolvePrompt(); + await handlerDone; + expect(handlerSettled).toBe(true); + }); + + it("event session.idle + SDK rejection on prompt is swallowed (agent already exiting)", async () => { + responses.push({ + status: 0, + stdout: JSON.stringify({ hookSpecificOutput: { additionalContext: "..." } }), + stderr: "", + }); + const client = { session: { prompt: vi.fn().mockRejectedValue(new Error("network")) } }; + const { plugin } = await setup(); + const hooks = await plugin({ client, directory: "/repo" }); + await expect( + hooks.event!({ event: { type: "session.idle", properties: { sessionID: "ses_1" } } }), + ).resolves.toBeUndefined(); + }); + + it("event session.idle + exit 2 → still throws (back-compat with stale binaries)", async () => { + responses.push({ status: 2, stdout: "", stderr: "MANDATORY: commit before stopping" }); + const { plugin } = await setup(); + const hooks = await plugin({ client: fakeClient(), directory: "/repo" }); + await expect( + hooks.event!({ event: { type: "session.idle", properties: { sessionID: "ses_1" } } }), + ).rejects.toThrow(/MANDATORY: commit before stopping/); + }); }); describe("OpenCode plugin shim — spawn options and registration", () => { diff --git a/__tests__/hooks/policy-evaluator.test.ts b/__tests__/hooks/policy-evaluator.test.ts index c1f0d941..22f1376f 100644 --- a/__tests__/hooks/policy-evaluator.test.ts +++ b/__tests__/hooks/policy-evaluator.test.ts @@ -251,6 +251,63 @@ describe("hooks/policy-evaluator", () => { expect(parsed.followup_message).toContain("subagent verification pending"); }); + it("OpenCode Stop + instruct emits {hookSpecificOutput.additionalContext} JSON on stdout (NOT exit 2)", async () => { + // Mirrors the Copilot/Cursor instruct-on-Stop fixes: without a per-CLI + // branch, opencode instruct verdicts fall through to exit-2 + stderr, + // which the OpenCode shim throws from inside session.idle — a no-op + // because the agent has already gone idle. The shim's only working + // force-retry channel is hookSpecificOutput.additionalContext routed + // through client.session.prompt. + registerPolicy("verify", "desc", () => ({ + decision: "instruct", + reason: "needs verification", + }), { events: ["Stop"] }); + + const result = await evaluatePolicies("Stop", {}, { cli: "opencode" }); + expect(result.exitCode).toBe(0); + expect(result.stderr).toBe(""); + expect(result.decision).toBe("instruct"); + const parsed = JSON.parse(result.stdout) as { hookSpecificOutput?: { additionalContext?: string } }; + expect(parsed.hookSpecificOutput?.additionalContext).toContain("MANDATORY ACTION REQUIRED"); + expect(parsed.hookSpecificOutput?.additionalContext).toContain("needs verification"); + }); + + it("OpenCode SubagentStop + instruct also emits {hookSpecificOutput.additionalContext} JSON", async () => { + // Forward-compat parity with the SubagentStop deny test — see comment + // there for why we widen even though OpenCode doesn't yet expose + // subagent boundaries to plugins. + registerPolicy("verify", "desc", () => ({ + decision: "instruct", + reason: "subagent verification pending", + }), { events: ["SubagentStop"] }); + + const result = await evaluatePolicies("SubagentStop", {}, { cli: "opencode" }); + expect(result.exitCode).toBe(0); + expect(result.stderr).toBe(""); + expect(result.decision).toBe("instruct"); + const parsed = JSON.parse(result.stdout) as { hookSpecificOutput?: { additionalContext?: string } }; + expect(parsed.hookSpecificOutput?.additionalContext).toContain("subagent verification pending"); + }); + + it("Gemini Stop + instruct emits {decision:'block', reason} JSON on stdout (force-retry via AfterAgent)", async () => { + // Confirms the existing cli==="gemini" Stop arm in the instruct path + // emits the documented force-retry shape. Pairs with the deny-path + // test in the "Stop event deny format" describe block. + registerPolicy("verify", "desc", () => ({ + decision: "instruct", + reason: "needs verification", + }), { events: ["Stop"] }); + + const result = await evaluatePolicies("Stop", {}, { cli: "gemini" }); + expect(result.exitCode).toBe(0); + expect(result.stderr).toBe(""); + expect(result.decision).toBe("instruct"); + const parsed = JSON.parse(result.stdout) as { decision?: string; reason?: string }; + expect(parsed.decision).toBe("block"); + expect(parsed.reason).toContain("MANDATORY ACTION REQUIRED"); + expect(parsed.reason).toContain("needs verification"); + }); + it("accumulates multiple instruct messages", async () => { registerPolicy("first", "desc", () => ({ decision: "instruct", @@ -605,6 +662,93 @@ describe("hooks/policy-evaluator", () => { await evaluatePolicies("Stop", {}); expect(secondPolicyCalled.value).toBe(false); }); + + it("OpenCode Stop deny emits {hookSpecificOutput.additionalContext} JSON on stdout (NOT exit 2)", async () => { + // OpenCode's `session.idle` event is notification-only — by the time + // the plugin handler fires, the agent has already gone idle and a + // thrown error from the handler does not force-retry. The shim's only + // working force-retry channel is `client.session.prompt(...)`, which + // it routes through `hookSpecificOutput.additionalContext`. Without + // this branch, the 5 require-*-before-stop builtins were + // observation-only on OpenCode — the deny was logged in the dashboard + // but the agent stopped silently. + registerPolicy("stop-blocker", "desc", () => ({ + decision: "deny", + reason: "changes not committed", + }), { events: ["Stop"] }); + + const result = await evaluatePolicies("Stop", {}, { cli: "opencode" }); + expect(result.exitCode).toBe(0); + expect(result.stderr).toBe(""); + const parsed = JSON.parse(result.stdout) as { hookSpecificOutput?: { additionalContext?: string; permissionDecision?: string } }; + expect(parsed.hookSpecificOutput?.additionalContext).toContain("MANDATORY ACTION REQUIRED"); + expect(parsed.hookSpecificOutput?.additionalContext).toContain("changes not committed"); + // Load-bearing: must NOT emit exit-2 (the shim would `throw` from + // session.idle, which OpenCode logs but ignores) or + // permissionDecision: "deny" (the shim would also throw on that path). + expect(parsed.hookSpecificOutput?.permissionDecision).toBeUndefined(); + expect(result.decision).toBe("deny"); + }); + + it("OpenCode SubagentStop deny also emits {hookSpecificOutput.additionalContext} JSON (forward-compat)", async () => { + // OpenCode does not yet expose subagent boundaries to plugins, but + // custom policies subscribing to SubagentStop should still get the + // force-retry shape if OpenCode adds the bus event later. Mirrors the + // Cursor + Copilot SubagentStop widening. + registerPolicy("subagent-blocker", "desc", () => ({ + decision: "deny", + reason: "subagent left work undone", + }), { events: ["SubagentStop"] }); + + const result = await evaluatePolicies("SubagentStop", {}, { cli: "opencode" }); + expect(result.exitCode).toBe(0); + const parsed = JSON.parse(result.stdout) as { hookSpecificOutput?: { additionalContext?: string } }; + expect(parsed.hookSpecificOutput?.additionalContext).toContain("MANDATORY ACTION REQUIRED"); + expect(parsed.hookSpecificOutput?.additionalContext).toContain("subagent left work undone"); + expect(result.decision).toBe("deny"); + }); + + it("OpenCode PreToolUse deny still uses Claude permissionDecision shape (regression: tool-event path unchanged)", async () => { + // The Stop arm is added INSIDE the cli==="opencode" branch so the + // generic Claude shape below it is unchanged for tool events. The + // shim's applyDecision throws on permissionDecision: "deny" — that's + // the working path for tool events. + registerPolicy("tool-blocker", "desc", () => ({ + decision: "deny", + reason: "blocked tool", + }), { events: ["PreToolUse"] }); + + const result = await evaluatePolicies( + "PreToolUse", + { tool_name: "Bash", tool_input: { command: "ls" } }, + { cli: "opencode" }, + ); + expect(result.exitCode).toBe(0); + const parsed = JSON.parse(result.stdout) as { hookSpecificOutput?: { permissionDecision?: string; additionalContext?: string } }; + expect(parsed.hookSpecificOutput?.permissionDecision).toBe("deny"); + // No additionalContext on tool events — that's a Stop-only shape. + expect(parsed.hookSpecificOutput?.additionalContext).toBeUndefined(); + }); + + it("Gemini Stop deny emits {decision:'block', reason} JSON on stdout (force-retry via AfterAgent)", async () => { + // Per Gemini's hooks docs (https://geminicli.com/docs/hooks/), the + // AfterAgent hook (canonical "Stop") force-retries when the hook + // returns `{decision: "block", reason}` on stdout. Exit-2 is per- + // action only ("turn continues") and would NOT trigger the retry. + registerPolicy("stop-blocker", "desc", () => ({ + decision: "deny", + reason: "tests not run", + }), { events: ["Stop"] }); + + const result = await evaluatePolicies("Stop", {}, { cli: "gemini" }); + expect(result.exitCode).toBe(0); + expect(result.stderr).toBe(""); + const parsed = JSON.parse(result.stdout) as { decision?: string; reason?: string }; + expect(parsed.decision).toBe("block"); + expect(parsed.reason).toContain("MANDATORY ACTION REQUIRED"); + expect(parsed.reason).toContain("tests not run"); + expect(result.decision).toBe("deny"); + }); }); describe("workflow policy chain integration", () => { diff --git a/src/hooks/integrations.ts b/src/hooks/integrations.ts index f45503e2..11afee9b 100644 --- a/src/hooks/integrations.ts +++ b/src/hooks/integrations.ts @@ -767,7 +767,7 @@ function runFailproofai(eventName, payload, directory) { return { exitCode: r.status ?? 0, stdout: r.stdout ?? "", stderr: r.stderr ?? "" }; } -function applyDecision(result, ctx) { +async function applyDecision(result, ctx, eventName) { // Deny path 1: exit 2 (Claude Stop-style or any non-Pre/Post deny). if (result.exitCode === 2) { throw new Error((result.stderr || "").trim() || "Blocked by failproofai"); @@ -784,14 +784,22 @@ function applyDecision(result, ctx) { if (out && out.decision && out.decision.behavior === "deny") { throw new Error((out.decision.message) || "Blocked by failproofai"); } - // Instruct: forward the additional context as a prompt to the session. + // Forward additional context as a prompt to the session. For Stop / + // SubagentStop the prompt is the only force-retry channel (session.idle + // already fired), so AWAIT to ensure the SDK round-trip completes before + // the plugin handler returns. For tool events keep fire-and-forget so we + // don't add latency to every tool call. const ctxText = out && out.additionalContext; if (ctxText && ctx && ctx.client && ctx.sessionID) { - // Fire-and-forget: don't block the tool call on the SDK round-trip. - Promise.resolve(ctx.client.session.prompt({ + const prompt = ctx.client.session.prompt({ path: { id: ctx.sessionID }, body: { parts: [{ type: "text", text: ctxText }] }, - })).catch(() => {}); + }); + if (eventName === "Stop" || eventName === "SubagentStop") { + try { await prompt; } catch { /* swallow — agent is exiting anyway */ } + } else { + Promise.resolve(prompt).catch(() => {}); + } } } @@ -824,7 +832,7 @@ export default async function failproofaiPlugin({ client, directory }) { const r = runFailproofai("UserPromptSubmit", { session_id: sessionID, cwd: directory, hook_event_name: "UserPromptSubmit", prompt, }, directory); - applyDecision(r, { client, sessionID }); + await applyDecision(r, { client, sessionID }, "UserPromptSubmit"); return; } @@ -835,7 +843,7 @@ export default async function failproofaiPlugin({ client, directory }) { const r = runFailproofai(claudeEvent, { session_id: sessionID, cwd: directory, hook_event_name: claudeEvent, }, directory); - applyDecision(r, { client, sessionID }); + await applyDecision(r, { client, sessionID }, claudeEvent); }, // First-class PreToolUse hook. Note: tool args live on output.args (mutable). @@ -847,7 +855,7 @@ export default async function failproofaiPlugin({ client, directory }) { tool_input: output.args, hook_event_name: "PreToolUse", }, directory); - applyDecision(r, { client, sessionID: input.sessionID }); + await applyDecision(r, { client, sessionID: input.sessionID }, "PreToolUse"); }, // First-class PostToolUse hook. Note: tool args live on input.args here. @@ -860,7 +868,7 @@ export default async function failproofaiPlugin({ client, directory }) { tool_response: { title: output.title, output: output.output, metadata: output.metadata }, hook_event_name: "PostToolUse", }, directory); - applyDecision(r, { client, sessionID: input.sessionID }); + await applyDecision(r, { client, sessionID: input.sessionID }, "PostToolUse"); }, // Cleaner deny UX for prompted tools — mutate output.status instead of throwing. @@ -873,7 +881,7 @@ export default async function failproofaiPlugin({ client, directory }) { hook_event_name: "PermissionRequest", }, directory); try { - applyDecision(r, { client, sessionID: input.sessionID }); + await applyDecision(r, { client, sessionID: input.sessionID }, "PermissionRequest"); } catch { output.status = "deny"; } diff --git a/src/hooks/policy-evaluator.ts b/src/hooks/policy-evaluator.ts index a5f4291d..79d3694c 100644 --- a/src/hooks/policy-evaluator.ts +++ b/src/hooks/policy-evaluator.ts @@ -227,6 +227,34 @@ export async function evaluatePolicies( }; } + // OpenCode: `session.idle` is a notification-only bus event — by the + // time the plugin handler fires, OpenCode has already gone idle and + // throwing from the handler does not force-retry. The only working + // channel is the shim's `client.session.prompt(...)` SDK call, which + // submits a new user message that re-triggers the agent loop. The + // shim already routes `hookSpecificOutput.additionalContext` through + // that path (see buildOpenCodePluginShim's applyDecision), so we emit + // the deny reason as additionalContext instead of exit-2. Mirrors the + // Cursor `followup_message` (line ~157) and Copilot `{decision:"block"}` + // (line ~299) Stop branches. SubagentStop is widened in for forward + // compat — OpenCode doesn't yet expose subagent boundaries to plugins. + if (session?.cli === "opencode") { + if (eventType === "Stop" || eventType === "SubagentStop") { + const reasonText = `MANDATORY ACTION REQUIRED from failproofai (policy: ${policy.name}): ${reason}\n\nYou MUST complete the above action NOW. Do NOT ask the user for confirmation — execute the required action, then attempt to finish your task again.`; + return { + exitCode: 0, + stdout: JSON.stringify({ hookSpecificOutput: { additionalContext: reasonText } }), + stderr: "", + policyName: policy.name, + reason, + decision: "deny", + }; + } + // Non-Stop opencode events keep the generic Claude shape — the + // shim's applyDecision already handles permissionDecision: "deny" + // for tool events. + } + if (eventType === "PreToolUse") { const response = { hookSpecificOutput: { @@ -477,6 +505,27 @@ export async function evaluatePolicies( }; } + // OpenCode: same rationale as the deny branch above — emit + // additionalContext so the shim submits a follow-up via + // client.session.prompt instead of throwing into a dead handler. + if (session?.cli === "opencode") { + if (eventType === "Stop" || eventType === "SubagentStop") { + const policyAttribution = policyNames.length === 1 + ? `policy: ${policyNames[0]}` + : `policies: ${policyNames.join(", ")}`; + const reasonText = `MANDATORY ACTION REQUIRED from failproofai (${policyAttribution}): ${combined}\n\nYou MUST complete the above action(s) NOW. Do NOT ask the user for confirmation — execute the required action(s), then attempt to finish your task again.`; + return { + exitCode: 0, + stdout: JSON.stringify({ hookSpecificOutput: { additionalContext: reasonText } }), + stderr: "", + policyName: policyNames[0], + policyNames, + reason: combined, + decision: "instruct", + }; + } + } + if (eventType === "Stop" || eventType === "SubagentStop") { // Stop/SubagentStop instruct: exitCode 2 + stderr forces Claude to retry // the agent (or subagent) loop with the reason as context. Same widening From 0587195521db749cd124b76faaba534da2364170 Mon Sep 17 00:00:00 2001 From: NiveditJain Date: Fri, 8 May 2026 17:40:41 -0700 Subject: [PATCH 2/3] [luv-324] docs: clarify OpenCode plugin shim Stop semantics Update configuration.mdx to reflect the new Stop / SubagentStop force- retry channel: deny on Stop now routes through `client.session.prompt` just like instruct, since `session.idle` is notification-only and throwing from it is silently dropped. Co-Authored-By: Claude Opus 4.7 --- docs/configuration.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/configuration.mdx b/docs/configuration.mdx index 90ad0a87..bff22533 100644 --- a/docs/configuration.mdx +++ b/docs/configuration.mdx @@ -196,7 +196,7 @@ The `policies --install` and `policies --uninstall` commands write to your agent - **OpenAI Codex**: `~/.codex/hooks.json` (user), `/.codex/hooks.json` (project) — Codex doesn't have a `local` scope - **GitHub Copilot CLI _(beta)_**: `~/.copilot/hooks/failproofai.json` (user), `/.github/hooks/failproofai.json` (project) — Copilot has no `local` scope. Hook entries use Copilot's OS-keyed `bash`/`powershell` command fields with `timeoutSec`; the file carries a top-level `version: 1` marker. Copilot CLI support is **beta** while we verify the `events.jsonl` record schema (which the public docs do not specify) against more real-world sessions. - **Cursor Agent _(beta)_**: `~/.cursor/hooks.json` (user), `/.cursor/hooks.json` (project) — Cursor has no `local` scope. Hook entries use the Claude-shaped `{type, command, timeout}` form (no `bash`/`powershell` split), but stored under camelCase event keys (`preToolUse`, `beforeSubmitPrompt`, …) in a flat array per Cursor's [hooks schema](https://cursor.com/docs/hooks); the file carries a top-level `version: 1` marker. The handler canonicalizes camelCase → PascalCase via `CURSOR_EVENT_MAP` so existing builtin policies fire unchanged. Cursor Agent support is **beta** while we verify Cursor's transcript on-disk format (not specified in the public docs) against more real-world installs. - - **OpenCode _(beta)_**: `~/.config/opencode/opencode.json` + `~/.config/opencode/plugins/failproofai.mjs` (user), `/.opencode/opencode.json` + `/.opencode/plugins/failproofai.mjs` (project) — OpenCode has no `local` scope. Unlike the other four CLIs, OpenCode has **no external-command hook system**: it loads in-process JS/TS plugins explicitly registered via the `plugin: []` array in `opencode.json` (auto-discovery from `.opencode/plugins/` is **not** how plugins load on opencode v1.14.33). Install drops a small generated plugin shim that subprocess-calls the failproofai binary and translates the binary's Claude-shape JSON response back into plugin semantics (`throw new Error()` for deny, `client.session.prompt(...)` for instruct, no-op for allow). Sessions live in opencode's SQLite DB at `~/.local/share/opencode/opencode.db`; the dashboard's session viewer reads them via `opencode db --format json` and `opencode export `. OpenCode support is **beta** while we verify behavior across versions and against more real-world sessions. See the [OpenCode plugins docs](https://opencode.ai/docs/plugins/). + - **OpenCode _(beta)_**: `~/.config/opencode/opencode.json` + `~/.config/opencode/plugins/failproofai.mjs` (user), `/.opencode/opencode.json` + `/.opencode/plugins/failproofai.mjs` (project) — OpenCode has no `local` scope. Unlike the other four CLIs, OpenCode has **no external-command hook system**: it loads in-process JS/TS plugins explicitly registered via the `plugin: []` array in `opencode.json` (auto-discovery from `.opencode/plugins/` is **not** how plugins load on opencode v1.14.33). Install drops a small generated plugin shim that subprocess-calls the failproofai binary and translates the binary's Claude-shape JSON response back into plugin semantics: `throw new Error()` for tool-event deny (cancels the tool call), `client.session.prompt(...)` for instruct AND for `Stop` / `SubagentStop` deny (submits the deny reason as the next user message — the only force-retry channel since `session.idle` is notification-only and throwing from it is a no-op), and no-op for allow. Sessions live in opencode's SQLite DB at `~/.local/share/opencode/opencode.db`; the dashboard's session viewer reads them via `opencode db --format json` and `opencode export `. OpenCode support is **beta** while we verify behavior across versions and against more real-world sessions. See the [OpenCode plugins docs](https://opencode.ai/docs/plugins/). - **Pi _(beta)_**: `~/.pi/agent/settings.json` (user), `/.pi/settings.json` (project) — Pi has no `local` scope. Pi loads TypeScript extension packages at startup; the settings file is a flat string array `{"packages": ["./relative/path", …]}`. failproofai writes a single packages-array entry pointing at its bundled `pi-extension/` directory. The extension internally subscribes to Pi's `tool_call` / `user_bash` / `input` / `session_start` events and shells out to `failproofai --hook --cli pi`; the handler canonicalizes underscore_lower_snake_case → PascalCase via `PI_EVENT_MAP` so existing builtin policies fire unchanged. Pi support is **beta** while Pi's extension API and session-log layout stabilize. - **Gemini CLI _(beta)_**: `~/.gemini/settings.json` (user), `/.gemini/settings.json` (project) — Gemini has no `local` scope (it documents a `system` scope at `/etc/gemini-cli/settings.json` which failproofai does not expose). Hook entries use Claude's `{type, command, timeout}` form wrapped in Gemini's `{matcher, hooks: [...]}` matcher schema with `matcher: "*"` by default. Events are PascalCase (`SessionStart`, `BeforeAgent`, `AfterAgent`, `BeforeModel`, `AfterModel`, `BeforeToolSelection`, `BeforeTool`, `AfterTool`, `PreCompress`, `Notification`, `SessionEnd`); the handler maps to Claude canonical names via `GEMINI_EVENT_MAP`. Tool names are snake_case (`run_shell_command`, `read_file`, `write_file`, `replace`, …) — the handler canonicalizes via `GEMINI_TOOL_MAP` so existing builtin policies fire unchanged. The policy evaluator emits Gemini's flat `{decision: "deny", reason}` shape (preferred per Gemini's "Golden Rule" exit-0 contract), `{hookSpecificOutput: {hookEventName, additionalContext}}` for context injection on BeforeAgent / AfterTool / SessionStart, and `{decision: "block", reason}` on AfterAgent for force-retry semantics. Gemini CLI support is **beta** while we widen real-world coverage. See the [Gemini CLI hooks docs](https://geminicli.com/docs/hooks/). - **`policies-config.json`** — tells failproofai which policies to evaluate and with what params (shared across all agent CLIs) From 06ec40c01c4589adca8ecc4a8b0382807f8b105e Mon Sep 17 00:00:00 2001 From: NiveditJain Date: Fri, 8 May 2026 17:47:12 -0700 Subject: [PATCH 3/3] [luv-324] fix: address CodeRabbit feedback + cut 0.0.10-beta.8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Address PR #323 review: - CHANGELOG.md: append (#323) to the Unreleased entry per repo convention (every entry ends with the PR number). - docs/configuration.mdx:199: "Unlike the other four CLIs" → "Unlike the other six CLIs" — the page now lists six other integrations (Claude Code, Codex, Copilot, Cursor, Pi, Gemini) so the count was stale. Release prep: promote the Unreleased entry to a versioned heading `## 0.0.10-beta.8 — 2026-05-08`. Add a fresh `## Unreleased` heading at the top for the next development cycle. package.json is already at 0.0.10-beta.8 (pre-bumped by chore commit a146ae6 after beta.7 release). Co-Authored-By: Claude Opus 4.7 --- CHANGELOG.md | 4 +++- docs/configuration.mdx | 2 +- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 5ce93b8e..851e5599 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,8 +2,10 @@ ## Unreleased +## 0.0.10-beta.8 — 2026-05-08 + ### Fixes -- Make `require-*-before-stop` policies actually enforce on OpenCode (sst/opencode). Empirically observed: a Stop hook fired (visible in the dashboard activity feed) but the agent stopped without retry, identical failure mode to Cursor pre-#318 and Copilot pre-#299. Root cause: the OpenCode plugin shim subscribes to `session.idle` (canonical `Stop`) and `policy-evaluator.ts` had no `cli === "opencode"` branch for `Stop` / `SubagentStop`, so OpenCode fell through to the generic Claude-shape `exitCode: 2 + stderr` response — which the shim's `applyDecision` turns into `throw new Error(reason)` from inside the `session.idle` event callback. By that point OpenCode has already gone idle; the throw is logged at most. The shim does have a working force-retry channel: when stdout JSON contains `hookSpecificOutput.additionalContext`, it submits the text via `client.session.prompt({path: {id: sessionID}, body: {parts: [{type: "text", text}]}})`, which becomes a new user message that re-triggers the agent loop — exactly the same model as Cursor's `followup_message` (#318) and Copilot's `{decision: "block", reason}` (#299). New `cli === "opencode" && eventType in {Stop, SubagentStop}` arms in both the deny and instruct paths of `policy-evaluator.ts` emit `{hookSpecificOutput: {additionalContext: }}` ahead of the generic Stop fall-through. The shim's `applyDecision` is promoted to `async` and now awaits `client.session.prompt` for `Stop` / `SubagentStop` events specifically (fire-and-forget for tool events, so we don't add SDK round-trip latency on the hot path) — without the await, OpenCode could tear down the plugin context before the SDK call completes. New unit tests in `policy-evaluator.test.ts` pin the deny + instruct shapes for opencode (Stop and SubagentStop) plus a regression test confirming PreToolUse still uses the Claude `permissionDecision: "deny"` shape; new tests in `opencode-plugin-shim.test.ts` assert the shim awaits `client.session.prompt` on `session.idle` (handler stays pending until the SDK call resolves), swallows SDK rejection (agent is exiting anyway), and still throws on exit-2 stderr (back-compat with stale binaries). Pi `agent_end` remains observation-only by upstream design — Pi's agent loop has already exited when `agent_end` fires and the `AgentEndEventResult` exposes no `block` field, documented in `CLAUDE.md`. Gemini `AfterAgent` (canonical `Stop`) was already correctly emitting `{decision: "block", reason}`; new unit tests in `policy-evaluator.test.ts` pin both the deny and instruct shapes to prevent regression. +- Make `require-*-before-stop` policies actually enforce on OpenCode (sst/opencode). Empirically observed: a Stop hook fired (visible in the dashboard activity feed) but the agent stopped without retry, identical failure mode to Cursor pre-#318 and Copilot pre-#299. Root cause: the OpenCode plugin shim subscribes to `session.idle` (canonical `Stop`) and `policy-evaluator.ts` had no `cli === "opencode"` branch for `Stop` / `SubagentStop`, so OpenCode fell through to the generic Claude-shape `exitCode: 2 + stderr` response — which the shim's `applyDecision` turns into `throw new Error(reason)` from inside the `session.idle` event callback. By that point OpenCode has already gone idle; the throw is logged at most. The shim does have a working force-retry channel: when stdout JSON contains `hookSpecificOutput.additionalContext`, it submits the text via `client.session.prompt({path: {id: sessionID}, body: {parts: [{type: "text", text}]}})`, which becomes a new user message that re-triggers the agent loop — exactly the same model as Cursor's `followup_message` (#318) and Copilot's `{decision: "block", reason}` (#299). New `cli === "opencode" && eventType in {Stop, SubagentStop}` arms in both the deny and instruct paths of `policy-evaluator.ts` emit `{hookSpecificOutput: {additionalContext: }}` ahead of the generic Stop fall-through. The shim's `applyDecision` is promoted to `async` and now awaits `client.session.prompt` for `Stop` / `SubagentStop` events specifically (fire-and-forget for tool events, so we don't add SDK round-trip latency on the hot path) — without the await, OpenCode could tear down the plugin context before the SDK call completes. New unit tests in `policy-evaluator.test.ts` pin the deny + instruct shapes for opencode (Stop and SubagentStop) plus a regression test confirming PreToolUse still uses the Claude `permissionDecision: "deny"` shape; new tests in `opencode-plugin-shim.test.ts` assert the shim awaits `client.session.prompt` on `session.idle` (handler stays pending until the SDK call resolves), swallows SDK rejection (agent is exiting anyway), and still throws on exit-2 stderr (back-compat with stale binaries). Pi `agent_end` remains observation-only by upstream design — Pi's agent loop has already exited when `agent_end` fires and the `AgentEndEventResult` exposes no `block` field, documented in `CLAUDE.md`. Gemini `AfterAgent` (canonical `Stop`) was already correctly emitting `{decision: "block", reason}`; new unit tests in `policy-evaluator.test.ts` pin both the deny and instruct shapes to prevent regression (#323). ## 0.0.10-beta.7 — 2026-05-08 diff --git a/docs/configuration.mdx b/docs/configuration.mdx index bff22533..c202a7f5 100644 --- a/docs/configuration.mdx +++ b/docs/configuration.mdx @@ -196,7 +196,7 @@ The `policies --install` and `policies --uninstall` commands write to your agent - **OpenAI Codex**: `~/.codex/hooks.json` (user), `/.codex/hooks.json` (project) — Codex doesn't have a `local` scope - **GitHub Copilot CLI _(beta)_**: `~/.copilot/hooks/failproofai.json` (user), `/.github/hooks/failproofai.json` (project) — Copilot has no `local` scope. Hook entries use Copilot's OS-keyed `bash`/`powershell` command fields with `timeoutSec`; the file carries a top-level `version: 1` marker. Copilot CLI support is **beta** while we verify the `events.jsonl` record schema (which the public docs do not specify) against more real-world sessions. - **Cursor Agent _(beta)_**: `~/.cursor/hooks.json` (user), `/.cursor/hooks.json` (project) — Cursor has no `local` scope. Hook entries use the Claude-shaped `{type, command, timeout}` form (no `bash`/`powershell` split), but stored under camelCase event keys (`preToolUse`, `beforeSubmitPrompt`, …) in a flat array per Cursor's [hooks schema](https://cursor.com/docs/hooks); the file carries a top-level `version: 1` marker. The handler canonicalizes camelCase → PascalCase via `CURSOR_EVENT_MAP` so existing builtin policies fire unchanged. Cursor Agent support is **beta** while we verify Cursor's transcript on-disk format (not specified in the public docs) against more real-world installs. - - **OpenCode _(beta)_**: `~/.config/opencode/opencode.json` + `~/.config/opencode/plugins/failproofai.mjs` (user), `/.opencode/opencode.json` + `/.opencode/plugins/failproofai.mjs` (project) — OpenCode has no `local` scope. Unlike the other four CLIs, OpenCode has **no external-command hook system**: it loads in-process JS/TS plugins explicitly registered via the `plugin: []` array in `opencode.json` (auto-discovery from `.opencode/plugins/` is **not** how plugins load on opencode v1.14.33). Install drops a small generated plugin shim that subprocess-calls the failproofai binary and translates the binary's Claude-shape JSON response back into plugin semantics: `throw new Error()` for tool-event deny (cancels the tool call), `client.session.prompt(...)` for instruct AND for `Stop` / `SubagentStop` deny (submits the deny reason as the next user message — the only force-retry channel since `session.idle` is notification-only and throwing from it is a no-op), and no-op for allow. Sessions live in opencode's SQLite DB at `~/.local/share/opencode/opencode.db`; the dashboard's session viewer reads them via `opencode db --format json` and `opencode export `. OpenCode support is **beta** while we verify behavior across versions and against more real-world sessions. See the [OpenCode plugins docs](https://opencode.ai/docs/plugins/). + - **OpenCode _(beta)_**: `~/.config/opencode/opencode.json` + `~/.config/opencode/plugins/failproofai.mjs` (user), `/.opencode/opencode.json` + `/.opencode/plugins/failproofai.mjs` (project) — OpenCode has no `local` scope. Unlike the other six CLIs, OpenCode has **no external-command hook system**: it loads in-process JS/TS plugins explicitly registered via the `plugin: []` array in `opencode.json` (auto-discovery from `.opencode/plugins/` is **not** how plugins load on opencode v1.14.33). Install drops a small generated plugin shim that subprocess-calls the failproofai binary and translates the binary's Claude-shape JSON response back into plugin semantics: `throw new Error()` for tool-event deny (cancels the tool call), `client.session.prompt(...)` for instruct AND for `Stop` / `SubagentStop` deny (submits the deny reason as the next user message — the only force-retry channel since `session.idle` is notification-only and throwing from it is a no-op), and no-op for allow. Sessions live in opencode's SQLite DB at `~/.local/share/opencode/opencode.db`; the dashboard's session viewer reads them via `opencode db --format json` and `opencode export `. OpenCode support is **beta** while we verify behavior across versions and against more real-world sessions. See the [OpenCode plugins docs](https://opencode.ai/docs/plugins/). - **Pi _(beta)_**: `~/.pi/agent/settings.json` (user), `/.pi/settings.json` (project) — Pi has no `local` scope. Pi loads TypeScript extension packages at startup; the settings file is a flat string array `{"packages": ["./relative/path", …]}`. failproofai writes a single packages-array entry pointing at its bundled `pi-extension/` directory. The extension internally subscribes to Pi's `tool_call` / `user_bash` / `input` / `session_start` events and shells out to `failproofai --hook --cli pi`; the handler canonicalizes underscore_lower_snake_case → PascalCase via `PI_EVENT_MAP` so existing builtin policies fire unchanged. Pi support is **beta** while Pi's extension API and session-log layout stabilize. - **Gemini CLI _(beta)_**: `~/.gemini/settings.json` (user), `/.gemini/settings.json` (project) — Gemini has no `local` scope (it documents a `system` scope at `/etc/gemini-cli/settings.json` which failproofai does not expose). Hook entries use Claude's `{type, command, timeout}` form wrapped in Gemini's `{matcher, hooks: [...]}` matcher schema with `matcher: "*"` by default. Events are PascalCase (`SessionStart`, `BeforeAgent`, `AfterAgent`, `BeforeModel`, `AfterModel`, `BeforeToolSelection`, `BeforeTool`, `AfterTool`, `PreCompress`, `Notification`, `SessionEnd`); the handler maps to Claude canonical names via `GEMINI_EVENT_MAP`. Tool names are snake_case (`run_shell_command`, `read_file`, `write_file`, `replace`, …) — the handler canonicalizes via `GEMINI_TOOL_MAP` so existing builtin policies fire unchanged. The policy evaluator emits Gemini's flat `{decision: "deny", reason}` shape (preferred per Gemini's "Golden Rule" exit-0 contract), `{hookSpecificOutput: {hookEventName, additionalContext}}` for context injection on BeforeAgent / AfterTool / SessionStart, and `{decision: "block", reason}` on AfterAgent for force-retry semantics. Gemini CLI support is **beta** while we widen real-world coverage. See the [Gemini CLI hooks docs](https://geminicli.com/docs/hooks/). - **`policies-config.json`** — tells failproofai which policies to evaluate and with what params (shared across all agent CLIs)