Skip to content

[luv-342] feat: enforce Pi Stop policies via before_agent_start handoff#341

Merged
NiveditJain merged 2 commits intomainfrom
luv-342
May 10, 2026
Merged

[luv-342] feat: enforce Pi Stop policies via before_agent_start handoff#341
NiveditJain merged 2 commits intomainfrom
luv-342

Conversation

@NiveditJain
Copy link
Copy Markdown
Member

@NiveditJain NiveditJain commented May 10, 2026

Summary

  • Root cause: Pi's AgentEndEvent has no Result type — by the time the handler fires, Pi's agent loop has already exited, so a deny return cannot keep Pi running the way Claude's exit-2-from-Stop does. Verified against pi-coding-agent v0.73.1 d.ts at packages/coding-agent/src/core/extensions/types.ts:1112. Empirically: a user enabled require-commit-before-stop on Pi, the deny reason propagated, but Pi exited anyway.
  • Fix: the pi-extension/index.ts shim now captures the deny reason from agent_end into a per-sessionId in-memory map, then on the next before_agent_start (Pi v0.73.x — fires after a new user prompt, before the agent loop runs) returns {systemPrompt: <event.systemPrompt> + "\n\n" + reason} so the LLM sees a MANDATORY ACTION REQUIRED directive at the top of its next turn. The map is one-shot per drain and is cleared on every session_shutdown reason (including quit), so a stale gate cannot leak into a fresh session in the same Pi process.
  • Wording: policy-evaluator.ts now emits the MANDATORY ACTION REQUIRED from failproofai (policy: …) wrapper inside reason for cli === "pi" && eventType === "Stop" (deny + instruct paths), mirroring the Cursor/Gemini/Copilot/OpenCode Stop branches; non-Stop Pi events keep the existing flat {permission, reason} shape.
  • Caveat (documented): bounded by Pi process lifetime — same bound Claude has on exit-2-from-Stop. If the user kills Pi between turns, the gate is missed.

Files

  • pi-extension/index.tspendingStopBlockBySession map; agent_end captures deny; new before_agent_start handler drains; session_shutdown cleanup on every reason (sessionId captured before cache reset to avoid disk re-discovery).
  • src/hooks/policy-evaluator.ts — Pi Stop branch in deny + instruct paths.
  • __tests__/hooks/policy-evaluator.test.ts — 4 new tests pinning the Pi-Stop deny + instruct payload shapes (and regression guards confirming non-Stop Pi events keep the legacy shape) — added per CodeRabbit suggestion.
  • __tests__/hooks/pi-extension-shim.test.ts — 7 new shim tests (capture/drain/one-shot/session_shutdown-clear/no-pending-noop/missing-systemPrompt/no-resolvable-sessionId).
  • __tests__/e2e/hooks/pi-integration.e2e.test.ts — new dirty-repo Pi case asserting the binary's stdout JSON shape.
  • CLAUDE.md — capability matrix (agent_end: shifted (next turn) + new before_agent_start row, version bumped to 0.73.1).
  • CHANGELOG.md0.0.10-beta.12 — 2026-05-10 Features entry.

Test plan

  • bun run test:run — 1607/1607 pass (incl. 4 new policy-evaluator + 7 new shim tests)
  • bun run test:e2e — 298/298 pass (incl. new Pi dirty-repo case)
  • bun run lint — 0 errors (1 pre-existing img-element warning)
  • bunx tsc --noEmit — clean
  • bun build --target=node --format=cjs --outfile=dist/index.js src/index.ts + bun run build:cli — both succeed
  • Live binary smoke (dirty repo, --hook agent_end --cli pi): emits {permission:"deny", reason:"MANDATORY ACTION REQUIRED from failproofai (policy: exospherehost/require-commit-before-stop): You have uncommitted changes …"}
  • CI: quality + test + build + test-e2e all green

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Stop policies now enforce across turns for Pi: denials at end-of-turn are captured and injected as a one-shot “MANDATORY ACTION REQUIRED” directive into the next system prompt; pending blocks are cleared on session shutdown.
  • Documentation

    • Updated docs/table to describe Pi event counts and the new next-turn Stop enforcement and session handling.
  • Tests

    • Added/updated E2E and unit tests (including per-event stdout simulation) to validate Pi deny formatting and one-shot behavior.

Review Change Stack

Pi's `agent_end` event has no Result type — by the time the handler fires,
Pi's agent loop has already exited, so a deny return cannot keep Pi running
the way Claude's exit-2-from-Stop does. Empirically: a user on Pi with
`require-commit-before-stop` enabled saw the deny reason propagate but Pi
exited anyway.

Fix: the `pi-extension/index.ts` shim now captures the deny `reason` from
`agent_end` into a per-`sessionId` in-memory map and re-injects it as a
`systemPrompt` suffix on the next `before_agent_start` (Pi v0.73.x — fires
after a new user prompt, before the agent loop runs). The map is one-shot
per drain and is cleared on every `session_shutdown` reason (including
`quit`), so a stale gate cannot leak into a fresh session in the same
process. `policy-evaluator.ts` emits the `MANDATORY ACTION REQUIRED from
failproofai (policy: …)` wrapper inside `reason` for `cli === "pi" &&
eventType === "Stop"` (deny + instruct paths), mirroring the
Cursor/Gemini/Copilot/OpenCode Stop branches; non-Stop Pi events keep the
existing flat `{permission, reason}` shape.

Bounded by Pi process lifetime — same bound Claude has on exit-2-from-Stop.

Verified:
- Unit: 1603/1603 (incl. 7 new shim tests)
- E2E: 298/298 (incl. new dirty-repo Pi case)
- Live binary smoke in dirty repo emits the MANDATORY ACTION reason JSON

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 10, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 02789ac5-1c7b-4cf6-a72a-6a881c1872a7

📥 Commits

Reviewing files that changed from the base of the PR and between 0aecd8f and ebe427d.

📒 Files selected for processing (2)
  • CHANGELOG.md
  • __tests__/hooks/policy-evaluator.test.ts
✅ Files skipped from review due to trivial changes (1)
  • CHANGELOG.md

📝 Walkthrough

Walkthrough

This PR implements cross-turn Stop-policy enforcement for Pi by capturing deny reasons at the agent_end hook, storing them per-session, formatting them with "MANDATORY ACTION REQUIRED" wording in policy-evaluator, and injecting them into the next before_agent_start's systemPrompt, with cleanup on session_shutdown.

Changes

Pi Cross-turn Stop-policy Enforcement

Layer / File(s) Summary
Data Contract & Session Storage
pi-extension/index.ts
PiBeforeAgentStartEvent interface types the before_agent_start hook payload. pendingStopBlockBySession Map stores deny reasons per sessionId for cross-turn enforcement.
Stop Deny/Instruct Reason Formatting
src/hooks/policy-evaluator.ts
For Pi CLI and Stop events, both deny and instruct paths now return { permission: "deny", reason: wrappedText } with "MANDATORY ACTION REQUIRED" wording including policy attribution instead of generic deny/instruct shape.
Extension Handlers
pi-extension/index.ts
agent_end handler captures and stores deny reason by sessionId. New before_agent_start handler drains pending reason and appends it to systemPrompt (one-shot per session). session_shutdown cleanup deletes any pending entry keyed by the ending sessionId and computes sessionId before cache reset.
Mock Per-Event Stdout Support
__tests__/hooks/pi-extension-shim.test.ts
spawnSync mock parses --hook event name from args and returns configurable stdout from mockSpawnReplyByEvent map. beforeEach clears the map before each test to prevent leakage.
Unit & E2E Tests
__tests__/hooks/pi-extension-shim.test.ts, __tests__/e2e/hooks/pi-integration.e2e.test.ts, __tests__/hooks/policy-evaluator.test.ts
New unit suite covers agent_end → before_agent_start handoff including one-shot draining, session_shutdown cleanup, and no-op scenarios. Tests added to pin Pi stdout JSON shape for Stop instruct/deny. E2E test verifies agent_end hook with require-commit-before-stop policy returns correct deny reason shape and exit code.
Release Notes & Docs
CHANGELOG.md, CLAUDE.md
CHANGELOG documents the new Pi Stop-policy enforcement mechanism. CLAUDE.md updates Pi event subscription count (7→8) and describes the new cross-turn deny-reason capture and systemPrompt injection flow.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • exospherehost/failproofai#323: Modifies hooks/policy-evaluator.ts to add CLI-specific handling for Stop events, related to Pi-specific stdout shaping.
  • exospherehost/failproofai#94: Changes how policy decisions are serialized/returned in evaluatePolicies; touches similar evaluator behavior.
  • exospherehost/failproofai#109: Alters Stop-event deny/instruct handling and built-in Stop messages, related to the formatting changes here.

Poem

🐰 A rabbit's verse on Stop-denied turns:
One turn the policy says "nay,"
We store the reason, come what may,
Next turn arrives, systemPrompt gleams,
With "MANDATORY ACTION" in its beams,
Across turns we enforce, one-shot clean!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: enforcing Pi Stop policies via a before_agent_start handoff mechanism, which is the primary objective of this PR.
Description check ✅ Passed The description is comprehensive and well-structured, covering root cause, fix details, wording changes, test plan with passing results, and all required checklist items verified.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/hooks/policy-evaluator.ts (1)

191-202: ⚡ Quick win

Please pin both new Pi Stop output branches with direct evaluator tests.

The new shim/e2e coverage in this PR exercises the deny handoff, but it doesn't directly lock down these two cli === "pi" && eventType === "Stop" payload shapes—especially the instruct branch, which now also relies on permission: "deny" to trigger next-turn enforcement. A small policy-evaluator test matrix here would catch regressions before they silently weaken Pi Stop enforcement.

As per coding guidelines, "Always add unit tests for new behaviour. Place tests in tests/."

Also applies to: 438-452

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/hooks/policy-evaluator.ts` around lines 191 - 202, Add unit tests in
__tests__/ for policy-evaluator.ts that explicitly cover the pi Stop branches
where session?.cli === "pi" and eventType === "Stop": create tests that invoke
the evaluator with a pi CLI session and Stop event and assert the returned
object contains decision: "deny", policyName matching policy.name, reason
matching input, and stdout JSON with permission: "deny" and the instruct-style
reasonText; also add the parallel test covering the other Stop branch referenced
around the other block (lines ~438-452) so both deny handoffs are pinned and
regressions in permission: "deny" behavior are caught.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@CHANGELOG.md`:
- Around line 3-6: The changelog entry under the "## 0.0.10-beta.12 —
2026-05-10" header contains an unresolved placeholder "#PR"; replace that
placeholder with the actual PR number "#341" so the release note for the
pi-extension/agent_end change reads with the correct PR reference.

---

Nitpick comments:
In `@src/hooks/policy-evaluator.ts`:
- Around line 191-202: Add unit tests in __tests__/ for policy-evaluator.ts that
explicitly cover the pi Stop branches where session?.cli === "pi" and eventType
=== "Stop": create tests that invoke the evaluator with a pi CLI session and
Stop event and assert the returned object contains decision: "deny", policyName
matching policy.name, reason matching input, and stdout JSON with permission:
"deny" and the instruct-style reasonText; also add the parallel test covering
the other Stop branch referenced around the other block (lines ~438-452) so both
deny handoffs are pinned and regressions in permission: "deny" behavior are
caught.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3002b6ef-3112-45cb-9e30-2bac584a16a4

📥 Commits

Reviewing files that changed from the base of the PR and between 0fb27f6 and 0aecd8f.

📒 Files selected for processing (6)
  • CHANGELOG.md
  • CLAUDE.md
  • __tests__/e2e/hooks/pi-integration.e2e.test.ts
  • __tests__/hooks/pi-extension-shim.test.ts
  • pi-extension/index.ts
  • src/hooks/policy-evaluator.ts

Comment thread CHANGELOG.md Outdated
…elog PR ref

CodeRabbit follow-ups on PR #341:

1. Direct policy-evaluator coverage for the new Pi-Stop branches. The
   shim and e2e tests already exercise the agent_end → before_agent_start
   handoff end-to-end, but neither directly locks the JSON shape that
   policy-evaluator emits. Adds 4 tests in
   `__tests__/hooks/policy-evaluator.test.ts`:
   - Pi Stop + deny: pins `{permission:"deny", reason: /MANDATORY
     ACTION REQUIRED.*policy: exospherehost\/stop-blocker.*tests not
     run/}`.
   - Pi non-Stop deny: regression guard — PreToolUse on Pi must keep
     the legacy `{permission:"deny", reason:"Blocked Bash by failproofai
     because: …"}` shape (the Stop split must not leak into tool
     events).
   - Pi Stop + instruct: pins `{permission:"deny"}` (NOT the regular Pi
     instruct `{permission:"allow"}`) so the shim's agent_end handler
     captures the reason — silently switching back to "allow" here
     would be invisible to the shim and turn Stop instructs into a
     no-op.
   - Pi non-Stop instruct: regression guard — PreToolUse instruct on Pi
     must keep `{permission:"allow", reason: "Instruction from
     failproofai: …"}`.

2. CHANGELOG #PR placeholder → #341.

Verified: 1607/1607 unit tests pass (was 1603 + 4 new).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@NiveditJain NiveditJain merged commit ccc5546 into main May 10, 2026
9 checks passed
NiveditJain added a commit that referenced this pull request May 10, 2026
CodeRabbit catch on #342 — same placeholder fixup as #341 had.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
NiveditJain added a commit that referenced this pull request May 10, 2026
* [luv-343] docs: document per-CLI Stop semantics + Pi limitation

Follow-up to #341: explain to end-users how `require-*-before-stop`
behaves across the 7 supported CLIs, with a dedicated note for Pi.

Adds a "Per-CLI Stop semantics" subsection to the Workflow chapter of
docs/built-in-policies.mdx:

- 7-row table covering Claude / Codex / Copilot / Cursor / Gemini /
  OpenCode / Pi, showing where the gate fires and what the user sees.
- A `<Note>` callout walking through the Pi limitation: Pi's
  `AgentEndEvent` has no Result type, so failproofai shifts enforcement
  to `before_agent_start` (next user turn). Pi visibly stops between
  turns; the gate fires the moment you submit the next prompt. Bounds
  (Pi process lifetime, `session_shutdown` cleanup) are spelled out so
  users enabling Stop policies on Pi understand the behavior before
  filing a bug.

Translated copies under docs/{ar,de,es,fr,he,hi,it,ja,ko,pt-br,ru,tr,vi,zh}/
will be regenerated by the translate-docs workflow on merge.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* [luv-343] docs: replace #PR placeholder with #342 in CHANGELOG

CodeRabbit catch on #342 — same placeholder fixup as #341 had.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant