Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

### Fixes
- Activity dashboard: populate the `Transcript:` field across every harness, not just Claude. Only Claude's hook stdin reliably carries `transcript_path`; Codex/Copilot/Cursor don't include one, the OpenCode and Pi shims don't forward one, and Gemini's coverage is uneven across versions, so before this fix the `/policies` activity detail panel rendered `Transcript: —` for nearly every non-Claude row even though every harness *except* OpenCode has the transcript on disk and the repo already shipped per-CLI `find*Transcript(sessionId)` helpers (`lib/codex-sessions.ts`, `lib/copilot-sessions.ts`, `lib/cursor-sessions.ts`, `lib/pi-sessions.ts`, `lib/gemini-sessions.ts`). New `src/hooks/resolve-transcript-path.ts` mirrors the existing `resolve-permission-mode.ts` dispatch pattern: trust `parsed.transcript_path` from stdin first, then fall back to the per-CLI helper when sessionId is known. OpenCode (transcripts in `~/.local/share/opencode/opencode.db`, no on-disk file) gets a synthetic `opencode-db://<sessionId>` marker so the field is non-empty, distinguishable from a genuine miss, and parseable by tooling — both detail panels (`app/components/session-hooks-panel.tsx`, `app/policies/hooks-client.tsx`) render an extra muted "(stored in opencode DB)" suffix when the value carries that scheme. The handler-side fallback covers OpenCode and Pi without touching their shims (no duplicated disk-walk; the Pi shim already discovers session IDs from disk for related reasons in `pi-extension/index.ts:discoverPiSessionId`). New `__tests__/hooks/resolve-transcript-path.test.ts` is a 23-case matrix: stdin-passthrough for all 7 CLIs, missing-sessionId returns undefined for all 7, per-CLI fallback dispatch, and stdin-precedence-beats-fallback (#296).
- Extend per-CLI tool-name canonicalization across the four CLIs PR #293 left incomplete: Copilot's `view` (used for both file reads and directory listings — empirically confirmed against Copilot CLI 1.0.39 with `{"toolName":"view","arguments":{"path":"/some/dir"}}`), Cursor's `Shell` (Cursor's name for what Claude calls `Bash`; PR #293 left Cursor as passthrough), Codex's `apply_patch` and `write_stdin` (Codex was passthrough), plus OpenCode's `apply_patch` and `websearch`. User-reported regression: under Copilot CLI, listing `$HOME` via `view` ran successfully despite an enabled `block-read-outside-cwd` policy (the same `ls -la` flow PR #293 already fixed for Bash). Adds `CURSOR_TOOL_MAP` and `CODEX_TOOL_MAP` (handler-side; mirror `COPILOT_TOOL_MAP` / `GEMINI_TOOL_MAP`) and extends `COPILOT_TOOL_MAP` with the full Copilot CLI tool surface — `view`/`show_file` → `Read`, `create` → `Write`, `apply_patch` → `Edit`, `web_fetch` → `WebFetch`, `powershell` and the eight `*_bash` / `*_powershell` session-management tools → `Bash`, `rg` → `Grep`. New e2e regression test in `__tests__/e2e/hooks/copilot-integration.e2e.test.ts` pins the `view` fix; new unit-test blocks in `__tests__/hooks/handler.test.ts` cover every Cursor and Codex map entry plus passthrough for unmapped tools (#295).
- Canonicalize tool names across all agent CLIs so builtin Bash/Read/Write/Edit policies fire under Copilot, OpenCode, and Pi (verified for Codex/Cursor/Gemini). Builtin policies match tool names in PascalCase (`["Bash"]`, `["Read","Glob","Grep","Bash"]`, …) via case-sensitive `Array.includes`, but Copilot's tool registry emits lowercase IDs (`bash`, `read`, …) and OpenCode's plugin SDK exposes the same. Without canonicalization every Bash/Read/Write/Edit builtin silently no-ops under those CLIs. Adds `COPILOT_TOOL_MAP` (handler-side) and `OPENCODE_TOOL_MAP` / `PI_TOOL_MAP` (shim-side, embedded inline in the self-contained plugin shims). User-reported regression: under Copilot CLI, `ls -la --almost-all $HOME | sed -n '1,200p'` ran successfully despite an enabled `block-read-outside-cwd` policy. Also fixes the Pi shim's naive `charAt(0).toUpperCase()` heuristic which only worked for single-word tool IDs (`bash` → `Bash`) but would have mis-canonicalized future multi-word tools (`todo_write` → `Todo_write`, not `TodoWrite`). E2e fixture for Copilot now uses the real lowercase shape so the suite catches future regressions in this layer (#293).
- Session log viewer: stop rendering log entries at the wrong y-offset (which exposed the page background and looked like the page "going blue" while scrolling). `app/components/raw-log-viewer.tsx` was capturing the virtualizer's `scrollMargin` once via a callback ref on the list wrapper's `offsetTop`. That ref fires only on mount, so any layout shift above the list — most commonly the async `searchHookActivityAction` resolving and mounting the `<SessionHooksPanel>` panel above the Logs section, but also Subagents-section / per-subagent collapse-expand, and window resizes that re-flow the StatsBar / ToolStatsGrid — left `scrollMargin` stale. The `useWindowVirtualizer` then computed the wrong visible window in list-local coordinates and positioned each item at `transform: translateY(virtualRow.start - staleScrollMargin)`, so items appeared shifted by the layout-delta (typically tens to hundreds of pixels). Replace the callback ref with a stable `useRef<HTMLDivElement>` and a `useLayoutEffect` that reads `getBoundingClientRect().top + window.scrollY` (more robust than `offsetTop` against future positioned ancestors) and re-reads it from a `ResizeObserver` watching `document.body` plus a `window` resize listener. Functional `setScrollMargin(prev => prev === top ? prev : top)` short-circuits same-value updates so the body-resize that the state-update itself causes can't loop. (#292)
- Detect when `failproofai` on the user's PATH is shadowed by a different, older install (classic cause: a leftover `bun link` from a prior dev session, or a previously-installed `bun install -g failproofai` whose `~/.bun/bin` prefix sorts ahead of npm's). New `scripts/install-diagnosis.mjs` helper resolves the PATH-first install via `command -v` (POSIX) / `where` (Win32), compares its package root + version against the running install, and surfaces a copy-pasteable cleanup command (`rm -f ~/.bun/bin/failproofai && rm -rf ~/.bun/install/global/node_modules/failproofai` for bun-side shadows, `npm rm -g failproofai` for npm-side ones). Wired into two places: (1) `scripts/postinstall.mjs` warns at install time when the just-installed copy is being shadowed, before the customer ever sees the runtime error, (2) `scripts/launch.ts` rewrites the existing "Cannot find server.js at" error to point at the actual stale install (with both versions and the cleanup command) when the missing build output is caused by a PATH shadow rather than a genuinely broken build. Replaces the previous misleading recommendation (`npm install -g failproofai@latest`) which doesn't help when the new install is itself being shadowed (#286).
Expand Down
16 changes: 16 additions & 0 deletions __tests__/e2e/helpers/payloads.ts
Original file line number Diff line number Diff line change
Expand Up @@ -299,6 +299,22 @@ export const CopilotPayloads = {
tool_input: { file_path: filePath },
};
},
// Copilot's `view` reads files OR lists directory contents depending on
// whether `path` resolves to a file or a dir — verified empirically
// against Copilot CLI 1.0.39 (`{"toolName":"view","arguments":{"path":"/some/dir"}}`).
// Canonicalizes to `Read`; the block-read-outside-cwd policy reads
// tool_input.path as a fallback to file_path so directory listings get
// covered by the same path check.
view(path: string, cwd: string): Record<string, unknown> {
return {
session_id: COPILOT_SESSION_ID,
transcript_path: TRANSCRIPT_PATH,
cwd,
hook_event_name: "PreToolUse",
tool_name: "view",
tool_input: { path },
};
},
},
postToolUse: {
bash(command: string, output: string, cwd: string): Record<string, unknown> {
Expand Down
22 changes: 22 additions & 0 deletions __tests__/e2e/hooks/copilot-integration.e2e.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -178,6 +178,28 @@ describe("E2E: Copilot integration — hook protocol", () => {
env.cleanup();
}
});

// Regression for #295: Copilot uses a single `view` tool for both file
// reads AND directory listings. Without `view → Read` in COPILOT_TOOL_MAP
// the case-sensitive registry filter skips `block-read-outside-cwd`
// entirely. Real session evidence: ~/.copilot/session-state/.../events.jsonl
// shows `{"toolName":"view","arguments":{"path":"/home/nivedit"}}` — i.e.
// listing $HOME via a single `view` call instead of `bash ls`.
it("blocks `view` of a path outside cwd under Copilot (regression for #295)", () => {
const env = createCopilotEnv();
try {
writeConfig(env.cwd, ["block-read-outside-cwd"]);
const outsidePath = "/etc";
const result = runHook(
"PreToolUse",
CopilotPayloads.preToolUse.view(outsidePath, env.cwd),
{ homeDir: env.home, cli: "copilot" },
);
assertPreToolUseDeny(result);
} finally {
env.cleanup();
}
});
});

describe("E2E: Copilot integration — install/uninstall", () => {
Expand Down
98 changes: 98 additions & 0 deletions __tests__/hooks/handler.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -272,15 +272,33 @@ describe("hooks/handler", () => {

it("canonicalizes every Copilot tool name in COPILOT_TOOL_MAP", async () => {
const { evaluatePolicies } = await import("../../src/hooks/policy-evaluator");
// `view → Read` is listed first because it's the regression that drove
// this audit (Copilot uses `view` for both file reads and directory
// listings, so without canonicalization `block-read-outside-cwd` no-ops).
const cases: Array<[string, string]> = [
["view", "Read"],
["bash", "Bash"],
["powershell", "Bash"],
["list_bash", "Bash"],
["read_bash", "Bash"],
["stop_bash", "Bash"],
["write_bash", "Bash"],
["list_powershell", "Bash"],
["read_powershell", "Bash"],
["stop_powershell", "Bash"],
["write_powershell", "Bash"],
["read", "Read"],
["show_file", "Read"],
["write", "Write"],
["create", "Write"],
["edit", "Edit"],
["apply_patch", "Edit"],
["str_replace_editor", "Edit"],
["glob", "Glob"],
["grep", "Grep"],
["rg", "Grep"],
["ls", "LS"],
["web_fetch", "WebFetch"],
];
for (const [raw, canonical] of cases) {
vi.mocked(evaluatePolicies).mockResolvedValueOnce({
Expand Down Expand Up @@ -314,6 +332,86 @@ describe("hooks/handler", () => {
);
});

it("canonicalizes Cursor 'Shell' → 'Bash' so Bash builtins fire under Cursor", async () => {
const { evaluatePolicies } = await import("../../src/hooks/policy-evaluator");
vi.mocked(evaluatePolicies).mockResolvedValueOnce({
exitCode: 0, stdout: "", stderr: "", policyName: null, reason: null, decision: "allow",
});
mockStdin(JSON.stringify({ tool_name: "Shell", hook_event_name: "preToolUse" }));
const { persistHookActivity } = await import("../../src/hooks/hook-activity-store");

await handleHookEvent("preToolUse", "cursor");

// Pre-fix: case-sensitive `Array.includes` would skip every builtin
// matching `["Bash"]` because Cursor sent `Shell`.
expect(evaluatePolicies).toHaveBeenCalledWith(
"PreToolUse",
expect.objectContaining({ tool_name: "Bash" }),
expect.any(Object),
expect.any(Object),
);
expect(persistHookActivity).toHaveBeenCalledWith(
expect.objectContaining({ integration: "cursor", toolName: "Bash" }),
);
});

it("passes through other Cursor tool names (Read/Write/Grep already canonical, MCP:* unchanged)", async () => {
const { evaluatePolicies } = await import("../../src/hooks/policy-evaluator");
const cases: string[] = ["Read", "Write", "Grep", "Delete", "Task", "MCP:linear/create_issue"];
for (const raw of cases) {
vi.mocked(evaluatePolicies).mockResolvedValueOnce({
exitCode: 0, stdout: "", stderr: "", policyName: null, reason: null, decision: "allow",
});
mockStdin(JSON.stringify({ tool_name: raw, hook_event_name: "preToolUse" }));
await handleHookEvent("preToolUse", "cursor");
expect(evaluatePolicies).toHaveBeenLastCalledWith(
"PreToolUse",
expect.objectContaining({ tool_name: raw }),
expect.any(Object),
expect.any(Object),
);
}
});

it("canonicalizes Codex 'apply_patch' → 'Edit' and 'write_stdin' → 'Bash'", async () => {
const { evaluatePolicies } = await import("../../src/hooks/policy-evaluator");
const cases: Array<[string, string]> = [
["apply_patch", "Edit"],
["write_stdin", "Bash"],
];
for (const [raw, canonical] of cases) {
vi.mocked(evaluatePolicies).mockResolvedValueOnce({
exitCode: 0, stdout: "", stderr: "", policyName: null, reason: null, decision: "allow",
});
mockStdin(JSON.stringify({ tool_name: raw, hook_event_name: "pre_tool_use" }));
await handleHookEvent("pre_tool_use", "codex");
expect(evaluatePolicies).toHaveBeenLastCalledWith(
"PreToolUse",
expect.objectContaining({ tool_name: canonical }),
expect.any(Object),
expect.any(Object),
);
}
});

it("passes through Codex Bash + MCP tool names unchanged (already canonical / no canonical equivalent)", async () => {
const { evaluatePolicies } = await import("../../src/hooks/policy-evaluator");
const cases: string[] = ["Bash", "mcp__filesystem__read_file"];
for (const raw of cases) {
vi.mocked(evaluatePolicies).mockResolvedValueOnce({
exitCode: 0, stdout: "", stderr: "", policyName: null, reason: null, decision: "allow",
});
mockStdin(JSON.stringify({ tool_name: raw, hook_event_name: "pre_tool_use" }));
await handleHookEvent("pre_tool_use", "codex");
expect(evaluatePolicies).toHaveBeenLastCalledWith(
"PreToolUse",
expect.objectContaining({ tool_name: raw }),
expect.any(Object),
expect.any(Object),
);
}
});

it("canonicalizes Cursor camelCase event names to PascalCase before evaluating", async () => {
const { evaluatePolicies } = await import("../../src/hooks/policy-evaluator");
vi.mocked(evaluatePolicies).mockResolvedValueOnce({
Expand Down
2 changes: 2 additions & 0 deletions __tests__/hooks/opencode-plugin-shim.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -177,10 +177,12 @@ describe("OpenCode plugin shim — translation of plugin events to binary stdin"
["read", "Read"],
["write", "Write"],
["edit", "Edit"],
["apply_patch", "Edit"],
["glob", "Glob"],
["grep", "Grep"],
["list", "LS"],
["webfetch", "WebFetch"],
["websearch", "WebSearch"],
["todowrite", "TodoWrite"],
["todoread", "TodoRead"],
];
Expand Down
31 changes: 20 additions & 11 deletions src/hooks/handler.ts
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,16 @@ import type {
PiHookEventType,
GeminiHookEventType,
} from "./types";
import { CODEX_EVENT_MAP, CURSOR_EVENT_MAP, PI_EVENT_MAP, GEMINI_EVENT_MAP, GEMINI_TOOL_MAP, COPILOT_TOOL_MAP } from "./types";
import {
CODEX_EVENT_MAP,
CURSOR_EVENT_MAP,
PI_EVENT_MAP,
GEMINI_EVENT_MAP,
GEMINI_TOOL_MAP,
COPILOT_TOOL_MAP,
CURSOR_TOOL_MAP,
CODEX_TOOL_MAP,
} from "./types";
import type { PolicyFunction, PolicyResult } from "./policy-types";
import { readMergedHooksConfig } from "./hooks-config";
import { registerBuiltinPolicies } from "./builtin-policies";
Expand Down Expand Up @@ -71,10 +80,12 @@ function canonicalizeEventType(raw: string, cli: IntegrationType): HookEventType
*
* Per-CLI tool-name shapes (verified from in-repo evidence and vendor docs):
* • Claude: PascalCase native — passthrough
* • Codex: PascalCase per current install — passthrough (no map until
* empirical evidence shows otherwise)
* • Copilot: lowercase IDs (`bash`, `read`, …) — COPILOT_TOOL_MAP
* • Cursor: PascalCase per Cursor docs — passthrough
* • Codex: `Bash` PascalCase passthrough; `apply_patch` → `Edit`,
* `write_stdin` → `Bash` via CODEX_TOOL_MAP
* • Copilot: lowercase IDs (`bash`, `read`, `view`, …) — COPILOT_TOOL_MAP
* • Cursor: PascalCase per Cursor docs but uses `Shell` for the bash-
* equivalent — CURSOR_TOOL_MAP rewrites `Shell → Bash`; other
* tool names already canonical and pass through
* • OpenCode: handled in the OpenCode plugin shim (in-process,
* self-contained) before the JSON crosses to this binary
* • Pi: handled in the Pi extension shim (same)
Expand All @@ -85,12 +96,10 @@ function canonicalizeEventType(raw: string, cli: IntegrationType): HookEventType
*/
function canonicalizeToolName(raw: string | undefined, cli: IntegrationType): string | undefined {
if (!raw) return raw;
if (cli === "copilot") {
return COPILOT_TOOL_MAP[raw] ?? raw;
}
if (cli === "gemini") {
return GEMINI_TOOL_MAP[raw] ?? raw;
}
if (cli === "copilot") return COPILOT_TOOL_MAP[raw] ?? raw;
if (cli === "cursor") return CURSOR_TOOL_MAP[raw] ?? raw;
if (cli === "codex") return CODEX_TOOL_MAP[raw] ?? raw;
if (cli === "gemini") return GEMINI_TOOL_MAP[raw] ?? raw;
return raw;
}

Expand Down
2 changes: 2 additions & 0 deletions src/hooks/integrations.ts
Original file line number Diff line number Diff line change
Expand Up @@ -736,10 +736,12 @@ const TOOL_NAME_MAP = {
read: "Read",
write: "Write",
edit: "Edit",
apply_patch: "Edit",
glob: "Glob",
grep: "Grep",
list: "LS",
webfetch: "WebFetch",
websearch: "WebSearch",
todowrite: "TodoWrite",
todoread: "TodoRead",
};
Expand Down
Loading