[luv-295] fix: canonicalize Copilot view + complete PR #293 audit#297
[luv-295] fix: canonicalize Copilot view + complete PR #293 audit#297NiveditJain merged 1 commit intomainfrom
Conversation
…atch + complete the PR #293 audit PR #293 wired per-CLI tool-name canonicalization so builtin policies that match Claude PascalCase names (`Bash`, `Read`, `Write`, `Edit`, `Glob`, `Grep`) fire under non-Claude CLIs. The map for Copilot was incomplete: Copilot's `view` tool — used by the model both to read files and to list directory contents — was not mapped, so `block-read-outside-cwd` never fired on `view` calls. User-reported regression: with `block-read-outside-cwd` enabled under Copilot CLI, the model could still list `$HOME` via a `view` call (a single tool invocation with `tool_input: {path: "/home/user"}`) where PR #293 had only fixed the `bash ls -la` flow. Empirical confirmation in this user's local session log at `~/.copilot/session-state/.../events.jsonl`: `{"type":"tool.execution_start","data":{"toolName":"view","arguments":{"path":"/home/nivedit"}}}` against Copilot CLI 1.0.39. Auditing all seven supported CLIs against their public tool registries plus on-disk session evidence revealed three more gaps in the same class: - Copilot was missing `view`, `create`, `apply_patch`, `web_fetch`, `powershell`, `*_bash`/`*_powershell` (the eight session-management tools), `rg`, `show_file` — directory/file reads, file creation, patches, PowerShell, web fetches all bypassed policies. - Cursor (PR #293 left it as passthrough) — Cursor uses `Shell` for what Claude calls `Bash`, so every Bash builtin (`block-sudo`, `block-rm-rf`, `block-read-outside-cwd` Bash branch, …) silently no-op'd on Cursor sessions. - Codex (PR #293 left it as passthrough) — Codex hooks report `tool_name: "apply_patch"` even when matchers say `Edit`/`Write`; live sessions also expose `write_stdin` which sends input to a running shell. - OpenCode was missing `apply_patch` and `websearch`. Fix: - Extend `COPILOT_TOOL_MAP` in `src/hooks/types.ts` with the full Copilot CLI tool surface — `view`/`show_file` → `Read`, `create` → `Write`, `apply_patch` → `Edit`, `web_fetch` → `WebFetch`, `powershell` and the `*_bash`/`*_powershell` session-management tools → `Bash`, `rg` → `Grep`. - Extend `OPENCODE_TOOL_MAP` with `apply_patch` → `Edit`, `websearch` → `WebSearch`. Mirror the same entries in the OpenCode shim template at `src/hooks/integrations.ts:734` (the shim must stay self-contained — it's loaded in-process by opencode). - Add `CURSOR_TOOL_MAP` (`Shell` → `Bash`) and `CODEX_TOOL_MAP` (`apply_patch` → `Edit`, `write_stdin` → `Bash`) in `src/hooks/types.ts`, plus matching cursor/codex branches in `handler.ts:canonicalizeToolName`. `apply_patch` maps to `Edit` (not `Write`) for consistency with the existing `str_replace_editor` → `Edit` entry; the choice was confirmed via AskUserQuestion. The trade-off is documented: `Edit` preserves Claude semantics (Claude's own `Edit` tool doesn't trigger `block-write-outside-cwd` either), while `Write` would have been stricter but inconsistent with Claude. Tests: - `__tests__/hooks/handler.test.ts` — extend the existing per-Copilot canonicalization loop to cover every new entry (with `[view, Read]` listed first as the regression anchor); add new test blocks for Cursor (`Shell` → `Bash`, plus passthrough for `Read`/`Write`/`Grep`/`Delete`/`Task`/`MCP:*`) and Codex (`apply_patch` → `Edit`, `write_stdin` → `Bash`, plus passthrough for `Bash`/`mcp__*`). - `__tests__/e2e/hooks/copilot-integration.e2e.test.ts` — pinned regression test "blocks `view` of a path outside cwd under Copilot (regression for #295)" mirroring the PR #293 `ls -la` test, plus a new `CopilotPayloads.preToolUse.view` factory in `__tests__/e2e/helpers/payloads.ts`. - `__tests__/hooks/opencode-plugin-shim.test.ts` — extend the OPENCODE_TOOL_MAP coverage loop with `apply_patch` and `websearch`. Verified: `bun run test:run` → 1461 passed, `bun run test:e2e` → 291 passed (Copilot e2e went 11 → 12), `bunx tsc --noEmit` → clean. Manual repro of all three: Copilot `view /etc` denies, Cursor `Shell sudo …` denies, Codex `write_stdin` denies (canonicalized to `Bash`). Pre-existing item not in this PR: the dogfood `.opencode/plugins/failproofai.mjs` was never updated with `TOOL_NAME_MAP` after PR #293 (the template was updated but the dogfood file is hand-maintained). Production users get the correct map via the template; only contributors running OpenCode against this repo are affected. Reading or rewriting that file requires temporarily disabling the `block-read-outside-cwd` agent-settings guard — deferred to a follow-up PR. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (8)
📝 WalkthroughWalkthroughThis PR centralizes and expands tool-name canonicalization across six agent CLIs (Copilot, Cursor, Codex, Gemini, OpenCode, Pi). It introduces CURSOR_TOOL_MAP and CODEX_TOOL_MAP as new exports, extends existing maps (COPILOT_TOOL_MAP, OPENCODE_TOOL_MAP) with broader tool coverage, updates the handler canonicalization logic to use these maps, and adds comprehensive unit and e2e test coverage for per-CLI mappings and fallback behavior. ChangesTool-Name Canonicalization Across Agent CLIs
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
Comment |
Summary
PR #293 wired per-CLI tool-name canonicalization so builtin policies that match Claude PascalCase names (
Bash,Read,Write,Edit,Glob,Grep) fire under non-Claude CLIs. The map for Copilot was incomplete: Copilot'sviewtool — used by the model both to read files and to list directory contents — was not mapped, soblock-read-outside-cwdnever fired onviewcalls.User-reported regression: with
block-read-outside-cwdenabled under Copilot CLI, the model could still list$HOMEvia aviewcall (a single tool invocation withtool_input: {path: "/home/user"}) where PR #293 had only fixed thebash ls -laflow. Empirical confirmation in this user's local session log at~/.copilot/session-state/.../events.jsonl:{"type":"tool.execution_start","data":{"toolName":"view","arguments":{"path":"/home/nivedit"}}}against Copilot CLI 1.0.39.While auditing the seven supported CLIs against their public tool registries plus on-disk session evidence, three more gaps in the same class came up:
view,create,apply_patch,web_fetch,powershell,*_bash/*_powershell(the eight session-management tools),rg,show_file— directory/file reads, file creation, patches, PowerShell, web fetches all bypassed policies.Shellfor what Claude callsBash, so every Bash builtin (block-sudo,block-rm-rf,block-read-outside-cwdBash branch, …) silently no-op'd on Cursor sessions.tool_name: "apply_patch"even when matchers sayEdit/Write; live sessions also exposewrite_stdinwhich sends input to a running shell.apply_patchandwebsearch.Changes
COPILOT_TOOL_MAPinsrc/hooks/types.tswith the full Copilot CLI tool surface.OPENCODE_TOOL_MAPwithapply_patch → Edit,websearch → WebSearch. Mirror the same entries in the OpenCode shim template atsrc/hooks/integrations.ts:734(the shim must stay self-contained — it's loaded in-process by opencode).CURSOR_TOOL_MAP(Shell → Bash) andCODEX_TOOL_MAP(apply_patch → Edit,write_stdin → Bash) insrc/hooks/types.ts, plus matching cursor/codex branches inhandler.ts:canonicalizeToolName.apply_patchmaps toEdit(notWrite) for consistency with the existingstr_replace_editor → Editentry; the choice was confirmed via AskUserQuestion. The trade-off:Editpreserves Claude semantics (Claude's ownEdittool doesn't triggerblock-write-outside-cwdeither), whileWritewould have been stricter but inconsistent with Claude.Test plan
__tests__/hooks/handler.test.ts— extended the existing per-Copilot canonicalization loop to cover every new entry (with[view, Read]listed first as the regression anchor); added new test blocks for Cursor (Shell → Bash, plus passthrough forRead/Write/Grep/Delete/Task/MCP:*) and Codex (apply_patch → Edit,write_stdin → Bash, plus passthrough forBash/mcp__*).__tests__/e2e/hooks/copilot-integration.e2e.test.ts— new pinned regression test "blocksviewof a path outside cwd under Copilot (regression for fix: write Codex session cache atomically #295)" mirroring the PR [luv-293] fix: canonicalize tool names across all agent CLIs so builtin policies fire #293ls -latest, plus a newCopilotPayloads.preToolUse.viewfactory in__tests__/e2e/helpers/payloads.ts.__tests__/hooks/opencode-plugin-shim.test.ts— extended the OPENCODE_TOOL_MAP coverage loop withapply_patchandwebsearch.bun run test:run→ 1461 passed, 0 failed.bun run test:e2e→ 291 passed, 0 failed (Copilot e2e went 11 → 12).bunx tsc --noEmit→ clean.bun run lint→ clean (1 pre-existing warning unrelated).viewof a path outside cwd →permissionDecision: denyviablock-read-outside-cwd.Shellsudo command →permission: denyviablock-sudo.write_stdinrm-rf command →permissionDecision: denyviablock-rm-rf(canonicalized toBash).Out of scope (deferred)
The dogfood
.opencode/plugins/failproofai.mjslacks any tool-name canonicalization (regression from PR #293 — the template was updated but the dogfood file is hand-maintained). Production users get the correct map via the template; only contributors running OpenCode against this repo are affected. Reading or rewriting that file requires temporarily disabling theblock-read-outside-cwdagent-settings guard — worth a separate follow-up PR.🤖 Generated with Claude Code
Summary by CodeRabbit
Bug Fixes
New Features
Tests