Skip to content

[luv-295] fix: canonicalize Copilot view + complete PR #293 audit#297

Merged
NiveditJain merged 1 commit intomainfrom
luv-295
May 5, 2026
Merged

[luv-295] fix: canonicalize Copilot view + complete PR #293 audit#297
NiveditJain merged 1 commit intomainfrom
luv-295

Conversation

@NiveditJain
Copy link
Copy Markdown
Member

@NiveditJain NiveditJain commented May 5, 2026

Summary

PR #293 wired per-CLI tool-name canonicalization so builtin policies that match Claude PascalCase names (Bash, Read, Write, Edit, Glob, Grep) fire under non-Claude CLIs. The map for Copilot was incomplete: Copilot's view tool — used by the model both to read files and to list directory contents — was not mapped, so block-read-outside-cwd never fired on view calls.

User-reported regression: with block-read-outside-cwd enabled under Copilot CLI, the model could still list $HOME via a view call (a single tool invocation with tool_input: {path: "/home/user"}) where PR #293 had only fixed the bash ls -la flow. Empirical confirmation in this user's local session log at ~/.copilot/session-state/.../events.jsonl: {"type":"tool.execution_start","data":{"toolName":"view","arguments":{"path":"/home/nivedit"}}} against Copilot CLI 1.0.39.

While auditing the seven supported CLIs against their public tool registries plus on-disk session evidence, three more gaps in the same class came up:

Changes

  • Extend COPILOT_TOOL_MAP in src/hooks/types.ts with the full Copilot CLI tool surface.
  • Extend OPENCODE_TOOL_MAP with apply_patch → Edit, websearch → WebSearch. Mirror the same entries in the OpenCode shim template at src/hooks/integrations.ts:734 (the shim must stay self-contained — it's loaded in-process by opencode).
  • Add CURSOR_TOOL_MAP (Shell → Bash) and CODEX_TOOL_MAP (apply_patch → Edit, write_stdin → Bash) in src/hooks/types.ts, plus matching cursor/codex branches in handler.ts:canonicalizeToolName.
  • apply_patch maps to Edit (not Write) for consistency with the existing str_replace_editor → Edit entry; the choice was confirmed via AskUserQuestion. The trade-off: Edit preserves Claude semantics (Claude's own Edit tool doesn't trigger block-write-outside-cwd either), while Write would have been stricter but inconsistent with Claude.

Test plan

  • __tests__/hooks/handler.test.ts — extended the existing per-Copilot canonicalization loop to cover every new entry (with [view, Read] listed first as the regression anchor); added new test blocks for Cursor (Shell → Bash, plus passthrough for Read/Write/Grep/Delete/Task/MCP:*) and Codex (apply_patch → Edit, write_stdin → Bash, plus passthrough for Bash/mcp__*).
  • __tests__/e2e/hooks/copilot-integration.e2e.test.ts — new pinned regression test "blocks view of a path outside cwd under Copilot (regression for fix: write Codex session cache atomically #295)" mirroring the PR [luv-293] fix: canonicalize tool names across all agent CLIs so builtin policies fire #293 ls -la test, plus a new CopilotPayloads.preToolUse.view factory in __tests__/e2e/helpers/payloads.ts.
  • __tests__/hooks/opencode-plugin-shim.test.ts — extended the OPENCODE_TOOL_MAP coverage loop with apply_patch and websearch.
  • bun run test:run → 1461 passed, 0 failed.
  • bun run test:e2e → 291 passed, 0 failed (Copilot e2e went 11 → 12).
  • bunx tsc --noEmit → clean.
  • bun run lint → clean (1 pre-existing warning unrelated).
  • Manual repro of all three new mappings:
    • Copilot view of a path outside cwd → permissionDecision: deny via block-read-outside-cwd.
    • Cursor Shell sudo command → permission: deny via block-sudo.
    • Codex write_stdin rm-rf command → permissionDecision: deny via block-rm-rf (canonicalized to Bash).

Out of scope (deferred)

The dogfood .opencode/plugins/failproofai.mjs lacks any tool-name canonicalization (regression from PR #293 — the template was updated but the dogfood file is hand-maintained). Production users get the correct map via the template; only contributors running OpenCode against this repo are affected. Reading or rewriting that file requires temporarily disabling the block-read-outside-cwd agent-settings guard — worth a separate follow-up PR.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Bug Fixes

    • Resolved cross-CLI consistency regressions in tool handling and policy routing across supported platforms.
  • New Features

    • Extended standardized tool-name mappings across Copilot, Cursor, Codex, and Pi for improved consistency.
  • Tests

    • Added comprehensive test coverage for CLI-specific tool mappings and fallback behavior.

…atch + complete the PR #293 audit

PR #293 wired per-CLI tool-name canonicalization so builtin policies that match Claude PascalCase names (`Bash`, `Read`, `Write`, `Edit`, `Glob`, `Grep`) fire under non-Claude CLIs. The map for Copilot was incomplete: Copilot's `view` tool — used by the model both to read files and to list directory contents — was not mapped, so `block-read-outside-cwd` never fired on `view` calls. User-reported regression: with `block-read-outside-cwd` enabled under Copilot CLI, the model could still list `$HOME` via a `view` call (a single tool invocation with `tool_input: {path: "/home/user"}`) where PR #293 had only fixed the `bash ls -la` flow.

Empirical confirmation in this user's local session log at `~/.copilot/session-state/.../events.jsonl`: `{"type":"tool.execution_start","data":{"toolName":"view","arguments":{"path":"/home/nivedit"}}}` against Copilot CLI 1.0.39.

Auditing all seven supported CLIs against their public tool registries plus on-disk session evidence revealed three more gaps in the same class:

- Copilot was missing `view`, `create`, `apply_patch`, `web_fetch`, `powershell`, `*_bash`/`*_powershell` (the eight session-management tools), `rg`, `show_file` — directory/file reads, file creation, patches, PowerShell, web fetches all bypassed policies.
- Cursor (PR #293 left it as passthrough) — Cursor uses `Shell` for what Claude calls `Bash`, so every Bash builtin (`block-sudo`, `block-rm-rf`, `block-read-outside-cwd` Bash branch, …) silently no-op'd on Cursor sessions.
- Codex (PR #293 left it as passthrough) — Codex hooks report `tool_name: "apply_patch"` even when matchers say `Edit`/`Write`; live sessions also expose `write_stdin` which sends input to a running shell.
- OpenCode was missing `apply_patch` and `websearch`.

Fix:
- Extend `COPILOT_TOOL_MAP` in `src/hooks/types.ts` with the full Copilot CLI tool surface — `view`/`show_file` → `Read`, `create` → `Write`, `apply_patch` → `Edit`, `web_fetch` → `WebFetch`, `powershell` and the `*_bash`/`*_powershell` session-management tools → `Bash`, `rg` → `Grep`.
- Extend `OPENCODE_TOOL_MAP` with `apply_patch` → `Edit`, `websearch` → `WebSearch`. Mirror the same entries in the OpenCode shim template at `src/hooks/integrations.ts:734` (the shim must stay self-contained — it's loaded in-process by opencode).
- Add `CURSOR_TOOL_MAP` (`Shell` → `Bash`) and `CODEX_TOOL_MAP` (`apply_patch` → `Edit`, `write_stdin` → `Bash`) in `src/hooks/types.ts`, plus matching cursor/codex branches in `handler.ts:canonicalizeToolName`. `apply_patch` maps to `Edit` (not `Write`) for consistency with the existing `str_replace_editor` → `Edit` entry; the choice was confirmed via AskUserQuestion. The trade-off is documented: `Edit` preserves Claude semantics (Claude's own `Edit` tool doesn't trigger `block-write-outside-cwd` either), while `Write` would have been stricter but inconsistent with Claude.

Tests:
- `__tests__/hooks/handler.test.ts` — extend the existing per-Copilot canonicalization loop to cover every new entry (with `[view, Read]` listed first as the regression anchor); add new test blocks for Cursor (`Shell` → `Bash`, plus passthrough for `Read`/`Write`/`Grep`/`Delete`/`Task`/`MCP:*`) and Codex (`apply_patch` → `Edit`, `write_stdin` → `Bash`, plus passthrough for `Bash`/`mcp__*`).
- `__tests__/e2e/hooks/copilot-integration.e2e.test.ts` — pinned regression test "blocks `view` of a path outside cwd under Copilot (regression for #295)" mirroring the PR #293 `ls -la` test, plus a new `CopilotPayloads.preToolUse.view` factory in `__tests__/e2e/helpers/payloads.ts`.
- `__tests__/hooks/opencode-plugin-shim.test.ts` — extend the OPENCODE_TOOL_MAP coverage loop with `apply_patch` and `websearch`.

Verified: `bun run test:run` → 1461 passed, `bun run test:e2e` → 291 passed (Copilot e2e went 11 → 12), `bunx tsc --noEmit` → clean. Manual repro of all three: Copilot `view /etc` denies, Cursor `Shell sudo …` denies, Codex `write_stdin` denies (canonicalized to `Bash`).

Pre-existing item not in this PR: the dogfood `.opencode/plugins/failproofai.mjs` was never updated with `TOOL_NAME_MAP` after PR #293 (the template was updated but the dogfood file is hand-maintained). Production users get the correct map via the template; only contributors running OpenCode against this repo are affected. Reading or rewriting that file requires temporarily disabling the `block-read-outside-cwd` agent-settings guard — deferred to a follow-up PR.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 5, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9e72a06d-0876-444a-9a5d-d4656fe82da3

📥 Commits

Reviewing files that changed from the base of the PR and between 5ac5312 and cdf022c.

📒 Files selected for processing (8)
  • CHANGELOG.md
  • __tests__/e2e/helpers/payloads.ts
  • __tests__/e2e/hooks/copilot-integration.e2e.test.ts
  • __tests__/hooks/handler.test.ts
  • __tests__/hooks/opencode-plugin-shim.test.ts
  • src/hooks/handler.ts
  • src/hooks/integrations.ts
  • src/hooks/types.ts

📝 Walkthrough

Walkthrough

This PR centralizes and expands tool-name canonicalization across six agent CLIs (Copilot, Cursor, Codex, Gemini, OpenCode, Pi). It introduces CURSOR_TOOL_MAP and CODEX_TOOL_MAP as new exports, extends existing maps (COPILOT_TOOL_MAP, OPENCODE_TOOL_MAP) with broader tool coverage, updates the handler canonicalization logic to use these maps, and adds comprehensive unit and e2e test coverage for per-CLI mappings and fallback behavior.

Changes

Tool-Name Canonicalization Across Agent CLIs

Layer / File(s) Summary
Type Exports & Canonical Maps
src/hooks/types.ts
Adds CODEX_TOOL_MAP and CURSOR_TOOL_MAP exports; expands COPILOT_TOOL_MAP with shell variants (powershell, bash-alikes), view/read/write/edit/patch operations, and fetch/search tools; extends OPENCODE_TOOL_MAP with apply_patch, glob, grep, list, webfetch, websearch, todo tools.
Handler Canonicalization Logic
src/hooks/handler.ts
Updates imports to include CODEX_TOOL_MAP, CURSOR_TOOL_MAP, and event maps; expands CanonicalizeToolName to route cursor, codex, copilot, and gemini CLIs to their respective tool maps with concise inline lookups.
Integration Shim Updates
src/hooks/integrations.ts
Extends OpenCode plugin TOOL_NAME_MAP with apply_patch → Edit and websearch → WebSearch mappings, aligning with broader tool surface.
Test Payloads & Fixtures
__tests__/e2e/helpers/payloads.ts
Adds CopilotPayloads.preToolUse.view(path, cwd) helper for e2e testing Copilot's view operation.
Unit Test Coverage
__tests__/hooks/handler.test.ts, __tests__/hooks/opencode-plugin-shim.test.ts
Expands canonicalization test blocks with per-CLI cases (Cursor Shell → Bash, Codex apply_patch/write_stdin mappings, Gemini event/tool mappings), unmapped tool fallthrough, telemetry tagging, and activity-store integration.
E2E Regression Tests
__tests__/e2e/hooks/copilot-integration.e2e.test.ts
Adds regression test for blocking Copilot view operation outside cwd (addresses #295).
Documentation
CHANGELOG.md
Documents per-CLI tool-name canonicalization enhancements and new test coverage.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • exospherehost/failproofai#220: Adds Codex-related canonicalization mappings (CODEX_TOOL_MAP/CODEX_EVENT_MAP) and handler logic updates with overlapping scope.
  • exospherehost/failproofai#293: Addresses the same core goal of canonicalizing tool names across CLIs so built-in policies fire, with overlapping edits to handler, types, shims, and tests.
  • exospherehost/failproofai#236: Modifies the hooks integration surface by adding/expanding CLI tool canonicalization maps and updating handler/type definitions.

Poem

🐰 A curious hop through CLI lands,
Where tool maps dance in careful hands—
Codex, Cursor, now aligned,
Each agent's tools canonized.
Policies fire true and bright! 🎯

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed Title clearly summarizes the main change: fixing Copilot view canonicalization and completing a cross-CLI audit for tool-name mappings.
Description check ✅ Passed Description is comprehensive, covering summary, detailed changes, test plan with results, and deferred work. Follows template with all required sections completed.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant