Skip to content

feat(hooks): MAP_INVOKED_BY recursion-guard contract for MAP hooks (Phase A)#152

Merged
azalio merged 11 commits into
mainfrom
prairie-dog-thermal
May 29, 2026
Merged

feat(hooks): MAP_INVOKED_BY recursion-guard contract for MAP hooks (Phase A)#152
azalio merged 11 commits into
mainfrom
prairie-dog-thermal

Conversation

@azalio
Copy link
Copy Markdown
Owner

@azalio azalio commented May 29, 2026

Summary

Phase A of the MAP hook hardening: make every MAP hook safe against recursive re-entry when a MAP workflow spawns a nested Claude/Codex subprocess (which re-fires the entire hook chain). Introduces the MAP_INVOKED_BY recursion-guard contract, a mechanical enforcer, and full test coverage.

  • REQUIRE_GUARD hooks (context/nudge/telemetry/transcript) early-exit on MAP_INVOKED_BY as the first statement of their entry point — 7 Python hooks + end-of-turn.sh.
  • FORBID_GUARD hooks (deny gates + post-compaction re-prime) must never carry the guard — applying it would be a safety/correctness regression. Enforced in both directions.
  • scripts/lint-hooks.py — AST/regex linter classifying every hook and verifying guard presence and position; wired into make lint / make check; ships a --self-test covering every failure mode.
  • .claude/references/hook-patterns.md — authoritative contract + per-hook rationale (mirrored to templates).
  • tests/test_hook_patterns.py — parametrized over dev + template trees (Claude + Codex), proving INV-A1 (FORBID hooks still deny with the flag set) and INV-A2 (REQUIRE hooks correctly positioned).
  • Also: detect_truncated_agent_output CLI now distinguishes empty stdin (status:"no_input") from truncated content; blast-radius analysis excludes the generic main entrypoint.

Code-review fixes (this PR's last two commits)

A max-effort review found no active bugs (gates were green) but surfaced soundness gaps in the new linter, now fixed:

  • Shell false negative: _SH_PREAMBLE_RE accepted VAR=$(...) / backtick command substitution before the guard — now rejected (INV-A2 honored).
  • Python false negative: module-scope fallback skipped any assignment; now skips only side-effect-free constant assignments, so a data = sys.stdin.read() before the guard flags MISPLACED. Also recognizes async def main.
  • is_recursion_guard now checks orelse as well as body.
  • FORBID enforcement aligned byte-for-byte with the pytest gate (raw-source substring) — make lint and pytest can no longer disagree on the same hook.
  • --self-test gained expected-FAIL fixtures (command-substitution, module-scope stdin read) and expected-PASS fixtures (sync/async main, module-scope-after-constant, conformant shell).
  • New drift guard test_doc_tables_match_classification cross-checks the README + hook-patterns.md tables against the linter's classification sets.
  • Minor cleanups: state, _ = read_step_state(...), dropped redundant name and, removed dead Colors fields.

Verification

make checkEXIT 0: ruff clean, mypy clean, pyright 0 errors, 0 warnings, 0 informations, lint-hooks 12/12 conform, 1776 passed, 4 skipped. Linter --self-test: 9 expected-FAIL + 4 expected-PASS. Dev ↔ template copies verified byte-identical.

Notes

  • MAP_INVOKED_BY is set by the orchestrator/harness at spawn time, not by repo code; the guard is intentionally dormant until Phase E lands (documented in hook-patterns.md).
  • scripts/lint-hooks.py is a dev-only tool and intentionally not synced to templates.

🤖 Generated with Claude Code

azalio and others added 11 commits May 29, 2026 16:10
…oks (ST-001)

Insert byte-identical early-exit guard as first main() statement in the 7
recursion-suppressed Python hooks so they no-op when MAP spawns a nested
Claude/Codex subprocess. Also add missing `import os` to
detect-clarification-triggers.py.

Fix 4 user-approved pre-existing Pyright diagnostics in
workflow-context-injector.py (3 unused tuple-unpack discards + 1 dead
parameter), behavior-preserving.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…-002)

Early-exit no-op when MAP spawned this subprocess, before any git/tooling.
nounset-safe via ${MAP_INVOKED_BY:-} default expansion.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…E (ST-004)

Document the MAP_INVOKED_BY recursion-guard contract: REQUIRE_GUARD vs
FORBID_GUARD classification with per-hook rationale, the guard-position rule
(INV-A2), a pointer to the LockState marker enum, and a Phase E forward
reference for the detached-Popen contract (explicitly not-yet-implemented).
Refresh the README inventory to all 11 hooks (adds context-meter.py +
map-token-meter.py) with a Class column.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
AST-walk .py + regex .sh over .claude/hooks and .codex/hooks, enforcing the
MAP_INVOKED_BY classification: REQUIRE_GUARD hooks must early-exit as the first
entry-function statement; FORBID_GUARD hooks must contain zero references;
unclassified hooks fail. Glob discovery (no hardcoded allowlist), argparse +
Colors + exit-code conventions mirroring lint-agent-templates.py, and a
--self-test that proves every failure mode exits non-zero.

Hardened per Monitor review: FORBID hooks are checked for zero MAP_INVOKED_BY
string constants (defeats indirect-variable guards), and shell guard detection
strips inline comments (a prose mention is not a guard).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Append `python scripts/lint-hooks.py` to the lint recipe so the recursion-guard
contract is enforced in `make lint` and `make check`. Proven: conformant tree
passes; a guard injected into a FORBID hook makes `make lint` exit non-zero.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…-007)

make sync-templates mirrors the recursion-guard edits, the refreshed README
inventory, and the new references/hook-patterns.md into src/mapify_cli/templates/.
Byte-identical dev<->template (INV-A3); idempotent; test_template_sync.py passes.
scripts/lint-hooks.py is intentionally not synced (dev tool).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
tests/test_hook_patterns.py validates the MAP_INVOKED_BY contract per-hook
across the dev and shipped trees (Claude + Codex): INV-A2 (REQUIRE_GUARD guard
positioned), INV-A1 (FORBID_GUARD guard absent + zero flag references), and the
behavioral proof that safety-guardrails still denies with the flag set.
Classification + guard detection are imported from scripts/lint-hooks.py (single
source of truth). Discovery guard asserts every tree contributes hooks (no
vacuous pass). make check is green (1767 passed).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…Phase A

Two MANDATORY gates produced misleading signals during this run:

1. Truncation gate false-positive: the CLI reads the agent response from stdin,
   but the skill documented a bare call with nothing piped → empty stdin →
   `truncated:true "empty response"` on every subtask. Add a distinct
   non-blocking `status:"no_input"` for empty stdin (a bare call is "you forgot
   to pipe", not a truncation), keep the pure function's empty→truncated
   semantics for programmatic callers, and update SKILL.md / efficient-reference
   to pipe the captured response.

2. Symbol blast-radius noise: a changed `def main()` matched the literal word
   "main" in ~168 SKILL.md / settings.json lines and recommended validate_callers.
   Exclude generic process entrypoints (`main`) from blast-radius symbols via a
   new `_is_reportable_symbol` predicate (same place dunders are already excluded);
   markdown callers of *meaningful* symbols are still flagged by design.

Regression tests for both. make check green (1772 passed).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Use python3 uniformly (lint/build/release/test-cli) for portability — `python`
is absent on macOS 12.3+ outside an activated venv. Includes the lint-hooks line.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Max-effort code review of the MAP_INVOKED_BY recursion-guard feature surfaced
soundness gaps in the new linter plus minor cleanups. `make check` stays green
(1776 passed, +4 new drift tests; ruff/mypy/pyright 0/0/0; lint-hooks 12/12).

scripts/lint-hooks.py (dev-only):
- _SH_PREAMBLE_RE: reject `VAR=$(...)` / backtick command substitution before
  the shell guard (was a false negative defeating INV-A2's position promise).
- _find_entry_body: at module scope skip only side-effect-free constant
  assignments, so a stdin-read assignment before the guard flags MISPLACED;
  also recognize `async def main`.
- is_recursion_guard: check `orelse` as well as `body`.
- FORBID check (python + shell): raw-source substring scan, aligned
  byte-for-byte with tests/test_hook_patterns.py (catches comment / embedded
  mentions the AST-constant scan missed); linter and pytest can no longer
  disagree on the same FORBID hook.
- _self_test: add expected-FAIL fixtures (shell command-substitution,
  module-scope stdin read) and 4 expected-PASS fixtures (sync/async main,
  module-scope-after-constant, conformant shell).
- drop dead Colors.YELLOW / Colors.BLUE.

tests/test_hook_patterns.py:
- add test_doc_tables_match_classification: drift guard cross-checking the
  README + hook-patterns.md tables (dev + template trees) against the
  lint-hooks REQUIRE_GUARD / FORBID_GUARD sets.

cleanups (dev + template copies synced and verified identical):
- workflow-context-injector.py: `state, _ = read_step_state(branch)` instead of
  the less-readable `[0]` (x3 sites).
- map_step_runner.py: drop redundant `name and` in _enclosing_changed_symbols.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Capture lessons surfaced while building/reviewing the MAP_INVOKED_BY
recursion-guard feature:
- architecture: CLI stdin gate must distinguish "no input piped" from invalid
  content; always-loaded skill body has a hard line budget.
- implementation: blast-radius detectors must exclude generic process-entrypoint
  names (`main`) in the shared _is_reportable_symbol predicate.
- testing: integration-test framework gates via real invocation; filesystem-
  parametrized tests need a non-empty discovery guard; a gate linter must ship a
  --self-test covering every failure mode, wired into CI.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 29, 2026 15:29
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Phase A introduces a MAP_INVOKED_BY recursion-guard contract for all MAP hooks, with a dev linter and tests proving the contract on both dev and shipped template hook trees. It also hardens the detect_truncated_agent_output CLI to distinguish empty stdin from invalid content, and excludes the generic main entrypoint from blast-radius analysis to suppress false positives.

Changes:

  • Add early-exit MAP_INVOKED_BY guard to 7 Python REQUIRE_GUARD hooks + end-of-turn.sh; classify FORBID hooks (safety-guardrails.py, workflow-gate.py, post-compact-context.py) and enforce they never reference the flag.
  • New scripts/lint-hooks.py (AST + regex) wired into make lint/make check, with --self-test covering every failure mode; new tests/test_hook_patterns.py validates dev + template trees and doc-table drift; new .claude/references/hook-patterns.md documents the contract.
  • CLI detect_truncated_agent_output returns status:"no_input" (truncated=False) on empty stdin and status:"ok" otherwise; _is_reportable_symbol excludes main from blast-radius detection; map-efficient skill docs updated to pipe the captured response.

Reviewed changes

Copilot reviewed 33 out of 33 changed files in this pull request and generated no comments.

Show a summary per file
File Description
scripts/lint-hooks.py New AST/regex linter classifying hooks and enforcing guard presence + position.
tests/test_hook_patterns.py New parametrized tests across 4 hook trees, including behavioral deny check and doc-table drift guard.
tests/test_map_step_runner.py Adds CLI subprocess coverage for truncation detector and predicate tests for _is_reportable_symbol.
.claude/references/hook-patterns.md & template copy New authoritative contract doc for REQUIRE/FORBID guard classes.
.claude/hooks/*.py + end-of-turn.sh & template copies Add first-statement MAP_INVOKED_BY early-exit to REQUIRE_GUARD hooks; minor cleanups in workflow-context-injector.
.claude/hooks/README.md & template copy Inventory table updated with Class column; pointer to hook-patterns.md.
src/mapify_cli/templates/map/scripts/map_step_runner.py & .map/scripts/map_step_runner.py Adds _is_reportable_symbol excluding main; CLI distinguishes no_input from truncated content.
.claude/skills/map-efficient/* & template copies Updates docs to pipe captured response into the truncation detector.
.claude/rules/learned/*.md New learned-rule entries (testing, implementation, architecture patterns).
Makefile Adds python3 scripts/lint-hooks.py to lint; standardizes python3.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@azalio azalio merged commit 0ae8def into main May 29, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants