feat(hooks): MAP_INVOKED_BY recursion-guard contract for MAP hooks (Phase A)#152
Merged
Conversation
…oks (ST-001) Insert byte-identical early-exit guard as first main() statement in the 7 recursion-suppressed Python hooks so they no-op when MAP spawns a nested Claude/Codex subprocess. Also add missing `import os` to detect-clarification-triggers.py. Fix 4 user-approved pre-existing Pyright diagnostics in workflow-context-injector.py (3 unused tuple-unpack discards + 1 dead parameter), behavior-preserving. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…-002)
Early-exit no-op when MAP spawned this subprocess, before any git/tooling.
nounset-safe via ${MAP_INVOKED_BY:-} default expansion.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…E (ST-004) Document the MAP_INVOKED_BY recursion-guard contract: REQUIRE_GUARD vs FORBID_GUARD classification with per-hook rationale, the guard-position rule (INV-A2), a pointer to the LockState marker enum, and a Phase E forward reference for the detached-Popen contract (explicitly not-yet-implemented). Refresh the README inventory to all 11 hooks (adds context-meter.py + map-token-meter.py) with a Class column. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
AST-walk .py + regex .sh over .claude/hooks and .codex/hooks, enforcing the MAP_INVOKED_BY classification: REQUIRE_GUARD hooks must early-exit as the first entry-function statement; FORBID_GUARD hooks must contain zero references; unclassified hooks fail. Glob discovery (no hardcoded allowlist), argparse + Colors + exit-code conventions mirroring lint-agent-templates.py, and a --self-test that proves every failure mode exits non-zero. Hardened per Monitor review: FORBID hooks are checked for zero MAP_INVOKED_BY string constants (defeats indirect-variable guards), and shell guard detection strips inline comments (a prose mention is not a guard). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Append `python scripts/lint-hooks.py` to the lint recipe so the recursion-guard contract is enforced in `make lint` and `make check`. Proven: conformant tree passes; a guard injected into a FORBID hook makes `make lint` exit non-zero. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…-007) make sync-templates mirrors the recursion-guard edits, the refreshed README inventory, and the new references/hook-patterns.md into src/mapify_cli/templates/. Byte-identical dev<->template (INV-A3); idempotent; test_template_sync.py passes. scripts/lint-hooks.py is intentionally not synced (dev tool). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
tests/test_hook_patterns.py validates the MAP_INVOKED_BY contract per-hook across the dev and shipped trees (Claude + Codex): INV-A2 (REQUIRE_GUARD guard positioned), INV-A1 (FORBID_GUARD guard absent + zero flag references), and the behavioral proof that safety-guardrails still denies with the flag set. Classification + guard detection are imported from scripts/lint-hooks.py (single source of truth). Discovery guard asserts every tree contributes hooks (no vacuous pass). make check is green (1767 passed). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…Phase A Two MANDATORY gates produced misleading signals during this run: 1. Truncation gate false-positive: the CLI reads the agent response from stdin, but the skill documented a bare call with nothing piped → empty stdin → `truncated:true "empty response"` on every subtask. Add a distinct non-blocking `status:"no_input"` for empty stdin (a bare call is "you forgot to pipe", not a truncation), keep the pure function's empty→truncated semantics for programmatic callers, and update SKILL.md / efficient-reference to pipe the captured response. 2. Symbol blast-radius noise: a changed `def main()` matched the literal word "main" in ~168 SKILL.md / settings.json lines and recommended validate_callers. Exclude generic process entrypoints (`main`) from blast-radius symbols via a new `_is_reportable_symbol` predicate (same place dunders are already excluded); markdown callers of *meaningful* symbols are still flagged by design. Regression tests for both. make check green (1772 passed). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Use python3 uniformly (lint/build/release/test-cli) for portability — `python` is absent on macOS 12.3+ outside an activated venv. Includes the lint-hooks line. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Max-effort code review of the MAP_INVOKED_BY recursion-guard feature surfaced soundness gaps in the new linter plus minor cleanups. `make check` stays green (1776 passed, +4 new drift tests; ruff/mypy/pyright 0/0/0; lint-hooks 12/12). scripts/lint-hooks.py (dev-only): - _SH_PREAMBLE_RE: reject `VAR=$(...)` / backtick command substitution before the shell guard (was a false negative defeating INV-A2's position promise). - _find_entry_body: at module scope skip only side-effect-free constant assignments, so a stdin-read assignment before the guard flags MISPLACED; also recognize `async def main`. - is_recursion_guard: check `orelse` as well as `body`. - FORBID check (python + shell): raw-source substring scan, aligned byte-for-byte with tests/test_hook_patterns.py (catches comment / embedded mentions the AST-constant scan missed); linter and pytest can no longer disagree on the same FORBID hook. - _self_test: add expected-FAIL fixtures (shell command-substitution, module-scope stdin read) and 4 expected-PASS fixtures (sync/async main, module-scope-after-constant, conformant shell). - drop dead Colors.YELLOW / Colors.BLUE. tests/test_hook_patterns.py: - add test_doc_tables_match_classification: drift guard cross-checking the README + hook-patterns.md tables (dev + template trees) against the lint-hooks REQUIRE_GUARD / FORBID_GUARD sets. cleanups (dev + template copies synced and verified identical): - workflow-context-injector.py: `state, _ = read_step_state(branch)` instead of the less-readable `[0]` (x3 sites). - map_step_runner.py: drop redundant `name and` in _enclosing_changed_symbols. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Capture lessons surfaced while building/reviewing the MAP_INVOKED_BY recursion-guard feature: - architecture: CLI stdin gate must distinguish "no input piped" from invalid content; always-loaded skill body has a hard line budget. - implementation: blast-radius detectors must exclude generic process-entrypoint names (`main`) in the shared _is_reportable_symbol predicate. - testing: integration-test framework gates via real invocation; filesystem- parametrized tests need a non-empty discovery guard; a gate linter must ship a --self-test covering every failure mode, wired into CI. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Phase A introduces a MAP_INVOKED_BY recursion-guard contract for all MAP hooks, with a dev linter and tests proving the contract on both dev and shipped template hook trees. It also hardens the detect_truncated_agent_output CLI to distinguish empty stdin from invalid content, and excludes the generic main entrypoint from blast-radius analysis to suppress false positives.
Changes:
- Add early-exit
MAP_INVOKED_BYguard to 7 Python REQUIRE_GUARD hooks +end-of-turn.sh; classify FORBID hooks (safety-guardrails.py,workflow-gate.py,post-compact-context.py) and enforce they never reference the flag. - New
scripts/lint-hooks.py(AST + regex) wired intomake lint/make check, with--self-testcovering every failure mode; newtests/test_hook_patterns.pyvalidates dev + template trees and doc-table drift; new.claude/references/hook-patterns.mddocuments the contract. - CLI
detect_truncated_agent_outputreturnsstatus:"no_input"(truncated=False) on empty stdin andstatus:"ok"otherwise;_is_reportable_symbolexcludesmainfrom blast-radius detection; map-efficient skill docs updated to pipe the captured response.
Reviewed changes
Copilot reviewed 33 out of 33 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/lint-hooks.py | New AST/regex linter classifying hooks and enforcing guard presence + position. |
| tests/test_hook_patterns.py | New parametrized tests across 4 hook trees, including behavioral deny check and doc-table drift guard. |
| tests/test_map_step_runner.py | Adds CLI subprocess coverage for truncation detector and predicate tests for _is_reportable_symbol. |
| .claude/references/hook-patterns.md & template copy | New authoritative contract doc for REQUIRE/FORBID guard classes. |
| .claude/hooks/*.py + end-of-turn.sh & template copies | Add first-statement MAP_INVOKED_BY early-exit to REQUIRE_GUARD hooks; minor cleanups in workflow-context-injector. |
| .claude/hooks/README.md & template copy | Inventory table updated with Class column; pointer to hook-patterns.md. |
| src/mapify_cli/templates/map/scripts/map_step_runner.py & .map/scripts/map_step_runner.py | Adds _is_reportable_symbol excluding main; CLI distinguishes no_input from truncated content. |
| .claude/skills/map-efficient/* & template copies | Updates docs to pipe captured response into the truncation detector. |
| .claude/rules/learned/*.md | New learned-rule entries (testing, implementation, architecture patterns). |
| Makefile | Adds python3 scripts/lint-hooks.py to lint; standardizes python3. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Phase A of the MAP hook hardening: make every MAP hook safe against recursive re-entry when a MAP workflow spawns a nested Claude/Codex subprocess (which re-fires the entire hook chain). Introduces the
MAP_INVOKED_BYrecursion-guard contract, a mechanical enforcer, and full test coverage.MAP_INVOKED_BYas the first statement of their entry point — 7 Python hooks +end-of-turn.sh.scripts/lint-hooks.py— AST/regex linter classifying every hook and verifying guard presence and position; wired intomake lint/make check; ships a--self-testcovering every failure mode..claude/references/hook-patterns.md— authoritative contract + per-hook rationale (mirrored to templates).tests/test_hook_patterns.py— parametrized over dev + template trees (Claude + Codex), proving INV-A1 (FORBID hooks still deny with the flag set) and INV-A2 (REQUIRE hooks correctly positioned).detect_truncated_agent_outputCLI now distinguishes empty stdin (status:"no_input") from truncated content; blast-radius analysis excludes the genericmainentrypoint.Code-review fixes (this PR's last two commits)
A max-effort review found no active bugs (gates were green) but surfaced soundness gaps in the new linter, now fixed:
_SH_PREAMBLE_REacceptedVAR=$(...)/ backtick command substitution before the guard — now rejected (INV-A2 honored).data = sys.stdin.read()before the guard flags MISPLACED. Also recognizesasync def main.is_recursion_guardnow checksorelseas well asbody.make lintandpytestcan no longer disagree on the same hook.--self-testgained expected-FAIL fixtures (command-substitution, module-scope stdin read) and expected-PASS fixtures (sync/async main, module-scope-after-constant, conformant shell).test_doc_tables_match_classificationcross-checks the README + hook-patterns.md tables against the linter's classification sets.state, _ = read_step_state(...), dropped redundantname and, removed deadColorsfields.Verification
make check→ EXIT 0: ruff clean, mypy clean, pyright0 errors, 0 warnings, 0 informations, lint-hooks12/12 conform, 1776 passed, 4 skipped. Linter--self-test: 9 expected-FAIL + 4 expected-PASS. Dev ↔ template copies verified byte-identical.Notes
MAP_INVOKED_BYis set by the orchestrator/harness at spawn time, not by repo code; the guard is intentionally dormant until Phase E lands (documented inhook-patterns.md).scripts/lint-hooks.pyis a dev-only tool and intentionally not synced to templates.🤖 Generated with Claude Code