MAP framework hardening: skill routing, conftest PYTHONPATH, pyright gate, spec citations by azalio · Pull Request #149 · azalio/map-framework

azalio · 2026-05-28T20:54:33Z

Summary

Four friction fixes surfaced while running /map-efficient on the hogback-gap PR. Each was a real footgun that either wasted iteration tokens or risked shipping a bug; each commit is independent and atomic.

`d13a9ad` — map-efficient: document script-routing dispatcher

The skill body referenced helpers like detect_symbol_blast_radius and build_json_retry_prompt without naming a script. Calling the wrong one fails with invalid choice — and the routing isn't by name prefix (record_subtask_result lives on the orchestrator while record_test_baseline lives on step_runner). Added a one-line routing rule to Execution Rules in SKILL.md (within the 500-line compact cap), a full per-command table to efficient-reference.md, and prefixed every bare detect_* / build_json_retry_prompt / log_agent_failure call in the MONITOR/ACTOR gates with the correct python3 .map/scripts/map_step_runner.py invocation.

`481329e` — conftest: prepend worktree src to sys.path + PYTHONPATH

Worktree-vs-editable-install footgun: running pytest from .warp/worktrees/<feature>/ while pip install -e . still points at the main checkout silently imports stale mapify_cli. Same trap hits subprocess.Popen([sys.executable, …]) test helpers — fresh interpreters inherit only PYTHONPATH, not the parent's sys.path. Conftest now prepends <this-repo>/src to both, idempotently. (Also silenced Pylance's _restore_cwd is not accessed hint via an explicit module-namespace sentinel.)

`044895e` — make check: gate on pyright src/ too

make check previously ran ruff + mypy on src/. Pyright deprecation diagnostics (e.g. @contextmanager + Iterator[T] — exactly the issue hit on hogback-gap ST-003) sailed past Monitor and only surfaced because the IDE language-server happened to emit them. Without that luck, the deprecated code would have shipped. Added pyright src/ to the Makefile lint target, pyright>=1.1.400 to the [dev] extra, and a best-effort which pyright line to the CI linters step. src/ is already 0/0/0, so no existing diagnostics need fixing.

`f5847f5` — /map-plan: gate spec on file:line citation validator

The hogback-gap spec cited src/mapify_cli/__init__.py:96 for MAP_DEBUG when the symbol had moved to :207, and cited tests/test_template_sync.py:303-355 for a class living at :317-356. Both stale citations propagated unchecked into research, Actor prompts, and review. New script .map/scripts/validate_spec_citations.py greps every file:line citation, checks the path exists and the line range is in bounds, and (when a backticked identifier sits within ~80 chars) verifies the cited line actually contains it. Returns a JSON verdict + exit 1 on stale-citation / file-missing / out-of-range. Wired into /map-plan as mandatory Step 2a.5 (before devil's-advocate review). 10 unit tests for the validator.

Test plan

make check — ruff clean, mypy clean (36 src files), pyright clean (37 src files, new gate), full pytest 1709 passed / 4 skipped / 0 failed / 12 deselected
pytest tests/test_validate_spec_citations.py -v — 10 passed (covering happy path, stale identifier, missing file, out-of-range line, line-range validation, no-identifier-window fallback, external-path skip, extension allowlist, path-escapes-repo-root, nearest-identifier precedence)
pytest tests/test_skills.py — 223 passed (high-traffic skill compactness + run-health closeout ordering still satisfied after Script Routing edit)
Smoke-tested the citation validator against the hogback-gap spec — caught BOTH known stale citations (src/mapify_cli/__init__.py:96 and tests/test_template_sync.py:303-355)
Templates synced (make sync-templates)

🤖 Generated with Claude Code

…t_* calls Surface friction from the hogback-gap run: the skill body referenced helpers like `detect_symbol_blast_radius` and `build_json_retry_prompt` without naming a script, leaving callers to guess between map_orchestrator.py and map_step_runner.py (calling the wrong one fails with an `invalid choice` error). Routing isn't by name prefix — `record_subtask_result` lives on the orchestrator while `record_test_baseline` lives on step_runner, etc. - Add a one-line routing rule to Execution Rules in SKILL.md. - Move the full per-command table + `record_*` / `validate_*` disambiguation into efficient-reference.md (keeps SKILL.md under the 500-line compact cap enforced by test_high_traffic_workflow_skills_keep_active_bodies_compact). - Prefix every bare `detect_*`, `build_json_retry_prompt`, `log_agent_failure`, and `detect_cross_subtask_regression_risk` call in the MONITOR/ACTOR gates with `python3 .map/scripts/map_step_runner.py`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Worktree-vs-editable-install footgun: running pytest from a .warp/worktrees/<feature>/ directory while `pip install -e .` still points at the main checkout silently imports stale `mapify_cli` code. Same trap hits subprocess.Popen([sys.executable, …]) helpers — fresh interpreters inherit only PYTHONPATH, not the parent's sys.path. - Prepend <this-repo>/src to sys.path so in-process imports prefer the current worktree. - Prepend <this-repo>/src to PYTHONPATH env var so test-spawned subprocesses inherit the same priority. - Idempotent: repeated insertions deduplicated, so re-importing conftest is a no-op. Also silence Pylance's "_restore_cwd is not accessed" hint via an explicit module-namespace sentinel — pytest discovers autouse fixtures by name lookup, which static analyzers can't see. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

`make check` previously ran ruff + mypy on src/, but pyright deprecation diagnostics (e.g. @contextmanager + Iterator return type, the bug we just hit on hogback-gap ST-003) sailed past Monitor and only surfaced because the IDE language-server happened to emit them — without that luck, the deprecated code would have shipped. - Makefile lint: append `pyright src/` after mypy. - pyproject [project.optional-dependencies].dev: add `pyright>=1.1.400` so `pip install -e ".[dev,ssl]"` brings the binary in. - .github/workflows/ci.yml linters step: best-effort pyright run with the same `which … || echo skipping` pattern as ruff/mypy, so non- pyright runners stay green while pyright-equipped ones gate hard. Current `src/` is 0 errors / 0 warnings / 0 informations, so no existing diagnostics need fixing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The hogback-gap spec cited src/mapify_cli/__init__.py:96 for MAP_DEBUG when the symbol had actually moved to :207, and the same spec cited tests/test_template_sync.py:303-355 for a sync-test class that lived at :317-356. Both stale citations propagated unchecked into research, Actor prompts, and the final review — silent drift between the spec and reality. - .map/scripts/validate_spec_citations.py: greps every file:line citation out of a spec, checks the path exists, the line range is in bounds, and (when a backticked identifier sits within ~80 chars) that the cited line(s) actually contain the symbol. Returns a JSON verdict on stdout; exit 1 on any stale-citation / file-missing / line-out-of- range, exit 2 on invocation error. - tests/test_validate_spec_citations.py: 10 unit tests covering happy path, stale identifier, missing file, line out of range, line-range validation, no-identifier-window fallback, external-path skip, extension allowlist, path-escapes-repo-root, and nearest-identifier precedence. - .claude/skills/map-plan/SKILL.md: insert mandatory Step 2a.5 between spec writing and the devil's-advocate review so a red validator blocks decomposition. - Templates synced (src/mapify_cli/templates/{map/scripts,skills/map-plan/}). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

This PR hardens MAP workflow tooling by clarifying script routing, making tests prefer the current worktree source, adding Pyright to local checks, and introducing a citation validator for /map-plan specs.

Changes:

Adds spec citation validation script, template copy, mandatory map-plan step, and unit tests.
Documents map-efficient script routing and updates several command examples to use explicit runner paths.
Prepends worktree src/ for pytest/import subprocess consistency and adds Pyright to dev linting.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
`.github/workflows/ci.yml`	Adds best-effort Pyright invocation to CI lint step.
`.map/scripts/validate_spec_citations.py`	Adds source validator for spec file:line citations.
`.claude/skills/map-efficient/SKILL.md`	Adds script routing rule and explicit step-runner command examples.
`.claude/skills/map-efficient/efficient-reference.md`	Adds script routing dispatcher reference.
`.claude/skills/map-plan/SKILL.md`	Adds mandatory spec citation validation step.
`Makefile`	Adds `pyright src/` to lint target.
`pyproject.toml`	Adds Pyright to dev dependencies.
`src/mapify_cli/templates/map/scripts/validate_spec_citations.py`	Adds packaged template copy of citation validator.
`src/mapify_cli/templates/skills/map-efficient/SKILL.md`	Syncs map-efficient skill routing updates into templates.
`src/mapify_cli/templates/skills/map-efficient/efficient-reference.md`	Syncs routing reference into templates.
`src/mapify_cli/templates/skills/map-plan/SKILL.md`	Syncs map-plan citation validation step into templates.
`tests/conftest.py`	Ensures pytest and subprocesses import from current worktree `src/`.
`tests/test_validate_spec_citations.py`	Adds unit coverage for the new citation validator.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+            "reason": f"could not read file: {exc}",
+        }
+
+    if line_no < 1 or end_no > len(lines):


+            "reason": f"could not read file: {exc}",
+        }
+
+    if line_no < 1 or end_no > len(lines):


+  - `save_*` / `load_*`: `save_research`, `load_research`, `save_artifact_manifest`, `load_artifact_manifest`, `save_learning_metrics`, `load_learning_metrics`, `load_blueprint`
+  - `refresh_*`: `refresh_blueprint_affected_files`
+  - `validate_*` (non-state): `validate_blueprint_contract`, `validate_mutation_boundary`, `validate_retry_quarantine`, `validate_run_health_report`, `validate_checkpoint`
+  - `record_*` (artifacts, not state): `record_test_baseline`, `record_diagnostics_baseline`, `record_scope_baseline`, `record_subtask_baseline`, `record_token_event`, `record_learning_consumption`, `record_repeated_learning_violations`, `record_workflow_fit`, `record_plan_artifacts`, `record_test_contract_handoff`, `record_review_ordering`, `record_token_budget_decision`


+  - `save_*` / `load_*`: `save_research`, `load_research`, `save_artifact_manifest`, `load_artifact_manifest`, `save_learning_metrics`, `load_learning_metrics`, `load_blueprint`
+  - `refresh_*`: `refresh_blueprint_affected_files`
+  - `validate_*` (non-state): `validate_blueprint_contract`, `validate_mutation_boundary`, `validate_retry_quarantine`, `validate_run_health_report`, `validate_checkpoint`
+  - `record_*` (artifacts, not state): `record_test_baseline`, `record_diagnostics_baseline`, `record_scope_baseline`, `record_subtask_baseline`, `record_token_event`, `record_learning_consumption`, `record_repeated_learning_violations`, `record_workflow_fit`, `record_plan_artifacts`, `record_test_contract_handoff`, `record_review_ordering`, `record_token_budget_decision`


+- **`python3 .map/scripts/map_step_runner.py <cmd>`** — pure analysis/persistence helpers (no state-machine side effect):
+  - `detect_*` family: `detect_truncated_agent_output`, `detect_already_done`, `detect_cross_subtask_regression_risk`, `detect_actor_files_changed_mismatch`, `detect_symbol_blast_radius`
+  - `build_*` family: `build_context_block`, `build_json_retry_prompt`, `build_acceptance_coverage_report`, `build_prior_stage_consumption_report`, `build_retry_quarantine`, `build_handoff_bundle`, `build_review_handoff`, `build_review_prompts`
+  - `save_*` / `load_*`: `save_research`, `load_research`, `save_artifact_manifest`, `load_artifact_manifest`, `save_learning_metrics`, `load_learning_metrics`, `load_blueprint`


+- **`python3 .map/scripts/map_step_runner.py <cmd>`** — pure analysis/persistence helpers (no state-machine side effect):
+  - `detect_*` family: `detect_truncated_agent_output`, `detect_already_done`, `detect_cross_subtask_regression_risk`, `detect_actor_files_changed_mismatch`, `detect_symbol_blast_radius`
+  - `build_*` family: `build_context_block`, `build_json_retry_prompt`, `build_acceptance_coverage_report`, `build_prior_stage_consumption_report`, `build_retry_quarantine`, `build_handoff_bundle`, `build_review_handoff`, `build_review_prompts`
+  - `save_*` / `load_*`: `save_research`, `load_research`, `save_artifact_manifest`, `load_artifact_manifest`, `save_learning_metrics`, `load_learning_metrics`, `load_blueprint`


Three findings, all substantive: 1. validate_spec_citations.py line-range bounds check was incomplete: - reversed ranges (`file.py:20-10`) passed because both bounds were compared independently against the file length, not each other. - out-of-bounds start with in-bounds end (`file.py:50-5` on a 10-line file) passed because only `end_no > len(lines)` was guarded. Fixed by validating all four conditions independently and reporting a reversed-range hint in the failure reason. Two new tests cover both edge cases: test_flags_reversed_range, test_flags_out_of_bounds_start_with_in_bounds_end. 2. efficient-reference.md routing table listed save_*/load_* functions that aren't CLI-dispatchable (`save_artifact_manifest`, `save_learning_metrics`, `load_learning_metrics`, `load_blueprint`) — they exist in map_step_runner.py as internal helpers but have no `func_name` dispatch branch, so `python3 map_step_runner.py save_learning_metrics` fails with "Unknown function". Pruned to the three actually-dispatchable names: save_research, load_research, load_artifact_manifest. Added validate_prior_stage_consumption (was missing). Added an "if you see 'Unknown function', grep map_step_runner.py for func_name ==" note so the dispatcher remains the ground truth even if this list drifts. 3. Same problem in the record_* family: `record_repeated_learning_violations` and `record_token_budget_decision` are internal helpers, not CLI commands. `record_review_ordering` IS dispatched, but with a HYPHEN (`record-review-ordering`) not an underscore — pure trap. Removed the two non-dispatchable names; renamed to `record-review-ordering` with a callout that this single command uses hyphen syntax. Also inlined the artifact-writer family list (six write_* commands) directly into the routing table instead of leaving it as a separate prose sentence — now every CLI-callable name appears in one place. make check: ruff + mypy + pyright clean; 1713 passed / 4 skipped / 0 failed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

azalio and others added 4 commits May 28, 2026 23:40

Copilot AI review requested due to automatic review settings May 28, 2026 20:54

Copilot started reviewing on behalf of azalio May 28, 2026 20:54 View session

Copilot AI reviewed May 28, 2026

View reviewed changes

azalio merged commit 0322815 into main May 29, 2026
6 checks passed

azalio deleted the map-friction-fixes branch May 29, 2026 04:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MAP framework hardening: skill routing, conftest PYTHONPATH, pyright gate, spec citations#149

MAP framework hardening: skill routing, conftest PYTHONPATH, pyright gate, spec citations#149
azalio merged 5 commits into
mainfrom
map-friction-fixes

azalio commented May 28, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

azalio commented May 28, 2026

Summary

d13a9ad — map-efficient: document script-routing dispatcher

481329e — conftest: prepend worktree src to sys.path + PYTHONPATH

044895e — make check: gate on pyright src/ too

f5847f5 — /map-plan: gate spec on file:line citation validator

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

`d13a9ad` — map-efficient: document script-routing dispatcher

`481329e` — conftest: prepend worktree src to sys.path + PYTHONPATH

`044895e` — make check: gate on pyright src/ too

`f5847f5` — /map-plan: gate spec on file:line citation validator