Skip to content

MAP framework hardening: skill routing, conftest PYTHONPATH, pyright gate, spec citations#149

Merged
azalio merged 5 commits into
mainfrom
map-friction-fixes
May 29, 2026
Merged

MAP framework hardening: skill routing, conftest PYTHONPATH, pyright gate, spec citations#149
azalio merged 5 commits into
mainfrom
map-friction-fixes

Conversation

@azalio
Copy link
Copy Markdown
Owner

@azalio azalio commented May 28, 2026

Summary

Four friction fixes surfaced while running /map-efficient on the hogback-gap PR. Each was a real footgun that either wasted iteration tokens or risked shipping a bug; each commit is independent and atomic.

d13a9ad — map-efficient: document script-routing dispatcher

The skill body referenced helpers like detect_symbol_blast_radius and build_json_retry_prompt without naming a script. Calling the wrong one fails with invalid choice — and the routing isn't by name prefix (record_subtask_result lives on the orchestrator while record_test_baseline lives on step_runner). Added a one-line routing rule to Execution Rules in SKILL.md (within the 500-line compact cap), a full per-command table to efficient-reference.md, and prefixed every bare detect_* / build_json_retry_prompt / log_agent_failure call in the MONITOR/ACTOR gates with the correct python3 .map/scripts/map_step_runner.py invocation.

481329e — conftest: prepend worktree src to sys.path + PYTHONPATH

Worktree-vs-editable-install footgun: running pytest from .warp/worktrees/<feature>/ while pip install -e . still points at the main checkout silently imports stale mapify_cli. Same trap hits subprocess.Popen([sys.executable, …]) test helpers — fresh interpreters inherit only PYTHONPATH, not the parent's sys.path. Conftest now prepends <this-repo>/src to both, idempotently. (Also silenced Pylance's _restore_cwd is not accessed hint via an explicit module-namespace sentinel.)

044895e — make check: gate on pyright src/ too

make check previously ran ruff + mypy on src/. Pyright deprecation diagnostics (e.g. @contextmanager + Iterator[T] — exactly the issue hit on hogback-gap ST-003) sailed past Monitor and only surfaced because the IDE language-server happened to emit them. Without that luck, the deprecated code would have shipped. Added pyright src/ to the Makefile lint target, pyright>=1.1.400 to the [dev] extra, and a best-effort which pyright line to the CI linters step. src/ is already 0/0/0, so no existing diagnostics need fixing.

f5847f5 — /map-plan: gate spec on file:line citation validator

The hogback-gap spec cited src/mapify_cli/__init__.py:96 for MAP_DEBUG when the symbol had moved to :207, and cited tests/test_template_sync.py:303-355 for a class living at :317-356. Both stale citations propagated unchecked into research, Actor prompts, and review. New script .map/scripts/validate_spec_citations.py greps every file:line citation, checks the path exists and the line range is in bounds, and (when a backticked identifier sits within ~80 chars) verifies the cited line actually contains it. Returns a JSON verdict + exit 1 on stale-citation / file-missing / out-of-range. Wired into /map-plan as mandatory Step 2a.5 (before devil's-advocate review). 10 unit tests for the validator.

Test plan

  • make check — ruff clean, mypy clean (36 src files), pyright clean (37 src files, new gate), full pytest 1709 passed / 4 skipped / 0 failed / 12 deselected
  • pytest tests/test_validate_spec_citations.py -v — 10 passed (covering happy path, stale identifier, missing file, out-of-range line, line-range validation, no-identifier-window fallback, external-path skip, extension allowlist, path-escapes-repo-root, nearest-identifier precedence)
  • pytest tests/test_skills.py — 223 passed (high-traffic skill compactness + run-health closeout ordering still satisfied after Script Routing edit)
  • Smoke-tested the citation validator against the hogback-gap spec — caught BOTH known stale citations (src/mapify_cli/__init__.py:96 and tests/test_template_sync.py:303-355)
  • Templates synced (make sync-templates)

🤖 Generated with Claude Code

azalio and others added 4 commits May 28, 2026 23:40
…t_* calls

Surface friction from the hogback-gap run: the skill body referenced
helpers like `detect_symbol_blast_radius` and `build_json_retry_prompt`
without naming a script, leaving callers to guess between
map_orchestrator.py and map_step_runner.py (calling the wrong one fails
with an `invalid choice` error). Routing isn't by name prefix —
`record_subtask_result` lives on the orchestrator while
`record_test_baseline` lives on step_runner, etc.

- Add a one-line routing rule to Execution Rules in SKILL.md.
- Move the full per-command table + `record_*` / `validate_*`
  disambiguation into efficient-reference.md (keeps SKILL.md under the
  500-line compact cap enforced by test_high_traffic_workflow_skills_keep_active_bodies_compact).
- Prefix every bare `detect_*`, `build_json_retry_prompt`,
  `log_agent_failure`, and `detect_cross_subtask_regression_risk` call
  in the MONITOR/ACTOR gates with `python3 .map/scripts/map_step_runner.py`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Worktree-vs-editable-install footgun: running pytest from a
.warp/worktrees/<feature>/ directory while `pip install -e .` still
points at the main checkout silently imports stale `mapify_cli` code.
Same trap hits subprocess.Popen([sys.executable, …]) helpers — fresh
interpreters inherit only PYTHONPATH, not the parent's sys.path.

- Prepend <this-repo>/src to sys.path so in-process imports prefer
  the current worktree.
- Prepend <this-repo>/src to PYTHONPATH env var so test-spawned
  subprocesses inherit the same priority.
- Idempotent: repeated insertions deduplicated, so re-importing
  conftest is a no-op.

Also silence Pylance's "_restore_cwd is not accessed" hint via an
explicit module-namespace sentinel — pytest discovers autouse fixtures
by name lookup, which static analyzers can't see.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`make check` previously ran ruff + mypy on src/, but pyright deprecation
diagnostics (e.g. @contextmanager + Iterator return type, the bug we
just hit on hogback-gap ST-003) sailed past Monitor and only surfaced
because the IDE language-server happened to emit them — without that
luck, the deprecated code would have shipped.

- Makefile lint: append `pyright src/` after mypy.
- pyproject [project.optional-dependencies].dev: add `pyright>=1.1.400`
  so `pip install -e ".[dev,ssl]"` brings the binary in.
- .github/workflows/ci.yml linters step: best-effort pyright run with
  the same `which … || echo skipping` pattern as ruff/mypy, so non-
  pyright runners stay green while pyright-equipped ones gate hard.

Current `src/` is 0 errors / 0 warnings / 0 informations, so no
existing diagnostics need fixing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The hogback-gap spec cited src/mapify_cli/__init__.py:96 for MAP_DEBUG
when the symbol had actually moved to :207, and the same spec cited
tests/test_template_sync.py:303-355 for a sync-test class that lived
at :317-356. Both stale citations propagated unchecked into research,
Actor prompts, and the final review — silent drift between the spec
and reality.

- .map/scripts/validate_spec_citations.py: greps every file:line
  citation out of a spec, checks the path exists, the line range is in
  bounds, and (when a backticked identifier sits within ~80 chars) that
  the cited line(s) actually contain the symbol. Returns a JSON verdict
  on stdout; exit 1 on any stale-citation / file-missing / line-out-of-
  range, exit 2 on invocation error.
- tests/test_validate_spec_citations.py: 10 unit tests covering happy
  path, stale identifier, missing file, line out of range, line-range
  validation, no-identifier-window fallback, external-path skip,
  extension allowlist, path-escapes-repo-root, and nearest-identifier
  precedence.
- .claude/skills/map-plan/SKILL.md: insert mandatory Step 2a.5 between
  spec writing and the devil's-advocate review so a red validator
  blocks decomposition.
- Templates synced (src/mapify_cli/templates/{map/scripts,skills/map-plan/}).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 28, 2026 20:54
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens MAP workflow tooling by clarifying script routing, making tests prefer the current worktree source, adding Pyright to local checks, and introducing a citation validator for /map-plan specs.

Changes:

  • Adds spec citation validation script, template copy, mandatory map-plan step, and unit tests.
  • Documents map-efficient script routing and updates several command examples to use explicit runner paths.
  • Prepends worktree src/ for pytest/import subprocess consistency and adds Pyright to dev linting.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
.github/workflows/ci.yml Adds best-effort Pyright invocation to CI lint step.
.map/scripts/validate_spec_citations.py Adds source validator for spec file:line citations.
.claude/skills/map-efficient/SKILL.md Adds script routing rule and explicit step-runner command examples.
.claude/skills/map-efficient/efficient-reference.md Adds script routing dispatcher reference.
.claude/skills/map-plan/SKILL.md Adds mandatory spec citation validation step.
Makefile Adds pyright src/ to lint target.
pyproject.toml Adds Pyright to dev dependencies.
src/mapify_cli/templates/map/scripts/validate_spec_citations.py Adds packaged template copy of citation validator.
src/mapify_cli/templates/skills/map-efficient/SKILL.md Syncs map-efficient skill routing updates into templates.
src/mapify_cli/templates/skills/map-efficient/efficient-reference.md Syncs routing reference into templates.
src/mapify_cli/templates/skills/map-plan/SKILL.md Syncs map-plan citation validation step into templates.
tests/conftest.py Ensures pytest and subprocesses import from current worktree src/.
tests/test_validate_spec_citations.py Adds unit coverage for the new citation validator.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .map/scripts/validate_spec_citations.py Outdated
"reason": f"could not read file: {exc}",
}

if line_no < 1 or end_no > len(lines):
"reason": f"could not read file: {exc}",
}

if line_no < 1 or end_no > len(lines):
- `save_*` / `load_*`: `save_research`, `load_research`, `save_artifact_manifest`, `load_artifact_manifest`, `save_learning_metrics`, `load_learning_metrics`, `load_blueprint`
- `refresh_*`: `refresh_blueprint_affected_files`
- `validate_*` (non-state): `validate_blueprint_contract`, `validate_mutation_boundary`, `validate_retry_quarantine`, `validate_run_health_report`, `validate_checkpoint`
- `record_*` (artifacts, not state): `record_test_baseline`, `record_diagnostics_baseline`, `record_scope_baseline`, `record_subtask_baseline`, `record_token_event`, `record_learning_consumption`, `record_repeated_learning_violations`, `record_workflow_fit`, `record_plan_artifacts`, `record_test_contract_handoff`, `record_review_ordering`, `record_token_budget_decision`
- `save_*` / `load_*`: `save_research`, `load_research`, `save_artifact_manifest`, `load_artifact_manifest`, `save_learning_metrics`, `load_learning_metrics`, `load_blueprint`
- `refresh_*`: `refresh_blueprint_affected_files`
- `validate_*` (non-state): `validate_blueprint_contract`, `validate_mutation_boundary`, `validate_retry_quarantine`, `validate_run_health_report`, `validate_checkpoint`
- `record_*` (artifacts, not state): `record_test_baseline`, `record_diagnostics_baseline`, `record_scope_baseline`, `record_subtask_baseline`, `record_token_event`, `record_learning_consumption`, `record_repeated_learning_violations`, `record_workflow_fit`, `record_plan_artifacts`, `record_test_contract_handoff`, `record_review_ordering`, `record_token_budget_decision`
- **`python3 .map/scripts/map_step_runner.py <cmd>`** — pure analysis/persistence helpers (no state-machine side effect):
- `detect_*` family: `detect_truncated_agent_output`, `detect_already_done`, `detect_cross_subtask_regression_risk`, `detect_actor_files_changed_mismatch`, `detect_symbol_blast_radius`
- `build_*` family: `build_context_block`, `build_json_retry_prompt`, `build_acceptance_coverage_report`, `build_prior_stage_consumption_report`, `build_retry_quarantine`, `build_handoff_bundle`, `build_review_handoff`, `build_review_prompts`
- `save_*` / `load_*`: `save_research`, `load_research`, `save_artifact_manifest`, `load_artifact_manifest`, `save_learning_metrics`, `load_learning_metrics`, `load_blueprint`
- **`python3 .map/scripts/map_step_runner.py <cmd>`** — pure analysis/persistence helpers (no state-machine side effect):
- `detect_*` family: `detect_truncated_agent_output`, `detect_already_done`, `detect_cross_subtask_regression_risk`, `detect_actor_files_changed_mismatch`, `detect_symbol_blast_radius`
- `build_*` family: `build_context_block`, `build_json_retry_prompt`, `build_acceptance_coverage_report`, `build_prior_stage_consumption_report`, `build_retry_quarantine`, `build_handoff_bundle`, `build_review_handoff`, `build_review_prompts`
- `save_*` / `load_*`: `save_research`, `load_research`, `save_artifact_manifest`, `load_artifact_manifest`, `save_learning_metrics`, `load_learning_metrics`, `load_blueprint`
Three findings, all substantive:

1. validate_spec_citations.py line-range bounds check was incomplete:
   - reversed ranges (`file.py:20-10`) passed because both bounds were
     compared independently against the file length, not each other.
   - out-of-bounds start with in-bounds end (`file.py:50-5` on a 10-line
     file) passed because only `end_no > len(lines)` was guarded.
   Fixed by validating all four conditions independently and reporting
   a reversed-range hint in the failure reason. Two new tests cover
   both edge cases: test_flags_reversed_range,
   test_flags_out_of_bounds_start_with_in_bounds_end.

2. efficient-reference.md routing table listed save_*/load_* functions
   that aren't CLI-dispatchable (`save_artifact_manifest`,
   `save_learning_metrics`, `load_learning_metrics`, `load_blueprint`)
   — they exist in map_step_runner.py as internal helpers but have no
   `func_name` dispatch branch, so `python3 map_step_runner.py
   save_learning_metrics` fails with "Unknown function". Pruned to the
   three actually-dispatchable names: save_research, load_research,
   load_artifact_manifest. Added validate_prior_stage_consumption
   (was missing). Added an "if you see 'Unknown function', grep
   map_step_runner.py for func_name ==" note so the dispatcher
   remains the ground truth even if this list drifts.

3. Same problem in the record_* family: `record_repeated_learning_violations`
   and `record_token_budget_decision` are internal helpers, not CLI
   commands. `record_review_ordering` IS dispatched, but with a HYPHEN
   (`record-review-ordering`) not an underscore — pure trap. Removed
   the two non-dispatchable names; renamed to `record-review-ordering`
   with a callout that this single command uses hyphen syntax.

Also inlined the artifact-writer family list (six write_* commands)
directly into the routing table instead of leaving it as a separate
prose sentence — now every CLI-callable name appears in one place.

make check: ruff + mypy + pyright clean; 1713 passed / 4 skipped / 0 failed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@azalio azalio merged commit 0322815 into main May 29, 2026
6 checks passed
@azalio azalio deleted the map-friction-fixes branch May 29, 2026 04:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants