Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 14 additions & 11 deletions .claude/skills/map-efficient/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ parallel_tool_policy: guarded_wave_only
4. Batch mode is default. Sequential subtask execution is default.
5. After Monitor pass, record files changed in `step_state.json` for guard isolation.
6. Validate planning metadata before Actor starts: `expected_diff_size`, `concern_type`, `one_logical_step`, `split_rationale`, `concern_justification`, `coverage_map`, `hard_constraints`, `soft_constraints`, `validation_criteria`, `[AC-1]` bracket tags, and `tradeoff_rationale`.
7. Script routing: `map_orchestrator.py` owns state-machine transitions (`get_next_step`, `validate_step`, `monitor_failed`, `record_subtask_result`, `set_waves`, `resume_from_plan`, …); `map_step_runner.py` owns every `detect_*` / `build_*` / `save_*` / `load_*` / `refresh_*` / `log_*` helper plus baseline `record_*` and artifact writers. Full table + the `record_*` / `validate_*` disambiguation in [efficient-reference.md#script-routing-dispatcher-reference](efficient-reference.md#script-routing-dispatcher-reference).

## Mutation Boundary Constraints

Expand Down Expand Up @@ -296,18 +297,19 @@ Return files_changed, tests_run, validation_notes, and any blocker.
### Actor truncated-response gate (MANDATORY — pre-MONITOR)

Before invoking Monitor, validate Actor's response via
`detect_truncated_agent_output --agent actor`. If `truncated: true`, log via
`log_agent_failure` and re-invoke ONCE using the prompt from
`build_json_retry_prompt --agent actor --errors '<reasons>'`; if still
malformed, stop with CLARIFICATION_NEEDED.
`python3 .map/scripts/map_step_runner.py detect_truncated_agent_output --agent actor`.
If `truncated: true`, log via
`python3 .map/scripts/map_step_runner.py log_agent_failure` and re-invoke ONCE using the prompt from
`python3 .map/scripts/map_step_runner.py build_json_retry_prompt --agent actor --errors '<reasons>'`;
if still malformed, stop with CLARIFICATION_NEEDED.

**Files-changed mismatch check (MANDATORY):** Run
`detect_actor_files_changed_mismatch "$BRANCH" "$SUBTASK_ID" --declared "<Actor's files_changed, comma-joined>"`.
`python3 .map/scripts/map_step_runner.py detect_actor_files_changed_mismatch "$BRANCH" "$SUBTASK_ID" --declared "<Actor's files_changed, comma-joined>"`.
If `status_mismatch == true`, surface `recovery_instruction` and re-invoke Actor to finish `declared_not_written` files; do NOT record the subtask until clear. Full recipe: [efficient-reference.md](efficient-reference.md).

### Symbol blast-radius gate (MANDATORY — pre-dispatch)

Run `detect_symbol_blast_radius "$BRANCH" "$SUBTASK_ID"`. If
Run `python3 .map/scripts/map_step_runner.py detect_symbol_blast_radius "$BRANCH" "$SUBTASK_ID"`. If
`recommended_gate == "validate_callers"`, append `external_callers` to the Monitor
`<documents>` context and require Monitor to validate each external caller's contract.
Full recipe: [efficient-reference.md](efficient-reference.md).
Expand Down Expand Up @@ -337,11 +339,12 @@ Return JSON with valid, summary, issues, files_changed, tests_run, and escalatio
# After Monitor returns:

- **Truncated-response gate (MANDATORY — pre-verdict):** Before reading
`valid`/`recommendation`, run `detect_truncated_agent_output --agent monitor`
`valid`/`recommendation`, run
`python3 .map/scripts/map_step_runner.py detect_truncated_agent_output --agent monitor`
(JSON with `valid`, `summary`, `issues`, ends `}`). On truncation: log via
`log_agent_failure` and re-invoke Monitor ONCE using the prompt from
`build_json_retry_prompt --agent monitor --errors '<reasons>'`; if still
malformed, stop with CLARIFICATION_NEEDED. Do NOT record the
`python3 .map/scripts/map_step_runner.py log_agent_failure` and re-invoke Monitor ONCE using the prompt from
`python3 .map/scripts/map_step_runner.py build_json_retry_prompt --agent monitor --errors '<reasons>'`;
if still malformed, stop with CLARIFICATION_NEEDED. Do NOT record the
prose-response subtask as complete. Three signs:
(a) doesn't parse as JSON, (b) missing one of
`valid`/`summary`/`issues`, (c) ends mid-sentence with no closing `}`.
Expand Down Expand Up @@ -380,7 +383,7 @@ Return JSON with valid, summary, issues, files_changed, tests_run, and escalatio
Full recipe: [efficient-reference.md](efficient-reference.md).
- If `valid=false`, write `code-review-N.md`, run `python3 .map/scripts/map_orchestrator.py monitor_failed --feedback "<feedback>"`, inspect `retry_isolation`, and invoke Predictor only when stuck/high-risk escalation rules apply.
- If `retry_isolation=clean_retry_required`, run `python3 .map/scripts/map_step_runner.py validate_retry_quarantine` before the next Actor call. The next Actor prompt must use CLEAN_RETRY mode from `.map/<branch>/retry_quarantine.json` and must not reuse the rejected approach unless the quarantine artifact preserves it.
- Treat test failures after Monitor approval as Monitor failure. **Cross-subtask regression gate (MANDATORY):** before the test gate, run `detect_cross_subtask_regression_risk "$BRANCH" "$SUBTASK_ID"`; if `recommended_gate == "full_suite"` you MUST run the FULL suite (never a `-k` subset) before commit / `record_subtask_result` — per-subtask Monitor is blind to regressions on prior subtasks' code. Recipe: [efficient-reference.md](efficient-reference.md).
- Treat test failures after Monitor approval as Monitor failure. **Cross-subtask regression gate (MANDATORY):** before the test gate, run `python3 .map/scripts/map_step_runner.py detect_cross_subtask_regression_risk "$BRANCH" "$SUBTASK_ID"`; if `recommended_gate == "full_suite"` you MUST run the FULL suite (never a `-k` subset) before commit / `record_subtask_result` — per-subtask Monitor is blind to regressions on prior subtasks' code. Recipe: [efficient-reference.md](efficient-reference.md).

### Phase: ADVANCE_SUBTASK (synthetic boundary)

Expand Down
21 changes: 21 additions & 0 deletions .claude/skills/map-efficient/efficient-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,27 @@

This file holds low-frequency MAP Efficient details so `SKILL.md` stays focused on the active state-machine path.

## Script Routing (dispatcher reference)

Two CLI scripts back the workflow; calling the wrong one fails with `invalid choice`. Route by purpose, not by name prefix:

- **`python3 .map/scripts/map_orchestrator.py <cmd>`** — state-machine transitions and step-state writes:
`get_next_step`, `peek_current_step`, `validate_step`, `initialize`, `set_plan_approved`, `set_execution_mode`, `set_tdd_mode`, `skip_step`, `set_subtasks`, `mark_contract_ready`, `resume_from_plan`, `resume_from_test_contract`, `check_circuit_breaker`, `set_waves`, `get_wave_step`, `validate_wave_step`, `advance_wave`, `resume_single_subtask`, `get_plan_progress`, `monitor_failed`, `wave_monitor_failed`, `reopen_for_fixes`, `mark_workflow_complete`, `mark_subtask_complete`, `record_subtask_result`, `backfill_subtask_ids`, `finalize_plan`.

- **`python3 .map/scripts/map_step_runner.py <cmd>`** — pure analysis/persistence helpers (no state-machine side effect). The list below names ONLY commands that have a `func_name` dispatch branch in `map_step_runner.py` and are thus invocable from the shell; the module defines additional internal helpers (`save_artifact_manifest`, `save_learning_metrics`, `load_learning_metrics`, `load_blueprint`, `record_repeated_learning_violations`, `record_token_budget_decision`, …) that are used by other dispatch branches but cannot be called directly:
- `detect_*` family: `detect_truncated_agent_output`, `detect_already_done`, `detect_cross_subtask_regression_risk`, `detect_actor_files_changed_mismatch`, `detect_symbol_blast_radius`
- `build_*` family: `build_context_block`, `build_json_retry_prompt`, `build_acceptance_coverage_report`, `build_prior_stage_consumption_report`, `build_retry_quarantine`, `build_handoff_bundle`, `build_review_handoff`, `build_review_prompts`
- `save_*` / `load_*`: `save_research`, `load_research`, `load_artifact_manifest`
- `refresh_*`: `refresh_blueprint_affected_files`
- `validate_*` (non-state): `validate_blueprint_contract`, `validate_mutation_boundary`, `validate_retry_quarantine`, `validate_run_health_report`, `validate_checkpoint`, `validate_prior_stage_consumption`
- `record_*` (artifacts, not state): `record_test_baseline`, `record_diagnostics_baseline`, `record_scope_baseline`, `record_subtask_baseline`, `record_token_event`, `record_learning_consumption`, `record_workflow_fit`, `record_plan_artifacts`, `record_test_contract_handoff`, `record-review-ordering` (note: this one is dispatched with a hyphen, not an underscore)
- artifact writers: `write_verification_summary`, `write_run_health_report`, `write_pr_draft`, `write_plan_review`, `write_stage_gate`, `write_learning_handoff`
- `log_*`: `log_agent_failure`

Rule of thumb: anything that mutates `step_state.json` → orchestrator. Anything that reads the repo, writes a sidecar artifact, or returns a JSON verdict without touching `step_state.json` → step_runner. The two `record_subtask_result` (orchestrator) vs `record_test_baseline` (step_runner) cases are the most common confusion point — orchestrator advances the cursor, step_runner just persists a baseline file.

If a command above ever returns `Unknown function`, grep `map_step_runner.py` for `func_name ==` to confirm the dispatch branch still exists; this list is the source of truth as of the PR that added it but the underlying dispatcher is the ground truth.

## Wave Execution

Sequential is default. Parallel execution is allowed only when a wave has satisfied dependencies, low risk, and disjoint new-file writes, or when the user explicitly requests it. Use `get_wave_step`, `validate_wave_step`, and `advance_wave`; do not mix wave APIs with the single-current-subtask API.
Expand Down
12 changes: 12 additions & 0 deletions .claude/skills/map-plan/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,18 @@ Write `.map/<branch>/spec_<branch>.md`. The full spec template is in [plan-refer

Write the same spec artifact from the provided requirements and discovery evidence. Do not invent unresolved decisions; put them in Open Questions.

### Step 2a.5: Validate Spec Citations (MANDATORY)

Before the devil's-advocate review, gate on `file:line` citation correctness — stale citations in the spec ship to every downstream phase (research, Actor, Monitor) and cause real bugs (e.g., the hogback-gap ST-002 cited `src/mapify_cli/__init__.py:96` for `MAP_DEBUG` when the symbol had moved to :207). The validator finds every `<path>:<line>[-<line>]` pattern, checks the path exists and line is in range, and — when a backticked identifier sits next to the citation — verifies the cited line contains it.

```bash
python3 .map/scripts/validate_spec_citations.py --branch "$BRANCH"
```

- Exit 0 + `"passed": true` → proceed to Step 2b.
- Exit 1 + `"failures": [...]` with `status` in `{stale-citation, error}` → fix the spec (correct the line number, update the symbol name, or remove the citation) and re-run. Do NOT proceed to decomposition with red failures.
- Exit 2 → invocation error (missing branch / spec file); fix invocation, do not skip.

### Step 2b: Devil's Advocate Review (SPEC_REVIEW)

Run Monitor as a spec reviewer before decomposition.
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,7 @@ jobs:
# Run linters if installed (Linux/macOS only)
which ruff > /dev/null 2>&1 && ruff check src/ tests/ || echo "Ruff not installed, skipping"
which mypy > /dev/null 2>&1 && mypy src/ || echo "Mypy not installed, skipping"
which pyright > /dev/null 2>&1 && pyright src/ || echo "Pyright not installed, skipping"

- name: Run Codex provider regression checks
run: |
Expand Down
250 changes: 250 additions & 0 deletions .map/scripts/validate_spec_citations.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,250 @@
#!/usr/bin/env python3
"""Validate file:line citations inside a /map-plan spec.

Scans `.map/<branch>/spec_<branch>.md` for `<path>:<line>[-<line>]` patterns,
checks that each path exists, the line range is in bounds, and (when a nearby
backticked identifier is detected in the spec text) the cited line actually
contains that identifier. Prints a JSON verdict on stdout; exits non-zero on
any failure so /map-plan can gate decomposition.

Usage:
python3 .map/scripts/validate_spec_citations.py [--branch BRANCH] \
[--spec-path PATH] [--repo-root PATH]

The branch slug follows the same sanitization rule as the orchestrator
(`/` and other special chars replaced with `-`). Citations whose path looks
like a URL or starts with `http`/`/Users/` are skipped (out of repo scope).
"""

from __future__ import annotations

import argparse
import json
import re
import subprocess
import sys
from pathlib import Path

# Match `<path>:<line>` or `<path>:<start>-<end>`, where the path ends in a
# recognised extension. The negative lookbehind on `:` avoids matching the
# second half of an already-matched range, and the positive lookahead on
# `[\s,;)\]]` (or EOL) avoids gluing onto trailing punctuation.
_CITATION_RE = re.compile(
r"""
(?<![:\w]) # not preceded by another colon-citation or word
(?P<path>
[\w./\-]+ # path component
\.(?:py|md|sh|toml|yaml|yml|json|js|ts|go|rs|tsx|jsx)
)
:
(?P<line>\d+)
(?:-(?P<endline>\d+))?
(?=[\s,;)\]'`"]|$)
""",
re.VERBOSE | re.MULTILINE,
)

# Match a `\`identifier\`` adjacent to a citation; identifiers may be Python
# symbols, env-var names, or hyphen-cased module names.
_IDENT_RE = re.compile(r"`([A-Za-z_][\w./\-]{1,79})`")

# Citations whose path looks like one of these are skipped (not in-repo).
_SKIP_PREFIXES = ("http://", "https://", "/Users/", "/home/", "~/", "$HOME")


def _branch_slug() -> str:
try:
raw = subprocess.check_output(
["git", "rev-parse", "--abbrev-ref", "HEAD"], text=True
).strip()
except (subprocess.CalledProcessError, FileNotFoundError):
return ""
return re.sub(r"-{2,}", "-", re.sub(r"[^a-zA-Z0-9_.-]", "-", raw)).strip("-")


def _resolve_repo_root() -> Path:
try:
out = subprocess.check_output(
["git", "rev-parse", "--show-toplevel"], text=True
).strip()
return Path(out)
except (subprocess.CalledProcessError, FileNotFoundError):
return Path.cwd()


def _nearest_identifier(spec_text: str, citation_start: int, window: int = 80) -> str | None:
"""Return the closest backticked identifier within `window` chars of the citation."""
left = spec_text[max(0, citation_start - window) : citation_start]
right = spec_text[citation_start : citation_start + window]
# Prefer the rightmost identifier on the LEFT of the citation
# (e.g. ``MAP_DEBUG`` `src/mapify_cli/__init__.py:96`)
left_matches = list(_IDENT_RE.finditer(left))
if left_matches:
return left_matches[-1].group(1)
right_matches = _IDENT_RE.search(right)
if right_matches:
return right_matches.group(1)
return None


def _check_citation(
repo_root: Path,
spec_text: str,
match: re.Match[str],
) -> dict[str, object]:
raw_path = match.group("path")
if any(raw_path.startswith(prefix) for prefix in _SKIP_PREFIXES):
return {"path": raw_path, "status": "skipped", "reason": "out-of-repo path"}

line_no = int(match.group("line"))
end_no = int(match.group("endline")) if match.group("endline") else line_no

target = (repo_root / raw_path).resolve()
try:
target.relative_to(repo_root)
except ValueError:
return {
"path": raw_path,
"line": line_no,
"status": "error",
"reason": "resolved path escapes repo root",
}

if not target.is_file():
return {
"path": raw_path,
"line": line_no,
"status": "error",
"reason": f"file does not exist at {target.relative_to(repo_root)}",
}

try:
lines = target.read_text(encoding="utf-8", errors="replace").splitlines()
except OSError as exc:
return {
"path": raw_path,
"line": line_no,
"status": "error",
"reason": f"could not read file: {exc}",
}

# Validate the line range against the file. The previous version only
# checked `line_no < 1 or end_no > len(lines)`, which missed two cases:
# 1. Reversed range (e.g. `file.py:20-10`) — end is below start.
# 2. Out-of-bounds start where end happens to be in range
# (e.g. file has 10 lines, citation `file.py:50-5`: start=50 fails
# but end_no=5 passes the upper-bound check).
# Validate every bound independently.
if (
line_no < 1
or end_no < line_no
or line_no > len(lines)
or end_no > len(lines)
):
reason_parts = [f"line out of range (file has {len(lines)} lines)"]
if end_no < line_no:
reason_parts.append(f"reversed range: end {end_no} < start {line_no}")
return {
"path": raw_path,
"line": line_no,
"end_line": end_no,
"status": "error",
"reason": "; ".join(reason_parts),
}

ident = _nearest_identifier(spec_text, match.start())
if ident is None:
return {
"path": raw_path,
"line": line_no,
"end_line": end_no,
"status": "ok-no-identifier",
"reason": "path/line valid; no adjacent identifier to cross-check",
}

cited_block = "\n".join(lines[line_no - 1 : end_no])
if ident in cited_block:
return {
"path": raw_path,
"line": line_no,
"end_line": end_no,
"identifier": ident,
"status": "ok",
}

return {
"path": raw_path,
"line": line_no,
"end_line": end_no,
"identifier": ident,
"status": "stale-citation",
"reason": (
f"identifier {ident!r} not found at line {line_no}"
+ (f"-{end_no}" if end_no != line_no else "")
+ "; cited block does not contain it"
),
}


def validate_spec(spec_path: Path, repo_root: Path) -> dict[str, object]:
text = spec_path.read_text(encoding="utf-8", errors="replace")
results = [
_check_citation(repo_root, text, m) for m in _CITATION_RE.finditer(text)
]
failures = [r for r in results if r["status"] in ("error", "stale-citation")]
return {
"spec_path": str(spec_path),
"repo_root": str(repo_root),
"total_citations": len(results),
"failures": failures,
"passed": len(failures) == 0,
"details": results,
}


def main() -> int:
parser = argparse.ArgumentParser(
description=(__doc__ or "").splitlines()[0] or "Validate spec citations."
)
parser.add_argument("--branch", help="branch slug (default: current git HEAD)")
parser.add_argument("--spec-path", help="explicit spec file path (overrides --branch)")
parser.add_argument(
"--repo-root", help="repo root path (default: `git rev-parse --show-toplevel`)"
)
parser.add_argument(
"--quiet",
action="store_true",
help="suppress per-citation details on success",
)
args = parser.parse_args()

repo_root = Path(args.repo_root).resolve() if args.repo_root else _resolve_repo_root()

if args.spec_path:
spec_path = Path(args.spec_path).resolve()
else:
branch = args.branch or _branch_slug()
if not branch:
print(
json.dumps({"status": "error", "reason": "could not determine branch"}),
file=sys.stderr,
)
return 2
spec_path = repo_root / ".map" / branch / f"spec_{branch}.md"

if not spec_path.is_file():
print(
json.dumps({"status": "error", "reason": f"spec not found: {spec_path}"}),
file=sys.stderr,
)
return 2

verdict = validate_spec(spec_path, repo_root)
if args.quiet and verdict["passed"]:
verdict.pop("details", None)
print(json.dumps(verdict, indent=2))
return 0 if verdict["passed"] else 1


if __name__ == "__main__":
sys.exit(main())
1 change: 1 addition & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ test-integration:
lint:
ruff check src/ tests/
mypy src/
pyright src/

format:
black src/ tests/
Expand Down
Loading
Loading