MAP Framework Platform Refactor Spec

Status: Proposed
Date: 2026-03-23
Target: map-framework next major platform iteration

Summary

MAP Framework has a strong execution model: specialized agents, explicit quality gates, branch-scoped workflow artifacts, and a usable /map-plan -> /map-efficient -> /map-check loop.

The main weakness is not execution quality. The main weakness is that too much product behavior is encoded in large prompt files, large installer codepaths, and hand-maintained runtime conventions. This makes the framework harder to evolve, harder to customize, and harder to extend beyond its current Claude Code-first delivery model.

This spec proposes a platform refactor that keeps MAP's workflow philosophy intact while making the system:

declarative instead of prompt-hardcoded
configurable instead of template-edited
adapter-driven instead of delivery-coupled
schema-validated instead of convention-only
upgradable without blind overwrite

Problem Statement

The current platform has five structural problems.

1. Workflow behavior is too hardcoded

Core behavior currently lives across:

installer/runtime code in src/mapify_cli/__init__.py
orchestrator logic in src/mapify_cli/templates/map/scripts/map_orchestrator.py
long command prompts such as .claude/commands/map-efficient.md

This makes workflow changes expensive because behavior is split between Python, markdown prompts, hooks, and docs.

2. The execution engine is tightly coupled to Claude delivery

mapify init currently installs one concrete delivery shape: .claude/agents, .claude/commands, .claude/hooks, .claude/settings.json, .claude/workflow-rules.json.

That works, but it means the platform layer and the delivery layer are effectively the same thing. As a result:

prompt structure becomes platform architecture
update behavior is file-copy based
future support for other assistants would multiply template debt

3. Customization requires source edits instead of configuration

Today, most meaningful customization still means editing shipped templates or project files directly.

MAP needs a first-class configuration layer for:

workflow profile selection
repo-specific context
verification commands and gates
risk thresholds
research/guard defaults
delivery toggles

4. `.map/<branch>/` is useful but under-specified as a platform contract

The .map/<branch>/ directory already carries the right idea: persistent workflow state and artifacts that survive context resets.

However, only part of that contract is formally defined. The platform still relies on a mix of:

JSON files with partial schema coverage
markdown files with implicit structure
runtime assumptions spread across hooks, commands, and scripts

This makes resume, migration, CI integration, and tooling interoperability less reliable than they should be.

5. Upgrade semantics are too coarse

mapify upgrade refreshes shipped files, but it does not provide a robust managed-update model:

generated files do not have a complete metadata contract
drift detection is limited
local customization and generated content are not clearly separated
upgrade behavior is closer to "refresh templates" than "reconcile managed artifacts"

Goals

The refactor must achieve the following goals.

G1. Introduce a declarative workflow model

MAP workflows must be representable as data, not only as prompt prose.

G2. Separate platform engine from delivery adapters

The workflow runtime must be independent from the specific assistant integration layer.

G3. Add first-class project configuration and profiles

Users must be able to tune behavior through config files, not prompt surgery.

G4. Formalize `.map/<branch>/` as a stable artifact contract

Workflow state, planning outputs, validation results, and handoff artifacts must have explicit schemas and lifecycle rules.

G5. Make updates managed and version-aware

Generated files must carry metadata so mapify upgrade can detect drift, refresh safely, and distinguish generated content from local edits.

G6. Preserve MAP's existing strengths

The refactor must preserve:

the MAP planning/execution/validation loop
branch-scoped persistence
state-gated orchestration
guardrails and hooks
the current default Claude Code experience

Landscape Analysis: OpenSpec as Reference

OpenSpec (Fission-AI/OpenSpec) is an open-source spec framework that solves a related but different problem: it structures what to build before code is written. Analyzing OpenSpec reveals mature patterns that MAP can adopt for its platform layer without changing its workflow philosophy.

What OpenSpec does well that MAP should learn from

Schema-driven artifact DAGs. OpenSpec defines workflow schemas as YAML files where artifacts form a directed acyclic graph with topological sort. Artifact state is determined by filesystem existence (BLOCKED → READY → DONE), not runtime flags. Dependencies are enablers, not gates. This is a proven implementation of the pattern MAP proposes in Section 1, and MAP should adopt a similar model for workflow phase graphs.

Schema management CLI. OpenSpec provides schema init, schema fork, schema validate, schema which with clear precedence resolution (CLI flag → change metadata → project config → default). MAP should consider an analogous mapify workflow inspect / mapify workflow validate surface.

Context injection via structured config. OpenSpec's config.yaml separates context: (injected into all artifacts) from rules: (per-artifact only). This is more reliable than freeform CLAUDE.md blocks because injection is deterministic. MAP's project config (Section 3) should adopt a similar structured injection model.

Multi-tool delivery with capability metadata. OpenSpec supports 20+ AI assistants via per-tool adapter metadata that declares path patterns and scope support per delivery surface (skills, commands). This validates MAP's adapter layer design (Section 5) and provides a concrete reference for capability metadata shape.

Three-dimensional verification. OpenSpec's /opsx:verify validates Completeness, Correctness, and Coherence as separate dimensions. MAP already has Monitor/Predictor agents doing similar work, but formalizing verification dimensions in the artifact contract would improve consistency.

Migration with legacy detection. OpenSpec detects legacy files, strips managed markers while preserving user content, and supports --force for CI. MAP's upgrade model (Section 9) should adopt similar precision.

Where MAP and OpenSpec fundamentally differ

Aspect	OpenSpec	MAP
Core problem	Agree on what to build before code	Execute how to build with quality gates
Workflow model	Fluid actions, no phase gates	Structured phases with hard-stop gates
Agent architecture	Single AI assistant, human-guided	Multi-agent (Actor, Monitor, Predictor, Evaluator)
Quality enforcement	Verify is advisory, doesn't block	Monitor `valid=false` is a hard stop
Artifact scope	Specs, proposals, design docs	Execution state, blueprints, verification results
Branch awareness	Change folders (non-git)	`.map/<branch>/` (git-branch-scoped)
Orchestration	Dependency graph on artifacts	State-machine on execution phases

What MAP should NOT adopt from OpenSpec

Fluid/no-gate philosophy. MAP's strength is structured gates. Making them advisory would weaken quality enforcement.
Delta spec model. Irrelevant to MAP — MAP doesn't manage behavioral specifications, it manages execution artifacts.
Archive lifecycle. MAP's branch-scoped artifacts already have clear lifecycle via git branch deletion.

Non-Goals

This spec does not propose:

replacing MAP's agent architecture
removing Claude Code support or making it secondary
building multi-assistant support in the first implementation phase
redesigning every prompt from scratch
changing the user-facing philosophy from structured execution to freeform specing

Design Principles

Declarative before procedural

Workflow sequencing, gates, retries, and artifact requirements should be declared in data and interpreted by runtime code.

Thin prompts, thick runtime contracts

Prompt files should describe the local reasoning task. They should not be the primary storage location for orchestration rules.

Managed generation with explicit ownership

Generated files should state:

what generated them
which workflow/profile they belong to
which template version they came from
whether local divergence exists

Single source of truth per concern

workflow sequencing: workflow schema + runtime state
project defaults: project config
branch execution state: .map/<branch>/
delivery-specific rendering: adapter

Proposed Architecture

1. Workflow Schema Layer

Introduce a workflow schema format, stored under a new directory such as:

src/mapify_cli/workflows/
  core/
    plan.yaml
    efficient.yaml
    check.yaml
    resume.yaml

Each workflow schema should define:

workflow id
description
entry command
required artifacts
phases
phase ordering and dependency graph
conditional phase rules
retry policy
stuck-recovery policy
per-wave guard policy
final verification policy
resume semantics

Phase dependency model (informed by OpenSpec's artifact DAG)

OpenSpec demonstrates that a dependency graph with topological sort and filesystem-based state detection is both sufficient and robust. MAP should adopt a similar model for phase ordering.

Key design choices:

Phases form a DAG, not a linear sequence. Some phases can run in parallel when their dependencies are satisfied.
Phase state is derived from .map/<branch>/ artifacts, not runtime flags. A phase is DONE when its output artifact exists and validates. A phase is READY when all predecessor phases are DONE. A phase is BLOCKED otherwise.
Dependencies are hard gates in MAP (unlike OpenSpec where they are enablers). If Monitor returns valid=false, the dependent phases cannot proceed.
Schema resolution precedence: workflow schema defaults → project config overrides → CLI flags (highest priority).

Schema validation and introspection

Following OpenSpec's schema validate / schema which pattern, MAP should expose:

mapify workflow list — show available workflow schemas
mapify workflow show <id> — display phase graph and dependencies
mapify workflow validate <id> — check schema for circular dependencies, missing artifact producers, unreachable phases

Example shape:

id: map-efficient
version: 1
entry: /map-efficient

requires:
  - blueprint
  - task_plan

phases:
  - id: research
    run_if: "risk == high or existing_files >= 3"
    actor: research-agent
    produces: repo_insight
    requires: []

  - id: actor
    actor: actor
    produces: implementation
    requires: [research]    # skipped if research was skipped

  - id: monitor
    actor: monitor
    produces: review_result
    requires: [actor]
    retry:
      max_attempts: 5
      stuck_recovery_at: 3
    gate: hard              # valid=false blocks downstream

guards:
  per_wave:
    checks:
      - tests
      - lint

final_verification:
  run_if: "has_high_risk_subtasks or subtask_count >= 5"

state_detection:
  method: artifact_existence  # check .map/<branch>/ for outputs

This layer is the biggest structural upgrade in the entire refactor.

2. Workflow Engine

Move orchestration into a dedicated runtime package, for example:

src/mapify_cli/workflow_engine/
  loader.py
  validator.py
  graph.py
  executor.py
  state_store.py
  policies.py

Responsibilities:

load workflow schemas
validate them
compute next executable phase
enforce dependencies and retries
persist and restore workflow state
evaluate guard and recovery policies
expose machine-readable commands for prompts/hooks

The existing map_orchestrator.py should become a thin CLI wrapper around this engine, not the canonical implementation.

3. Project Configuration

Introduce a project config file, for example:

mapify.yaml

or:

.map/config.yaml

Initial supported configuration should include:

active profile
enabled workflows
repo context (injected into all agent prompts)
per-phase rules (injected only into matching phase prompts)
verification commands
research defaults
risk thresholds
guard policy tuning
MCP defaults
delivery settings
language preference

Context injection model (informed by OpenSpec)

OpenSpec demonstrates that separating context (global) from rules (per-artifact) provides reliable, deterministic injection:

context appears in every agent prompt — tech stack, conventions, constraints.
rules appear only in the matching phase prompt — phase-specific guidance.
Both are injected programmatically, not via freeform markdown that agents may or may not read.

This is more reliable than MAP's current approach of embedding context in CLAUDE.md blocks.

Config validation

Following OpenSpec's pattern:

Unknown phase IDs in rules should generate warnings.
context should have a size limit (e.g., 50KB) to avoid prompt bloat.
Invalid YAML should be reported with line numbers.
mapify doctor should validate config and report issues.

Example:

profile: core

context: |
  Python CLI project.
  Prefer deterministic shell commands.
  Keep .claude templates in sync with shipped templates.

rules:
  research:
    - Check for existing patterns before proposing new abstractions
    - Read tests first to understand expected behavior
  monitor:
    - Verify template sync between .claude/ and src/mapify_cli/templates/
    - Check for OWASP top-10 vulnerabilities

verification:
  checks:
    - make check
    - pytest tests/test_template_sync.py -v

policies:
  research_threshold_existing_files: 3
  final_verify_subtask_threshold: 5
  actor_monitor_max_retries: 5

delivery:
  assistant: claude
  hooks: true
  mcp: essential

language: ru  # optional: agent response language

4. Profiles

Add built-in profiles to reduce default complexity:

core: /map-plan, /map-efficient, /map-check, /map-review
full: all advanced workflows (map-fast, map-debug, map-tdd, map-release, map-learn)
custom: explicit workflow selection

Profiles should affect both generated delivery files and runtime defaults.

Profile selection UX (informed by OpenSpec)

OpenSpec's config profile provides an interactive wizard with current-state summary before changes. MAP should follow a similar pattern:

mapify config profile — interactive selection with preview
mapify config profile core — fast preset switch
mapify update — regenerate delivery files for current profile

Workflow selection granularity

OpenSpec allows per-workflow toggle within profiles. MAP should support similar granularity:

profile: custom
workflows:
  - map-plan
  - map-efficient
  - map-check
  - map-tdd        # user opted into TDD
  # map-debug etc. — omitted, not installed

This reduces generated file count and cognitive load for users who don't need every workflow.

This gives MAP a cleaner on-ramp and a cleaner upgrade path.

5. Delivery Adapter Layer

Split delivery generation from workflow semantics.

Example package shape:

src/mapify_cli/delivery/
  base.py
  claude.py

Per-tool capability metadata (informed by OpenSpec)

OpenSpec's multi-tool delivery model demonstrates that each adapter needs explicit capability metadata declaring what delivery surfaces it supports and where files go.

MAP should define adapter capability metadata even if initially only Claude is supported:

@dataclass
class DeliveryCapability:
    tool_id: str                    # e.g., "claude"
    supports_agents: bool           # .claude/agents/
    supports_commands: bool         # .claude/commands/
    supports_hooks: bool            # .claude/hooks/
    supports_skills: bool           # .claude/skills/ (emerging standard)
    config_files: list[str]         # ["settings.json", "workflow-rules.json"]
    path_pattern: str               # ".claude/{surface}/{name}"

This metadata allows:

Programmatic validation of adapter completeness
Future adapters to declare their capabilities upfront
mapify doctor to verify delivery state against capability declarations

Delivery surfaces

OpenSpec distinguishes between skills (richer metadata, cross-tool compatible) and commands (tool-specific). MAP currently uses commands + agents. Consider adding skills support for forward compatibility:

Surface	Current	Future
Agents	`.claude/agents/*.md`	adapter-generated
Commands	`.claude/commands/*.md`	adapter-generated
Hooks	`.claude/hooks/`	adapter-generated
Skills	not used	`.claude/skills/map-*/SKILL.md`
Config	`.claude/settings.json`	adapter-generated

The Claude adapter should generate:

.claude/agents/*
.claude/commands/*
.claude/hooks/*
.claude/settings.json
.claude/workflow-rules.json

The workflow engine should not care where those files end up. It should only expose:

workflow definitions
prompt inputs
state transitions
artifact contracts

This keeps MAP Claude-first without making the platform Claude-dependent.

6. Managed File Metadata

Every generated file should include structured metadata, either in frontmatter or JSON fields.

Minimum metadata contract:

generated_by: mapify-cli
generated_by_version
template_id
template_version
workflow_id or profile
managed: true

This enables:

drift detection
targeted upgrade
selective refresh
future migration tooling

7. Branch Artifact Contract

Formalize .map/<branch>/ into explicitly versioned artifacts.

Required machine-readable artifacts should include:

step_state.json
blueprint.json
verification_results_<branch>.json
repo_insight_<branch>.json
active-issues.json

Recommended human-readable artifacts should include:

task_plan_<branch>.md
findings_<branch>.md
verification-summary.md
pr-draft.md
code-review-XXX.md

Each artifact must define:

producer
consumer
schema/version
lifecycle
overwrite/append semantics

8. Runtime Policy Extraction

Move these rules out of prompt text and into code/config:

when research is required
when final verification runs
when guard checks run
when stuck recovery triggers
when predictor participates
what counts as a blocking failure
when resume should re-open or continue a subtask

Prompts should consume policy outputs, not embed policy logic themselves.

9. Verification Contract

Formalize verification as a structured contract, not just "run tests".

Verification dimensions (informed by OpenSpec)

OpenSpec's /opsx:verify checks three independent dimensions. MAP's /map-check should adopt a similar structured model:

Dimension	What it validates	MAP equivalent
Completeness	All subtasks done, all artifacts present	Task plan checkboxes, artifact existence
Correctness	Implementation matches intent, tests pass	`make check`, Monitor review
Coherence	Design decisions reflected in code, patterns consistent	Predictor analysis, cross-file consistency

Verification result schema

verification_results_<branch>.json should report per-dimension:

{
  "branch": "feature-x",
  "timestamp": "2026-03-23T12:00:00Z",
  "dimensions": {
    "completeness": {
      "status": "pass",
      "subtasks_done": 5,
      "subtasks_total": 5,
      "artifacts_present": ["blueprint", "task_plan", "step_state"]
    },
    "correctness": {
      "status": "pass",
      "checks_passed": ["make check", "pytest"],
      "checks_failed": []
    },
    "coherence": {
      "status": "warning",
      "issues": ["Design mentions event-driven but implementation uses polling"]
    }
  },
  "overall": "pass_with_warnings",
  "blocking": false
}

10. Upgrade and Migration Model

Add a real migration path:

mapify doctor
mapify upgrade
mapify migrate

Migration approach (informed by OpenSpec)

OpenSpec's migration model demonstrates several best practices MAP should adopt:

Legacy detection. Automatically detect older generated files by presence of outdated metadata, missing version fields, or legacy file layouts. OpenSpec scans for legacy command directories and marker blocks — MAP should scan for pre-metadata generated files.

Selective cleanup. Remove only OpenSpec-managed files, preserve user content. MAP should:

Strip managed content markers from generated files
Preserve any user edits outside managed blocks
Move deprecated files to a backup location before deletion

Non-interactive mode. Support --force flag for CI environments:

mapify upgrade --force — skip confirmation prompts
mapify migrate --force — auto-accept all migration steps

User attention items. When files require manual review (e.g., custom hooks that conflict with new schema), report them clearly without auto-deleting.

Scope-aware cleanup. If install scope or profile changes, track last-applied state and clean up stale files at previous locations.

Migration responsibilities:

detect older generated file versions
detect legacy artifact layouts
reconcile old and new metadata
regenerate safely when files are untouched
warn when managed files were locally edited
track last-applied scope/profile for drift detection

Implementation Plan

Phase 1. Extract Platform Boundaries

Create internal module boundaries without changing end-user behavior.

Deliverables:

dedicated workflow engine package
dedicated delivery package
dedicated config package
reduced responsibilities inside src/mapify_cli/__init__.py

Acceptance criteria:

existing commands still work
current tests still pass
installer behavior remains unchanged for Claude projects

Codebase analysis

Current state. src/mapify_cli/__init__.py is 2692 lines (41% of all source) and contains at least four distinct concerns: CLI wiring + UI widgets (~200 LOC), delivery/installer logic (~1500 LOC including 8 create_*_content() agent generators), config generation (~445 LOC for settings/MCP/permissions), and health checks (~370 LOC for doctor/check). State management is already well-separated into five modules (workflow_state.py, ralph_state.py, dependency_graph.py, verification_recorder.py, schemas.py — 1957 LOC total). The workflow engine (map_orchestrator.py + map_step_runner.py, 2556 LOC) lives in templates/map/scripts/ as runtime code copied to user projects. An empty validation/ directory already exists.

Proposed decomposition of __init__.py:

src/mapify_cli/
  __init__.py              → ~300 lines (CLI wiring, Typer commands, re-exports)
  cli_ui.py                → select_with_arrows(), StepTracker, show_banner()
  delivery/
    __init__.py
    installer.py           → init() orchestration logic
    agent_generator.py     → create_*_content() functions (8 agent generators)
    file_copier.py         → create_reference/command/skill/hook_files()
    health.py              → check(), doctor(), get_project_health()
  config/
    __init__.py
    settings.py            → settings merge logic
    mcp.py                 → MCP config generation
    permissions.py         → configure_global_permissions()

Risks specific to this phase: Refactoring a 2692-line file requires incremental moves with re-exports to avoid breaking existing imports. Existing tests import from mapify_cli top-level — backward-compatible re-exports are essential. The templates/map/scripts/ runtime engine stays in-place (it is copied to user projects and runs independently).

Phase 2. Workflow Schema for `map-plan` and `map-efficient`

Implement the workflow schema format and runtime loader.

Deliverables:

schema format
validator
runtime phase graph
migrated schemas for map-plan and map-efficient

Acceptance criteria:

map-plan and map-efficient execute from workflow definitions
orchestration decisions no longer rely on prompt-only phase descriptions

Codebase analysis

Triple redundancy problem. The same orchestration rules are currently defined in three independent places that must be kept in sync manually:

map_orchestrator.py — phase ordering (STEP_PHASES dict, lines 99-110; STEP_ORDER list, lines 113-122; TDD_STEP_ORDER, lines 125-136), step instructions (lines 412-458), retry default (max_retries=5, line 278), skippable steps set (line 972).
map-efficient.md — re-describes the same phases and ordering in prose, embeds retry thresholds (5 main / 3 stuck trigger / 2 guard rework), run conditions (RESEARCH: "3+ existing files OR risk=high"), gate definitions (per-wave: tests + linter), confidence threshold (0.7), stuck recovery policy.
workflow-gate.py — independently defines EDITING_PHASES = {"ACTOR", "APPLY", "TEST_WRITER"} (line 32) for phase-to-permission mapping.

What should become schema data (currently spread across 5+ files):

Rule	Current location	Current format
10 phase definitions and ordering	`map_orchestrator.py:99-136`	Python dict + list
Phase-to-agent mapping	`map-efficient.md` + `map_orchestrator.py:412-458`	Prose + Python strings
Retry thresholds (5/3/2)	`map-efficient.md:456-504`	Prose
RESEARCH run_if condition	`map-efficient.md:318` + `map_orchestrator.py:436`	Prose + instruction text
Per-wave gate checks	`map-efficient.md:525-575`	Prose
Phase edit permissions	`workflow-gate.py:32`	Python constant
Skippable steps set	`map_orchestrator.py:972`	Python constant
TDD step ordering variant	`map_orchestrator.py:125-136`	Python list
Test runner detection	`map_step_runner.py:820-842`	Python if/elif chain
Guard rework max	`map-efficient.md:571`	Prose
Confidence threshold	`map-efficient.md:676`	Prose

What should remain as code: state machine transitions (wave computation, subtask rotation), DAG algorithms (wave splitting by file conflicts), resume detection (regex parsing of task_plan), constraint enforcement (scope_glob fnmatch), circuit breaker formula.

What should remain as prompts: interview dimensions in map-plan.md, agent reasoning instructions, UX formatting, context distillation rules, troubleshooting guidance.

Recommendation: Start with map-efficient — it has the largest rule surface. Migrate map-plan second — it has less runtime orchestration, more reasoning guidance. The prompt files should reference parameters from the schema via injection (e.g., {{max_retries}}) rather than duplicating values.

Phase 3. Project Config and Profiles

Add mapify.yaml or .map/config.yaml and built-in profiles.

Deliverables:

config parser and validator
default config generation
profile selection
runtime policy overrides

Acceptance criteria:

users can change workflow/profile behavior without editing templates
installer and runtime both read the same config source

Codebase analysis

Current state: 7 configuration sources, zero unification. Configuration is scattered across workflow-rules.json (workflow selection triggers), settings.json (permissions + hooks), ralph-loop-config.json (circuit breaker thresholds), skill-rules.json (skill triggers), 3 Python hooks (~60 hardcoded constants), CLAUDE.md (behavioral rules in prose), and the init() CLI (4 flags, 0 profiles).

Key gaps identified:

Gap	Impact
No `config.yaml`	Users must edit 5+ files to customize MAP
No language/framework detection	init installs identical config for Python/Go/TS — yet `settings.json` contains `go vet`, `gofmt`, `kubectl` in allow rules
Hook constants are fully hardcoded	`EDITING_PHASES`, `DANGEROUS_FILE_PATTERNS`, `SAFE_PATH_PREFIXES`, `READONLY_COMMANDS` — all Python constants, user must edit source
No per-workflow overrides	`ralph-loop-config.json` is global; `map-fast` cannot have `max_total_iterations: 10` while `map-efficient` has 50
No profiles	init installs ALL: 11 agents, 13 commands, 8 hooks, 3 skills — no `core`/`full`/`custom`
No CLAUDE.md for user projects	Template CLAUDE.md is MAP's own dev instructions, not a user-project skeleton

Existing infrastructure that helps: repo_insight.py already detects project language and can generate suggested verification commands. This can feed auto-detected defaults into config.yaml.

Migration path for hooks: Hooks should read overridable constants from config.yaml via a shared load_map_config() utility, with fallback to current hardcoded defaults. This preserves behavior for users who don't create a config file.

Phase 4. Managed Generation Metadata

Add metadata to generated files and upgrade detection.

Deliverables:

metadata contract
drift detection
selective upgrade behavior
clearer reporting in mapify doctor

Acceptance criteria:

generated files show origin/version
mapify upgrade can distinguish stale files from customized files

Codebase analysis

Current state: almost no metadata, upgrade is blind overwrite. All four generation functions (create_agent_files, create_command_files, create_hook_files, create_config_files) use shutil.copy2() — pure file copy with zero metadata injection. No generated_by marker, no content hash, no mapify_version stamp exists in any generated file. The upgrade() command (line 2519) calls the same create_*_files() functions and blindly overwrites everything. Only two exceptions have merge logic: .mcp.json (preserved) and settings.local.json (merged with create_or_merge_project_settings_local()).

What exists today:

Category	Has frontmatter	Has version	Has `generated_by`	Upgrade behavior
Agents (11 .md)	YAML: name, description, model	Manual (e.g., `2.4.0`)	No	Blind overwrite
Commands (13 .md)	YAML: description only	No	No	Blind overwrite
Hooks (7 .py)	No (docstring only)	No	No	Blind overwrite
Config (3 .json)	N/A	No (schema version only)	No	Blind overwrite

Implementation approach: Replace shutil.copy2() with a copy_managed_file() function that injects metadata. For .md files: YAML frontmatter fields. For .py files: header comment block. For .json files: _mapify_metadata key or sidecar .map/manifest.json. The upgrade() function should compare template_hash values: overwrite only when the source template changed AND the user did not modify the file.

Drift detection logic:

current_hash = sha256(file_content_without_metadata)
if current_hash != metadata.template_hash:
    → "user_modified" (warn, don't overwrite)
if metadata.mapify_version < current_mapify_version:
    → "stale" (safe to overwrite)
else:
    → "up_to_date" (skip)

Phase 5. Formal `.map` Artifact Schemas

Extend schema coverage to all critical branch artifacts.

Deliverables:

explicit schema/version for each artifact
validators for write/read paths
migration helpers for old artifacts

Acceptance criteria:

resume/check/doctor rely on validated artifacts
artifact consumers fail with actionable errors, not silent drift

Codebase analysis

Current state: 25 artifacts discovered, only 2 have JSON Schema validation. The full artifact inventory:

Scope	Machine-readable (JSON)	Human-readable (MD)	With schema	Without schema
`.map/<branch>/`	10	8	0	18
`.map/` root	2	1	2	1
`.map/logs/`, `.map/scripts/`	—	—	—	—

Critical gaps:

step_state.json — central artifact, no runtime validation. Read by orchestrator + 4 hooks + step_runner. STATE_ARTIFACT_SCHEMA exists in schemas.py but is never imported or used — it is an orphaned schema definition.
blueprint.json — no schema at all. Created by /map-plan, consumed by orchestrator for DAG computation. Expected keys (subtasks[].id, .dependencies, .affected_files) are described only in prompt text.
Two state systems coexist. progress.md (root-level, workflow_state.py) is legacy; step_state.json (branch-scoped, map_orchestrator.py) is current. Both remain in active code.
Location inconsistency. verification_results_<branch>.json and repo_insight_<branch>.json live at .map/ root (branch in filename), while all other branch-scoped artifacts use .map/<branch>/ directories.
No lifecycle cleanup. No artifact has a deletion mechanism. .map/<branch>/ accumulates files indefinitely.

Priority for schema formalization:

Priority	Artifact	Reason
P0	`step_state.json`	Central, 6+ consumers, schema already written but orphaned
P0	`blueprint.json`	DAG computation input, no schema at all
P1	`final_verification.json`	Gate decision artifact
P1	`task_plan_<branch>.md`	Custom XML-like format with regex parsing
P2	`active-issues.json`, `known-issues.json`	Simple, but no schema

Quick win: Activate the orphaned STATE_ARTIFACT_SCHEMA — connect it to map_orchestrator.py with jsonschema.validate() calls on read/write paths.

Phase 6. Verification Contract and Doctor Improvements

Formalize the verification model.

Deliverables:

structured verification result schema
per-dimension reporting (completeness, correctness, coherence)
improved mapify doctor output with config validation

Acceptance criteria:

/map-check reports structured results per verification dimension
mapify doctor validates project config and reports issues with line numbers
verification results are machine-readable JSON

Codebase analysis

Current state: three-tier verification model already exists, but is not formalized into unified dimensions. The three tiers:

Monitor agent (per-subtask, during execution) — 10-dimension quality model with structured JSON verdict (valid, issues[], passed_checks, failed_checks). Decision rules: CRITICAL → always valid=false; ≥2 HIGH → valid=false. Hard stop.
final-verifier agent (whole-task, after all subtasks) — adversarial verification with confidence scoring (threshold 0.7). Outputs JSON with passed, confidence, evidence, root_cause. Verdicts: COMPLETE / RE_DECOMPOSE / ESCALATE.
/map-check command (orchestration) — runs final-verifier + tests + linter + git status. Produces verification-summary.md, <stage>-gate.json, active-issues.json, pr-draft.md, runs/<timestamp>/RESULTS.md.

Mapping to Completeness/Correctness/Coherence:

Tier	Completeness	Correctness	Coherence
Monitor	dims 5,8,10 (testability, deps, research)	dims 1,2,4,6 (correctness, security, perf, CLI)	dims 3,7,9 (quality, maintainability, docs)
final-verifier	subtasks done, criteria met	tests pass, edge cases, ground truth	integration between subtasks
`/map-check`	step_state all COMPLETE	tests+lint pass, verifier APPROVED	acceptance criteria from plan

Four incompatible result formats currently coexist: verification_results_<branch>.json (recipe-based), final_verification.json (confidence-based), verification-summary.md (human-readable verdict), <stage>-gate.json (gate verdict). A unified umbrella schema with per-dimension reporting would consolidate these.

Hardcoded policy values to extract to config: confidence threshold (0.7) in final-verifier.md, Monitor decision rules ("CRITICAL → always valid=false", "≥2 HIGH → valid=false") in monitor.md, test timeout (300s) in map_step_runner.py:858, output truncation (5000 chars) in map_step_runner.py:863.

Phase 7. Additional Delivery Adapters

Only after the platform boundary is complete.

Deliverables:

second adapter if desired
adapter capability metadata model
adapter test harness
generated-file snapshots per adapter

Acceptance criteria:

adding a new delivery target does not require changing workflow semantics
adapter declares capabilities via metadata, not implicit code paths

Codebase analysis

Current state: monolithic coupling to Claude Code, zero abstraction layer. 73 references to "claude" in __init__.py, 12+ functions with .claude/ hardcoded paths, 8 hook scripts fully dependent on Claude Code protocol (PreToolUse/PostToolUse events, CLAUDE_PROJECT_DIR env var, JSON stdout response format). selected_ai = "claude" is hardcoded (line 2167) with comment "the only supported AI assistant". No adapter/factory/provider pattern exists.

Natural separation boundary already present:

Layer	Content	Portability
`.map/` scripts (orchestrator, step runner, diagnostics)	Workflow engine	100% portable
Agent/command body text (prompt content)	Reasoning instructions	~90% portable
Delivery envelope (frontmatter, paths, hooks, config)	Claude Code integration	0% portable

What would need per-adapter reimplementation:

Component	Files affected	Effort
Directory layout (`.claude/` paths)	`__init__.py` (12+ functions)	High
`settings.json` generation	3 functions (settings, permissions, settings.local)	High — entirely different format per platform
Hook scripts	8 files, ~700 LOC total	High — protocol completely different per platform
Agent file format (frontmatter)	`create_agent_files()`	Medium — body reusable, envelope differs
Command/slash format	`create_command_files()`	Medium
MCP config	2 functions	Medium

What is reusable as-is across adapters: .map/ scripts, ralph-loop-config.json, workflow-rules.json body content, all state management modules, schemas.py, static analysis handlers, planning skill templates.

Minimum viable adapter interface:

class DeliveryAdapter(Protocol):
    tool_id: str
    def create_agents(self, path: Path, agents: list[AgentSpec]) -> int: ...
    def create_commands(self, path: Path, commands: list[CommandSpec]) -> int: ...
    def create_hooks(self, path: Path, hooks: list[HookSpec]) -> int: ...
    def create_config(self, path: Path, config: ProjectConfig) -> int: ...
    def detect_installed(self, path: Path) -> bool: ...

Each adapter maps tool-agnostic content (AgentSpec.prompt_body) into platform-specific envelopes (ClaudeAgentEnvelope.model_frontmatter).

Implementation Priority and Dependencies

Based on codebase analysis, the phases have the following dependency structure and priority assessment:

Phase 1 (Extract Boundaries) ─────────────────────────────────┐
  │                                                             │
  ├──> Phase 4 (Managed Metadata)         [parallel, low dep]  │
  │                                                             │
  ├──> Phase 3 (Config + Profiles)        [parallel, low dep]  │
  │                                                             │
  └──> Phase 2 (Workflow Schema)          [highest value]  ────┤
                                                                │
       Phase 5 (Artifact Schemas)         [incremental]   ─────┤
                                                                │
       Phase 6 (Verification Contract)    [incremental]   ─────┤
                                                                │
       Phase 7 (Delivery Adapters)        [only if needed] ────┘

Phase	Complexity	Value	Risk	LOC affected	Recommendation
1. Boundaries	Medium	High (prerequisite)	Low	~2700 refactor	First, incremental moves with re-exports
2. Schema	High	Highest (core problem)	Medium	~2500 across 5 files	Start with map-efficient, then map-plan
3. Config	Medium	High (UX)	Low	~500 new + hooks migration	Parallel with Phase 2, after Phase 1
4. Metadata	Low-medium	Medium (upgrade safety)	Low	~200 new + 4 functions	Parallel with Phase 3, after Phase 1
5. Artifacts	Low-medium	Medium (reliability)	Low	~300 (activate + add schemas)	Incremental, P0 = step_state + blueprint
6. Verification	Low	Medium (formalization)	Low	~200 (unified schema)	After Phases 2-5
7. Adapters	High	Low (one platform now)	Medium	~1500+ (hooks reimpl)	Only if multi-platform demand exists

Quick wins available today (before full refactor):

Activate orphaned STATE_ARTIFACT_SCHEMA in schemas.py → connect to map_orchestrator.py
Add BLUEPRINT_SCHEMA to schemas.py
Add generated_by + mapify_version to agent frontmatter (agents already have YAML frontmatter)
Move 60 hardcoded hook constants to a hook_defaults.json that hooks read at startup

Acceptance Criteria

This refactor is successful when all of the following are true.

Platform

workflow sequencing is defined in workflow schemas, not only in markdown command prompts
runtime policies are loaded from code/config instead of duplicated across prompts
.map/<branch>/ artifacts have documented producers, consumers, and schemas

Product

a new project can still run mapify init and get a working Claude Code setup
advanced users can tune behavior through config
mapify doctor can explain state, drift, and missing artifacts clearly
mapify upgrade can refresh managed files without acting as blind overwrite

Engineering

src/mapify_cli/__init__.py is reduced to CLI wiring and installer composition
orchestrator logic is testable without parsing prompt files
core workflow state transitions have unit tests
generated templates have snapshot or fixture-based regression tests

Risks

R1. Over-engineering the schema layer

If the workflow schema becomes too abstract, it will be hard to use and hard to debug.

Mitigation:

start with map-plan and map-efficient
keep the schema minimal
move only stable orchestration rules into the schema

R2. Breaking the current Claude experience

MAP already works today. The refactor must not damage the default experience.

Mitigation:

keep Claude adapter as the reference implementation
ship migration incrementally
preserve existing generated file locations in the first major rollout

R3. Two systems temporarily coexisting

During migration there may be prompt-defined and schema-defined workflow logic in parallel.

Mitigation:

make one workflow at a time authoritative
document migration status clearly
add tests that fail if both sources diverge

R4. Config injection bloating prompts

If project config context and rules are too large, they consume token budget and degrade agent reasoning quality.

Mitigation:

enforce a size limit on context (e.g., 50KB, following OpenSpec's precedent)
warn in mapify doctor when context exceeds recommended size
inject rules only into matching phases, not globally

R5. Schema complexity creep

If workflow schemas support too many conditional features (run_if, retry policies, gate types), they become hard to debug.

Mitigation:

start with a minimal schema format covering only map-plan and map-efficient
add features incrementally based on real usage patterns
provide mapify workflow show <id> for visual inspection of phase graphs

Open Questions

Should project config live at mapify.yaml or .map/config.yaml?
- OpenSpec uses openspec/config.yaml inside its own directory. Codebase analysis supports .map/config.yaml — all branch-scoped artifacts already live under .map/, and hooks already read from .map/<branch>/. Co-locating config with artifacts reduces path management complexity.
Should workflow schemas be user-overridable, or only built-in in the first iteration?
- OpenSpec allows custom schemas from day one via schema fork. Given that MAP has 10 hardcoded phases with complex interactions, start built-in only but design the YAML format to be extensible.
Should managed metadata be embedded directly in generated markdown files, or tracked in a sidecar manifest?
- Codebase analysis shows agents already have YAML frontmatter (adding fields is trivial), but commands have minimal frontmatter and hooks have none. A hybrid approach is recommended: embed in .md frontmatter, use a sidecar .map/manifest.json for .py/.json files.
Should human-readable .map artifacts remain markdown-first, or should some become JSON-first with generated markdown views?
- Codebase analysis shows task_plan_<branch>.md is parsed via regex to extract ST-XXX IDs and status. Converting it to JSON-first with a markdown view would eliminate fragile regex parsing.
Should /map-review remain mostly prompt-driven longer than /map-plan and /map-efficient, or migrate with them?
Should MAP adopt a skills-based delivery model alongside commands/agents for forward compatibility with emerging AI tool standards?
- OpenSpec has moved to skills as the primary delivery surface. MAP already ships skills in templates/skills/. Consider expanding skills coverage during Phase 7.
Should context and rules injection be validated against token budgets to prevent prompt bloat?
- OpenSpec limits context to 50KB. MAP should define a similar limit and warn when approaching it.
Should mapify doctor validate delivery adapter state against capability metadata (similar to OpenSpec's schema validate)?
Should the legacy progress.md / WorkflowState system be removed or migrated as part of Phase 5?
- Codebase analysis confirmed two state systems coexist (progress.md + step_state.json). Removing the legacy system would simplify the artifact contract but requires verifying no consumers depend on it.
Should verification_results_<branch>.json and repo_insight_<branch>.json move from .map/ root into .map/<branch>/ for location consistency?
- All other branch-scoped artifacts use .map/<branch>/ directories. The current location-in-filename pattern is an inconsistency that complicates cleanup and discovery.

Recommended First Slice

The first implementation slice should be:

Extract boundaries — decompose __init__.py (2692 LOC) into delivery/, config/, cli_ui.py with backward-compatible re-exports
Quick wins — activate orphaned STATE_ARTIFACT_SCHEMA, add BLUEPRINT_SCHEMA, add generated_by to agent frontmatter
Workflow schemas for map-efficient first (largest rule surface, triple redundancy), then map-plan
Project config — .map/config.yaml with context injection, per-phase rules, profile selection; hooks read from config via load_map_config() with fallback to current defaults
Managed file metadata — replace shutil.copy2() with copy_managed_file() that injects mapify_version + template_hash; rewrite upgrade() with drift detection
Schema introspection CLI — mapify workflow list / mapify workflow show

This sequence provides the highest leverage while keeping the current product usable throughout the migration. Phases 3-4 can proceed in parallel after Phase 1.

Appendix A: `.map/` Artifact Inventory (as of codebase analysis)

Full inventory of artifacts discovered in the current codebase. This serves as the baseline for Phase 5 schema formalization.

Branch-scoped artifacts (`.map/<branch>/`)

Artifact	Format	Producer	Key consumers	Schema exists	Semantics
`step_state.json`	JSON	`map_orchestrator.py`	orchestrator, 4 hooks, step_runner	Doc only (orphaned schema in `schemas.py`)	Overwrite (atomic)
`blueprint.json`	JSON	`/map-plan` (Write tool)	orchestrator `set_waves()`	No	Overwrite (once)
`task_plan_<branch>.md`	MD + XML tags	`/map-plan` (Write tool)	orchestrator, step_runner (regex parse)	No	In-place status updates
`spec_<branch>.md`	MD	`/map-plan` (Write tool)	task-decomposer, `/map-tdd`	No	Overwrite + append
`findings_<branch>.md`	MD + XML tags	research-agent, `/map-plan`	`/map-efficient`	No	Append
`ralph_state.json`	JSON	`ralph_state.py`	Ralph loop orchestrator	No (dataclass only)	Overwrite (atomic)
`final_verification.json`	JSON	final-verifier agent	`ralph_state.py`, orchestrator	No	Overwrite
`verification-summary.md`	MD	`map_step_runner.py`	orchestrator (resume), `/map-check`	No	Overwrite
`code-review-NNN.md`	MD (numbered)	`/map-efficient` (Write tool)	orchestrator (resume briefing)	No	Create new
`plan-review-NNN.md`	MD (numbered)	`map_step_runner.py`	review handoff	No	Create new
`pr-draft.md`	MD	`map_step_runner.py`	handoff bundle	No	Overwrite
`qa-001.md`	MD	`ensure_human_artifacts()` / agent	handoff bundle	No	Overwrite
`known-issues.json`	JSON	`map_step_runner.py`	`/map-check`, diagnostics	No	Append
`active-issues.json`	JSON	`map_step_runner.py`	handoff bundle, `/map-review`	No	Overwrite (replace)
`<stage>-gate.json`	JSON	`map_step_runner.py`	handoff bundle	No	Overwrite
`diagnostics.json`	JSON	`diagnostics.py`	`workflow-context-injector.py`	No	Overwrite
`run-summary.json`	JSON	`diagnostics.py`	`/map-check`	No	Overwrite
`runs/<ts>/RESULTS.md`	MD	`/map-check` (Write tool)	Audit	No	Create new
`transcript-*.md`	MD	`pre-compact-save-transcript.py`	Manual review	No	Create new

Root-level artifacts (`.map/`)

Artifact	Format	Producer	Schema exists	Notes
`verification_results_<branch>.json`	JSON	`verification_recorder.py`	Yes (validated on write)	Location inconsistency: branch in filename, not path
`repo_insight_<branch>.json`	JSON	`repo_insight.py`	Yes (validated on write)	Same location inconsistency
`progress.md`	MD + YAML frontmatter	`workflow_state.py`	No	Legacy — coexists with `step_state.json`

Appendix B: Claude Code Coupling Map (as of codebase analysis)

Coupling assessment for Phase 7 planning. Shows what is Claude-specific vs tool-agnostic.

Claude-specific (0% portable)

Component	Files	Coupling points
`__init__.py` delivery functions	12+ functions	`.claude/` paths hardcoded, `selected_ai = "claude"`
`settings.json`	Template	Claude Code proprietary schema, `$schema` URL, `permissions.deny/allow` format, hook event names
Hook scripts (8 files)	`templates/hooks/`	`CLAUDE_PROJECT_DIR` env var, `PreToolUse`/`PostToolUse` event model, JSON stdout protocol
Agent frontmatter	11 `.md` files	`model: sonnet/opus/haiku` (Claude-specific tier names)
Skill format	`SKILL.md` files	`allowed-tools`, `${CLAUDE_PLUGIN_ROOT}`, Claude hook sections
`is_map_initialized()`	`__init__.py:549`	Checks for `.claude/agents`, `.claude/commands`, `.claude/settings.json`

Tool-agnostic (100% portable)

Component	Files	Notes
`.map/` scripts	`map_orchestrator.py`, `map_step_runner.py`, `diagnostics.py`, `map_utils.py`	Only 1 comment mentions Claude
State management	`workflow_state.py`, `ralph_state.py`, `dependency_graph.py`, `schemas.py`	Pure logic
`ralph-loop-config.json`	Template	Circuit breaker thresholds
Static analysis	`analyze.sh` + language handlers	Language-specific, not platform-specific
Agent/command body text	Prompt content (sans frontmatter)	~90% portable (`AskUserQuestion()` is Claude-specific)
`verification_recorder.py`	State module	Pure logic
`repo_insight.py`	State module	1 line excludes `.claude` from scan

Appendix C: OpenSpec Reference

This spec was informed by analysis of OpenSpec (v2025+, OPSX workflow). Key documents referenced:

docs/concepts.md — artifact model, delta specs, schemas
docs/workflows.md — fluid actions, dependency graph, verification dimensions
docs/customization.md — project config, custom schemas, context injection
docs/opsx.md — OPSX architecture, component model, iteration model
docs/cli.md — CLI surface, schema management commands
docs/migration-guide.md — legacy detection, cleanup, config migration
docs/supported-tools.md — multi-tool delivery, capability metadata
openspec/changes/add-global-install-scope/ — install scope design (global vs project)

FilesExpand file tree

MAP_PLATFORM_SPEC.md

Latest commit

History

MAP_PLATFORM_SPEC.md

File metadata and controls

MAP Framework Platform Refactor Spec

Summary

Problem Statement

1. Workflow behavior is too hardcoded

2. The execution engine is tightly coupled to Claude delivery

3. Customization requires source edits instead of configuration

4. .map/<branch>/ is useful but under-specified as a platform contract

5. Upgrade semantics are too coarse

Goals

G1. Introduce a declarative workflow model

G2. Separate platform engine from delivery adapters

G3. Add first-class project configuration and profiles

G4. Formalize .map/<branch>/ as a stable artifact contract

G5. Make updates managed and version-aware

G6. Preserve MAP's existing strengths

Landscape Analysis: OpenSpec as Reference

What OpenSpec does well that MAP should learn from

Where MAP and OpenSpec fundamentally differ

What MAP should NOT adopt from OpenSpec

Non-Goals

Design Principles

Declarative before procedural

Thin prompts, thick runtime contracts

Managed generation with explicit ownership

Single source of truth per concern

Proposed Architecture

1. Workflow Schema Layer

Phase dependency model (informed by OpenSpec's artifact DAG)

Schema validation and introspection

2. Workflow Engine

3. Project Configuration

Context injection model (informed by OpenSpec)

Config validation

4. Profiles

Profile selection UX (informed by OpenSpec)

Workflow selection granularity

5. Delivery Adapter Layer

Per-tool capability metadata (informed by OpenSpec)

Delivery surfaces

6. Managed File Metadata

7. Branch Artifact Contract

8. Runtime Policy Extraction

9. Verification Contract

Verification dimensions (informed by OpenSpec)

Verification result schema

10. Upgrade and Migration Model

Migration approach (informed by OpenSpec)

Implementation Plan

Phase 1. Extract Platform Boundaries

Codebase analysis

Phase 2. Workflow Schema for map-plan and map-efficient

Codebase analysis

Phase 3. Project Config and Profiles

Codebase analysis

Phase 4. Managed Generation Metadata

Codebase analysis

Phase 5. Formal .map Artifact Schemas

Codebase analysis

Phase 6. Verification Contract and Doctor Improvements

Codebase analysis

Phase 7. Additional Delivery Adapters

Codebase analysis

Implementation Priority and Dependencies

Acceptance Criteria

Platform

Product

Engineering

Risks

R1. Over-engineering the schema layer

R2. Breaking the current Claude experience

R3. Two systems temporarily coexisting

R4. Config injection bloating prompts

R5. Schema complexity creep

Open Questions

4. `.map/<branch>/` is useful but under-specified as a platform contract

G4. Formalize `.map/<branch>/` as a stable artifact contract

Phase 2. Workflow Schema for `map-plan` and `map-efficient`

Phase 5. Formal `.map` Artifact Schemas

Appendix A: `.map/` Artifact Inventory (as of codebase analysis)

Branch-scoped artifacts (`.map/<branch>/`)

Root-level artifacts (`.map/`)