Status: Proposed
Date: 2026-03-23
Target: map-framework next major platform iteration
MAP Framework has a strong execution model: specialized agents, explicit quality gates, branch-scoped workflow artifacts, and a usable /map-plan -> /map-efficient -> /map-check loop.
The main weakness is not execution quality. The main weakness is that too much product behavior is encoded in large prompt files, large installer codepaths, and hand-maintained runtime conventions. This makes the framework harder to evolve, harder to customize, and harder to extend beyond its current Claude Code-first delivery model.
This spec proposes a platform refactor that keeps MAP's workflow philosophy intact while making the system:
- declarative instead of prompt-hardcoded
- configurable instead of template-edited
- adapter-driven instead of delivery-coupled
- schema-validated instead of convention-only
- upgradable without blind overwrite
The current platform has five structural problems.
Core behavior currently lives across:
- installer/runtime code in
src/mapify_cli/__init__.py - orchestrator logic in
src/mapify_cli/templates/map/scripts/map_orchestrator.py - long command prompts such as
.claude/commands/map-efficient.md
This makes workflow changes expensive because behavior is split between Python, markdown prompts, hooks, and docs.
mapify init currently installs one concrete delivery shape: .claude/agents, .claude/commands, .claude/hooks, .claude/settings.json, .claude/workflow-rules.json.
That works, but it means the platform layer and the delivery layer are effectively the same thing. As a result:
- prompt structure becomes platform architecture
- update behavior is file-copy based
- future support for other assistants would multiply template debt
Today, most meaningful customization still means editing shipped templates or project files directly.
MAP needs a first-class configuration layer for:
- workflow profile selection
- repo-specific context
- verification commands and gates
- risk thresholds
- research/guard defaults
- delivery toggles
The .map/<branch>/ directory already carries the right idea: persistent workflow state and artifacts that survive context resets.
However, only part of that contract is formally defined. The platform still relies on a mix of:
- JSON files with partial schema coverage
- markdown files with implicit structure
- runtime assumptions spread across hooks, commands, and scripts
This makes resume, migration, CI integration, and tooling interoperability less reliable than they should be.
mapify upgrade refreshes shipped files, but it does not provide a robust managed-update model:
- generated files do not have a complete metadata contract
- drift detection is limited
- local customization and generated content are not clearly separated
- upgrade behavior is closer to "refresh templates" than "reconcile managed artifacts"
The refactor must achieve the following goals.
MAP workflows must be representable as data, not only as prompt prose.
The workflow runtime must be independent from the specific assistant integration layer.
Users must be able to tune behavior through config files, not prompt surgery.
Workflow state, planning outputs, validation results, and handoff artifacts must have explicit schemas and lifecycle rules.
Generated files must carry metadata so mapify upgrade can detect drift, refresh safely, and distinguish generated content from local edits.
The refactor must preserve:
- the MAP planning/execution/validation loop
- branch-scoped persistence
- state-gated orchestration
- guardrails and hooks
- the current default Claude Code experience
OpenSpec (Fission-AI/OpenSpec) is an open-source spec framework that solves a related but different problem: it structures what to build before code is written. Analyzing OpenSpec reveals mature patterns that MAP can adopt for its platform layer without changing its workflow philosophy.
Schema-driven artifact DAGs. OpenSpec defines workflow schemas as YAML files where artifacts form a directed acyclic graph with topological sort. Artifact state is determined by filesystem existence (BLOCKED → READY → DONE), not runtime flags. Dependencies are enablers, not gates. This is a proven implementation of the pattern MAP proposes in Section 1, and MAP should adopt a similar model for workflow phase graphs.
Schema management CLI. OpenSpec provides schema init, schema fork, schema validate, schema which with clear precedence resolution (CLI flag → change metadata → project config → default). MAP should consider an analogous mapify workflow inspect / mapify workflow validate surface.
Context injection via structured config. OpenSpec's config.yaml separates context: (injected into all artifacts) from rules: (per-artifact only). This is more reliable than freeform CLAUDE.md blocks because injection is deterministic. MAP's project config (Section 3) should adopt a similar structured injection model.
Multi-tool delivery with capability metadata. OpenSpec supports 20+ AI assistants via per-tool adapter metadata that declares path patterns and scope support per delivery surface (skills, commands). This validates MAP's adapter layer design (Section 5) and provides a concrete reference for capability metadata shape.
Three-dimensional verification. OpenSpec's /opsx:verify validates Completeness, Correctness, and Coherence as separate dimensions. MAP already has Monitor/Predictor agents doing similar work, but formalizing verification dimensions in the artifact contract would improve consistency.
Migration with legacy detection. OpenSpec detects legacy files, strips managed markers while preserving user content, and supports --force for CI. MAP's upgrade model (Section 9) should adopt similar precision.
| Aspect | OpenSpec | MAP |
|---|---|---|
| Core problem | Agree on what to build before code | Execute how to build with quality gates |
| Workflow model | Fluid actions, no phase gates | Structured phases with hard-stop gates |
| Agent architecture | Single AI assistant, human-guided | Multi-agent (Actor, Monitor, Predictor, Evaluator) |
| Quality enforcement | Verify is advisory, doesn't block | Monitor valid=false is a hard stop |
| Artifact scope | Specs, proposals, design docs | Execution state, blueprints, verification results |
| Branch awareness | Change folders (non-git) | .map/<branch>/ (git-branch-scoped) |
| Orchestration | Dependency graph on artifacts | State-machine on execution phases |
- Fluid/no-gate philosophy. MAP's strength is structured gates. Making them advisory would weaken quality enforcement.
- Delta spec model. Irrelevant to MAP — MAP doesn't manage behavioral specifications, it manages execution artifacts.
- Archive lifecycle. MAP's branch-scoped artifacts already have clear lifecycle via git branch deletion.
This spec does not propose:
- replacing MAP's agent architecture
- removing Claude Code support or making it secondary
- building multi-assistant support in the first implementation phase
- redesigning every prompt from scratch
- changing the user-facing philosophy from structured execution to freeform specing
Workflow sequencing, gates, retries, and artifact requirements should be declared in data and interpreted by runtime code.
Prompt files should describe the local reasoning task. They should not be the primary storage location for orchestration rules.
Generated files should state:
- what generated them
- which workflow/profile they belong to
- which template version they came from
- whether local divergence exists
- workflow sequencing: workflow schema + runtime state
- project defaults: project config
- branch execution state:
.map/<branch>/ - delivery-specific rendering: adapter
Introduce a workflow schema format, stored under a new directory such as:
src/mapify_cli/workflows/
core/
plan.yaml
efficient.yaml
check.yaml
resume.yaml
Each workflow schema should define:
- workflow id
- description
- entry command
- required artifacts
- phases
- phase ordering and dependency graph
- conditional phase rules
- retry policy
- stuck-recovery policy
- per-wave guard policy
- final verification policy
- resume semantics
OpenSpec demonstrates that a dependency graph with topological sort and filesystem-based state detection is both sufficient and robust. MAP should adopt a similar model for phase ordering.
Key design choices:
- Phases form a DAG, not a linear sequence. Some phases can run in parallel when their dependencies are satisfied.
- Phase state is derived from
.map/<branch>/artifacts, not runtime flags. A phase is DONE when its output artifact exists and validates. A phase is READY when all predecessor phases are DONE. A phase is BLOCKED otherwise. - Dependencies are hard gates in MAP (unlike OpenSpec where they are enablers). If Monitor returns
valid=false, the dependent phases cannot proceed. - Schema resolution precedence: workflow schema defaults → project config overrides → CLI flags (highest priority).
Following OpenSpec's schema validate / schema which pattern, MAP should expose:
mapify workflow list— show available workflow schemasmapify workflow show <id>— display phase graph and dependenciesmapify workflow validate <id>— check schema for circular dependencies, missing artifact producers, unreachable phases
Example shape:
id: map-efficient
version: 1
entry: /map-efficient
requires:
- blueprint
- task_plan
phases:
- id: research
run_if: "risk == high or existing_files >= 3"
actor: research-agent
produces: repo_insight
requires: []
- id: actor
actor: actor
produces: implementation
requires: [research] # skipped if research was skipped
- id: monitor
actor: monitor
produces: review_result
requires: [actor]
retry:
max_attempts: 5
stuck_recovery_at: 3
gate: hard # valid=false blocks downstream
guards:
per_wave:
checks:
- tests
- lint
final_verification:
run_if: "has_high_risk_subtasks or subtask_count >= 5"
state_detection:
method: artifact_existence # check .map/<branch>/ for outputsThis layer is the biggest structural upgrade in the entire refactor.
Move orchestration into a dedicated runtime package, for example:
src/mapify_cli/workflow_engine/
loader.py
validator.py
graph.py
executor.py
state_store.py
policies.py
Responsibilities:
- load workflow schemas
- validate them
- compute next executable phase
- enforce dependencies and retries
- persist and restore workflow state
- evaluate guard and recovery policies
- expose machine-readable commands for prompts/hooks
The existing map_orchestrator.py should become a thin CLI wrapper around this engine, not the canonical implementation.
Introduce a project config file, for example:
mapify.yaml
or:
.map/config.yaml
Initial supported configuration should include:
- active profile
- enabled workflows
- repo context (injected into all agent prompts)
- per-phase rules (injected only into matching phase prompts)
- verification commands
- research defaults
- risk thresholds
- guard policy tuning
- MCP defaults
- delivery settings
- language preference
OpenSpec demonstrates that separating context (global) from rules (per-artifact) provides reliable, deterministic injection:
contextappears in every agent prompt — tech stack, conventions, constraints.rulesappear only in the matching phase prompt — phase-specific guidance.- Both are injected programmatically, not via freeform markdown that agents may or may not read.
This is more reliable than MAP's current approach of embedding context in CLAUDE.md blocks.
Following OpenSpec's pattern:
- Unknown phase IDs in
rulesshould generate warnings. contextshould have a size limit (e.g., 50KB) to avoid prompt bloat.- Invalid YAML should be reported with line numbers.
mapify doctorshould validate config and report issues.
Example:
profile: core
context: |
Python CLI project.
Prefer deterministic shell commands.
Keep .claude templates in sync with shipped templates.
rules:
research:
- Check for existing patterns before proposing new abstractions
- Read tests first to understand expected behavior
monitor:
- Verify template sync between .claude/ and src/mapify_cli/templates/
- Check for OWASP top-10 vulnerabilities
verification:
checks:
- make check
- pytest tests/test_template_sync.py -v
policies:
research_threshold_existing_files: 3
final_verify_subtask_threshold: 5
actor_monitor_max_retries: 5
delivery:
assistant: claude
hooks: true
mcp: essential
language: ru # optional: agent response languageAdd built-in profiles to reduce default complexity:
core:/map-plan,/map-efficient,/map-check,/map-reviewfull: all advanced workflows (map-fast,map-debug,map-tdd,map-release,map-learn)custom: explicit workflow selection
Profiles should affect both generated delivery files and runtime defaults.
OpenSpec's config profile provides an interactive wizard with current-state summary before changes. MAP should follow a similar pattern:
mapify config profile— interactive selection with previewmapify config profile core— fast preset switchmapify update— regenerate delivery files for current profile
OpenSpec allows per-workflow toggle within profiles. MAP should support similar granularity:
profile: custom
workflows:
- map-plan
- map-efficient
- map-check
- map-tdd # user opted into TDD
# map-debug etc. — omitted, not installedThis reduces generated file count and cognitive load for users who don't need every workflow.
This gives MAP a cleaner on-ramp and a cleaner upgrade path.
Split delivery generation from workflow semantics.
Example package shape:
src/mapify_cli/delivery/
base.py
claude.py
OpenSpec's multi-tool delivery model demonstrates that each adapter needs explicit capability metadata declaring what delivery surfaces it supports and where files go.
MAP should define adapter capability metadata even if initially only Claude is supported:
@dataclass
class DeliveryCapability:
tool_id: str # e.g., "claude"
supports_agents: bool # .claude/agents/
supports_commands: bool # .claude/commands/
supports_hooks: bool # .claude/hooks/
supports_skills: bool # .claude/skills/ (emerging standard)
config_files: list[str] # ["settings.json", "workflow-rules.json"]
path_pattern: str # ".claude/{surface}/{name}"This metadata allows:
- Programmatic validation of adapter completeness
- Future adapters to declare their capabilities upfront
mapify doctorto verify delivery state against capability declarations
OpenSpec distinguishes between skills (richer metadata, cross-tool compatible) and commands (tool-specific). MAP currently uses commands + agents. Consider adding skills support for forward compatibility:
| Surface | Current | Future |
|---|---|---|
| Agents | .claude/agents/*.md |
adapter-generated |
| Commands | .claude/commands/*.md |
adapter-generated |
| Hooks | .claude/hooks/ |
adapter-generated |
| Skills | not used | .claude/skills/map-*/SKILL.md |
| Config | .claude/settings.json |
adapter-generated |
The Claude adapter should generate:
.claude/agents/*.claude/commands/*.claude/hooks/*.claude/settings.json.claude/workflow-rules.json
The workflow engine should not care where those files end up. It should only expose:
- workflow definitions
- prompt inputs
- state transitions
- artifact contracts
This keeps MAP Claude-first without making the platform Claude-dependent.
Every generated file should include structured metadata, either in frontmatter or JSON fields.
Minimum metadata contract:
generated_by:mapify-cligenerated_by_versiontemplate_idtemplate_versionworkflow_idorprofilemanaged: true
This enables:
- drift detection
- targeted upgrade
- selective refresh
- future migration tooling
Formalize .map/<branch>/ into explicitly versioned artifacts.
Required machine-readable artifacts should include:
step_state.jsonblueprint.jsonverification_results_<branch>.jsonrepo_insight_<branch>.jsonactive-issues.json
Recommended human-readable artifacts should include:
task_plan_<branch>.mdfindings_<branch>.mdverification-summary.mdpr-draft.mdcode-review-XXX.md
Each artifact must define:
- producer
- consumer
- schema/version
- lifecycle
- overwrite/append semantics
Move these rules out of prompt text and into code/config:
- when research is required
- when final verification runs
- when guard checks run
- when stuck recovery triggers
- when predictor participates
- what counts as a blocking failure
- when resume should re-open or continue a subtask
Prompts should consume policy outputs, not embed policy logic themselves.
Formalize verification as a structured contract, not just "run tests".
OpenSpec's /opsx:verify checks three independent dimensions. MAP's /map-check should adopt a similar structured model:
| Dimension | What it validates | MAP equivalent |
|---|---|---|
| Completeness | All subtasks done, all artifacts present | Task plan checkboxes, artifact existence |
| Correctness | Implementation matches intent, tests pass | make check, Monitor review |
| Coherence | Design decisions reflected in code, patterns consistent | Predictor analysis, cross-file consistency |
verification_results_<branch>.json should report per-dimension:
{
"branch": "feature-x",
"timestamp": "2026-03-23T12:00:00Z",
"dimensions": {
"completeness": {
"status": "pass",
"subtasks_done": 5,
"subtasks_total": 5,
"artifacts_present": ["blueprint", "task_plan", "step_state"]
},
"correctness": {
"status": "pass",
"checks_passed": ["make check", "pytest"],
"checks_failed": []
},
"coherence": {
"status": "warning",
"issues": ["Design mentions event-driven but implementation uses polling"]
}
},
"overall": "pass_with_warnings",
"blocking": false
}Add a real migration path:
mapify doctormapify upgrademapify migrate
OpenSpec's migration model demonstrates several best practices MAP should adopt:
Legacy detection. Automatically detect older generated files by presence of outdated metadata, missing version fields, or legacy file layouts. OpenSpec scans for legacy command directories and marker blocks — MAP should scan for pre-metadata generated files.
Selective cleanup. Remove only OpenSpec-managed files, preserve user content. MAP should:
- Strip managed content markers from generated files
- Preserve any user edits outside managed blocks
- Move deprecated files to a backup location before deletion
Non-interactive mode. Support --force flag for CI environments:
mapify upgrade --force— skip confirmation promptsmapify migrate --force— auto-accept all migration steps
User attention items. When files require manual review (e.g., custom hooks that conflict with new schema), report them clearly without auto-deleting.
Scope-aware cleanup. If install scope or profile changes, track last-applied state and clean up stale files at previous locations.
Migration responsibilities:
- detect older generated file versions
- detect legacy artifact layouts
- reconcile old and new metadata
- regenerate safely when files are untouched
- warn when managed files were locally edited
- track last-applied scope/profile for drift detection
Create internal module boundaries without changing end-user behavior.
Deliverables:
- dedicated workflow engine package
- dedicated delivery package
- dedicated config package
- reduced responsibilities inside
src/mapify_cli/__init__.py
Acceptance criteria:
- existing commands still work
- current tests still pass
- installer behavior remains unchanged for Claude projects
Current state. src/mapify_cli/__init__.py is 2692 lines (41% of all source) and contains at least four distinct concerns: CLI wiring + UI widgets (~200 LOC), delivery/installer logic (~1500 LOC including 8 create_*_content() agent generators), config generation (~445 LOC for settings/MCP/permissions), and health checks (~370 LOC for doctor/check). State management is already well-separated into five modules (workflow_state.py, ralph_state.py, dependency_graph.py, verification_recorder.py, schemas.py — 1957 LOC total). The workflow engine (map_orchestrator.py + map_step_runner.py, 2556 LOC) lives in templates/map/scripts/ as runtime code copied to user projects. An empty validation/ directory already exists.
Proposed decomposition of __init__.py:
src/mapify_cli/
__init__.py → ~300 lines (CLI wiring, Typer commands, re-exports)
cli_ui.py → select_with_arrows(), StepTracker, show_banner()
delivery/
__init__.py
installer.py → init() orchestration logic
agent_generator.py → create_*_content() functions (8 agent generators)
file_copier.py → create_reference/command/skill/hook_files()
health.py → check(), doctor(), get_project_health()
config/
__init__.py
settings.py → settings merge logic
mcp.py → MCP config generation
permissions.py → configure_global_permissions()
Risks specific to this phase: Refactoring a 2692-line file requires incremental moves with re-exports to avoid breaking existing imports. Existing tests import from mapify_cli top-level — backward-compatible re-exports are essential. The templates/map/scripts/ runtime engine stays in-place (it is copied to user projects and runs independently).
Implement the workflow schema format and runtime loader.
Deliverables:
- schema format
- validator
- runtime phase graph
- migrated schemas for
map-planandmap-efficient
Acceptance criteria:
map-planandmap-efficientexecute from workflow definitions- orchestration decisions no longer rely on prompt-only phase descriptions
Triple redundancy problem. The same orchestration rules are currently defined in three independent places that must be kept in sync manually:
map_orchestrator.py— phase ordering (STEP_PHASESdict, lines 99-110;STEP_ORDERlist, lines 113-122;TDD_STEP_ORDER, lines 125-136), step instructions (lines 412-458), retry default (max_retries=5, line 278), skippable steps set (line 972).map-efficient.md— re-describes the same phases and ordering in prose, embeds retry thresholds (5 main / 3 stuck trigger / 2 guard rework), run conditions (RESEARCH: "3+ existing files OR risk=high"), gate definitions (per-wave: tests + linter), confidence threshold (0.7), stuck recovery policy.workflow-gate.py— independently definesEDITING_PHASES = {"ACTOR", "APPLY", "TEST_WRITER"}(line 32) for phase-to-permission mapping.
What should become schema data (currently spread across 5+ files):
| Rule | Current location | Current format |
|---|---|---|
| 10 phase definitions and ordering | map_orchestrator.py:99-136 |
Python dict + list |
| Phase-to-agent mapping | map-efficient.md + map_orchestrator.py:412-458 |
Prose + Python strings |
| Retry thresholds (5/3/2) | map-efficient.md:456-504 |
Prose |
| RESEARCH run_if condition | map-efficient.md:318 + map_orchestrator.py:436 |
Prose + instruction text |
| Per-wave gate checks | map-efficient.md:525-575 |
Prose |
| Phase edit permissions | workflow-gate.py:32 |
Python constant |
| Skippable steps set | map_orchestrator.py:972 |
Python constant |
| TDD step ordering variant | map_orchestrator.py:125-136 |
Python list |
| Test runner detection | map_step_runner.py:820-842 |
Python if/elif chain |
| Guard rework max | map-efficient.md:571 |
Prose |
| Confidence threshold | map-efficient.md:676 |
Prose |
What should remain as code: state machine transitions (wave computation, subtask rotation), DAG algorithms (wave splitting by file conflicts), resume detection (regex parsing of task_plan), constraint enforcement (scope_glob fnmatch), circuit breaker formula.
What should remain as prompts: interview dimensions in map-plan.md, agent reasoning instructions, UX formatting, context distillation rules, troubleshooting guidance.
Recommendation: Start with map-efficient — it has the largest rule surface. Migrate map-plan second — it has less runtime orchestration, more reasoning guidance. The prompt files should reference parameters from the schema via injection (e.g., {{max_retries}}) rather than duplicating values.
Add mapify.yaml or .map/config.yaml and built-in profiles.
Deliverables:
- config parser and validator
- default config generation
- profile selection
- runtime policy overrides
Acceptance criteria:
- users can change workflow/profile behavior without editing templates
- installer and runtime both read the same config source
Current state: 7 configuration sources, zero unification. Configuration is scattered across workflow-rules.json (workflow selection triggers), settings.json (permissions + hooks), ralph-loop-config.json (circuit breaker thresholds), skill-rules.json (skill triggers), 3 Python hooks (~60 hardcoded constants), CLAUDE.md (behavioral rules in prose), and the init() CLI (4 flags, 0 profiles).
Key gaps identified:
| Gap | Impact |
|---|---|
No config.yaml |
Users must edit 5+ files to customize MAP |
| No language/framework detection | init installs identical config for Python/Go/TS — yet settings.json contains go vet, gofmt, kubectl in allow rules |
| Hook constants are fully hardcoded | EDITING_PHASES, DANGEROUS_FILE_PATTERNS, SAFE_PATH_PREFIXES, READONLY_COMMANDS — all Python constants, user must edit source |
| No per-workflow overrides | ralph-loop-config.json is global; map-fast cannot have max_total_iterations: 10 while map-efficient has 50 |
| No profiles | init installs ALL: 11 agents, 13 commands, 8 hooks, 3 skills — no core/full/custom |
| No CLAUDE.md for user projects | Template CLAUDE.md is MAP's own dev instructions, not a user-project skeleton |
Existing infrastructure that helps: repo_insight.py already detects project language and can generate suggested verification commands. This can feed auto-detected defaults into config.yaml.
Migration path for hooks: Hooks should read overridable constants from config.yaml via a shared load_map_config() utility, with fallback to current hardcoded defaults. This preserves behavior for users who don't create a config file.
Add metadata to generated files and upgrade detection.
Deliverables:
- metadata contract
- drift detection
- selective upgrade behavior
- clearer reporting in
mapify doctor
Acceptance criteria:
- generated files show origin/version
mapify upgradecan distinguish stale files from customized files
Current state: almost no metadata, upgrade is blind overwrite. All four generation functions (create_agent_files, create_command_files, create_hook_files, create_config_files) use shutil.copy2() — pure file copy with zero metadata injection. No generated_by marker, no content hash, no mapify_version stamp exists in any generated file. The upgrade() command (line 2519) calls the same create_*_files() functions and blindly overwrites everything. Only two exceptions have merge logic: .mcp.json (preserved) and settings.local.json (merged with create_or_merge_project_settings_local()).
What exists today:
| Category | Has frontmatter | Has version | Has generated_by |
Upgrade behavior |
|---|---|---|---|---|
| Agents (11 .md) | YAML: name, description, model | Manual (e.g., 2.4.0) |
No | Blind overwrite |
| Commands (13 .md) | YAML: description only | No | No | Blind overwrite |
| Hooks (7 .py) | No (docstring only) | No | No | Blind overwrite |
| Config (3 .json) | N/A | No (schema version only) | No | Blind overwrite |
Implementation approach: Replace shutil.copy2() with a copy_managed_file() function that injects metadata. For .md files: YAML frontmatter fields. For .py files: header comment block. For .json files: _mapify_metadata key or sidecar .map/manifest.json. The upgrade() function should compare template_hash values: overwrite only when the source template changed AND the user did not modify the file.
Drift detection logic:
current_hash = sha256(file_content_without_metadata)
if current_hash != metadata.template_hash:
→ "user_modified" (warn, don't overwrite)
if metadata.mapify_version < current_mapify_version:
→ "stale" (safe to overwrite)
else:
→ "up_to_date" (skip)
Extend schema coverage to all critical branch artifacts.
Deliverables:
- explicit schema/version for each artifact
- validators for write/read paths
- migration helpers for old artifacts
Acceptance criteria:
- resume/check/doctor rely on validated artifacts
- artifact consumers fail with actionable errors, not silent drift
Current state: 25 artifacts discovered, only 2 have JSON Schema validation. The full artifact inventory:
| Scope | Machine-readable (JSON) | Human-readable (MD) | With schema | Without schema |
|---|---|---|---|---|
.map/<branch>/ |
10 | 8 | 0 | 18 |
.map/ root |
2 | 1 | 2 | 1 |
.map/logs/, .map/scripts/ |
— | — | — | — |
Critical gaps:
step_state.json— central artifact, no runtime validation. Read by orchestrator + 4 hooks + step_runner.STATE_ARTIFACT_SCHEMAexists inschemas.pybut is never imported or used — it is an orphaned schema definition.blueprint.json— no schema at all. Created by/map-plan, consumed by orchestrator for DAG computation. Expected keys (subtasks[].id, .dependencies, .affected_files) are described only in prompt text.- Two state systems coexist.
progress.md(root-level,workflow_state.py) is legacy;step_state.json(branch-scoped,map_orchestrator.py) is current. Both remain in active code. - Location inconsistency.
verification_results_<branch>.jsonandrepo_insight_<branch>.jsonlive at.map/root (branch in filename), while all other branch-scoped artifacts use.map/<branch>/directories. - No lifecycle cleanup. No artifact has a deletion mechanism.
.map/<branch>/accumulates files indefinitely.
Priority for schema formalization:
| Priority | Artifact | Reason |
|---|---|---|
| P0 | step_state.json |
Central, 6+ consumers, schema already written but orphaned |
| P0 | blueprint.json |
DAG computation input, no schema at all |
| P1 | final_verification.json |
Gate decision artifact |
| P1 | task_plan_<branch>.md |
Custom XML-like format with regex parsing |
| P2 | active-issues.json, known-issues.json |
Simple, but no schema |
Quick win: Activate the orphaned STATE_ARTIFACT_SCHEMA — connect it to map_orchestrator.py with jsonschema.validate() calls on read/write paths.
Formalize the verification model.
Deliverables:
- structured verification result schema
- per-dimension reporting (completeness, correctness, coherence)
- improved
mapify doctoroutput with config validation
Acceptance criteria:
/map-checkreports structured results per verification dimensionmapify doctorvalidates project config and reports issues with line numbers- verification results are machine-readable JSON
Current state: three-tier verification model already exists, but is not formalized into unified dimensions. The three tiers:
- Monitor agent (per-subtask, during execution) — 10-dimension quality model with structured JSON verdict (
valid,issues[],passed_checks,failed_checks). Decision rules: CRITICAL → alwaysvalid=false; ≥2 HIGH →valid=false. Hard stop. - final-verifier agent (whole-task, after all subtasks) — adversarial verification with confidence scoring (threshold 0.7). Outputs JSON with
passed,confidence,evidence,root_cause. Verdicts: COMPLETE / RE_DECOMPOSE / ESCALATE. /map-checkcommand (orchestration) — runs final-verifier + tests + linter + git status. Producesverification-summary.md,<stage>-gate.json,active-issues.json,pr-draft.md,runs/<timestamp>/RESULTS.md.
Mapping to Completeness/Correctness/Coherence:
| Tier | Completeness | Correctness | Coherence |
|---|---|---|---|
| Monitor | dims 5,8,10 (testability, deps, research) | dims 1,2,4,6 (correctness, security, perf, CLI) | dims 3,7,9 (quality, maintainability, docs) |
| final-verifier | subtasks done, criteria met | tests pass, edge cases, ground truth | integration between subtasks |
/map-check |
step_state all COMPLETE | tests+lint pass, verifier APPROVED | acceptance criteria from plan |
Four incompatible result formats currently coexist: verification_results_<branch>.json (recipe-based), final_verification.json (confidence-based), verification-summary.md (human-readable verdict), <stage>-gate.json (gate verdict). A unified umbrella schema with per-dimension reporting would consolidate these.
Hardcoded policy values to extract to config: confidence threshold (0.7) in final-verifier.md, Monitor decision rules ("CRITICAL → always valid=false", "≥2 HIGH → valid=false") in monitor.md, test timeout (300s) in map_step_runner.py:858, output truncation (5000 chars) in map_step_runner.py:863.
Only after the platform boundary is complete.
Deliverables:
- second adapter if desired
- adapter capability metadata model
- adapter test harness
- generated-file snapshots per adapter
Acceptance criteria:
- adding a new delivery target does not require changing workflow semantics
- adapter declares capabilities via metadata, not implicit code paths
Current state: monolithic coupling to Claude Code, zero abstraction layer. 73 references to "claude" in __init__.py, 12+ functions with .claude/ hardcoded paths, 8 hook scripts fully dependent on Claude Code protocol (PreToolUse/PostToolUse events, CLAUDE_PROJECT_DIR env var, JSON stdout response format). selected_ai = "claude" is hardcoded (line 2167) with comment "the only supported AI assistant". No adapter/factory/provider pattern exists.
Natural separation boundary already present:
| Layer | Content | Portability |
|---|---|---|
.map/ scripts (orchestrator, step runner, diagnostics) |
Workflow engine | 100% portable |
| Agent/command body text (prompt content) | Reasoning instructions | ~90% portable |
| Delivery envelope (frontmatter, paths, hooks, config) | Claude Code integration | 0% portable |
What would need per-adapter reimplementation:
| Component | Files affected | Effort |
|---|---|---|
Directory layout (.claude/ paths) |
__init__.py (12+ functions) |
High |
settings.json generation |
3 functions (settings, permissions, settings.local) | High — entirely different format per platform |
| Hook scripts | 8 files, ~700 LOC total | High — protocol completely different per platform |
| Agent file format (frontmatter) | create_agent_files() |
Medium — body reusable, envelope differs |
| Command/slash format | create_command_files() |
Medium |
| MCP config | 2 functions | Medium |
What is reusable as-is across adapters: .map/ scripts, ralph-loop-config.json, workflow-rules.json body content, all state management modules, schemas.py, static analysis handlers, planning skill templates.
Minimum viable adapter interface:
class DeliveryAdapter(Protocol):
tool_id: str
def create_agents(self, path: Path, agents: list[AgentSpec]) -> int: ...
def create_commands(self, path: Path, commands: list[CommandSpec]) -> int: ...
def create_hooks(self, path: Path, hooks: list[HookSpec]) -> int: ...
def create_config(self, path: Path, config: ProjectConfig) -> int: ...
def detect_installed(self, path: Path) -> bool: ...Each adapter maps tool-agnostic content (AgentSpec.prompt_body) into platform-specific envelopes (ClaudeAgentEnvelope.model_frontmatter).
Based on codebase analysis, the phases have the following dependency structure and priority assessment:
Phase 1 (Extract Boundaries) ─────────────────────────────────┐
│ │
├──> Phase 4 (Managed Metadata) [parallel, low dep] │
│ │
├──> Phase 3 (Config + Profiles) [parallel, low dep] │
│ │
└──> Phase 2 (Workflow Schema) [highest value] ────┤
│
Phase 5 (Artifact Schemas) [incremental] ─────┤
│
Phase 6 (Verification Contract) [incremental] ─────┤
│
Phase 7 (Delivery Adapters) [only if needed] ────┘
| Phase | Complexity | Value | Risk | LOC affected | Recommendation |
|---|---|---|---|---|---|
| 1. Boundaries | Medium | High (prerequisite) | Low | ~2700 refactor | First, incremental moves with re-exports |
| 2. Schema | High | Highest (core problem) | Medium | ~2500 across 5 files | Start with map-efficient, then map-plan |
| 3. Config | Medium | High (UX) | Low | ~500 new + hooks migration | Parallel with Phase 2, after Phase 1 |
| 4. Metadata | Low-medium | Medium (upgrade safety) | Low | ~200 new + 4 functions | Parallel with Phase 3, after Phase 1 |
| 5. Artifacts | Low-medium | Medium (reliability) | Low | ~300 (activate + add schemas) | Incremental, P0 = step_state + blueprint |
| 6. Verification | Low | Medium (formalization) | Low | ~200 (unified schema) | After Phases 2-5 |
| 7. Adapters | High | Low (one platform now) | Medium | ~1500+ (hooks reimpl) | Only if multi-platform demand exists |
Quick wins available today (before full refactor):
- Activate orphaned
STATE_ARTIFACT_SCHEMAinschemas.py→ connect tomap_orchestrator.py - Add
BLUEPRINT_SCHEMAtoschemas.py - Add
generated_by+mapify_versionto agent frontmatter (agents already have YAML frontmatter) - Move 60 hardcoded hook constants to a
hook_defaults.jsonthat hooks read at startup
This refactor is successful when all of the following are true.
- workflow sequencing is defined in workflow schemas, not only in markdown command prompts
- runtime policies are loaded from code/config instead of duplicated across prompts
.map/<branch>/artifacts have documented producers, consumers, and schemas
- a new project can still run
mapify initand get a working Claude Code setup - advanced users can tune behavior through config
mapify doctorcan explain state, drift, and missing artifacts clearlymapify upgradecan refresh managed files without acting as blind overwrite
src/mapify_cli/__init__.pyis reduced to CLI wiring and installer composition- orchestrator logic is testable without parsing prompt files
- core workflow state transitions have unit tests
- generated templates have snapshot or fixture-based regression tests
If the workflow schema becomes too abstract, it will be hard to use and hard to debug.
Mitigation:
- start with
map-planandmap-efficient - keep the schema minimal
- move only stable orchestration rules into the schema
MAP already works today. The refactor must not damage the default experience.
Mitigation:
- keep Claude adapter as the reference implementation
- ship migration incrementally
- preserve existing generated file locations in the first major rollout
During migration there may be prompt-defined and schema-defined workflow logic in parallel.
Mitigation:
- make one workflow at a time authoritative
- document migration status clearly
- add tests that fail if both sources diverge
If project config context and rules are too large, they consume token budget and degrade agent reasoning quality.
Mitigation:
- enforce a size limit on
context(e.g., 50KB, following OpenSpec's precedent) - warn in
mapify doctorwhen context exceeds recommended size - inject
rulesonly into matching phases, not globally
If workflow schemas support too many conditional features (run_if, retry policies, gate types), they become hard to debug.
Mitigation:
- start with a minimal schema format covering only
map-planandmap-efficient - add features incrementally based on real usage patterns
- provide
mapify workflow show <id>for visual inspection of phase graphs
- Should project config live at
mapify.yamlor.map/config.yaml?- OpenSpec uses
openspec/config.yamlinside its own directory. Codebase analysis supports.map/config.yaml— all branch-scoped artifacts already live under.map/, and hooks already read from.map/<branch>/. Co-locating config with artifacts reduces path management complexity.
- OpenSpec uses
- Should workflow schemas be user-overridable, or only built-in in the first iteration?
- OpenSpec allows custom schemas from day one via
schema fork. Given that MAP has 10 hardcoded phases with complex interactions, start built-in only but design the YAML format to be extensible.
- OpenSpec allows custom schemas from day one via
- Should managed metadata be embedded directly in generated markdown files, or tracked in a sidecar manifest?
- Codebase analysis shows agents already have YAML frontmatter (adding fields is trivial), but commands have minimal frontmatter and hooks have none. A hybrid approach is recommended: embed in .md frontmatter, use a sidecar
.map/manifest.jsonfor .py/.json files.
- Codebase analysis shows agents already have YAML frontmatter (adding fields is trivial), but commands have minimal frontmatter and hooks have none. A hybrid approach is recommended: embed in .md frontmatter, use a sidecar
- Should human-readable
.mapartifacts remain markdown-first, or should some become JSON-first with generated markdown views?- Codebase analysis shows
task_plan_<branch>.mdis parsed via regex to extractST-XXXIDs and status. Converting it to JSON-first with a markdown view would eliminate fragile regex parsing.
- Codebase analysis shows
- Should
/map-reviewremain mostly prompt-driven longer than/map-planand/map-efficient, or migrate with them? - Should MAP adopt a skills-based delivery model alongside commands/agents for forward compatibility with emerging AI tool standards?
- OpenSpec has moved to skills as the primary delivery surface. MAP already ships skills in
templates/skills/. Consider expanding skills coverage during Phase 7.
- OpenSpec has moved to skills as the primary delivery surface. MAP already ships skills in
- Should
contextandrulesinjection be validated against token budgets to prevent prompt bloat?- OpenSpec limits context to 50KB. MAP should define a similar limit and warn when approaching it.
- Should
mapify doctorvalidate delivery adapter state against capability metadata (similar to OpenSpec'sschema validate)? - Should the legacy
progress.md/WorkflowStatesystem be removed or migrated as part of Phase 5?- Codebase analysis confirmed two state systems coexist (
progress.md+step_state.json). Removing the legacy system would simplify the artifact contract but requires verifying no consumers depend on it.
- Codebase analysis confirmed two state systems coexist (
- Should
verification_results_<branch>.jsonandrepo_insight_<branch>.jsonmove from.map/root into.map/<branch>/for location consistency?- All other branch-scoped artifacts use
.map/<branch>/directories. The current location-in-filename pattern is an inconsistency that complicates cleanup and discovery.
- All other branch-scoped artifacts use
The first implementation slice should be:
- Extract boundaries — decompose
__init__.py(2692 LOC) intodelivery/,config/,cli_ui.pywith backward-compatible re-exports - Quick wins — activate orphaned
STATE_ARTIFACT_SCHEMA, addBLUEPRINT_SCHEMA, addgenerated_byto agent frontmatter - Workflow schemas for
map-efficientfirst (largest rule surface, triple redundancy), thenmap-plan - Project config —
.map/config.yamlwith context injection, per-phase rules, profile selection; hooks read from config viaload_map_config()with fallback to current defaults - Managed file metadata — replace
shutil.copy2()withcopy_managed_file()that injectsmapify_version+template_hash; rewriteupgrade()with drift detection - Schema introspection CLI —
mapify workflow list/mapify workflow show
This sequence provides the highest leverage while keeping the current product usable throughout the migration. Phases 3-4 can proceed in parallel after Phase 1.
Full inventory of artifacts discovered in the current codebase. This serves as the baseline for Phase 5 schema formalization.
| Artifact | Format | Producer | Key consumers | Schema exists | Semantics |
|---|---|---|---|---|---|
step_state.json |
JSON | map_orchestrator.py |
orchestrator, 4 hooks, step_runner | Doc only (orphaned schema in schemas.py) |
Overwrite (atomic) |
blueprint.json |
JSON | /map-plan (Write tool) |
orchestrator set_waves() |
No | Overwrite (once) |
task_plan_<branch>.md |
MD + XML tags | /map-plan (Write tool) |
orchestrator, step_runner (regex parse) | No | In-place status updates |
spec_<branch>.md |
MD | /map-plan (Write tool) |
task-decomposer, /map-tdd |
No | Overwrite + append |
findings_<branch>.md |
MD + XML tags | research-agent, /map-plan |
/map-efficient |
No | Append |
ralph_state.json |
JSON | ralph_state.py |
Ralph loop orchestrator | No (dataclass only) | Overwrite (atomic) |
final_verification.json |
JSON | final-verifier agent | ralph_state.py, orchestrator |
No | Overwrite |
verification-summary.md |
MD | map_step_runner.py |
orchestrator (resume), /map-check |
No | Overwrite |
code-review-NNN.md |
MD (numbered) | /map-efficient (Write tool) |
orchestrator (resume briefing) | No | Create new |
plan-review-NNN.md |
MD (numbered) | map_step_runner.py |
review handoff | No | Create new |
pr-draft.md |
MD | map_step_runner.py |
handoff bundle | No | Overwrite |
qa-001.md |
MD | ensure_human_artifacts() / agent |
handoff bundle | No | Overwrite |
known-issues.json |
JSON | map_step_runner.py |
/map-check, diagnostics |
No | Append |
active-issues.json |
JSON | map_step_runner.py |
handoff bundle, /map-review |
No | Overwrite (replace) |
<stage>-gate.json |
JSON | map_step_runner.py |
handoff bundle | No | Overwrite |
diagnostics.json |
JSON | diagnostics.py |
workflow-context-injector.py |
No | Overwrite |
run-summary.json |
JSON | diagnostics.py |
/map-check |
No | Overwrite |
runs/<ts>/RESULTS.md |
MD | /map-check (Write tool) |
Audit | No | Create new |
transcript-*.md |
MD | pre-compact-save-transcript.py |
Manual review | No | Create new |
| Artifact | Format | Producer | Schema exists | Notes |
|---|---|---|---|---|
verification_results_<branch>.json |
JSON | verification_recorder.py |
Yes (validated on write) | Location inconsistency: branch in filename, not path |
repo_insight_<branch>.json |
JSON | repo_insight.py |
Yes (validated on write) | Same location inconsistency |
progress.md |
MD + YAML frontmatter | workflow_state.py |
No | Legacy — coexists with step_state.json |
Coupling assessment for Phase 7 planning. Shows what is Claude-specific vs tool-agnostic.
| Component | Files | Coupling points |
|---|---|---|
__init__.py delivery functions |
12+ functions | .claude/ paths hardcoded, selected_ai = "claude" |
settings.json |
Template | Claude Code proprietary schema, $schema URL, permissions.deny/allow format, hook event names |
| Hook scripts (8 files) | templates/hooks/ |
CLAUDE_PROJECT_DIR env var, PreToolUse/PostToolUse event model, JSON stdout protocol |
| Agent frontmatter | 11 .md files |
model: sonnet/opus/haiku (Claude-specific tier names) |
| Skill format | SKILL.md files |
allowed-tools, ${CLAUDE_PLUGIN_ROOT}, Claude hook sections |
is_map_initialized() |
__init__.py:549 |
Checks for .claude/agents, .claude/commands, .claude/settings.json |
| Component | Files | Notes |
|---|---|---|
.map/ scripts |
map_orchestrator.py, map_step_runner.py, diagnostics.py, map_utils.py |
Only 1 comment mentions Claude |
| State management | workflow_state.py, ralph_state.py, dependency_graph.py, schemas.py |
Pure logic |
ralph-loop-config.json |
Template | Circuit breaker thresholds |
| Static analysis | analyze.sh + language handlers |
Language-specific, not platform-specific |
| Agent/command body text | Prompt content (sans frontmatter) | ~90% portable (AskUserQuestion() is Claude-specific) |
verification_recorder.py |
State module | Pure logic |
repo_insight.py |
State module | 1 line excludes .claude from scan |
This spec was informed by analysis of OpenSpec (v2025+, OPSX workflow). Key documents referenced:
docs/concepts.md— artifact model, delta specs, schemasdocs/workflows.md— fluid actions, dependency graph, verification dimensionsdocs/customization.md— project config, custom schemas, context injectiondocs/opsx.md— OPSX architecture, component model, iteration modeldocs/cli.md— CLI surface, schema management commandsdocs/migration-guide.md— legacy detection, cleanup, config migrationdocs/supported-tools.md— multi-tool delivery, capability metadataopenspec/changes/add-global-install-scope/— install scope design (global vs project)