Skip to content

Commit 41c5e39

Browse files
authored
feat: automatic ACTOR retry on Monitor failure (#92)
* feat: add monitor_failed() for automatic ACTOR retry on Monitor failure When Monitor returns valid=false, the orchestrator now has a proper state transition back to ACTOR phase. This fixes the deadlock where Monitor detects issues but workflow-gate blocks edits during MONITOR. - Add monitor_failed() for sequential execution mode - Add wave_monitor_failed() for parallel/wave execution mode - Both persist feedback to .map/<branch>/monitor_feedback*.md - Both enforce max_retries with escalation status - Register as CLI commands: monitor_failed, wave_monitor_failed - Update map-efficient.md to use orchestrator instead of pseudocode * chore: formatting cleanup and add DevOpsConf case study link - Add DevOpsConf 2026 case study link to README.md - Minor formatting cleanup in ralph-iteration-logger.py, diagnostics.py, map_step_runner.py * fix: address review findings for monitor_failed Issues found by Monitor/Predictor/Evaluator review: - Fix CLI feedback truncation: join all positional args for multi-word feedback (was silently dropping words after the first) - Fix TDD mode: always requeue only ["2.3", "2.4"] on retry, never re-run TDD pre-steps (2.25/2.26) that were already completed - Fix feedback overwrite: use retry-numbered filenames (monitor_feedback_retry{N}.md) per error-patterns.md lesson - Extract _write_feedback_file() helper to DRY the feedback persistence - Update Phase D wave retry pseudocode to use wave_monitor_failed - Clarify Actor feedback file path comment (explicit, not "automatic") - Add 17 unit tests: TestMonitorFailed (10) + TestWaveMonitorFailed (7) * fix: address review findings and add reopen_for_fixes for post-review edits Review findings (11 items): - Extract _check_retry_limit() helper to DRY up monitor_failed/wave_monitor_failed - Add phase guard to monitor_failed() rejecting non-MONITOR phase calls - Add --feedback CLI flag for symmetric argument passing - Fix SUBTASK_ID placeholder, feedback filename comment, Continue label text - Document serial/wave retry counter invariant in StepState - Update ARCHITECTURE.md CLI command list New feature: reopen_for_fixes() - Transitions COMPLETE → ACTOR so workflow gate unlocks for review fixes - Solves: /map-review finds issues but edits blocked in COMPLETE phase - map-review.md updated with "Workflow Gate Unlock" section Tests: 13 new tests (6 reopen_for_fixes + 4 review findings + 3 integration) * fix: add test quality gate to map-tdd to prevent trivial "2+2=4" tests TEST_WRITER prompt now includes explicit anti-patterns: - Every test must verify semantic behavior, not just single branch execution - Tests must assert multiple consequences (side effects, return values, state) - Prefer scenario-based tests over single-field unit checks - Aim for 60%+ semantic tests TEST_FAIL_GATE now includes quality gate: - Classifies tests as semantic vs trivial - Rejects if >40% tests are trivial single-branch checks - Sends feedback to TEST_WRITER to merge trivial tests into richer scenarios Addresses feedback: 3/9 tests were trivial branch checks in production use. * fix: prevent stale Red-phase comments in TDD test files TEST_WRITER could add temporal comments like "currently FAILS" or "expected to FAIL until fix is applied" that become stale after the Green phase. Actor in code_only mode cannot clean them up. - Add rule to TEST_WRITER prohibiting temporal state-marking comments (map-tdd.md, map-efficient.md, actor.md) - Add TDD Refactor cleanup step after Actor returns, before Monitor, to scan and clean any stale Red-phase markers in test files * chore: remove deprecated skills (map-workflows-guide, map-cli-reference, map-debate) Remove unused skill definitions and their trigger rules from both dev config (.claude/) and shipped templates (src/mapify_cli/templates/). * fix: remove map-debate from expected command templates in tests * chore: remove all map-debate references from docs, tests, and plugin metadata Clean up remaining mentions after command/skill deletion. CHANGELOG entries preserved as historical record. * chore: ignore .map/scripts/.map/ in gitignore
1 parent 56e2aeb commit 41c5e39

52 files changed

Lines changed: 1111 additions & 6229 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.claude-plugin/PLUGIN.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,6 @@ MAP (Modular Agentic Planner) is a cognitive architecture that orchestrates 11 s
3939
- `/map-efficient` — implement features, refactor code, complex tasks with full MAP workflow
4040
- `/map-debug` — debug issues using MAP analysis
4141
- `/map-fast` — small, low-risk changes with minimal overhead
42-
- `/map-debate` — multi-variant synthesis with Opus arbiter
4342
- `/map-review` — comprehensive review of changes
4443
- `/map-check` — quality gates and verification
4544
- `/map-plan` — architecture decomposition

.claude-plugin/marketplace.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@
3333
"features": [
3434
"11 specialized agents (TaskDecomposer, Actor, Monitor, Predictor, Evaluator, Reflector, DocumentationReviewer, Debate-Arbiter, Synthesizer, Research-Agent, Final-Verifier)",
3535
"5 Claude Code hooks for automation",
36-
"12 slash commands (/map-efficient, /map-debug, /map-fast, /map-debate, /map-review, /map-check, /map-plan, /map-task, /map-tdd, /map-release, /map-resume, /map-learn)",
36+
"11 slash commands (/map-efficient, /map-debug, /map-fast, /map-review, /map-check, /map-plan, /map-task, /map-tdd, /map-release, /map-resume, /map-learn)",
3737
"Professional code review integration",
3838
"Cost optimization (40-60% savings)"
3939
],

.claude-plugin/plugin.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@
3131
"features": [
3232
"11 specialized MAP agents (TaskDecomposer, Actor, Monitor, Predictor, Evaluator, Reflector, DocumentationReviewer, Debate-Arbiter, Synthesizer, Research-Agent, Final-Verifier)",
3333
"5 Claude Code hooks for automation (validate-agent-templates, auto-store-knowledge, enrich-context, session-init, track-metrics)",
34-
"12 slash commands (/map-efficient, /map-debug, /map-fast, /map-debate, /map-review, /map-check, /map-plan, /map-task, /map-tdd, /map-release, /map-resume, /map-learn)",
34+
"11 slash commands (/map-efficient, /map-debug, /map-fast, /map-review, /map-check, /map-plan, /map-task, /map-tdd, /map-release, /map-resume, /map-learn)",
3535
"Chain-of-thought reasoning with sequential-thinking MCP",
3636
"Semantic pattern search with embeddings cache",
3737
"Cost optimization with per-agent model selection (haiku/sonnet/opus)"

.claude/agents/actor.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -276,6 +276,9 @@ Rules:
276276
5. Include edge cases from the spec's `## Edge Cases` section if available in the packet.
277277
6. Use standard test patterns for the project's language and framework.
278278
7. Tests SHOULD fail when run (implementation doesn't exist yet). This is expected.
279+
8. Do NOT add temporal comments about test failure status (e.g., "currently FAILS",
280+
"expected to FAIL", "will PASS once fix is applied"). Write tests as permanent,
281+
clean code — the Red/Green state is transient and must not leak into comments.
279282

280283
Output:
281284
- Test files created via Write tool

0 commit comments

Comments
 (0)