diff --git a/.claude/skills/extracting-plan-dag/SKILL.md b/.claude/skills/extracting-plan-dag/SKILL.md new file mode 100644 index 00000000..4265887a --- /dev/null +++ b/.claude/skills/extracting-plan-dag/SKILL.md @@ -0,0 +1,613 @@ +--- +name: extracting-plan-dag +description: Extract the inter-task dependency structure from a written plan into a queryable, derived artifact. Chains after plan-review-cycle when the execution model warrants it — multi-builder concurrent dispatch, or any project where a Beads-backed orchestrator (e.g. Gas City) is load-bearing. Methodology-focused — task tracker and graph format are adapter points, not assumptions. Detects gc / non-gc projects and adjusts mandatoriness accordingly. +--- + +# Extracting Plan DAG + +## Terminology + +The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", +"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this +document are to be interpreted as described in RFC 2119. + +A "**gc project**" is any project where Gas City (or another +Beads-backed orchestrator) is load-bearing — the orchestrator +dispatches work by reading from Beads and atomic-claims issues on +behalf of agents. A "**non-gc project**" is everything else: the plan +markdown plus the Living Document Contract (per `writing-plans-enhanced` +Step 5) is sufficient runtime state. + +## Overview + +Force every plan that warrants it to declare its inter-task dependency +structure explicitly, then make that structure queryable by whatever +coordinator dispatches the work. The plan markdown stays as the source +of truth and archival record; the DAG is a derived view co-located +with the plan; the tracker (Beads or otherwise), if used, is a runtime +cache. + +Edits flow plan → DAG → tracker. Never the other way. + +**Core principles (two asymmetries):** + +1. **Cheap to extract, expensive to reconstruct ad hoc.** A plan's + inter-task dependency structure exists whether or not it's written + down. If it's not written down, every coordinator (and every future + reader) reconstructs it from prose, often inconsistently. The cost + of one rigorous extraction beats N noisy re-extractions, multiplied + across every dispatch the plan receives. + +2. **Wrong DAG is worse than no DAG.** A DAG that ships with fabricated + edges, missed dependencies, or superseded nodes promoted as live + gives downstream coordinators false confidence. The asymmetry favors + adversarial review over speed: a half-DAG quietly corrupts dispatch + decisions; the fix surfaces only when builders collide. Err toward + more review. + +## When to use + +- After `plan-review-cycle` completes with zero findings on a plan + written by `writing-plans-enhanced`. +- On any **gc project**, regardless of plan size — Gas City needs every + plan in Beads to dispatch from it. +- On non-gc projects when the execution model is "Parallel agents" + (3+ concurrent builders on independent tracks) per + `writing-plans-enhanced` Step 2. +- On non-gc projects when the execution model is "Subagent-driven" AND + the plan has ≥15 tasks AND ≥1 phase has internal parallelism. +- Whenever an existing plan grows (new phases added, new builders + introduced) such that its execution model changes after initial + authoring. + +## When NOT to use + +- On non-gc projects with execution model "Parallel session" (one + builder, sequential checkpoints). The Living Document Contract's + banners are sufficient runtime state and the DAG is overhead. +- On research / exploratory plans where structure is itself uncertain. + Premature DAG-ification freezes structure that should remain fluid. +- Before `plan-review-cycle` has produced a zero-finding round. A DAG + built from an unreviewed plan inherits the plan's defects with + amplification. +- When the plan lacks `Files:` sections per task. This skill MUST + refuse to produce a DAG in that case (see Core discipline §2). + +## Prerequisites + +The runner MUST verify ALL of the following before any other step: + +1. The plan was written by `writing-plans-enhanced` and carries the + Living Document Contract block, per-phase Execution Status banners, + and `Files:` sections per task. +2. `plan-review-cycle` has been run to completion (a round produced + zero findings) against the **current** plan content. If the plan + has been revised after the prior `plan-review-cycle` run, the + runner MUST require a fresh `plan-review-cycle` pass before + proceeding. +3. The plan's execution strategy (selected in `writing-plans-enhanced` + Step 2) is recorded in or near the plan, so the gate (Phase 1) can + read it. +4. The runner has determined whether the project is gc or non-gc by + checking project markers. Common markers include: a `.gc/` + directory at the repo root, a Beads database file (typically + `.beads/` or a SQLite file referenced in project config), a + `gas-city` or `bd` configuration block in the project's main + config, or an explicit setting in `CLAUDE.md` or the project's + equivalent. The runner MUST cite which marker(s) it found. If no + marker is found and the project type is genuinely ambiguous, the + runner MUST ask the user before proceeding. + +If any prerequisite is missing or ambiguous, the runner MUST STOP and +request remediation. Extracting against an unreviewed plan or a plan +without `Files:` sections produces a half-DAG and false confidence. + +## Core discipline + +A DAG extraction MUST do five things. Skipping any one degrades the +extraction into a sketch that should not be committed. + +1. **Cite every edge.** Every hard edge in the DAG MUST be traceable + to a specific line, paragraph, or quoted phrase in the plan. + Uncited edges are fabricated; the runner MUST delete them. + +2. **Treat every `Files:` overlap.** For every pair of tasks that + share a file path in their `Files:` sections, the runner MUST + classify the relationship as either a hard edge (with citation) or + a soft conflict (recorded separately). A pair that ends up in + neither category is a missed dependency. If any task lacks a + `Files:` section, the runner MUST refuse to produce a DAG. + +3. **Detect superseded content.** Before edges are extracted, the + runner MUST scan the plan for `
` blocks, "REVISED", + "SUPERSEDED", "Do not execute", strikethroughs, deferred-phase + banners (⏸) that reroute to other plans, and similar markers. + Tasks inside superseded sections are NOT DAG nodes. Missing this + step corrupts the entire downstream graph. + +4. **Bridge phase-level banners to task-level nodes.** The Living + Document Contract specifies banners at the **phase** level + (⬜ / 🚧 / ✅ / ⏸), but DAG nodes are at the **task** level — one + phase contains multiple task nodes. The runner MUST capture, for + each task node, the parent phase's current banner. The tracker + (Phase 8), if used, holds the finer-grained per-task state. The + phase banner is derived from the aggregate of its task states (any + task in 🚧 → banner is 🚧; all tasks in ✅ → banner flips to ✅; an + external blocker on the phase as a whole → banner is ⏸). The plan + banner wins on disagreement — the plan is the archival source of + truth. + +5. **Run minimum 4 rounds of adversarial review.** Three canonical + perspectives — Citation auditor, Coverage auditor, Inference- + discipline auditor — each targeting a specific failure mode this + skill exists to prevent (fabricated edges, missed dependencies, + inferred-not-cited ordering). Plus at least one plan-specific + perspective the runner chooses based on the plan's character. + Additional rounds MAY be run; they SHOULD be run when any earlier + round produced material findings or when the plan's content + suggests further perspectives would catch additional issues. See + Phase 7 for round structure and loop rules. + +## Process + +### Phase 1: Gate decision + +The runner MUST evaluate whether this skill should run before any +other step. + +| Project type | Execution strategy | Action | +|---|---|---| +| gc | any | RUN (mandatory; tracker sync is required for the orchestrator to dispatch) | +| non-gc | Parallel agents (3+ concurrent builders on independent tracks) | RUN | +| non-gc | Subagent-driven | RUN if the plan has roughly 15+ tasks AND at least one phase has internal parallelism. Otherwise SKIP. The 15-task threshold is a heuristic, not a hard cutoff — a 12-task plan with heavy fan-out warrants extraction, while a 25-task plan that's strictly sequential does not. | +| non-gc | Parallel session (one builder, sequential checkpoints) | SKIP — banners alone suffice | + +If the runner SKIPs, it MUST add a one-line note inside the plan +("DAG extraction skipped: ; revisit if execution model +changes") so future readers see the gate was considered. + +If the runner RUNs, it proceeds to Phase 2. + +### Phase 2: Extract hard edges + +A hard edge is "task B MUST NOT start until task A's commit lands." +Sources, in descending order of authority: + +1. **Explicit dependency blocks** in the plan (e.g. "Stage Overview: + Dependencies"). Authoritative — the runner copies these verbatim. +2. **Task body sentences** containing "depends on", "must complete + after", "after X is committed", "before Y starts." +3. **Type/symbol creation chains.** If Task A creates a type, + interface, schema element, migration, or shared helper that Task B + references, A blocks B. +4. **Phase prologues/epilogues** that gate batches of tasks. + +The runner MUST NOT promote to a hard edge: + +- "It's cleaner to do X before Y" — preference, not blocker. +- "X may shift after Y" — heads-up, not blocker. +- "X is similar to Y" — relationship, not edge. +- Numeric task ordering within a phase — assume parallel unless the + plan explicitly says otherwise. + +For each edge, the runner MUST record: +- source task ID +- target task ID +- a citation: line number, section reference, or quoted phrase from + the plan justifying the edge + +If a citation cannot be produced, the edge is not real and MUST be +discarded. + +### Phase 3: Extract soft conflicts + +A soft conflict is "tasks A and B touch the same file but neither +depends on the other." Soft conflicts are NOT edges in the DAG. They +are separate metadata used at dispatch time to prevent parallel +builders from serializing into merge conflicts. + +For each task, the runner MUST list every file path under its +`Files:` section (Create / Modify / Test). For each file, the runner +MUST collect the set of tasks that touch it. Any pair within that set +without a hard edge between them is a soft conflict. + +The runner MUST record soft conflicts as a separate table, NOT as +edges in the graph. Coordinators treat them as mutual-exclusion locks +at dispatch time, not as ordering constraints. + +If any task lacks a `Files:` section, the runner MUST STOP and refuse +to produce a DAG. The fix belongs upstream in `writing-plans-enhanced`, +not here. + +### Phase 4: Extract per-node metadata + +For each task, the runner MUST collect: + +- **`priority`** — security / correctness / quality / cleanup +- **`blast_radius`** — single-file / package / codebase-wide +- **`kind`** — code / research / design-decision / review-gate +- **`effort`** — if the plan provides it; otherwise omit +- **`external_blockers`** — references to other plans, manual approvals, + upstream events outside this plan's scope +- **`parent_phase`** — the phase this task belongs to (so the DAG can + associate the task with the phase whose banner governs it) +- **`parent_phase_banner`** — current Execution Status banner of the + parent phase (⬜ / 🚧 / ✅ / ⏸), read from the plan markdown + +Per-node metadata is NOT used for edge construction. It is used by the +coordinator to decide WHICH ready node to dispatch next, not WHEN it +becomes ready. + +### Phase 5: Detect plan-structural hazards + +The runner MUST re-read the plan with these specific eyes: + +- **Superseded sections** (Core discipline §3). Tasks inside `
` + blocks marked "SUPERSEDED" or similar are excluded from the graph + entirely. +- **Mass-rename / freeze events.** Tasks described as "large blast + radius", "every file that imports X", "must execute after all other + tasks in this batch." Flag as freeze points; coordinators serialize + against them. +- **External plan handoffs.** Phrases like "implementation plan is in + another document" or "see proposal doc" mean that phase is a single + external-reference node, not its inline task list. The runner MUST + NOT inline external task lists. +- **Non-task tasks.** Research timeboxes, design gates, manual review + approvals. Tag with `kind:research` or `kind:gate` so coordinators + do not dispatch them as code work. +- **Banner state.** Per Phase 4, capture each phase's current Execution + Status banner. + +### Phase 6: Render the DAG artifact + +The runner MUST write the DAG to `-dag.md` (e.g. +`docs/superpowers/plans/2026-04-08-mcp-tools-plan-dag.md`). Co-location +keeps plan and DAG paired in directory listings, code review diffs, +and any tooling that walks the plans directory. + +The artifact MUST contain: + +1. A scope statement: "models inter-task ordering only; intra-task + ordering (TDD steps, sub-step sequencing) is not modeled." +2. Every edge cited back to the plan (line number or quoted phrase). +3. The soft-conflicts table. +4. The per-node metadata table including each node's parent phase and + parent-phase banner state. +5. A topological-layers view showing which tasks fan out together at + each layer. +6. Freeze events and external handoffs listed explicitly. +7. An "excluded from graph" section for superseded, invalidated, and + resolved-by-prerequisite tasks, each with a cited reason. +8. A pointer to the plan's Living Document Contract noting that the + DAG records the parent-phase banner per node, that fine-grained + per-task state lives in the tracker (if Phase 8 was performed), and + that the plan banner wins on disagreement. + +Format choice: + +- **Mermaid** is the default human-facing format because it renders + inline on GitHub and is readable in plain text. Most projects + SHOULD use Mermaid unless they have a specific reason not to. +- **Graphviz/DOT** is acceptable for projects that already render DOT + elsewhere and want a single rendering toolchain. +- **Plain structured text** (YAML/JSON only, no diagram) is acceptable + when the artifact is consumed primarily by tooling and the human + view comes from the tracker. + +Whatever the human-facing format, the artifact MUST also contain a +machine-readable form (YAML or JSON sidecar, or a fenced code block +beneath the diagram) that the tracker adapter (Phase 8) reads. The +two MUST be derived from the same source data — divergence between +the diagram and the structured form is a defect. + +### Phase 7: Adversarial review (minimum 4 rounds, until zero findings) + +The first-pass DAG is wrong. The runner MUST re-read the artifact +adversarially. + +Run these rounds sequentially, documenting findings at each: + +**Round 1 — Citation auditor.** Audit every edge in the graph. Can +each one cite a specific plan line, section, or quoted phrase that +justifies it? If a citation cannot be produced, the edge is fabricated +and MUST be deleted. Walk the entire graph; do not skip "obvious" +edges. + +**Round 2 — Coverage auditor.** Re-read every `Files:` section in the +plan. For each file that appears in more than one task, verify the +relationship is captured either as a hard edge (with citation) or a +soft conflict. Pairs that appear in neither are missed dependencies +and MUST be added. Also re-scan for superseded sections, external +handoffs, and freeze events; any that were missed in Phase 5 MUST be +captured now. + +**Round 3 — Inference-discipline auditor.** Walk the graph hunting for +edges the runner inferred from numeric ordering, narrative flow, or +"obvious sequence" rather than from a plan citation. Numbered tasks +are siblings unless the plan says otherwise. A → B → C MUST appear +only if the plan explicitly orders them. Strike any edge whose +justification reduces to "they're listed in this order." + +**Round 4 — Plan-specific perspective (runner-chosen).** Rounds 1-3 +cover known-in-general failure modes. This plan has its own character +— security-heavy, schema-heavy, frontend-heavy, methodology-novel, +cross-plan-coupled, something else — and that character has its own +failure modes the canonical rounds will not catch. The runner MUST +choose a perspective specifically relevant to what this plan actually +contains and review from it. + +Requirements for the Round 4 perspective choice: + +- MUST be a perspective not already covered by Rounds 1-3. +- MUST be specifically relevant to THIS plan, not a generic auditor + template. If the plan is auth-heavy, "security gate auditor" is + legitimate; if the plan is pure refactoring, it isn't. +- MUST be named and described explicitly in the DAG artifact under a + heading like `### Round 4 — [chosen perspective] — [N findings + applied]`, so future readers can see the reasoning. +- SHOULD be concrete enough to produce findings. "General quality + pass" is too vague; "cross-plan handoff fidelity to the external + Stage 3 plan" is actionable. + +**Loop rule (applies to all rounds).** If any round produces material +findings, the runner MUST re-run every round in sequence after applying +fixes. Fixes can surface issues earlier rounds missed or introduce new +issues those rounds would have caught. Exit only when a full pass +through every round (1-3 canonical + Round 4 + any additional rounds +the runner elected to run) produces zero material findings. + +**Additional rounds (5+) — encouraged when warranted.** 4 is the floor, +not a ceiling. Run further rounds if the plan has unusual structural +risk, cross-plan dependencies, or a freeze event with broad scope. +Each additional round MUST be named and described like Round 4 and +MUST NOT duplicate a prior round's lens. + +### Phase 8: Sync to a queryable substrate + +Mandatoriness depends on project type and execution model. + +| Project type & execution model | Phase 8 status | +|---|---| +| gc (any execution model) | MANDATORY — Gas City reads from Beads to dispatch work; the orchestrator is non-functional without sync | +| non-gc, "Parallel agents" (3+ concurrent builders) | RECOMMENDED — cross-phase ready-queue queries pay back the sync cost | +| non-gc, "Subagent-driven" (≥15 tasks AND parallelism) | OPTIONAL — banner system suffices for most cases; sync only if cross-plan visibility or finer-grained queries are wanted | +| non-gc, "Parallel session" or below the gate threshold | N/A — Phase 1 should have skipped this skill entirely | + +The DAG → tracker step is tool-specific. This skill defines the adapter +contract; it does not specify the tracker. + +When sync is performed, the adapter MUST: + +- Create one issue per node, keyed `-` (deterministic + so re-runs are idempotent). +- Encode hard edges as blocker dependencies in the tracker's native + format. +- Encode soft conflicts as `mutex:` labels on each side of + every conflict pair. +- Encode per-node metadata as labels (`priority:`, + `blast_radius:`, `kind:`, etc.). +- **Encode parent-phase banner state on each node**: ⬜ → open; + 🚧 → in-progress (with claim timestamp + branch if available); + ✅ → closed-shipped with the shipping SHA; ⏸ → blocked, with the + prose unblock condition AND the link from the plan's banner. The + tracker holds per-task state at finer granularity; the parent phase + banner is recorded per node so the tracker can render either view. +- Create already-closed anchor issues for prerequisite work outside + this plan's scope (shipped phases of upstream plans, completed + prerequisites) so cross-plan dependency queries still resolve + correctly. +- Mark superseded and invalidated nodes as closed-on-creation with a + reason field. + +The adapter MUST NOT: + +- Propagate tracker edits back to the DAG or plan. Authority flows + plan → DAG → tracker, never the other way. +- Invent dependencies the DAG didn't declare. +- Skip the closed-anchor pattern for prerequisites — silent gaps in + the dependency graph become silent gaps in `ready`-queue queries. + +Sync MUST be idempotent: re-running the skill regenerates tracker +state deterministically from the plan + DAG. The runner SHOULD verify +idempotency by re-running the sync immediately after the first run +and confirming the tracker reports zero changes (no new issues, no +modified labels, no edge churn). If a second run produces changes, +the adapter is non-deterministic and the divergence MUST be diagnosed +before relying on the tracker for dispatch. + +### Phase 9: Plan-revision protocol + +The Living Document Contract from `writing-plans-enhanced` Step 5 +specifies events that update the plan. Each event has a defined DAG +action. + +| Plan event | DAG action | +|---|---| +| Phase claim — non-gc (⬜ → 🚧 banner update) | No structural DAG change. If a tracker is in use, the tracker MUST update the affected nodes' parent-phase banner state. | +| Phase claim — gc (Beads claim on a task issue, no banner change) | No structural DAG change and no banner update; gc owns claim state in Beads. The phase banner stays ⬜ until the phase ships. | +| Phase ship — non-gc (🚧 → ✅) | No structural change. Shipping commit MUST update both the banner and the tracker (if used) atomically. | +| Phase ship — gc (⬜ → ✅; banner skipped 🚧 entirely) | No structural change. Shipping commit MUST update both the banner and the Beads issue atomically. | +| Phase defer (→ ⏸) | If the unblock condition references a NEW external dependency, the runner MUST re-run Phase 2 to record the `external_blocker`. The banner's prose + link is the durable coordination signal; the DAG mirrors it. | +| Stale-claim reclaim (per writing-plans-enhanced Step 5) | The runner MUST update the tracker node's claim timestamp and branch; prior claim history is preserved per the reclaim protocol. | +| Deviation (scope edit, dropped task, reordered phase) | The runner MUST re-run Phases 2-7 on the affected sub-graph and update the artifact. If the deviation changes plan structure substantially, the runner MUST require a fresh `plan-review-cycle` pass before re-extracting. | +| Discovery (new task added) | The runner MUST add the new node, re-extract its edges, and re-run Phase 7 on its neighbors. | +| Banner-state internal inconsistency detected (e.g., a phase shows ✅ while a hard-prerequisite phase still shows ⬜) | The runner MUST flag this as a defect in the plan, NOT silently reconcile it. Surface to the user; the plan is the source of truth and must be repaired before re-extraction proceeds. | + +The runner MUST NOT silently delete tracker issues for removed nodes. +They MUST be closed with reason "superseded by plan revision " +so future dispatches see the transition trail. + +Plan revisions are a common failure mode for this workflow — banner +state drifts, scope edits are not propagated to the DAG, and +downstream readers consume a stale graph. Treating revisions as +normal events with a defined protocol — not exceptions — is what +keeps the DAG honest over time. + +**Detecting that a plan was revised since the last DAG extraction.** +The runner SHOULD compare the plan file's git history against the DAG +artifact's last-modified commit. If the plan has commits newer than +the DAG, treat the DAG as potentially stale and re-run the affected +phases. The runner MAY add a one-line comment to the DAG artifact +(e.g. ``) so +future readers can audit alignment without git archaeology. + +### Phase 10: Log to the pattern store + +Following `plan-review-cycle`'s post-completion convention, the runner +SHOULD log to the project's pattern store (private journal, MCP store, +dated `docs/learnings/` file, or whatever the project uses): + +- **Type:** pattern +- **Key:** `dag-extraction-[plan-slug]` +- **Insight:** Plan-shape patterns observed (sequential vs parallel; + freeze events; superseded sections; cross-plan handoffs; banner + conventions that rendered ambiguously). Recurring extraction-time + discoveries SHOULD feed back into `writing-plans-enhanced` if a + pattern keeps appearing. + +## Red flags (STOP) + +These mean the extraction is not yet complete or correct: + +- "The plan ordering is obvious" — Then cite the line that says so. If + you can't, it's not an edge, it's an inference. +- "These tasks are clearly sequential because they're numbered" — + Numbered tasks are siblings unless the plan orders them. Strike the + inferred edges. +- "The `Files:` section was missing for one task; I worked around it" + — Refuse and surface the gap upstream. A workaround silently fails + to detect soft conflicts for the missing task. +- "The superseded section is short; I'll just include those tasks + anyway" — Promoting superseded content corrupts the entire + downstream graph. Exclude. +- "I'll skip Phase 8 sync; the user can run it later" — On gc projects, + no Beads sync means Gas City can't dispatch this plan. Skip is not + an option. +- "One review round is enough; the DAG is small" — Small DAGs are + cheaper to review, not exempt from review. Run the four rounds. +- "The plan revised mid-extraction; I'll just patch the affected + edges" — Re-run the affected phases (Phase 9). Patches accumulate + drift. +- "The banner says ⏸ but I'll model it as ⬜ so it shows up in ready + queues" — Authority is plan → DAG → tracker; if the plan banner is + wrong, fix the plan first, then re-extract. +- "I can't find a session-specific perspective for Round 4" — Try + harder. If you genuinely can't, document the attempt explicitly per + Phase 7's Round 4 requirements; don't silently skip. + +## Common rationalizations (rebuttals) + +| Rationalization | Reality | +|---|---| +| "The plan is small; the DAG is overhead" | Phase 1's gate handles this. Either the gate says skip (legitimate) or the gate says run (do it). Don't override the gate with vibes. | +| "The plan author already declared the dependencies in prose" | Prose declarations are not queryable. Extracting them into a structured form is the entire point of this skill. | +| "Citing every edge slows me down" | Uncited edges are the failure mode this skill exists to prevent. The cost of one careful pass beats the cost of a wrong DAG corrupting downstream dispatch. | +| "Soft conflicts are obvious from the `Files:` sections; I don't need to enumerate them" | Coordinators don't read `Files:` sections at dispatch time. They query the soft-conflicts table. Implicit conflicts become merge conflicts. | +| "I'll write the artifact and skip the adversarial review; my first pass is good" | Single-pass extraction misses fabricated edges, missed soft conflicts, and superseded content promoted as live. The handoff skill's review discipline applies here for the same reasons. | +| "The plan revision is small; I'll just edit the artifact directly" | Edits without re-extracting Phases 2-7 introduce drift the next reader can't trust. Re-run the affected phases. | +| "Beads is overkill for this project" | On gc projects, the orchestrator can't dispatch without it — that's not aesthetic, it's a hard requirement. On non-gc projects, the Phase 8 table determines status (recommended for "Parallel agents", optional otherwise) — and "optional" genuinely means optional. Don't dismiss the requirement on gc; don't force the recommendation on non-gc. | +| "The banner discipline is enough; we don't need a DAG" | True for "Parallel session" execution and small plans. False for "Parallel agents" and gc. Phase 1's gate captures this. | + +## Checklist + +Before declaring the extraction complete, verify: + +- [ ] All four prerequisites verified: plan written by + `writing-plans-enhanced`, `plan-review-cycle` complete with zero + findings, execution strategy known, gc / non-gc determined. +- [ ] Phase 1 gate decision recorded (RUN or SKIP with reason). +- [ ] Every hard edge has a plan citation (line number or quoted + phrase). +- [ ] Every `Files:`-section overlap is captured as either a hard + edge or a soft conflict; no orphan pairs. +- [ ] Every task has per-node metadata recorded, including current + banner state. +- [ ] All superseded sections are excluded from the graph and listed + in the artifact's "excluded" section with reasons. +- [ ] Freeze events and external plan handoffs are flagged explicitly. +- [ ] The DAG artifact is at `-dag.md` and contains all + eight required sections (scope statement, edges, soft conflicts, + metadata, layers, freezes/handoffs, exclusions, LDC pointer). +- [ ] At least 4 adversarial review rounds complete (3 canonical + + Round 4 plan-specific; additional rounds run as judgment + suggested); the final full pass through every round produced + zero material findings. +- [ ] Round 4 (and any 5+ the runner elected to run) is documented by + name in the artifact with its findings count; perspective choice + is plan-specific, not a generic template. +- [ ] On gc projects: Phase 8 sync to Beads is performed; idempotency + is verified by running the sync a second time and confirming + zero changes. On non-gc: sync is performed if recommended by the + Phase 8 table, or skipped with a note in the artifact. +- [ ] Pattern-store log entry written (per Phase 10). +- [ ] Artifact committed in the same commit (or commit chain) that + lands the plan, so plan and DAG stay paired in git history. + +## Social proof + +Observed across multi-agent coordination cycles in plans of +sufficient size and parallelism: DAG extraction reduces dispatch +prompts from "figure out which tasks are unblocked given the current +plan state" reads to short pointer sequences. With Beads as an +example tracker, that looks like: `bd ready` returns N unblocked +issues; mutex labels show two of them both touch +`internal/notify/worker.go`, so the coordinator dispatches one and +queues the other. The principle is tracker-agnostic — the same +shape of query against any structured tracker yields the same +short-prompt dispatch. + +DAGs that ship without adversarial review create the opposite: a +fabricated edge sends a builder onto work that isn't actually ready; +a missed soft conflict produces parallel branches that collide at +merge; a superseded section promoted as live pushes builders onto +work the plan author marked "do not execute." Every one of those +failure modes was observed in real plan executions before this +skill's discipline was codified. + +The cost asymmetry favors rigorous extraction by a wide margin and +compounds across every dispatch the plan receives. A plan with three +agents over three days takes ~9 builder-cycles from the DAG; a wrong +DAG poisons all of them. + +## Related conventions + +- **`writing-plans-enhanced`** is the upstream that produces the plan + this skill consumes. The Living Document Contract (its Step 5) + defines the banner format that this skill mirrors. If + `writing-plans-enhanced`'s contract changes, this skill's Phase 4 + and Phase 9 SHOULD be updated to match. + +- **`plan-review-cycle`** is the immediate prerequisite. This skill + refuses to run before `plan-review-cycle` produces a zero-finding + round. The two are designed to chain. + +- **gc / non-gc determination.** A project is gc if Gas City (or + another Beads-backed orchestrator) is load-bearing. The detection + mechanism (`.gc/`, Beads database, project setting) is project-local; + this skill assumes the convention is recorded somewhere the runner + can check. + +- **Banner format and stale-claim reclaim.** Banner conventions and + the reclaim protocol come from `writing-plans-enhanced` Step 5. This + skill does not redefine them; it consumes them as input. + +- **Strategy & rationale.** The decision framework for gc vs non-gc + handling, why banners and Beads divide LDC events the way they do, + and the wider context for this skill's design SHOULD be documented + in a project-local strategy doc (e.g. + `dev/research-findings/dag-extraction-and-orchestration.md` or + whatever the project uses for methodology research). + +## The bottom line + +A plan's inter-task dependency structure exists whether or not it's +written down. If it's not written down, every coordinator reconstructs +it from prose and gets it slightly wrong each time. Extract once, +adversarially review, sync to whatever queryable substrate the +orchestration needs, mirror banner state on revision. The cost is one +session; the saving compounds across every dispatch the plan receives. + +If a downstream coordinator dispatches work that turns out to be +blocked, the DAG failed. If `bd ready` (or the equivalent) returns +exactly the set of tasks a careful human would, it succeeded. diff --git a/dev/plans/2026-03-10-phase9-health-review-remediation-dag.md b/dev/plans/2026-03-10-phase9-health-review-remediation-dag.md new file mode 100644 index 00000000..3bd860e3 --- /dev/null +++ b/dev/plans/2026-03-10-phase9-health-review-remediation-dag.md @@ -0,0 +1,226 @@ +# Phase 9 Health Review Remediation — Task DAG + +Inter-task dependency graph for `dev/plans/2026-03-10-phase9-health-review-remediation-plan.md`. + +**Scope:** ordering between named tasks (e.g. `1.11 → 2C.1`). Intra-task ordering — TDD steps inside a single task body, such as 6B's "scaffolding → stub → failing test → real impl → wire to readiness" — is not modeled here. Read the task body in the plan for those details. + +**Methodology:** this DAG was extracted before the methodology was codified. The standardized procedure is now documented in `.claude/skills/extracting-plan-dag/SKILL.md` and the rationale for the gc / non-gc split it embeds is in `dev/research-findings/dag-extraction-and-orchestration.md`. Re-extractions (e.g. on plan revision) SHOULD follow the skill's process going forward. + +Sources of edges (line refs into the plan): +- Stage Overview "Dependency graph" block (lines 43–51) +- Stage 1 prologue: 1.11 must run after all other Stage 1 tasks (line 83) +- Stage 2B prologue: 2B finishes before Stage 2C wires into runtime (line 961) +- Task 2A.2: cross-tenant test asserts via `tdb.AppStore` (the restricted role enabled by 2A.1) (lines 929–941) +- Task 6C: "DEFERRED — depends on Stage 3 completing" (line 2761) +- Stage 3 wrapper: tasks 3.0–3.12 are inside a `
` block marked **superseded — Do not execute** (lines 1280, 1284, 1912). The actual Stage 3 work lives in `dev/plans/2026-03-15-phase9-stage3-api-contract-convergence-plan.md`. +- Per-pillar Phase 8 notes in the prerequisites table (lines 17–28) and per-task warnings (e.g. 6A line 2589, 6B line 2660, 5A appendix line 3001, 2B.1/2B.2 lines 969, 1079). + +## Mermaid graph + +```mermaid +graph TD + %% ── Phase 8 prerequisite (per pillar) ─────────────── + subgraph P8["Phase 8 merges (prerequisite)"] + P8B["8B Observe
(metrics, instrumentation)"]:::prereq + P8C["8C Operate
(/healthz, /readyz, doctor, admin)"]:::prereq + P8D["8D Generic feed adapter"]:::prereq + P8E["8E (other operational work)"]:::prereq + end + + %% ── Stage 1 ────────────────────────────────────────── + subgraph S1["Stage 1: Quick Wins"] + T1_1["1.1 Close api.Server"] + T1_2["1.2 Close stdlib DB wrappers"] + T1_3["1.3 Validate COOKIE_SECURE"] + T1_4["1.4 Worker pool ctx cancel"] + T1_5["1.5 Remove dead readTx"] + T1_6["1.6 Fix GetCVEDetail comment"] + T1_7["1.7 Add missing assertion"] + T1_8["1.8 Stop discarding setup errors"] + T1_9["1.9 DownloadToTemp pkg state"] + T1_10["1.10 Validate InCISAKEV bool"] + T1_12["1.12 Dedup toNullString"] + T1_11["1.11 sqlc rename Cfe → CVE
(after all other Stage 1)"]:::ordering + end + + %% ── Stage 2A ───────────────────────────────────────── + subgraph S2A["Stage 2A: RLS Security"] + T2A_1["2A.1 Restricted app DB role"] + T2A_2["2A.2 RLS cross-tenant test"] + end + + %% ── Stage 2B ───────────────────────────────────────── + subgraph S2B["Stage 2B: Evaluator Refactor"] + T2B_1["2B.1 Extract post-filter"] + T2B_2["2B.2 Merge queryCandidates"] + end + + %% ── Stage 2C ───────────────────────────────────────── + subgraph S2C["Stage 2C: Alert Wiring (parallel siblings)"] + T2C_1["2C.1 Schedule batch + EPSS jobs"] + T2C_2["2C.2 Realtime post-merge hook"] + end + + %% ── Stage 3 (gate only — implementation lives elsewhere) ── + GATE["OpenAPI evaluation gate
(timeboxed, in-plan)"]:::gate + S3EXT["Stage 3 implementation
(external plan:
2026-03-15-phase9-stage3-
api-contract-convergence-plan.md)"]:::external + + %% ── Stage 4 ────────────────────────────────────────── + subgraph S4["Stage 4: Ops Hardening"] + T4D["4D Notification semaphore eviction"] + T4E["4E Configurable statement timeout"] + end + + %% ── Stage 5 ────────────────────────────────────────── + subgraph S5["Stage 5: Test Quality"] + T5A["5A Feed adapter golden tests"] + T5B["5B Ingest handler integration test"] + T5C["5C Email testcontainer"] + T5D["5D Advisory lock concurrency test"] + end + + %% ── Stage 6 ────────────────────────────────────────── + subgraph S6["Stage 6: Architecture"] + T6A["6A ServerDeps options struct"] + T6B["6B Notification worker health"] + T6E["6E merge.Store interface"] + T6F["6F BootstrapFirstUserOrg refactor"] + T6C["6C Extract buildApp()
(deferred)"]:::deferred + end + + %% ── Phase 8 prerequisite edges (whole-stage gating) ── + P8B --> S1 + P8B --> S2A + P8B --> S2B + P8B --> S2C + P8B --> GATE + P8B --> S4 + P8B --> S5 + P8C --> S1 + P8C --> S2A + P8C --> S2B + P8C --> S2C + P8C --> GATE + P8C --> S4 + P8C --> S5 + P8D --> S5 + P8E --> S1 + P8E --> S2A + + %% ── Phase 8 prerequisite edges (per-task call-outs) ── + P8C -.->|adds Server deps captured by ServerDeps| T6A + P8C -.->|exposes /readyz target| T6B + P8D -.->|generic adapter covered by golden tests| T5A + P8B -.->|metric instrumentation may shift| T2B_1 + P8B -.->|metric instrumentation may shift| T2B_2 + P8B -.->|alert metrics activate once wired| T2C_1 + P8B -.->|alert metrics activate once wired| T2C_2 + + %% ── Stage 1 fan-in to 1.11 ─────────────────────────── + T1_1 --> T1_11 + T1_2 --> T1_11 + T1_3 --> T1_11 + T1_4 --> T1_11 + T1_5 --> T1_11 + T1_6 --> T1_11 + T1_7 --> T1_11 + T1_8 --> T1_11 + T1_9 --> T1_11 + T1_10 --> T1_11 + T1_12 --> T1_11 + + %% ── Stage 2A internal edge ─────────────────────────── + T2A_1 --> T2A_2 + + %% ── Stage 2B → 2C (both 2B tasks must complete) ────── + T2B_1 --> T2C_1 + T2B_2 --> T2C_1 + T2B_1 --> T2C_2 + T2B_2 --> T2C_2 + + %% ── Stage 3 gate → external plan → 6C ──────────────── + GATE --> S3EXT + S3EXT --> T6C + + classDef prereq fill:#fce4a6,stroke:#a06b00,color:#3a2a00 + classDef ordering fill:#e8d5ff,stroke:#5b27a8,color:#22094a + classDef gate fill:#d4edff,stroke:#1f6feb,color:#0a2540 + classDef external fill:#dff5e0,stroke:#1a7f37,color:#0a2a12 + classDef deferred fill:#f0f0f0,stroke:#999,color:#444 +``` + +Solid arrows = hard ordering required by the plan. Dotted arrows = pillar-specific pre-conditions / metric instrumentation hand-offs called out in the plan body. + +## Topological layers (parallel-execution view) + +A subagent coordinator can fan out each layer in parallel; later layers wait for the prior layer's edges. **Read the soft-conflicts section before dispatching L1 in parallel.** + +| Layer | Tasks | Notes | +|-------|-------|-------| +| L0 | 8B Observe · 8C Operate · 8D · 8E | Phase 8 prerequisite — out of scope for this plan | +| L1 | 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 1.10, 1.12 · 2A.1 · 2B.1 · 2B.2 · 4D · 4E · 5A · 5B · 5C · 5D · 6A · 6B · 6E · 6F · OpenAPI gate | All independent per the plan's dependency block. 2A.1, 2B.1, 2B.2 have no stated Stage 1 prerequisites, so they belong here. | +| L2 | 1.11 · 2A.2 | 1.11 waits on the Stage 1 fan-in. 2A.2 waits on 2A.1. | +| L3 | 2C.1 · 2C.2 | Both wait on 2B.1 + 2B.2. They are siblings — the plan does not require one before the other. | +| L4 | Stage 3 implementation (external plan) | Waits on the OpenAPI gate's outcome. | +| L5 | 6C (extract `buildApp()`) | Waits on the external Stage 3 plan landing. | + +`6D` is excluded entirely — invalidated, no node in the graph. + +## Soft conflicts (file-level, not logical) + +These pairs are independent in the plan's dependency model but touch the same file. Dispatching them simultaneously will produce merge conflicts; sequence them in the queue. + +| Pair | Shared file | Conflict | +|------|-------------|----------| +| 4D ↔ 6B | `internal/notify/worker.go` | Both add fields to `Worker` struct and modify `Start()` | +| 1.12 ↔ 6E | `internal/merge/pipeline.go` | toNullString call sites vs. `merge.Store` interface signature change | +| 1.4 ↔ 2C.1 | worker pool registration sites | Context-cancel fix vs. new `alert_batch`/`alert_epss`/`alert_zombie_sweep` handlers | +| 5B ↔ 2C.2 | `internal/ingest/handler.go` | Integration test vs. realtime-eval hook on the same handler | +| 1.11 ↔ 2B.1, 5B, 6E, Stage 3 work | every file importing `generated.Cfe` | 1.11 mass-renames the type. Tasks that write code referencing the type pre-rename will need a trivial rebase — not a hard dep, but a real coordination cost. The plan resolves this by sequencing 1.11 last in Stage 1 before later stages start writing new code against the type. | +| 6A ↔ 8C-derived setters | `internal/api/server.go`, `cmd/cvert-ops/main.go` | If Phase 8C added new `Set*Deps` methods, 6A must absorb them too (called out in plan §6A Step 2 note). | + +## Critical path + +There are two largely independent chains; the plan does not connect them, and the second is only partially defined here. + +**Chain A (alert wiring):** + +``` +P8 (8B + 8C + 8E) → {Stage 1 batch, longest task} → 1.11 → {2B.1 ∥ 2B.2} → {2C.1 ∥ 2C.2} +``` + +The `2C.x` fan-out at the end means the chain-A bottleneck is `max(2C.1, 2C.2)` after Stage 2B completes — neither blocks the other. + +**Chain B (API contract convergence):** + +``` +P8 (8B + 8C) → OpenAPI gate → external Stage 3 plan (Tasks 0–14b) → 6C +``` + +Chain B's true length is set by `2026-03-15-phase9-stage3-api-contract-convergence-plan.md`, which has 14+ tasks of its own. From this plan's perspective the depth is unknown; treat Chain B as the project critical path until the external plan's own DAG is summarized. + +The two chains share only the Phase 8 prerequisite, so they run concurrently after L0. Stages 4, 5, 6A, 6B, 6E, 6F are off the critical path entirely — they can land any time after their Phase 8 pillar is in. + +## Resolved / invalidated (excluded from the graph) + +- **Findings 3, 10, 11, 38** — resolved by Phase 8 or already correct; no task ever existed. +- **Tasks 4A, 4B** — subsumed by Phase 8B/8C; removed from Stage 4 scope. +- **Task 4C** — moved into Stage 6 as 6C (already a node). +- **Task 6D (Finding 19)** — invalidated; NVD has no bulk download archives. Not a node. +- **Original Tasks 3.0–3.12** — superseded; lives behind `
` in the plan with "Do not execute." Not nodes; replaced by the single `S3EXT` node pointing at the external implementation plan. + +## Adversarial review + +This DAG went through nine rounds of review during its initial production, summarized retrospectively against the standardized rounds in `.claude/skills/extracting-plan-dag/SKILL.md` Phase 7. Findings counts are approximate, recovered from the conversation arc rather than logged at the time. + +| Round | Lens | Findings applied | +|---|---|---| +| 1 | Citation auditor — every edge cites a plan line | 4 fabricated edges removed (incl. an unjustified `2C.1 → 2C.2`) | +| 2 | Coverage auditor — `Files:`-overlap pairs captured | 6 soft-conflict pairs added (4D↔6B, 1.12↔6E, 1.4↔2C.1, 5B↔2C.2, 1.11 mass-rename row, 6A↔8C-derived setters) | +| 3 | Inference-discipline auditor — no edges from numeric ordering | Numeric-order edge from `2C.1 → 2C.2` deleted (was inferred, not cited) | +| 4 | Plan-specific perspective: **`
` block / supersession audit** — chosen because the plan revised Stage 3 mid-flight and wrapped the original task list in a `
` block marked "Do not execute" | 13 superseded tasks (3.0–3.12 + cleanup) removed from the graph; replaced with a single external-plan node `S3EXT` and a re-routed `T6C` dependency | +| 5+ | Loop check — graph/text contradictions, dangling edges, scope-clarification gaps | DAG-scope statement added; `T6D` removed entirely (graph said "node," text said "excluded"); critical-path claim recomputed as two independent chains; Phase 8 split into 8B/8C/8D/8E with per-task dotted edges; 1.11→2B.1 demoted from a graph edge to a soft-conflict row | + +The Round 4 perspective was specifically motivated by this plan's mid-flight Stage 3 revision (the `
` block at lines 1280–1912 of the plan). On a plan without such revisions, a different Round 4 perspective would have applied — that's why the skill mandates plan-specific choice rather than a fixed canonical lens. + +A final loop pass produced zero material findings. diff --git a/dev/research-findings/dag-extraction-and-orchestration.md b/dev/research-findings/dag-extraction-and-orchestration.md new file mode 100644 index 00000000..e55d90d1 --- /dev/null +++ b/dev/research-findings/dag-extraction-and-orchestration.md @@ -0,0 +1,434 @@ +# DAG Extraction and Multi-Agent Orchestration: Strategy & Context + +## Terminology + +The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", +"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this +document are to be interpreted as described in RFC 2119. + +A "**gc project**" is any project where Gas City (or another +Beads-backed orchestrator) is load-bearing — the orchestrator +dispatches work by reading from Beads and atomic-claims issues on +behalf of agents. Gas City is non-functional without a populated, +current Beads database. + +A "**non-gc project**" is everything else: the plan markdown plus +the Living Document Contract from `writing-plans-enhanced` Step 5 +(per-phase Execution Status banners + stale-claim reclaim protocol) +is sufficient runtime state. + +## Why this doc exists + +Plan execution coordination is a hard problem. Several attempts to +solve it have produced complementary tools that overlap in awkward +ways: + +- The **Living Document Contract** evolved through trial and error + with multi-agent coordination cycles. It keeps plan markdown + honest as execution progresses — banners flip ⬜ → 🚧 → ✅ or → ⏸, + deviations get inlined, discoveries get captured. It works well + for solo and small-team execution and produces excellent archival + records. + +- **Beads** (and orchestrators built on it like Gas City) provides + atomic claim, globally-visible runtime state, and structured + ready-queue queries. It solves the worktree-divergence problem + that LDC banners hit when 3+ builders concurrently update the + same plan markdown file. + +- The **DAG extraction skill** (`.claude/skills/extracting-plan-dag/`) + forces a plan's inter-task dependency structure to be made + explicit and queryable. It chains after `plan-review-cycle` and + produces a co-located DAG artifact. + +These three tools answer overlapping questions ("what's the runtime +state of this work?" "what's actionable now?" "what's the dependency +structure?") at different layers and with different durability +profiles. Without a clear strategy, agents on a project either: + +- Use Beads where LDC would suffice and accept the extra tooling + burden; +- Use LDC where Beads would prevent worktree-divergence pain and + accept the merge-conflict tax; +- Use both inconsistently and spend cognitive overhead reconciling + state across substrates. + +This doc records the strategy converged on across discussion: how +the three tools layer, who has authority for what, and how the +workflow stays the same on gc and non-gc projects with a small, +mechanical delta. + +## Core principles (two asymmetries) + +1. **Cheap to layer correctly, expensive to reconcile after the + fact.** Each tool has a defined role and direction of authority. + Setting that up at plan-writing time is cheap. Discovering mid- + execution that two substrates disagree about a phase's state is + expensive and erodes trust in both. + +2. **Same workflow, small mechanical delta.** A builder on a non-gc + project and a builder on a gc project SHOULD use nearly the same + skills, the same banner conventions, the same plan format. The + gc-specific behavior SHOULD be a small reduction (skip a banner + transition gc handles in Beads) plus an automatic Phase 8 sync + triggered by gc-project detection. Anything more grows the + maintenance burden of keeping two workflows aligned. + +## The three tools, their roles, their authority + +| Tool | Role | Authority over... | +|---|---|---| +| Plan markdown + Living Document Contract | Source of truth + archival record | Phase-granularity state, deviations, discoveries, narrative context, why work was done a certain way | +| DAG (derived view, co-located with plan) | Inter-task dependency structure | Edges, soft conflicts, per-node metadata, exclusion of superseded content | +| Beads / tracker (runtime cache, optional on non-gc) | Live coordination layer | Per-task state at finer granularity than banners, atomic claim, ready-queue queries | + +**Direction of authority flows plan → DAG → tracker. Never the other +way.** Tracker edits don't propagate back to the plan. Banner state +in the plan is authoritative on disagreement. The plan is what gets +read in archaeology a year from now. + +## What each tool is good at (and not) + +### Plan markdown + LDC banners + +**Good at:** +- Co-located narrative + state. A banner sits above its task body; + any reader sees state before reading the task. +- Archival record. A year later, the plan tells the story of what + shipped, what got deferred, what was discovered. +- Tool-independent. Just markdown in git. Survives the death of any + external tracking tool. +- Self-propagating discipline. The contract is in the plan; every + session that opens the plan inherits the rules. + +**Not good at:** +- Atomic claim across worktrees. Two builders updating banners in + different worktrees → eventual consistency at merge time, with + line-level conflicts on the same banner. +- Cross-plan ready queries. To know what's actionable across a + project, you read N markdown files. +- Structured queries. "What's blocked on Sam's review?" is a grep, + not a typed filter. +- Fine-grained per-task state. Banners are at phase granularity; + per-task state during a phase is invisible. + +### DAG (the artifact this skill produces) + +**Good at:** +- Forcing implicit dependencies to become explicit. The act of + building it is the value, regardless of whether a tracker + consumes it. +- Catching plan defects early. Citation discipline surfaces + fabricated edges before they corrupt downstream dispatch. +- Cross-tool portability. Same DAG can be consumed by Gas City, + by another tracker, or by a coordinator reading the markdown + directly. + +**Not good at:** +- Live runtime state. The DAG is structural, not stateful. The + parent-phase banner state on each node is a snapshot, not a + live signal. +- Continuous synchronization. The skill prescribes re-extraction on + plan revision, but the DAG can drift between revisions if events + happen mid-stream without a re-run. + +### Beads / tracker (when present) + +**Good at:** +- Atomic claim. `bd claim` is race-free across worktrees in a way + banner-edit never can be. +- Ready-queue queries. `bd ready` returns exactly the unblocked + set, across all plans synced. +- Structured per-task state. Status, labels, blocker dependencies, + queryable filters. +- Mutex-via-labels. Soft conflicts encoded as `mutex:` + labels become first-class dispatch-time signals. + +**Not good at:** +- Narrative context. Beads issues record events; they don't tell + the story of why a phase was deferred or what was discovered + along the way. +- Long-term archival. A closed Beads issue is greppable but the + plan markdown is the durable artifact. +- Tool-portability. A workflow that depends on Beads commands + doesn't transfer to a project without Beads. + +## When to use which combination + +| Scenario | Plan + LDC | DAG | Tracker | +|---|---|---|---| +| Solo builder, sequential plan | Yes | Skip (gate) | Skip | +| Solo builder, large parallelizable plan | Yes | Yes | Optional | +| 2 builders, parallel work | Yes | Yes | Optional but useful | +| 3+ builders concurrent ("Parallel agents") | Yes | Yes | Recommended | +| gc project, any size | Yes | Yes (mandatory) | Yes (mandatory; this is what gc dispatches from) | + +The gate that decides DAG extraction lives in +`.claude/skills/extracting-plan-dag/` Phase 1. The gate that decides +tracker sync lives in the same skill's Phase 8. Both gates are +project-and-execution-model-dependent, not aesthetic. + +## The gc / non-gc split: how the tools divide LDC events + +The Living Document Contract specifies five event types. Each is +allocated to whichever substrate handles it best, and the allocation +differs slightly between gc and non-gc projects. + +| LDC event | Non-gc handling | gc handling | +|---|---|---| +| Phase claim (⬜ → 🚧) | Banner edit + stale-claim reclaim protocol | Beads claim only — **no 🚧 banner update** (banner stays ⬜ until ship) | +| Phase ship (→ ✅) | Banner edit in shipping commit | Banner edit AND `bd close` in shipping commit (atomic) | +| Phase defer (→ ⏸) | Banner edit with prose unblock condition + link | Banner edit AND `bd block` with the same prose + link | +| Deviation | Plan inline + top-of-plan summary | Same — plan inline + `bd comment` cross-link | +| Discovery | Plan "Discoveries" subsection | Same — plan inline + new `bd issue` if discovery becomes a task | + +The gc-specific reduction is concrete and small: **skip the 🚧 +banner update because Beads has the claim atomically and globally.** +Everything else stays the same. + +This gives gc projects: +- **Worktree-divergence on banners largely evaporates** because the + noisy mid-flight 🚧 updates are gone. Ship/defer/deviation banner + updates are infrequent and on different phases — minimal merge + friction. +- **Atomic claim** without any banner contention. +- **Live cross-plan visibility** via Beads. +- **Archival narrative preserved** — ship, defer, deviation, + discovery still hit the plan markdown in the same commit as the + work. + +And it gives non-gc projects: +- **Unchanged LDC discipline** — the contract is exactly the same + as it always was. +- **Optional tracker sync** when 3+ builders concurrent execution + benefits from cross-phase ready-queue queries. + +## Detection mechanism + +Skills SHOULD NOT ask the user "are you on gc?" each invocation. +Detect once via a project marker. Common markers: + +- A `.gc/` directory at the repo root. +- A Beads database file (typically `.beads/` or a SQLite file + referenced in project config). +- A `gas-city` or `bd` configuration block in the project's main + config. +- An explicit setting in `CLAUDE.md` or the project's equivalent. + +Detection happens in `writing-plans-enhanced` and propagates to the +chained skills (`plan-review-cycle`, `extracting-plan-dag`). When a +skill needs to know, it reads the project marker rather than asking. + +If detection is genuinely ambiguous, the skill asks the user once +and records the answer in the project marker for next time. + +## Worktree-divergence: what each tool does about it + +**The problem:** with 3+ builders in 3+ worktrees, each updating +banners in their own copy of the plan markdown, you get: + +- Eventual consistency at merge time. +- Two-line conflicts on adjacent banner edits. +- Race conditions on phase claims (two builders both flip ⬜ → 🚧 + in their own worktrees, second push gets rejected; reclaim + protocol then fires reactively). + +**Non-gc mitigation (LDC's reclaim protocol):** +- Detect stale claims by observable git signals (PR existence, + commit recency). +- Reactive cleanup. The race already happened; the protocol + resolves it. +- Works ~80% of the time at low overhead. + +**gc mitigation (Beads atomic claim):** +- Claim is a `bd` operation against a single global database. + Race-free by construction. +- Banner stays ⬜ during execution; only the shipping commit + updates it. No mid-flight banner edits → no merge conflicts on + banner edits. +- Coordination state lives outside the worktree. Worktree markdown + diverges, but the divergence doesn't matter for coordination. + +**Choice:** if you have 3+ concurrent builders regularly, gc is +genuinely better at this and the LDC's reclaim protocol is doing +work that should be unnecessary. If you have 1-2 builders, the LDC +reclaim protocol is sufficient and Beads is overhead. + +## Why the LDC stays — even when Beads exists + +It would be tempting on gc projects to drop the LDC banner +discipline entirely "because Beads has it." That would be a +regression. The LDC is not redundant with Beads; they record +different things: + +- **Beads is a runtime tool.** It tracks live state for orchestrator + dispatch. Issue history is queryable but verbose. +- **LDC banners are an archival record.** A year later, the plan + tells the story. + +The shipping-commit pattern (atomic banner update + bd close in the +same commit) keeps both views consistent without duplicating effort. +A builder shipping work updates one source — the plan banner — and +runs `bd close`. Both happen in the same commit. The plan's +narrative is preserved; Beads' runtime state stays current. + +If banner discipline ever lapses on a gc project, the symptom is +silent: Beads keeps working, the plan's archival quality erodes, +and a year later "what shipped here?" requires Beads archaeology +instead of a plan read. Don't let that happen. + +## Why the DAG stays — even when banners are sufficient + +On small / sequential / single-builder plans, the LDC banners are +genuinely sufficient runtime state. The DAG extraction skill's +Phase 1 gate skips extraction in those cases. + +But the gate is opinionated: + +- **Subagent-driven** with ≥15 tasks AND parallelism → extract. +- **Parallel agents** with 3+ concurrent builders → extract + unconditionally. +- **gc** projects → extract unconditionally regardless of plan + size, because the orchestrator dispatches from Beads which + requires populated tracker issues. + +The gate threshold (~15 tasks) is heuristic. A 12-task plan with +heavy fan-out warrants extraction; a 25-task strictly-sequential +plan does not. The runner exercises judgment. + +When the gate skips, the DAG extraction skill leaves a one-line +note in the plan ("DAG extraction skipped: ; revisit if +execution model changes"). That keeps the decision auditable +without forcing a stub artifact. + +## Common failure modes (and what prevents each) + +| Failure mode | Substrate that causes it | What prevents it | +|---|---|---| +| Banner-edit merge conflicts mid-execution | LDC at 3+ builders | gc adoption (Beads atomic claim) OR fewer concurrent builders | +| Stale plan after several phases ship | LDC discipline lapse | Mandatory banner update in the shipping commit | +| Beads and plan disagreeing about phase state | Builder shipping non-atomically | The LDC + tracker sync pattern: atomic banner-edit + `bd close` in the same commit | +| Wrong DAG corrupting downstream dispatch | DAG extraction without adversarial review | Phase 7 of the DAG skill — minimum 4 rounds, plan-specific Round 4, loop until zero | +| Superseded plan content promoted as live work | First-pass DAG extraction missing `
` blocks | Core discipline §3 of the DAG skill — explicit superseded-content scan | +| Cross-plan dependencies invisible | Banner-only state on multi-plan projects | Tracker sync (gc-mandatory, non-gc-recommended above 3 builders) | +| Plan revision drifts from DAG | Ad-hoc edits to either artifact | Phase 9 plan-revision protocol — re-run affected DAG phases on each LDC event class | +| Builders working concurrently on same file | Soft conflicts not enumerated | Phase 3 of the DAG skill — `Files:`-section overlap → soft-conflict table | + +## Workflow on a non-gc project (today) + +``` +1. /writing-plans-enhanced + → produces plan with LDC banners and `Files:` sections + → saves to docs/superpowers/plans/YYYY-MM-DD-.md +2. /plan-review-cycle + → minimum 3 rounds, until zero findings +3. /extracting-plan-dag (gate-conditional) + → Phase 1 gate: skip for solo/sequential, run for parallel/large + → if RUN: produces -dag.md with full process + → if SKIP: leaves a one-line note in the plan +4. Execute the plan + → builders update banners as they work (LDC discipline) + → reclaim protocol handles stale claims if any + → DAG re-extraction triggered by LDC events per Phase 9 +``` + +## Workflow on a gc project + +``` +1. /writing-plans-enhanced + → same plan format, same LDC contract + → gc detection happens here; skill records project type + → contract block omits the 🚧 row (gc-mode) +2. /plan-review-cycle + → same as non-gc +3. /extracting-plan-dag (mandatory) + → Phase 1 gate triggers RUN automatically (gc project) + → produces DAG artifact + → Phase 8 sync to Beads is mandatory + → idempotency verified by re-running sync (zero changes) +4. Gas City takes over for dispatch + → reads `bd ready` to find unblocked work + → atomic-claims on agent's behalf + → agents work in worktrees, ship via atomic commits + (banner update + bd close together) + → no mid-flight banner edits → no banner merge conflicts +``` + +The differences are mechanical, not philosophical: + +- gc detection is automatic; builders don't need to remember the + project type. +- 🚧 banner row is omitted from the contract on gc projects so + builders don't see the discipline they don't need. +- Phase 8 sync is automatic on gc; gc detection in the DAG skill + flips it to mandatory. + +Everything else — banner format, plan structure, deviation/discovery +discipline, DAG artifact format, adversarial review rounds — is +identical. + +## When this strategy might be wrong + +Three honest concerns to track over time: + +1. **The "skip 🚧 on gc" subtraction is easy to forget.** Builders + trained on non-gc projects will instinctively flip 🚧 banners. + The cost is benign (visual noise, no correctness issue) but it + dilutes the "Beads is authoritative for claim" rule. Mitigation: + the LDC contract block on gc projects omits the 🚧 row, so the + discipline isn't visible. Watch for builders adding 🚧 anyway; + if it keeps happening, the contract block needs a STOP-style + warning. + +2. **Two sources for ship-time state.** Both the banner and the + Beads issue carry "Phase 3 shipped at SHA." If they disagree, + who wins? The strategy says: Beads is authoritative for + runtime; the banner is archival; on disagreement, repair the + banner from Beads' record. The atomic shipping-commit pattern + prevents the gap, but a lapsed commit (only the banner, only + the bd close) creates one. + +3. **The fork in the skill tree.** Adding gc-mode to skills grows + maintenance burden with each new skill. If the divergence stays + at "skip 🚧 + force DAG extraction + force Phase 8 sync," it's + manageable. If it grows to 5+ deltas across multiple skills, a + single skill with a `mode: gc` parameter would be cleaner. Track + the delta count; reorganize if it crosses ~5. + +## What this strategy does NOT solve + +- **Builder competence.** No coordination tool catches semantically + bad work. Beads can dispatch the right next task; only test-on- + mainline catches whether a builder did the task well. +- **Reviewer bottleneck.** N concurrent builders produce N pending + PRs. Reviewer throughput is the practical ceiling on parallelism, + and no orchestration substrate raises it. +- **Cross-project coordination.** Each project has its own plans, + its own DAGs, its own Beads database. Coordinating work that + spans projects (e.g. a CVErt-Ops plan that depends on a + Gas-City plan) is out of scope here. + +## Related artifacts + +- `.claude/skills/writing-plans-enhanced/SKILL.md` — Living Document + Contract definition (Step 5). +- `.claude/skills/plan-review-cycle/SKILL.md` — Adversarial plan + review, prerequisite to DAG extraction. +- `.claude/skills/extracting-plan-dag/SKILL.md` — DAG extraction + methodology with gc / non-gc handling. +- `dev/plans/2026-03-10-phase9-health-review-remediation-dag.md` — + Worked example of a DAG produced retrospectively against the + methodology. + +## The bottom line + +Three tools, three layers, one direction of authority: plan → DAG → +tracker. Banners stay; Beads adds atomic claim and queryable runtime +state when the project's execution model warrants it. The gc / +non-gc split is small and mechanical: detect once, drop the 🚧 +banner row, force DAG extraction and Phase 8 sync. Everything else +is the same workflow on both. + +If a builder on a gc project has to think about Beads more than +once a session, the strategy failed. If they ship a phase by +updating one banner and running `bd close`, it succeeded.