Skip to content
Merged
72 changes: 54 additions & 18 deletions skills/alignment-check/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,33 +25,62 @@ Invoked automatically by `writing-plans` in autonomous mode. Can also be invoked

## Dispatching the Alignment Agent

Dispatch a Sonnet agent to perform the comparison:
Dispatch a `balanced`-tier subagent to verify alignment. The subagent reads both documents and produces an Alignment Report:

**Input:**
- Design document: `docs/plans/YYYY-MM-DD-<topic>-design.md`
- Implementation plan: `docs/plans/YYYY-MM-DD-<feature>.md`

**Forward trace (design → plan):**
For each requirement in the design:
- Find the plan task(s) that implement it
- If no task covers it: flag as MISSING

**Reverse trace (plan → design):**
For each task in the plan:
- Find the design requirement it satisfies
- If no requirement justifies it: flag as SCOPE CREEP

**Report format:**

### Alignment Report

**Status:** PASS | FAIL

**Coverage:**
| Design Requirement | Plan Task(s) | Status |
|---|---|---|
| [requirement] | Task N | ✅ Covered |
| [requirement] | — | ❌ MISSING |

**Scope Check:**
| Plan Task | Design Requirement | Status |
|---|---|---|
| Task N | [requirement] | ✅ Justified |
| Task N | — | ⚠️ SCOPE CREEP |

**Drift Items:** [list specific items to fix]

<host: claude-code>
Dispatch using the Agent tool:

```
Agent tool (general-purpose, model: sonnet):
Agent tool (general-purpose, model: balanced):
description: "Check alignment: design vs plan"
prompt: |
You are verifying that an implementation plan aligns with its design document.

## Design Document
[Read: docs/plans/YYYY-MM-DD-<topic>-design.md]

## Implementation Plan
[Read: docs/plans/YYYY-MM-DD-<feature>.md]
Read docs/plans/YYYY-MM-DD-<topic>-design.md and docs/plans/YYYY-MM-DD-<feature>.md.

## Your Job
Perform a forward trace (design → plan):
- For each requirement, constraint, and acceptance criterion in the design, find the plan task(s) that implement it.
- If no plan task covers a design item, flag it as MISSING.

**Forward trace (design → plan):**
For each requirement in the design:
- Find the plan task(s) that implement it
- If no task covers it: flag as MISSING
Perform a reverse trace (plan → design):
- For each task in the implementation plan, find the design requirement, constraint, or acceptance criterion it satisfies.
- If no design item justifies a plan task, flag it as SCOPE CREEP.

**Reverse trace (plan → design):**
For each task in the plan:
- Find the design requirement it satisfies
- If no requirement justifies it: flag as SCOPE CREEP

**Report format:**
Return exactly this report format:

### Alignment Report

Expand All @@ -70,7 +99,14 @@ Agent tool (general-purpose, model: sonnet):
| Task N | — | ⚠️ SCOPE CREEP |

**Drift Items:** [list specific items to fix]

Set **Status:** to PASS only if every design item is covered and every plan task is justified. Otherwise set it to FAIL.
```
</host>
Comment on lines 64 to +105
Copy link

Copilot AI Apr 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is now mostly host-neutral, but the actual “dispatch” mechanism is only described for <host: claude-code>. For portability, add a corresponding block for other supported hosts (e.g., codex/opencode/cursor) describing how to run the alignment check as a separate subagent/thread (or explicitly state that the procedure can be executed inline without spawning a subagent).

Copilot uses AI. Check for mistakes.

<host: codex, opencode, cursor>
Run the alignment check inline: read both documents, perform the forward and reverse traces using the Comparison Procedure above, and produce the Alignment Report.
</host>

## On FAIL

Expand Down
21 changes: 15 additions & 6 deletions skills/brainstorming/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ description: "You MUST use this before any creative work - creating features, bu

Help turn ideas into fully formed designs and specs through natural collaborative dialogue.

Start by understanding the current project context, then ask questions one at a time to refine the idea. Once you understand what you're building, present the design and get user approval.
Start by understanding the current project context, then ask questions using adaptive batching to refine the idea. Once you understand what you're building, present the design and get user approval.

<HARD-GATE>
Do NOT invoke any implementation skill, write any code, scaffold any project, or take any implementation action until you have presented a design and the user has approved it. This applies to EVERY project regardless of perceived simplicity.
Expand All @@ -24,7 +24,7 @@ Every project goes through this process. A todo list, a single-function utility,
You MUST create a task for each of these items and complete them in order:

1. **Explore project context** — check files, docs, recent commits
2. **Ask clarifying questions** — adaptive batching: group 2-4 related questions per form, follow up with targeted singles
2. **Ask clarifying questions** — adaptive batching: group related questions to reduce round-trips; use targeted singles for follow-ups
3. **Propose 2-3 approaches** — with trade-offs and your recommendation
Comment on lines 26 to 28
Copy link

Copilot AI Apr 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The updated guidance emphasizes batching multiple questions per turn, but the Overview still says to “ask questions one at a time” (now conflicting with this checklist/process section). Consider updating wording to consistently describe small-batch questioning throughout the skill to avoid confusing operators.

Copilot uses AI. Check for mistakes.
4. **Present design** — in sections scaled to their complexity, get user approval after each section
5. **Write design doc** — save to `docs/plans/YYYY-MM-DD-<topic>-design.md` and commit
Expand Down Expand Up @@ -58,11 +58,20 @@ digraph brainstorming {

**Understanding the idea:**
- Check out the current project state first (files, docs, recent commits)
- Ask questions using adaptive batching with AskUserQuestion:
- **First form:** Group 2-4 related questions covering purpose, constraints, scope, and tech choices
- **Follow-ups:** Targeted single questions based on interesting or ambiguous answers from previous forms
- Ask questions using adaptive batching — group related questions to reduce round-trips:
- **First batch:** covers purpose, constraints, scope, and tech choices
- **Follow-ups:** Targeted single questions based on interesting or ambiguous answers
Comment on lines +61 to +63
Copy link

Copilot AI Apr 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The updated batching guidance (“First batch: 2–3 questions…”) conflicts with the earlier guidance in this same skill that says to group 2–4 questions per form. Please align these numbers (or explicitly explain why the first batch is narrower than later batches) so the instructions don’t contradict each other.

Copilot uses AI. Check for mistakes.
Comment on lines +61 to +63
Copy link

Copilot AI Apr 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section now recommends a 2–3 question first batch, but the earlier checklist still says “group 2–4 related questions per form”. Please align these to a single rule so the guidance is consistent.

Copilot uses AI. Check for mistakes.

<host: claude-code>
- Use multiple choice options when possible (AskUserQuestion supports 2-4 options per question)
- AskUserQuestion supports up to 4 questions per form — use this to reduce round-trips
</host>

<host: codex, opencode, cursor>
- Present options as a numbered list and ask the user to reply with the chosen number
- Group no more than 3 questions per turn to avoid overloading the chat
Comment on lines +61 to +72
Copy link

Copilot AI Apr 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The host-neutral guidance says the first batch is “2–4 questions”, but the <host: codex, opencode, cursor> block then caps grouping at 3 questions per turn. On those hosts this is contradictory (a “first batch” of 4 would violate the per-turn cap). Consider making the batch-size guidance consistent by either adjusting the generic text (e.g., 2–3) or moving the batch-size guidance into per-host blocks.

Copilot uses AI. Check for mistakes.
</host>

- Focus on understanding: purpose, constraints, success criteria

**Exploring approaches:**
Expand Down Expand Up @@ -106,7 +115,7 @@ When the user wants design exploration without execution, they pass `--design-on

## Key Principles

- **Adaptive question batching** - Group 2-4 related questions per form, follow up with targeted singles
- **Adaptive question batching** - Group related questions to reduce round-trips; use targeted singles for follow-ups
- **Multiple choice preferred** - Easier to answer than open-ended when possible
- **YAGNI ruthlessly** - Remove unnecessary features from all designs
- **Explore alternatives** - Always propose 2-3 approaches before settling
Expand Down
27 changes: 21 additions & 6 deletions skills/pr-monitoring/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,13 @@ Invoked automatically by `finishing-a-development-branch` in autonomous mode aft

## The Process

Spawn a background agent that monitors the PR in a loop:
Run a `balanced`-tier agent that monitors the PR in a loop until all CI checks pass and no unresolved reviews remain.

```
Agent tool (general-purpose, model: sonnet, run_in_background: true):
<host: claude-code>
Use the Agent tool to run the monitor in the background:

````
Agent tool (general-purpose, model: balanced, run_in_background: true):
Comment on lines +22 to +26
Copy link

Copilot AI Apr 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new outer fenced code block (at line 23) wraps the entire Agent prompt, but the prompt itself contains fenced code blocks likebash. Markdown doesn’t support nested triple-backtick fences, so the inner ```bash will terminate the outer block early and break rendering for the rest of the skill. Use a different outer fence (e.g., 4 backticks) or convert the inner shell snippets to indented code blocks so the Agent tool block can contain them safely.

Copilot uses AI. Check for mistakes.
description: "Monitor PR #N for CI and reviews"
prompt: |
You are monitoring PR #<number> on <repo> and automatically fixing issues.
Expand All @@ -33,9 +36,7 @@ Agent tool (general-purpose, model: sonnet, run_in_background: true):
Design doc: <path>
Plan doc: <path>

## Monitor Loop

Repeat until exit conditions met:
Repeat the Monitor Loop until exit conditions are met:

### 1. Check CI Status
Comment on lines +39 to 41
Copy link

Copilot AI Apr 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Monitor Loop instructions are duplicated: they’re fully embedded inside the Agent tool prompt and then repeated again as a standalone “## Monitor Loop” section below. This makes the skill easy to get out of sync—consider keeping the loop in one place (e.g., keep the standalone section and have the Agent prompt reference it, or vice versa).

Copilot uses AI. Check for mistakes.

Expand Down Expand Up @@ -88,6 +89,20 @@ Agent tool (general-purpose, model: sonnet, run_in_background: true):
### 4. Wait Between Checks

Sleep 60 seconds between check cycles. Do not poll more frequently.
````
</host>

<host: codex, opencode, cursor>

Use your host's equivalent mechanism to periodically poll the following in a loop:
- `gh pr checks <number>` — fix any failing CI checks
- `gh api repos/<owner>/<repo>/pulls/<number>/comments` — respond to inline review comments
- `gh api repos/<owner>/<repo>/pulls/<number>/reviews` — handle any "CHANGES_REQUESTED" reviews

Continue until all checks pass, no unresolved inline comments remain, and no "changes requested" reviews are pending.
Comment on lines +97 to +102
Copy link

Copilot AI Apr 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The non-Claude fallback loop only mentions polling PR checks and review comments, but the main Monitor Loop also checks PR reviews for CHANGES_REQUESTED (and the exit conditions depend on it). This fallback can miss requested-changes reviews and incorrectly decide it’s “done”; include equivalent review polling (e.g., the reviews endpoint) and reflect the same exit conditions.

Suggested change
Use your host's equivalent mechanism to periodically poll the following in a loop:
- `gh pr checks <number>`fix any failing CI checks
- `gh api repos/<owner>/<repo>/pulls/<number>/comments` — respond to inline review comments
- `gh api repos/<owner>/<repo>/pulls/<number>/reviews` — handle any "CHANGES_REQUESTED" reviews
Continue until all checks pass, no unresolved inline comments remain, and no "changes requested" reviews are pending.
Use your host's equivalent mechanism to periodically poll the following in a loop (sleep 60 seconds between cycles; do not poll more frequently):
- `gh pr checks <number>`inspect all CI checks and fix any failures
- `gh api repos/<owner>/<repo>/pulls/<number>/comments`inspect and respond to inline review comments
- `gh api repos/<owner>/<repo>/pulls/<number>/reviews`inspect PR review states separately from comments and handle any reviews with state `"CHANGES_REQUESTED"`
Only exit when all of the following are true: all checks pass, no unresolved inline comments remain, and there are no pending PR reviews whose state is `"CHANGES_REQUESTED"`.

Copilot uses AI. Check for mistakes.

</host>


## Safety Limits

Expand Down
Loading