Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
203 changes: 178 additions & 25 deletions .claude/skills/issue-lifecycle/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,48 +53,184 @@ systeminit/swamp#850":
```

5. **Generate an implementation plan**:

First, write a single YAML file (e.g. `/tmp/plan.yaml`) containing both
`steps` and `potentialChallenges` as top-level keys. The CLI only supports
one `--input-file` flag per invocation, and the file must be a YAML object
(not a bare array).

```yaml
# /tmp/plan.yaml
steps:
- order: 1
description: "Add the new schema types"
files:
- src/domain/schemas.ts
risks: "May need migration"
- order: 2
description: "Update the service layer"
files:
- src/domain/service.ts
- src/domain/service_test.ts
potentialChallenges:
- "Backwards compatibility with existing data"
- "Test coverage for edge cases"
```

Then run the method. The `--input` flags handle scalar values, and
`--input-file` handles the structured data:

```
swamp model method run issue-<N> plan \
--input summary="..." \
--input dddAnalysis="..." \
--input-file steps.json \
--input testingStrategy="..." \
--input-file potentialChallenges.json --json
--input-file /tmp/plan.yaml --json
```

6. **Show the plan to the human and ask for feedback.** Present it clearly:
- Summary
- Each step with files and risks
- Testing strategy
- Potential challenges
- Then ask: "Does this plan look right, or do you have feedback?"
6. **Show the plan to the human**, then **run adversarial review** (see below).

## Adversarial Plan Review

After **every** `plan` or `iterate` call, you MUST run an adversarial review
before the human can approve. This is automatic — do not skip it.

### Step 1: Challenge the plan

Re-read the plan, then critically evaluate it across these dimensions:

- **Architecture**: Does this follow DDD principles? Are domain boundaries
correct? Is this the right abstraction level? Are there better patterns?
- **Scope**: Is this doing too much or too little? Does it match the issue? Is
there scope creep? Are there unnecessary changes?
- **Risk**: Are all failure modes identified? What about edge cases, race
conditions, backwards compatibility? What could go wrong that isn't listed?
- **Testing**: Is the testing strategy sufficient? Are edge cases covered? Are
there integration test gaps? Is there over-reliance on unit tests?
- **Complexity**: Is this over-engineered? Could it be simpler? Are there
unnecessary abstractions or indirections?
- **Correctness**: Will this actually solve the problem? Are there logical gaps?
Does the approach match established patterns in the codebase?

### Step 2: Verify against the codebase

For each plan step, **read the actual code**:

- Read every file referenced in the plan steps — verify they exist
- Confirm functions, classes, and types exist where claimed
- Grep for related code the plan might have missed
- Check for conflicts with existing patterns or naming conventions
- Verify that proposed test paths and test patterns match the codebase
- Look for code that already does what a step proposes (duplication risk)

### Step 3: Record findings

Write findings to a YAML file (e.g. `/tmp/findings.yaml`) as a YAML object with
a `findings` key:

```yaml
# /tmp/findings.yaml
findings:
- id: ADV-1
severity: high
category: architecture
description: "The plan adds a new service but doesn't define its domain boundary"
- id: ADV-2
severity: medium
category: testing
description: "No integration tests planned for the new API endpoint"
```

Then record them:

```
swamp model method run issue-<N> adversarial_review \
--input-file /tmp/findings.yaml --json
```

Each finding must have:

- `id`: Sequential identifier (ADV-1, ADV-2, ...)
- `severity`: critical, high, medium, or low
- `category`: architecture, scope, risk, testing, complexity, or correctness
- `description`: Clear explanation of the concern

**Critical/high findings block approval.** Medium/low are shown as warnings.

### Step 4: Present to the human

Show findings grouped by severity:

1. Critical findings (blockers)
2. High findings (blockers)
3. Medium findings (warnings)
4. Low findings (warnings)

If there are blocking findings, say:

> "The adversarial review found N blocking issue(s) that need to be addressed
> before approval. Here's what needs to change: ..."

If no blocking findings, say:

> "Adversarial review passed — no blocking findings. N warning(s) noted. Ready
> for your approval when you are."

## Plan Iteration Loop

When the human gives feedback (they almost always will on the first plan):
When the human gives feedback OR adversarial findings need addressing:

1. **Call iterate** with both the feedback text and your revised plan. Write the
revised `steps` and `potentialChallenges` to a YAML file (same format as the
`plan` step above), then run:

1. **Call iterate** with both the feedback text and your revised plan:
```
swamp model method run issue-<N> iterate \
--input feedback="<human's feedback>" \
--input feedback="<human's feedback or adversarial findings>" \
--input summary="..." \
--input-file steps.json \
... --json
--input dddAnalysis="..." \
--input testingStrategy="..." \
--input-file /tmp/plan.yaml --json
```

2. **Resolve addressed findings**. Write resolutions to a YAML file:

```yaml
# /tmp/resolutions.yaml
resolutions:
- findingId: ADV-1
resolutionNote: "Added domain boundary definition in step 2"
- findingId: ADV-2
resolutionNote: "Added integration test step covering the new endpoint"
```

2. **Show what changed** between the previous and new plan version. Be specific:
```
swamp model method run issue-<N> resolve_findings \
--input-file /tmp/resolutions.yaml --json
```

3. **Re-run adversarial review** on the new plan version. The review must be
current with the latest plan version — stale reviews block approval.

4. **Show what changed** between the previous and new plan version. Be specific:
- "Reordered steps 2 and 3 based on your feedback"
- "Added regression analysis for module X"
- "Removed step 4 — you're right, that's already handled by..."

3. **Ask for feedback again.** Keep iterating until the human says one of:
- "approve", "approved", "looks good", "ship it", "go", "LGTM"
5. **Ask for feedback again.** Keep iterating until:
- All critical/high adversarial findings are resolved, AND
- The human says: "approve", "approved", "looks good", "ship it", "go",
"LGTM"

4. Only then call `approve`:
6. Only then call `approve`:
```
swamp model method run issue-<N> approve --json
```

**The `approve` method will fail** if critical/high findings are unresolved or
if the adversarial review is stale (wrong plan version). This is enforced by the
`adversarial-review-clear` pre-flight check.

## Implementation & CI Flow

After plan approval, when the human says to implement:
Expand All @@ -107,17 +243,31 @@ After plan approval, when the human says to implement:
--input prNumber=<N> --json
```

4. **Wait for CI to complete**, then fetch results:
4. **Wait 3 minutes for CI to start.** Use `sleep 180` — CI takes at least this
long, so there's no point polling earlier.

5. **Poll for CI results.** Call `ci_status` and check if checks are still
pending. If pending, wait 60 seconds and retry. Keep polling until all checks
have completed (passed or failed). Do NOT hand control back to the human
during this wait — stay in the loop.

```
swamp model method run issue-<N> ci_status --json
```

5. **Show the CI results** to the human, grouped by reviewer and severity:
6. **Show the CI results** to the human, grouped by reviewer and severity:
- Which checks passed/failed
- Review comments grouped by reviewer
- Comments sorted by severity (critical first)

6. **When the human directs fixes**, parse their instruction:
7. **If everything is green and approved**, the PR will auto-merge. Call
`complete` immediately — no need to ask the human:
```
swamp model method run issue-<N> complete --json
```

8. **If there are failures or review comments**, present them and wait for the
human's direction. Parse their instruction:
- "fix the CRITICAL issues from adversarial review" ->
`targetReview: "claude-adversarial-review"`, `targetSeverity: "critical"`
- "address all the review comments" -> no filters
Expand All @@ -130,11 +280,14 @@ After plan approval, when the human says to implement:
--input targetSeverity="<severity>" --json
```

7. **Make the fixes**, push, wait for CI, then call `ci_status` again
8. **Loop until clean** or the human says to complete:
```
swamp model method run issue-<N> complete --json
```
9. **After pushing fixes, loop back to step 4.** Wait 3 minutes, poll for CI,
show results. Repeat until clean or the human says to stop.

**IMPORTANT:** Do not break out of this loop voluntarily. The human should never
have to manually check CI or come back to ask "what happened?" — the skill stays
in the conversation and drives through to completion. If the human wants to
break out (e.g. "I'll come back to this later", "pause", "stop"), respect that
immediately — but it must be their decision, never yours.

## Reviewing Plan History

Expand Down
28 changes: 28 additions & 0 deletions extensions/models/_lib/schemas.ts
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,8 @@ export const TRANSITIONS: Record<string, Phase[]> = {
approve: ["plan_generated"],
implement: ["approved"],
ci_status: ["implementing"],
adversarial_review: ["plan_generated"],
resolve_findings: ["plan_generated"],
fix: ["ci_review"],
complete: ["ci_review"],
};
Expand Down Expand Up @@ -161,6 +163,32 @@ export const CiResultsSchema = z.object({
fetchedAt: z.string(),
});

export const AdversarialFindingSchema = z.object({
id: z.string().describe("Unique finding identifier, e.g. ADV-1"),
severity: z.enum(["critical", "high", "medium", "low"]),
category: z.enum([
"architecture",
"scope",
"risk",
"testing",
"complexity",
"correctness",
]),
description: z.string(),
resolved: z.boolean().default(false),
resolutionNote: z.string().optional(),
});

export const AdversarialReviewSchema = z.object({
planVersion: z.number().describe(
"The plan version this review applies to",
),
findings: z.array(AdversarialFindingSchema),
reviewedAt: z.string(),
});

export type AdversarialReviewData = z.infer<typeof AdversarialReviewSchema>;

export const FixDirectiveSchema = z.object({
round: z.number(),
directive: z.string(),
Expand Down
Loading
Loading