Improve validation loops and next-task skill

etr · etr · commit 05407ce42afa · 2026-02-27T16:08:06.000-08:00
- Skip irrelevant agents in validation-loop and task-validation-loop
  when their review subjects don't exist (e.g., no design system)
- Re-run only failed agents on retry instead of all 8, with domain
  spillover detection for cross-cutting fixes
- Clarify next-task handoff to execute-task with explicit task-id format
- Bump version to 1.0.4
diff --git a/.claude-plugin/plugin.json b/.claude-plugin/plugin.json
@@ -1,7 +1,7 @@
 {
   "name": "groundwork",
   "description": "Comprehensive skills library for Claude Code: planning, design, TDD, debugging, collaboration patterns, and proven techniques",
-  "version": "1.0.3",
+  "version": "1.0.4",
   "author": {
     "name": "Groundwork Contributors"
   },
diff --git a/skills/next-task/SKILL.md b/skills/next-task/SKILL.md
@@ -62,7 +62,8 @@ Parse the tasks to find the next task to work on:
 
 ### Step 4: Delegate to Execute Task
 
-Once a task is identified, **you MUST call the Skill tool:**
-  `Skill(skill="groundwork:execute-task", args="TASK-NNN")`
+**You MUST call the Skill tool now:** `Skill(skill="groundwork:execute-task", args="<task-id>")`
 
-Do NOT load project context, present summaries, or begin task execution yourself. The execute-task skill handles the complete workflow. Call it NOW with the identified task identifier.
+Replace `<task-id>` with the identified task identifier (e.g., `TASK-004`).
+
+Do NOT load project context, explore the codebase, present summaries, or begin task execution yourself. The execute-task skill handles the complete workflow including worktree setup, TDD, and validation.
diff --git a/skills/task-validation-loop/SKILL.md b/skills/task-validation-loop/SKILL.md
@@ -31,6 +31,16 @@ design_system ← Read specs/design_system.md (if exists, optional)
 
 **Detection:** Check for file first (takes precedence), then directory. When reading a directory, aggregate all `.md` files recursively.
 
+### 1.5. Determine Active Agents
+
+| Agent | Skip when |
+|---|---|
+| `design-task-alignment-checker` | No `design_system` found AND no UI/frontend tasks in task list |
+
+`prd-task-alignment-checker` and `architecture-task-alignment-checker` always run (their inputs are prerequisites).
+
+Record skipped agents with verdict `skipped`.
+
 ### 2. Launch Validation Agents
 
 Use Task tool to launch all 3 agents in parallel:
@@ -93,7 +103,12 @@ Present results in table format:
    - **accessibility-missing**: Add acceptance criteria to task
    - **over-tasked**: Remove task or add requirement to PRD (user decision)
 
-3. **Re-run Agent Validation** - Launch all 3 agents again with updated task list
+3. **Re-run Agent Validation** — Re-launch ONLY agents that returned `request-changes`. Agents that approved retain their verdict unless the fix changed content in their domain:
+   - **PRD alignment checker**: re-run if tasks were added/removed or requirements mapping changed
+   - **Architecture alignment checker**: re-run if component assignments or technology references changed
+   - **Design alignment checker**: re-run if accessibility criteria or design token references changed
+
+   For agents NOT re-run, carry forward their previous `approve` verdict and score.
 
 4. **Check Results**
    - ALL approve → **PASS**, return success
diff --git a/skills/validation-loop/SKILL.md b/skills/validation-loop/SKILL.md
@@ -30,6 +30,25 @@ Collect for the agents:
 - Architecture path: path to `specs/architecture.md` or `specs/architecture/` (do NOT read contents)
 - Design system path: path to `specs/design_system.md` or `specs/design_system/` (do NOT read contents)
 
+### 1.5. Determine Active Agents
+
+Based on context gathered, skip agents whose primary review subject does not exist:
+
+| Agent | Skip when |
+|---|---|
+| `design-consistency-checker` | No `design_system_path` AND no CSS/styling files in `changed_file_paths` |
+| `spec-alignment-checker` | No `specs_path` found |
+| `architecture-alignment-checker` | No `architecture_path` found |
+
+**Always run** regardless of context:
+- `code-quality-reviewer` — always applicable to code changes
+- `security-reviewer` — always applicable to code changes
+- `code-simplifier` — always applicable to code changes
+- `performance-reviewer` — always applicable to code changes
+- `housekeeper` — handles missing paths gracefully, still checks task status
+
+Record skipped agents in the aggregation table with verdict `skipped` and a note explaining why.
+
 ### 2. Launch Verification Agents
 
 Use Task tool to launch all 8 agents in parallel:
@@ -98,7 +117,25 @@ Each returns JSON:
    - Run tests - must pass
    - Confirm acceptance criteria
 
-4. **Re-run Agent Validation** - Launch all 8 agents again
+4. **Re-run Agent Validation** — Re-launch ONLY agents that returned `request-changes` in the previous iteration.
+
+   **Domain spillover**: If a fix modified code relevant to an agent that previously approved, re-run that agent too:
+
+   | Fix touches... | Also re-run |
+   |---|---|
+   | Auth, crypto, input validation | security-reviewer |
+   | Layer boundaries, component structure | architecture-alignment-checker |
+   | CSS, design tokens, accessibility | design-consistency-checker |
+   | Spec/requirement behavior | spec-alignment-checker |
+   | Test files | code-quality-reviewer |
+   | Task status, docs, spec files | housekeeper |
+   | Hot paths, algorithmic changes | performance-reviewer |
+   | Code structure, naming | code-simplifier |
+
+   **When in doubt, re-run.** False passes are worse than extra agent runs.
+
+   For agents NOT re-run, carry forward their previous `approve` verdict and score into the aggregation table.
+
    - Do NOT re-read updated files into the orchestrator context — agents will re-read the updated files themselves
    - Only update `changed_file_paths` or `diff_stat` if the set of changed files has changed
 

Original file line number	Diff line number	Diff line change
`@@ -1,7 +1,7 @@`
`1`	`1`	`{`
`2`	`2`	`"name": "groundwork",`
`3`	`3`	`"description": "Comprehensive skills library for Claude Code: planning, design, TDD, debugging, collaboration patterns, and proven techniques",`
`4`		`- "version": "1.0.3",`
	`4`	`+ "version": "1.0.4",`
`5`	`5`	`"author": {`
`6`	`6`	`"name": "Groundwork Contributors"`
`7`	`7`	`},`