You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Metric direction (`lower_is_better` or `higher_is_better`)
33
+
- Iteration budget (default: 20)
34
+
35
+
**Batch 2** (scope and constraints — 3 questions):
36
+
- Scope (files/directories to modify)
37
+
- Files to NEVER modify (test files, guard files, config)
38
+
- Starting approach (optional — first idea to try)
39
+
3. Dry-run: Execute the metric command once to establish baseline. Execute the guard command to confirm it passes. If either fails, ask the user to fix before proceeding.
40
+
4. Show the proposed loop configuration and confirm with user
41
+
5. Exit Plan Mode via ExitPlanMode
36
42
37
43
**Phase 2 — Optimization Loop**
38
44
@@ -46,7 +52,7 @@ Run the 8-phase autoresearch loop, one iteration at a time:
46
52
6.**Guard** — run the guard command to check for regressions
When `verification.strict_mode` is enabled in the project config, run an additional two-stage sequential review after the parallel agents complete. Each stage uses a fresh verifier subagent to prevent anchoring bias.
232
+
233
+
**Stage 1 — Spec Compliance:**
234
+
235
+
Spawn a fresh verifier agent:
236
+
```
237
+
Agent(
238
+
subagent_type="Explore",
239
+
model="{verifier_model}",
240
+
prompt="
241
+
You are performing a spec compliance review for phase {phase_number}: {phase_name}.
242
+
243
+
Read the phase requirements from GitHub Issue #{phase_issue_number}.
244
+
Read all files modified in this phase.
245
+
246
+
For EACH requirement listed in the issue, verify it is implemented with evidence:
247
+
248
+
CLAIM: Requirement [ID] — [description]
249
+
EVIDENCE: [file:line or command]
250
+
OUTPUT: [actual result observed]
251
+
VERDICT: PASS | FAIL — [reason]
252
+
253
+
End with: SPEC COMPLIANCE: PASS or SPEC COMPLIANCE: FAIL — [list of unmet requirements]
254
+
"
255
+
)
256
+
```
257
+
258
+
Wait for Stage 1 to complete. If it fails, include the failures in the final report.
259
+
260
+
**Stage 2 — Code Quality (fresh subagent):**
261
+
262
+
Spawn a NEW verifier agent (do NOT reuse the Stage 1 agent):
263
+
```
264
+
Agent(
265
+
subagent_type="Explore",
266
+
model="{verifier_model}",
267
+
prompt="
268
+
You are performing a code quality deep review for phase {phase_number}: {phase_name}.
269
+
270
+
Context: Spec compliance review has already been completed.
271
+
Read all files modified in this phase.
272
+
273
+
Focus on implementation quality beyond spec compliance:
274
+
- Architecture and design pattern adherence
275
+
- Error handling completeness
276
+
- Edge case coverage
277
+
- Code maintainability and clarity
278
+
- No dead code, no unnecessary complexity
279
+
280
+
For each finding:
281
+
CLAIM: [what was checked]
282
+
EVIDENCE: [file:line]
283
+
OUTPUT: [observed behavior or code pattern]
284
+
VERDICT: PASS | FAIL — [reason]
285
+
286
+
End with: CODE QUALITY: PASS or CODE QUALITY: FAIL — [issues found]
287
+
"
288
+
)
289
+
```
290
+
229
291
## Step 5 — Identify Human Verification Items
230
292
231
293
Some checks cannot be automated. Flag these for human review:
0 commit comments