Skip to content

fix(self-improvement): TSV columns, two-batch wizard, two-stage review#186

Merged
maystudios merged 1 commit intomainfrom
worktree-agent-a0bcccc3
Mar 25, 2026
Merged

fix(self-improvement): TSV columns, two-batch wizard, two-stage review#186
maystudios merged 1 commit intomainfrom
worktree-agent-a0bcccc3

Conversation

@maystudios
Copy link
Copy Markdown
Owner

Summary

  • Fix improve.md TSV columns to match spec §11.4 canonical 7-column format
  • Add two-batch AskUserQuestion pattern with dry-run baseline
  • Add optional two-stage sequential review (Spec Compliance → Code Quality) to verify-phase.md

Test plan

  • Verify improve.md TSV description matches spec §11.4
  • Verify two-batch setup is clear and complete
  • Verify two-stage review is gated on strict_mode

🤖 Generated with Claude Code

…-stage review

- improve.md: Replace 8-column TSV with spec §11.4 canonical 7-column format
  (iteration, commit, metric, delta, guard, status, description)
- improve.md: Restructure parameter collection into two-batch AskUserQuestion
  pattern with dry-run baseline step
- verify-phase.md: Add Step 4b two-stage sequential review (Spec Compliance
  then Code Quality) gated on strict_mode config
- verify-phase.md: Update status determination and comment template with
  spec_compliance_review and code_quality_review fields

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 25, 2026 17:37
@maystudios maystudios merged commit 80755c3 into main Mar 25, 2026
3 checks passed
@github-actions
Copy link
Copy Markdown
Contributor

🎉 This PR is included in version 5.13.2 🎉

The release is available on:

Your semantic-release bot 📦🚀

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates Maxsim workflow/command templates to align self-improvement logging with the canonical 7-column TSV spec and to add an optional strict-mode verification pass with sequential spec/quality reviews.

Changes:

  • Add optional strict-mode “two-stage sequential review” (Spec Compliance → Code Quality) to verify-phase.md, and extend the verification report schema/table with two new check fields.
  • Update /maxsim:improve setup to use a two-batch AskUserQuestion pattern plus a dry-run baseline, and align its TSV logging description to the 7-column format.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File Description
templates/workflows/verify-phase.md Adds strict-mode sequential review stages and new verification check fields/rows in the posted report.
templates/commands/maxsim/improve.md Clarifies setup via two question batches + baseline dry-run; updates TSV log column description to canonical 7-column format.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +359 to +360
spec_compliance_review: pass | fail | skipped
code_quality_review: pass | fail | skipped
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these new check fields allow skipped, it would help to explicitly state (near the strict_mode section or here) that they MUST be set to skipped when strict_mode is disabled, so the verification YAML and the checks_passed/checks_total math stay consistent.

Copilot uses AI. Check for mistakes.
Spawn a fresh verifier agent:
```
Agent(
subagent_type="Explore",
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

subagent_type is set to "Explore", but existing workflow templates consistently use subagent_type="verifier" for verifier agents. If "Explore" isn’t a valid agent type, this will fail at runtime; switch this to verifier (and keep the model as {verifier_model}).

Suggested change
subagent_type="Explore",
subagent_type="verifier",

Copilot uses AI. Check for mistakes.
Comment on lines +264 to +267
Agent(
subagent_type="Explore",
model="{verifier_model}",
prompt="
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as Stage 1: subagent_type="Explore" is inconsistent with other verifier spawns (subagent_type="verifier") and may not be a valid agent type. Use verifier here as well to avoid agent-spawn failures.

Copilot uses AI. Check for mistakes.

### Step 4b — Two-Stage Sequential Review (Optional)

When `verification.strict_mode` is enabled in the project config, run an additional two-stage sequential review after the parallel agents complete. Each stage uses a fresh verifier subagent to prevent anchoring bias.
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The config key referenced here (verification.strict_mode) doesn’t match the config schema and other docs, which use execution.verification.strict_mode. This mismatch can lead to strict-mode behavior being configured incorrectly (or not at all); update the key/path in the text accordingly.

Suggested change
When `verification.strict_mode` is enabled in the project config, run an additional two-stage sequential review after the parallel agents complete. Each stage uses a fresh verifier subagent to prevent anchoring bias.
When `execution.verification.strict_mode` is enabled in the project config, run an additional two-stage sequential review after the parallel agents complete. Each stage uses a fresh verifier subagent to prevent anchoring bias.

Copilot uses AI. Check for mistakes.

Wait for all three review agents to complete before proceeding.

### Step 4b — Two-Stage Sequential Review (Optional)
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new section is the only step header that uses ### rather than the file’s standard ## Step N — ... format, which makes the step structure inconsistent and can break skimming/TOC behavior. Consider promoting this to ## Step 4b — ... (or renumbering) to match the rest of the document.

Suggested change
### Step 4b — Two-Stage Sequential Review (Optional)
## Step 4b — Two-Stage Sequential Review (Optional)

Copilot uses AI. Check for mistakes.
Comment on lines +28 to +30
2. Gather loop parameters via two AskUserQuestion calls:
**Batch 1** (required — 4 questions):
- Metric command (the command to run and extract a number from)
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Step 2 now implies the metric command is always collected via AskUserQuestion, but the command’s <context> section says $ARGUMENTS should be treated as the metric command when provided. To keep the setup flow consistent, update this step to only ask for the metric command when $ARGUMENTS is empty (or explicitly say it’s pre-filled from $ARGUMENTS).

Suggested change
2. Gather loop parameters via two AskUserQuestion calls:
**Batch 1** (required — 4 questions):
- Metric command (the command to run and extract a number from)
2. Gather loop parameters (using AskUserQuestion where needed):
**Batch 1** (required — 4 parameters):
- Metric command (if `$ARGUMENTS` is provided, treat it as the metric command and optionally confirm/edit with the user; otherwise, ask the user for the metric command to run and extract a number from)

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants