feat(cross-llm): Group II skill rewrites for cross-host portability#15
Conversation
|
@copilot review |
Code Review Summary The 3 changed files all correctly apply the portability pattern: ✅ What's correct
The description claims: But Minor nit in The added
Since this is already inside Otherwise the changes are minimal, surgical, and correctly implement the portability pattern. |
63c6808 to
d285044
Compare
There was a problem hiding this comment.
Pull request overview
This PR continues the cross-LLM portability series by rewriting Group II skills to avoid Claude-Code-specific tool/model names in shared skill text, using <host: claude-code> blocks and model-tier roles instead.
Changes:
- Updated skills to reference the
balancedtier instead of brand/model names (e.g.,sonnet). - Gated Claude-Code-specific tool references (e.g.,
Agent tool,TodoWrite) behind<host: claude-code>blocks. - Adjusted skill instructions to keep host-neutral phrasing outside host-specific sections (intended), while still passing the grep guard.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| skills/pr-monitoring/SKILL.md | Moves PR monitoring “Agent tool” content behind <host: claude-code> and switches model to balanced. |
| skills/executing-plans/SKILL.md | Moves TodoWrite mention into a <host: claude-code> block and adjusts Step 1 wording. |
| skills/alignment-check/SKILL.md | Wraps Agent tool prompt behind <host: claude-code> and switches model to balanced. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Inside a <host: claude-code> block, the phrase “If your host supports task tracking” is redundant/ambiguous (the block already scopes the instruction to Claude Code). Consider either stating the instruction directly for Claude Code, or moving a host-neutral “If your host supports task tracking…” line outside and using the host block only for the TodoWrite-specific wording.
| Create a TodoWrite entry per plan task before starting. |
| <host: claude-code> | ||
| ``` | ||
| Agent tool (general-purpose, model: sonnet, run_in_background: true): | ||
| Agent tool (general-purpose, model: balanced, run_in_background: true): |
There was a problem hiding this comment.
The new outer fenced code block (at line 23) wraps the entire Agent prompt, but the prompt itself contains fenced code blocks likebash. Markdown doesn’t support nested triple-backtick fences, so the inner ```bash will terminate the outer block early and break rendering for the rest of the skill. Use a different outer fence (e.g., 4 backticks) or convert the inner shell snippets to indented code blocks so the Agent tool block can contain them safely.
alignment-check (Task 4): - Replace "Dispatch a Sonnet agent" with "Dispatch a balanced-tier subagent" - Wrap Agent tool code block in <host: claude-code> / </host> - model: sonnet → model: balanced pr-monitoring (Task 5): - Replace "background agent" with "background balanced-tier agent" in prose - Wrap entire Agent tool code block in <host: claude-code> / </host> - model: sonnet → model: balanced brainstorming (Task 8): - Replace inline AskUserQuestion references with host-neutral phrasing - Wrap AskUserQuestion-specific bullet points in <host: claude-code> - Add <host: codex, opencode, cursor> fallback: numbered list + 3 questions/turn receiving-code-review (Task 6) and finishing-a-development-branch (Task 7): - Audited; no forbidden tokens found — no changes required. All changes verified: tests/skill-content-grep.sh PASS Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
d285044 to
50ea2a0
Compare
|
@copilot review |
|
@copilot review |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| - Ask questions using adaptive batching — group related questions to reduce round-trips: | ||
| - **First batch:** 2–4 questions covering purpose, constraints, scope, and tech choices | ||
| - **Follow-ups:** Targeted single questions based on interesting or ambiguous answers | ||
|
|
||
| <host: claude-code> | ||
| - Use multiple choice options when possible (AskUserQuestion supports 2-4 options per question) | ||
| - AskUserQuestion supports up to 4 questions per form — use this to reduce round-trips | ||
| </host> | ||
|
|
||
| <host: codex, opencode, cursor> | ||
| - Present options as a numbered list and ask the user to reply with the chosen number | ||
| - Group no more than 3 questions per turn to avoid overloading the chat |
There was a problem hiding this comment.
The host-neutral guidance says the first batch is “2–4 questions”, but the <host: codex, opencode, cursor> block then caps grouping at 3 questions per turn. On those hosts this is contradictory (a “first batch” of 4 would violate the per-turn cap). Consider making the batch-size guidance consistent by either adjusting the generic text (e.g., 2–3) or moving the batch-size guidance into per-host blocks.
| <host: claude-code> | ||
| - Use multiple choice options when possible (AskUserQuestion supports 2-4 options per question) | ||
| - AskUserQuestion supports up to 4 questions per form — use this to reduce round-trips | ||
| </host> | ||
|
|
||
| <host: codex, opencode, cursor> | ||
| - Present options as a numbered list and ask the user to reply with the chosen number | ||
| - Group no more than 3 questions per turn to avoid overloading the chat |
There was a problem hiding this comment.
The PR description’s “Skills changed” list doesn’t mention skills/brainstorming, but this PR updates it. Either add brainstorming to the PR summary/table or split it into the intended PR in the portability series so the description matches the actual change set.
…itoring alignment-check: - Fix misleading "using the following prompt:" sentence; prompt is inside a <host: claude-code> block and hidden on other hosts - Add "On Claude Code, use the Agent tool:" header inside host block to make the scoping explicit pr-monitoring: - Fix nested fences bug: inner ```bash fences inside Agent prompt were terminating the outer ``` fence; switch outer fence to 4 backticks - Move Monitor Loop steps outside <host: claude-code> as host-neutral ## Monitor Loop section so non-Claude-Code hosts see the procedure - Keep full Agent tool invocation (with detailed prompt) inside host block for Claude Code background agent Agent-Logs-Url: https://github.com/GoCodeAlone/claude-superpowers/sessions/023600d7-9f65-420f-8d6e-9ccf6b448388 Co-authored-by: intel352 <77607+intel352@users.noreply.github.com>
Addressed the review feedback from
All 3 changed files remain free of forbidden tokens outside |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…instorming alignment-check: - Move forward/reverse trace instructions and report format outside <host: claude-code> as host-neutral prose; all hosts now see the full alignment procedure - Slim the claude-code Agent tool prompt down to a minimal wrapper that references the skill instructions, avoiding duplication brainstorming: - Fix batch-size inconsistency: host-neutral "2–4 questions" contradicted the <host: codex, opencode, cursor> cap of 3 per turn; changed to "2–3 questions" for consistency across all hosts Agent-Logs-Url: https://github.com/GoCodeAlone/claude-superpowers/sessions/6c1dc754-4fe3-4943-aab2-97223c410498 Co-authored-by: intel352 <77607+intel352@users.noreply.github.com>
Addressed both new review issues in commit
The third review comment (brainstorming not in the PR description table) doesn't apply — brainstorming is listed in the PR summary table under "Skills changed". |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
skills/pr-monitoring/SKILL.md:45
- The Monitor Loop instructions are duplicated (once inside the claude-code Agent tool prompt and again as the standalone “## Monitor Loop” section). This duplication is likely to drift over time. Consider keeping the loop defined only once (e.g., in the standalone section) and having the Agent prompt reference it (similar to alignment-check’s “follow the instructions in this skill”).
Repeat the Monitor Loop until exit conditions are met:
### 1. Check CI Status
```bash
gh pr checks <number> --json name,state,conclusion
```
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| - Ask questions using adaptive batching — group related questions to reduce round-trips: | ||
| - **First batch:** 2–3 questions covering purpose, constraints, scope, and tech choices | ||
| - **Follow-ups:** Targeted single questions based on interesting or ambiguous answers |
There was a problem hiding this comment.
The updated batching guidance (“First batch: 2–3 questions…”) conflicts with the earlier guidance in this same skill that says to group 2–4 questions per form. Please align these numbers (or explicitly explain why the first batch is narrower than later batches) so the instructions don’t contradict each other.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ## Monitor Loop | ||
|
|
||
| Repeat until exit conditions met: | ||
|
|
||
| ### 1. Check CI Status |
There was a problem hiding this comment.
The Monitor Loop instructions are duplicated: once inside the <host: claude-code> Agent-tool prompt and again as a standalone ## Monitor Loop section below. This creates a maintenance risk where one copy can drift from the other. Consider keeping a single canonical Monitor Loop (either in the skill body or in the Agent prompt) and have the other reference it instead of repeating the full content.
intel352
left a comment
There was a problem hiding this comment.
Please re-review PR-B (latest commit on skills/cross-llm-pr-b-group-ii).
…r Loop duplication brainstorming: - Remove specific count from "First batch" line (was "2–3 questions", conflicting with the "2-4 questions per form" principle elsewhere); now reads "First batch: covers purpose, constraints, scope, and tech choices" pr-monitoring: - Remove standalone ## Monitor Loop section (lines 95-149) that duplicated the host-wrapped Agent tool prompt — single source of truth is inside <host: claude-code> where the loop runs autonomously - Add <host: codex, opencode, cursor> block with brief gh polling guidance for non-Claude hosts Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
skills/pr-monitoring/SKILL.md:108
- The CI failure step references
gh run view <run-id> --log-failed, but the precedinggh pr checkscommand doesn’t provide a run id. Add a concrete way to obtain the run id (or use an alternative command that links directly to the failing run) so the loop is actionable as written.
</host>
## Safety Limits
| Limit | Value | On Exceed |
|-------|-------|-----------|
| CI fix attempts per failure | 5 | Comment on PR, stop fixing that check |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| - Ask questions using adaptive batching — group related questions to reduce round-trips: | ||
| - **First batch:** covers purpose, constraints, scope, and tech choices | ||
| - **Follow-ups:** Targeted single questions based on interesting or ambiguous answers |
There was a problem hiding this comment.
This section now recommends a 2–3 question first batch, but the earlier checklist still says “group 2–4 related questions per form”. Please align these to a single rule so the guidance is consistent.
| Repeat the Monitor Loop until exit conditions are met: | ||
|
|
||
| ### 1. Check CI Status |
There was a problem hiding this comment.
The Monitor Loop instructions are duplicated: they’re fully embedded inside the Agent tool prompt and then repeated again as a standalone “## Monitor Loop” section below. This makes the skill easy to get out of sync—consider keeping the loop in one place (e.g., keep the standalone section and have the Agent prompt reference it, or vice versa).
…ng + pr-monitoring brainstorming: - Remove "2-4 per form" count from checklist and Key Principles; now reads "group related questions to reduce round-trips; use targeted singles for follow-ups" — no number to conflict with <host: codex> cap of 3 per turn pr-monitoring: - Fix dangling "Monitor Loop below" reference (standalone section was removed in previous commit); now "following the Monitor Loop procedure" Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| Read docs/plans/YYYY-MM-DD-<topic>-design.md and docs/plans/YYYY-MM-DD-<feature>.md, | ||
| then produce an Alignment Report following the instructions in the alignment-check skill. |
There was a problem hiding this comment.
The Agent-tool prompt is no longer self-contained: it asks the subagent to follow “instructions in the alignment-check skill”, but the spawned agent may not have access to this skill text. To keep alignment results deterministic, include the forward/reverse trace steps + required report format directly in the Agent tool prompt body (as it was previously).
| Read docs/plans/YYYY-MM-DD-<topic>-design.md and docs/plans/YYYY-MM-DD-<feature>.md, | |
| then produce an Alignment Report following the instructions in the alignment-check skill. | |
| Read docs/plans/YYYY-MM-DD-<topic>-design.md and docs/plans/YYYY-MM-DD-<feature>.md. | |
| Perform a forward trace (design → plan): | |
| - For each requirement, constraint, and acceptance criterion in the design, find the plan task(s) that implement it. | |
| - If no plan task covers a design item, flag it as MISSING. | |
| Perform a reverse trace (plan → design): | |
| - For each task in the implementation plan, find the design requirement, constraint, or acceptance criterion it satisfies. | |
| - If no design item justifies a plan task, flag it as SCOPE CREEP. | |
| Return exactly this report format: | |
| ### Alignment Report | |
| **Status:** PASS | FAIL | |
| **Coverage:** | |
| | Design Requirement | Plan Task(s) | Status | | |
| |---|---|---| | |
| | [requirement] | Task N | ✅ Covered | | |
| | [requirement] | — | ❌ MISSING | | |
| **Scope Check:** | |
| | Plan Task | Design Requirement | Status | | |
| |---|---|---| | |
| | Task N | [requirement] | ✅ Justified | | |
| | Task N | — | ⚠️ SCOPE CREEP | | |
| **Drift Items:** [list specific items to fix] | |
| Set **Status:** to PASS only if every design item is covered and every plan task is justified. Otherwise set it to FAIL. |
| <host: claude-code> | ||
| - Use multiple choice options when possible (AskUserQuestion supports 2-4 options per question) | ||
| - AskUserQuestion supports up to 4 questions per form — use this to reduce round-trips | ||
| </host> | ||
|
|
||
| <host: codex, opencode, cursor> | ||
| - Present options as a numbered list and ask the user to reply with the chosen number | ||
| - Group no more than 3 questions per turn to avoid overloading the chat | ||
| </host> |
There was a problem hiding this comment.
The <host: ...> markers are at column 1 in the middle of a nested bullet list, which breaks the list structure and can cause the host-specific bullets to render as separate top-level items. Since the grep-guard allows surrounding whitespace, indent the <host: ...> / </host> lines to match the nesting level of the bullets they guard so the list renders correctly.
| <host: claude-code> | |
| - Use multiple choice options when possible (AskUserQuestion supports 2-4 options per question) | |
| - AskUserQuestion supports up to 4 questions per form — use this to reduce round-trips | |
| </host> | |
| <host: codex, opencode, cursor> | |
| - Present options as a numbered list and ask the user to reply with the chosen number | |
| - Group no more than 3 questions per turn to avoid overloading the chat | |
| </host> | |
| <host: claude-code> | |
| - Use multiple choice options when possible (AskUserQuestion supports 2-4 options per question) | |
| - AskUserQuestion supports up to 4 questions per form — use this to reduce round-trips | |
| </host> | |
| <host: codex, opencode, cursor> | |
| - Present options as a numbered list and ask the user to reply with the chosen number | |
| - Group no more than 3 questions per turn to avoid overloading the chat | |
| </host> |
… brainstorming host markers Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ## The Process | ||
|
|
||
| Spawn a background agent that monitors the PR in a loop: | ||
| Spawn a background `balanced`-tier agent that monitors the PR in a loop following the Monitor Loop procedure. |
There was a problem hiding this comment.
The host-neutral sentence references “the Monitor Loop procedure”, but the detailed Monitor Loop steps live only inside the <host: claude-code> Agent prompt, so non-Claude hosts won’t see any defined “procedure” to follow. Consider either moving the Monitor Loop steps into host-neutral prose (outside host blocks) or rewording this line to match the shorter cross-host fallback instructions below.
| Spawn a background `balanced`-tier agent that monitors the PR in a loop following the Monitor Loop procedure. | |
| Spawn a background `balanced`-tier agent that monitors the PR in a loop, checking CI status and review comments, fixing issues, and continuing until all checks pass and no unresolved comments remain. |
…to non-Claude host guidance Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| Use your host's equivalent mechanism to periodically poll the following in a loop: | ||
| - `gh pr checks <number>` — fix any failing CI checks | ||
| - `gh api repos/<owner>/<repo>/pulls/<number>/comments` — respond to inline review comments | ||
| - `gh api repos/<owner>/<repo>/pulls/<number>/reviews` — handle any "CHANGES_REQUESTED" reviews | ||
|
|
||
| Continue until all checks pass, no unresolved inline comments remain, and no "changes requested" reviews are pending. |
There was a problem hiding this comment.
The non-Claude fallback loop only mentions polling PR checks and review comments, but the main Monitor Loop also checks PR reviews for CHANGES_REQUESTED (and the exit conditions depend on it). This fallback can miss requested-changes reviews and incorrectly decide it’s “done”; include equivalent review polling (e.g., the reviews endpoint) and reflect the same exit conditions.
| Use your host's equivalent mechanism to periodically poll the following in a loop: | |
| - `gh pr checks <number>` — fix any failing CI checks | |
| - `gh api repos/<owner>/<repo>/pulls/<number>/comments` — respond to inline review comments | |
| - `gh api repos/<owner>/<repo>/pulls/<number>/reviews` — handle any "CHANGES_REQUESTED" reviews | |
| Continue until all checks pass, no unresolved inline comments remain, and no "changes requested" reviews are pending. | |
| Use your host's equivalent mechanism to periodically poll the following in a loop (sleep 60 seconds between cycles; do not poll more frequently): | |
| - `gh pr checks <number>` — inspect all CI checks and fix any failures | |
| - `gh api repos/<owner>/<repo>/pulls/<number>/comments` — inspect and respond to inline review comments | |
| - `gh api repos/<owner>/<repo>/pulls/<number>/reviews` — inspect PR review states separately from comments and handle any reviews with state `"CHANGES_REQUESTED"` | |
| Only exit when all of the following are true: all checks pass, no unresolved inline comments remain, and there are no pending PR reviews whose state is `"CHANGES_REQUESTED"`. |
| @@ -70,7 +99,10 @@ Agent tool (general-purpose, model: sonnet): | |||
| | Task N | — | ⚠️ SCOPE CREEP | | |||
|
|
|||
| **Drift Items:** [list specific items to fix] | |||
|
|
|||
| Set **Status:** to PASS only if every design item is covered and every plan task is justified. Otherwise set it to FAIL. | |||
| ``` | |||
| </host> | |||
There was a problem hiding this comment.
This section is now mostly host-neutral, but the actual “dispatch” mechanism is only described for <host: claude-code>. For portability, add a corresponding block for other supported hosts (e.g., codex/opencode/cursor) describing how to run the alignment check as a separate subagent/thread (or explicitly state that the procedure can be executed inline without spawning a subagent).
| 1. **Explore project context** — check files, docs, recent commits | ||
| 2. **Ask clarifying questions** — adaptive batching: group 2-4 related questions per form, follow up with targeted singles | ||
| 2. **Ask clarifying questions** — adaptive batching: group related questions to reduce round-trips; use targeted singles for follow-ups | ||
| 3. **Propose 2-3 approaches** — with trade-offs and your recommendation |
There was a problem hiding this comment.
The updated guidance emphasizes batching multiple questions per turn, but the Overview still says to “ask questions one at a time” (now conflicting with this checklist/process section). Consider updating wording to consistently describe small-batch questioning throughout the skill to avoid confusing operators.
…gnment dispatch + fix brainstorming one-at-a-time Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
@copilot review |
Both issues flagged by the automated review are already resolved in earlier commits on this branch:
All 3 PR files pass the grep guard with no violations. |
|
@copilot review |
Summary
Rewrites 3 of the 5 Group II skills to replace Claude-Code-specific tool names and model brand names with host-neutral equivalents, gated behind `<host: ...>` blocks where needed.
The remaining 2 skills (receiving-code-review, finishing-a-development-branch) were audited and contain no forbidden tokens — no changes required.
Skills changed
Skills audited / no changes needed
Verification
The 3 modified files are individually compliant with `tests/skill-content-grep.sh`. Pre-existing violations in other skills (subagent-driven-development, using-superpowers, writing-skills, executing-plans) are addressed in PR-C (#16).
Part of
Cross-LLM portability series: PR-A (infra) → PR-B (Group II) → PR-C (Group I+III) → PR-D (docs) → PR-E (coverage table)
🤖 Generated with Claude Code