Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
101 changes: 101 additions & 0 deletions .agents/skills/pr-comments/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
---
name: pr-comments
description: Submit a PR review as inline GitHub comments on specific files and lines using the gh CLI.
---

# PR Comments

Post review findings as inline comments on specific diff lines via the GitHub Pull Request Reviews API and `gh` CLI.

## When to Use

- After `pr-review` produces findings and user wants them posted to the PR
- User asks to "add comments", "submit the review", "comment on the lines", or "do an actual review"

## Step 0: Decide Whether to Post Comments

**Before doing anything else**, check the pr-review output:

- If the review found **zero Must Fix and zero Should Fix items**, submit an `APPROVE` review with a short body like "Clean PR — builds, tests pass, lint clean. LGTM." and **no inline comments**. Then stop.
- If only "Consider" items exist and they're truly optional, approve without inline comments.
- Only proceed to Step 1 if there are **concrete, actionable findings** worth commenting on.

**Never post test/placeholder comments.** Every comment submitted to the PR must contain real, substantive feedback.

## Workflow

### Step 1: Gather Data

1. Get owner, repo, PR number
2. Get head commit SHA: `gh api repos/daltoniam/switchboard_plugins/pulls/<number> --jq '.head.sha'`
3. Get the diff: `gh pr diff <number> --repo daltoniam/switchboard_plugins`
4. Check existing comments: `gh api repos/daltoniam/switchboard_plugins/pulls/<number>/comments` — don't duplicate

### Step 2: Map Findings to Diff Lines

The API only accepts lines that appear in the diff. For each finding, confirm the target line is in a `+` or context line. If not, use the nearest line in the same hunk.

### Step 3: Build Payload with Python

Always use a Python script to build the JSON — avoids shell escaping issues with Markdown and code fences:

```python
import json

comments = [
{
"path": "plugins/example/src/lib.rs",
"line": 42,
"side": "RIGHT",
"body": "This handler can panic on malformed input. We should return a structured Switchboard error instead so the host can surface the failure cleanly."
},
]

payload = {
"commit_id": "<sha>",
"event": "COMMENT",
"body": "Looks good overall. I left a couple of concrete things inline.",
"comments": comments,
}

with open("/tmp/review_payload.json", "w") as f:
json.dump(payload, f)
```

Then submit:

```bash
gh api repos/daltoniam/switchboard_plugins/pulls/<number>/reviews \
--method POST \
--input /tmp/review_payload.json \
--jq '.html_url'
```

### Step 4: Handle Errors

- **422 line not in diff**: Use nearest diff line in the same hunk
- **422 validation failed**: Re-fetch head SHA and retry — branch may have been updated
- **403 not accessible**: Fall back to `gh pr comment <number> --repo daltoniam/switchboard_plugins --body "..."`

## Comment Voice and Style

Write comments like a peer reviewer who's read the code carefully and is being helpful, not like an automated tool generating a report.

**Key rules:**

- **No severity titles or labels.** Don't start comments with "Must Fix —", "Should Fix —", "Nit —", or any bolded header/title.
- **Conversational, not formulaic.** Each comment should feel like its own thought, not a template fill-in.
- **Lead with the problem, not a category.** Say what can break and why.
- **Keep it tight.** 2-4 sentences is usually enough.
- **Code suggestions are inline.** Drop code blocks naturally after explaining the issue.
- **Skip the preamble.** Don't start every comment with "Great work but...".
- **Use "we" and "this" not "you should".**
- **One idea per comment.** Don't combine unrelated issues.

**Overall review body:** Keep the top-level review summary to 1-2 natural sentences. Lead with something positive if warranted, mention roughly how many things to look at.

**Event type:** Choose the event based on the review findings from `pr-review`:

- `"APPROVE"` — if there are **no Must Fix items** in the review. Non-blocking suggestions are fine alongside approval.
- `"REQUEST_CHANGES"` — if the user explicitly asks to block the PR.
- `"COMMENT"` — if there are blocking items but the user hasn't explicitly asked to block.
199 changes: 199 additions & 0 deletions .agents/skills/pr-review/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,199 @@
---
name: pr-review
description: Review a GitHub pull request for Switchboard WASM plugins. Enforces guest ABI correctness, Rust/WASM conventions, manifest/version consistency, safe configuration handling, tests, and production readiness.
---

# Pull Request Review

Review a GitHub PR with a focus on keeping Switchboard WASM plugins correct, safe, and production-ready — while delivering feedback that is constructive, encouraging, and actionable.

## When to Use

- User asks to review a PR (by number, URL, or branch name)
- User asks to look at code quality, performance, security, or production readiness of a plugin PR
- User provides a PR and asks for feedback or recommendations

## Persona

You are a senior Rust and WASM developer reviewing code for Switchboard plugin crates. You care deeply about:

- **Correctness first**: plugin crates must compile to `wasm32-wasip1` and expose the required Switchboard guest ABI.
- **Host safety**: plugin handlers should return structured errors, not panic or trap on malformed input.
- **Manifest integrity**: binary, version, SHA-256, size, and download metadata must stay consistent.
- **Rust quality**: idiomatic error handling, minimal cloning, clear config parsing, and no hidden global-state races.
- **Test coverage and validation**: every behavior change must have tests or a concrete validation path.

Your tone is **direct but respectful**. Call out what's done well. When something needs fixing, explain why and provide a concrete suggestion. Don't nitpick style when `cargo fmt` handles it.

---

## Workflow

Execute the following steps in order. Do not skip steps. Do not ask the user for information you can find yourself.

**CRITICAL: Step tracking is mandatory.** Before starting the review, create a todo list with one item per step (Step 1 through Step 11). Mark each step in_progress before starting it and completed after finishing it. Every step must appear in the final review output, even if the finding is "no issues found for this category."

### Step 1: Fetch PR Context

Using the `gh` CLI (the repo is `daltoniam/switchboard_plugins`):

1. `gh pr view <number> --repo daltoniam/switchboard_plugins --json title,body,headRefName,baseRefName,state,author,files,additions,deletions,changedFiles`
2. `gh pr diff <number> --repo daltoniam/switchboard_plugins` for the full diff
3. `gh api repos/daltoniam/switchboard_plugins/pulls/<number>/comments` for existing review comments (don't duplicate feedback already given)

Then check out the branch locally:

```bash
git fetch origin <headRefName> && git checkout <headRefName>
```

Note the PR size (files changed, additions, deletions) — large PRs deserve extra scrutiny.

### Step 2: Build Verification

Read `AGENTS.md` first, then verify the changed code builds:

```bash
cargo build --release --target wasm32-wasip1
```

If the build fails, report it as a **Must Fix** item unless it is clearly unrelated to the PR and already known. Include the failing crate and linker/compiler error.

### Step 3: Test and Format Verification

Run the configured checks:

```bash
cargo fmt --check
cargo clippy --target wasm32-wasip1 -- -D warnings
```

If the repo contains tests for touched crates, run the relevant `cargo test` commands too. If tests are not available because the plugin only targets WASM, state that and rely on build/clippy plus code review.

- If checks fail due to code issues, report each failure with the file and error output.
- If there are no tests or validation for new behavior, flag it when the behavior is non-trivial.

### Step 4: Manifest and Artifact Verification

For any binary-affecting plugin change:

- Confirm the plugin version was bumped when appropriate.
- Confirm `manifest.json` entries match the expected artifact name and download URL.
- Confirm SHA-256 and size metadata were regenerated for committed artifacts, if artifacts are part of the PR.
- Confirm `released_at` is plausible and formatted consistently.

### Step 5: Code Review — Guest ABI and Switchboard Patterns

Review the diff against Switchboard plugin conventions in `AGENTS.md`.

**Required ABI:**

- Each plugin must export `name`, `metadata`, `tools`, `configure`, `execute`, and `healthy` through the SDK helpers or equivalent ABI-compatible exports.
- Tool names should follow `<plugin>_<verb>_<noun>`.
- Configuration should be populated by `configure` and read safely by handlers.
- `execute` must handle unknown tools with structured errors.

**SDK and Host Interactions:**

- Use `switchboard-guest-sdk` APIs as intended.
- Host calls should propagate errors with context instead of panicking.
- Validate request arguments before sending host requests.
- Keep response shapes stable for existing tool users.

### Step 6: Code Review — Rust Quality

Review the diff for Rust-specific quality:

- Avoid `unwrap`, `expect`, indexing, or panics in runtime paths.
- Prefer `Result` and structured error responses for malformed input and upstream failures.
- Keep config parsing explicit and fail closed when required credentials are missing.
- Avoid unnecessary clones on large payloads.
- Keep global state protected and simple; static `Mutex` config should not be held across host calls.
- Ensure serialization/deserialization uses explicit types where shape matters.
- Keep public APIs and module boundaries clear.

### Step 7: Code Review — Testing

Every behavior change should have tests or a concrete validation path.

| Change Type | Required Validation |
|-------------|---------------------|
| New plugin | WASM build, ABI exports, metadata/tools shape, configure/execute/healthy coverage |
| New tool | Argument validation, success path, upstream failure path, unknown/missing args |
| Config change | Missing/invalid credential behavior and healthy-state behavior |
| Manifest/artifact update | Version, SHA-256, size, artifact name, and URL consistency |
| Bug fix | Regression test or minimal repro validation |

### Step 8: Code Review — Security

Review the diff for security concerns:

- **Secrets:** API keys, OAuth tokens, and config values must not be hardcoded, logged, or returned in tool results.
- **Input validation:** Validate user-provided URLs, IDs, query strings, and request bodies before host calls.
- **SSRF-like risks:** Any tool that fetches user-provided URLs needs clear intended scope and validation.
- **Unbounded data:** Avoid unbounded request/response bodies and large allocations.
- **Error leakage:** Return useful errors without leaking credentials or raw upstream sensitive payloads.

### Step 9: Review Existing Comments

Check if other reviewers or CI bots have already left feedback:

- Don't duplicate issues already raised.
- If the author has addressed previous review comments, do not re-raise those issues.
- If you disagree with existing feedback, explain why.

### Step 10: Decide Whether to Comment

Not every PR needs inline comments.

- If the PR builds, checks pass, and there are zero Must Fix or Should Fix items, approve it without inline comments.
- If only optional Consider items exist, approve without commenting.
- Only post inline comments for concrete findings affecting correctness, security, performance, or significant maintainability.
- Skip issues that `cargo fmt`, clippy, or tests already catch.

### Step 11: Compile Review

Organize findings into the structured report below.

---

## Output Format

```markdown
# PR #<number> Review: <title>

## Verification
| Check | Result |
|-------|--------|
| Build (`cargo build --release --target wasm32-wasip1`) | Pass / Fail / Not run with reason |
| Format (`cargo fmt --check`) | Pass / Fail / Not run with reason |
| Clippy (`cargo clippy --target wasm32-wasip1 -- -D warnings`) | Pass / Fail / Not run with reason |
| Manifest/artifacts | Clean / Dirty / Not applicable |

## What's Good
- [Genuine positive observations]

## Must Fix (Blocking)
No issues found / findings...

## Should Fix (Non-Blocking)
No issues found / findings...

## Consider (Nice to Have)
No issues found / findings...

## Questions
- [Questions or "None"]
```

If a severity category has no findings, include it with "No issues found" to show it was not skipped.

## Guidelines

- Be positive first, but do not bury important issues.
- Explain the why behind every finding.
- Provide concrete suggestions when possible.
- Respect existing plugin conventions.
- Never block on style handled by formatting or linting.
- Do not re-litigate resolved conversations.
- Before citing versions, module paths, install commands, or API behavior, verify against the repo or official docs; if you cannot verify, drop the claim.
2 changes: 2 additions & 0 deletions .cargo/config.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
[target.wasm32-wasip1]
rustflags = ["-C", "link-arg=--allow-undefined"]
44 changes: 44 additions & 0 deletions .ci/crush/crush.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
{
"$schema": "https://charm.land/crush.json",
"models": {
"large": {
"model": "grok-4.3",
"provider": "xAI",
"max_tokens": 16384
},
"small": {
"model": "grok-build-0.1",
"provider": "xAI",
"max_tokens": 16384
}
},
"providers": {
"xAI": {
"type": "openai-compat",
"base_url": "https://gateway.ai.cloudflare.com/v1/${CF_AIG_ACCOUNT_ID:?set CF_AIG_ACCOUNT_ID}/${CF_AIG_GATEWAY_ID:?set CF_AIG_GATEWAY_ID}/grok",
"api_key": "${CF_AIG_TOKEN:?set CF_AIG_TOKEN to your Cloudflare AI Gateway token}",
"extra_headers": {
"cf-aig-metadata": "{\"developer_id\": \"austin.cherry\", \"task_type\": \"ci_pr_review\", \"tool\": \"crush\", \"repo\": \"daltoniam/switchboard_plugins\"}",
"cf-aig-collect-log-payload": "false"
},
"models": [
{
"id": "grok-4.3",
"name": "Grok 4.3",
"context_window": 1000000,
"default_max_tokens": 16384,
"can_reason": true,
"supports_attachments": true
},
{
"id": "grok-build-0.1",
"name": "Grok Build 0.1",
"context_window": 256000,
"default_max_tokens": 16384,
"can_reason": true,
"supports_attachments": true
}
]
}
}
}
47 changes: 47 additions & 0 deletions .github/workflows/crush-pr-review.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
name: Crush PR Review

on:
pull_request:
types: [opened, synchronize, reopened]
branches: [main]

permissions:
contents: read
pull-requests: write

jobs:
crush-review:
name: Crush PR Review
runs-on: ubuntu-latest
if: github.event.pull_request.head.repo.full_name == github.repository
steps:
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0

- name: Install Rust
run: |
rustup toolchain install stable --profile minimal
rustup target add wasm32-wasip1

- name: Install Crush CLI
run: |
sudo mkdir -p /etc/apt/keyrings
curl -fsSL https://repo.charm.sh/apt/gpg.key | sudo gpg --dearmor -o /etc/apt/keyrings/charm.gpg
echo "deb [signed-by=/etc/apt/keyrings/charm.gpg] https://repo.charm.sh/apt/ * *" | sudo tee /etc/apt/sources.list.d/charm.list
sudo apt update && sudo apt install -y crush

- name: Run Crush PR Review
env:
CF_AIG_TOKEN: ${{ secrets.CF_AIG_TOKEN }}
CF_AIG_ACCOUNT_ID: ${{ secrets.CF_AIG_ACCOUNT_ID }}
CF_AIG_GATEWAY_ID: ${{ secrets.CF_AIG_GATEWAY_ID }}
GH_TOKEN: ${{ secrets.GH_ACCESS_TOKEN }}
CRUSH_SKILLS_DIR: ${{ github.workspace }}/.agents/skills
CRUSH_GLOBAL_CONFIG: ${{ github.workspace }}/.ci/crush
run: >
crush run -c "${{ github.workspace }}"
"Review PR #${{ github.event.pull_request.number }} in ${{ github.repository }}.
Use the pr-review skill to do a full code review, then use the pr-comments skill to post your findings as inline comments on the PR.
The repo is already checked out locally. Use gh CLI for all GitHub API interactions."
Loading