Skip to content

feat(ci): automated PR audit workflow#182

Draft
auxesis wants to merge 8 commits into
mainfrom
feat/auto-pr-review
Draft

feat(ci): automated PR audit workflow#182
auxesis wants to merge 8 commits into
mainfrom
feat/auto-pr-review

Conversation

@auxesis
Copy link
Copy Markdown

@auxesis auxesis commented May 28, 2026

Summary

Two-tier automated PR review on every PR to vitaminc:

  • Tier 1 — pr-coverage-review.yml: lightweight test-coverage
    review on every PR via Claude Sonnet. Eventually a required check
    on main.
  • Tier 2 — pr-crypto-audit.yml: deep crypto + coverage audit
    on PRs touching packages/{aead,encrypt,protected,permutation, random,kms,password,traits}/** via Claude Opus. Advisory only.

Both wrap anthropics/claude-code-action@v1. Prompts live as
markdown under .github/audit-prompts/ so they version-control
with the rest of the repo.

Design

Full design at .work/auto-audit-on-pr/design.md on this branch.

Test plan

  • actionlint .github/workflows/pr-coverage-review.yml — clean
  • actionlint .github/workflows/pr-crypto-audit.yml — clean
  • Tier 1 fires on this PR itself (touches .github/ only — no
    crypto paths, so Tier 2 should NOT fire). Watch the run and
    review comments.
  • Once this PR is merged, manually dogfood both workflows
    against a recent PR (e.g. feat(aead, encrypt): HList-based static cipher shape behind hlist feature #175) via workflow_dispatch. See
    the rollout runbook in .work/2026-05-29-auto-audit-on-pr.md.
  • After dogfood looks correct, flip Tier 1 to required check on
    main branch protection.

Rollout

This PR ships both workflows as advisory. Required-check
promotion happens in a follow-up PR (or via repo settings) after
dogfooding.

auxesis and others added 8 commits May 29, 2026 01:08
Two-tier GitHub Actions design derived from the manual review pattern
applied to PR #175. Tier 1 (Sonnet, required) reviews every PR for
test coverage gaps; Tier 2 (Opus, advisory) audits crypto-package PRs
for vulnerabilities + coverage gaps. Both built on
anthropics/claude-code-action with version-controlled prompts in
.github/audit-prompts/.

Includes a four-phase rollout plan that exits when Tier 2 catches
>=80% of the M-* findings from the manual PR #175 audit and Tier 1
catches >=80% of the top-priority coverage gaps.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bite-sized TDD-style plan covering pre-flight setup, two workflow
files, two prompt files, dogfood validation against PR #175 ground
truth, and a post-merge operational runbook for Phase A→B promotion
of Tier 1 to required check.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lightweight per-PR coverage review wrapped around
anthropics/claude-code-action. Runs on every PR (incl. doc-only
changes) so the eventual branch-protection required check is
always satisfiable. Includes workflow_dispatch for dogfooding
against arbitrary PR numbers.
The initial Tier 1 workflow used pre-v1 input names (model,
prompt_file, mode, allowed_tools) that do not exist on
anthropics/claude-code-action@v1. The v1 idiom is `prompt`
(multi-line inline) + `claude_args` for CLI flags. Updates the
plan to match upstream's action.yml and rewrites the workflow
file accordingly. Also pre-emptively updates Task 4 (Tier 2)
in the plan so the same defect does not recur.
Path-gated to packages/{aead,encrypt,protected,permutation,
random,kms,password,traits}/** and their -derive crates.
Advisory only (does not block merge). Uses Claude Opus and the
crypto-audit.md prompt to perform context-build + vulnerability
hunt + coverage gap analysis in a single PR review.
Three-phase prompt (context-build → vuln hunt → coverage gaps)
producing a single PR review with inline comments tagged by
finding ID. Calibrated against PR #175 ground truth: M-01, M-02,
L-01, L-02, L-03 patterns are explicitly enumerated as reference
categories the prompt should be able to surface.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant