Skip to content

Latest commit

 

History

History
1031 lines (804 loc) · 83 KB

File metadata and controls

1031 lines (804 loc) · 83 KB

VR-AUDIT — Master Architecture

Version: 1.0 (Wave 4 master synthesis) Status: Accepted. Supersedes all prior design artifacts for agent count, gate ladder, state schema, and execution modes. Authoritative paths: ${REPO}/ARCHITECTURE.md, ${REPO}/.claude/, ${REPO}/.vr-state/, /opt/vr-sessions/ (Rocky VM).


0. Executive Summary

What this is. A vulnerability research pipeline built as a Claude Code plugin. Given a target repository (or an external lead from one of 15 ingestion sources), the framework produces disclosure-ready findings and surfaces them at a single human review point — Gate D. Target FP rate: P(FP | all gates pass) ≤ 0.01 under the independence assumptions of §2.1. Measured rate is tracked in the calibration ledger (§13), not asserted anywhere until enough disclosures accumulate to compute it honestly.

Production line.

intake → Lead-Auditor (long-context Opus) ──┐
          ├── ralph-loops: harness, invest, ├── artifacts ──▶ Gate A
          │   variant, campaign, symex       │   (target scoped)
          │                                  │
          ├── Fuzzing Engineer + Crash Triage│
          │       │                          │
          │       ▼                          │
          ├── Reproduction Agent (VM, ASan) ─┤── artifacts ──▶ Gate B
          │   (physical-grounding oracle)    │   (PoC crashes)
          │                                  │
          ├── Symex-Juror (solver: SAT/UNSAT)┤── witness   ──▶ Gate B.5
          │   (mechanical juror)             │   (reachability proved)
          │                                  │
          ├── Fresh Validator (prior-quarantined Opus) ─┐
          ├── Devils Advocate (opposing prior Opus) ────┼──▶ Gate C
          │                                  │          │   (fresh TP)
          ├── Counter-Symex-Juror (adversarial UNSAT) ──┼──▶ Gate C.5
          │                                  │          │   (not killed on release)
          ├── Patch Drafter ─▶ Patch Fresh-Validator ───┤──▶ Gate D
          │                                  │          │   (fix-scope + 99%)
          ├── Release Pinner (T-15min) ──────┘          │
          │                                              │
          └──────────────────────── Report Writer ─────▶ Human batch review

Target FP bound (math). Design goal: P(FP | all gates pass) ≤ 0.01. The bound comes from composing seven near-independent mechanical checks V1–V7 for all classes, plus V8/V9 symex checks for tractable classes (§19). See §2. The derivation is a ceiling; the observed rate depends on how well V1–V7 independence actually holds at runtime. Until the calibration ledger has accumulated ≥30 disclosures, treat this as an architectural target, not a measurement.

Scope — in. C/C++ (nginx, httpd, openssl), Go (net/http, k8s), Rust (hyper, tokio), Python C-extensions (asan-python), Protobuf wire. Open-source targets with live source. Rocky 8.10 VM at 167.71.84.75 (8 vCPU, 16GB, clang 20, full sanitizer toolchain).

Scope — out. Closed-source binaries (no decompiler in roster), firmware, mobile apps, web-app/auth testing (separate skill tree exists). Commercial targets without explicit authorization. Any target whose LICENSE/SECURITY.md forbids external research.


1. Design Principles

  1. Autonomous-until-human-required. The orchestrator proceeds through every gate without human input. Humans are pulled in only at: (a) campaign scoping at /vuln-audit start, (b) batch-review when candidates reach Gate D, (c) must-keep tie-break gates (§9), (d) budget overrun, (e) FP kill-switch.
  2. Single-agent where context matters, multi-agent where structural requirements dictate. Lead Auditor (Opus, long-lived) owns all reasoning-intensive work. Specialists spawn only when narrower tools, cleaner prior, longer-running profile, or persistent artifact is structurally required (§3, §Appendix A).
  3. Mechanical gates > narrative checks. Every gate is a shell hook against an artifact; no gate asks an LLM "are you sure?" Convergence hooks inspect files and tool outputs, not self-reports (§4, §7).
  4. Ground-truth anchored. Every finding declares an oracle_class and must produce an unfakeable signal from it: ASan stack trace for C/C++, panic + -race for Go, miri/panic for Rust, parse-reparse byte-diff for protobuf. Logical-only findings go through the heaviest human scrutiny (§2.7).
  5. Every finding traceable to file:line + commit SHA. Provenance atoms (Technique 3) carry (claim, source_type, source_ref, sha256). A hook live-rehashes every atom at disclosure send time; any drift hard-blocks.
  6. Fail closed. Unknown state → escalate to human. Orchestrator is a deterministic Python/shell program, not an LLM — it does not improvise.
  7. Tool allowlist propagates down. Sub-agent profile is always the intersection of parent and sub-agent defaults. Bash side-channel writes (tee, dd, heredocs) blocked by PreToolUse on Bash.

2. The 99% Confidence Architecture

Goal. P(FP | Gate A ∧ B ∧ C ∧ D pass) ≤ 0.01 before a finding is surfaced for human review.

2.1 The seven mechanical checks

Each check targets a distinct failure mode. Independence is argued below; conjoined probability under assumed independence gives ≤ 0.01.

# Check What it catches P(says-TP | actually-FP) Notes
V1 ASan / panic / miri crash reproduced 3/3 on pinned VM with job-id attestation (Technique 9) Fabricated stderr; flaky non-bug; allocator-induced crash 0.30 Defeats Attacks 7, 11, 8
V2 Provenance Merkle tree live-rehashed at send time + all file:line/CVE/commit SHAs verified (Technique 3) Line drift; confabulated CVE IDs; SHA-plausible-but-wrong 0.25 Defeats Attacks 4, 15, 23
V3 External-oracle triangulation: ≥2 of {CodeQL, Semgrep, GitNexus Cypher} fire on claim with query-specificity gate (≤10 results) Claim internally consistent but not in AST 0.25 Defeats Attacks 25, 26
V4 Counterfactual diff inversion: patch applied, rebuilt, PoC stops crashing; fresh agent on patched file sees no bug (Technique 1) Mislocated root cause; fix covers symptom not cause 0.30 Defeats Attacks 16, 6
V5 Dual-prior Fresh Validators: one defensive-primed + one aggressive-primed, both derive compatible triggers from raw source only (Technique 5) Shared-training-prior hallucination; mode collapse on CVE-lookalike 0.35 Defeats Attacks 1, 5
V6 Devils Advocate exhaustion (da-ralph): 2 consecutive iters with new_challenges: [] AND verbatim hunk quote of every fix_diffs_read[] SHA Handwaved diff-read; superset-fix claim; engineered commit message 0.30 Defeats Attacks 3, 15, 16, 19
V7 Last-mile Release Pinner (T-15min): fetch upstream HEAD + latest tag, re-run PoC on release tarball and tip, provenance-rehash atoms Silent upstream fix between audit and send; tag-vs-main race 0.20 Defeats Attacks 23, 24, 17
V8 Symex SAT-witness replays concrete crash on Rocky VM (§19, Gate B.5) — applies only when symex_class != "N/A" Agent-claimed reachability that math refutes; wrong bound / off-by-one algebra ~0.01 (solver-bounded, not agent-bounded) Tractable classes only (integer, bounds, branch-feasibility, invariant); N/A for concurrency/logic
V9 Counter-symex UNSAT on release tag (§19, Gate C.5) — adversarial attempt to prove bug absent on upstream release Silent upstream fix missed by V7; spec-intended behavior ~0.02 Adversarial mode, pinned to release-tag commit; TIMEOUT does not block

2.2 Independence argument

The checks are designed to not share failure modes:

  • V1 is a physical-VM oracle; it's false-positive-prone only when the VM's runtime agrees with the attacker narrative (crash matches claim text). It cannot agree when V2/V3 disagree because V2 verifies the claim text against source SHA, and V3 verifies against AST queries.
  • V2 is pure cryptographic grounding. It has zero LLM judgment. Its failure mode is "the ground-truth URL / git object is lying" which requires a supply-chain-poisoned source tree, caught by V7.
  • V3 fails only when all three query engines' quirks align with the false claim — query engines were picked for diversity (CodeQL is semantic taint, Semgrep is syntactic pattern, GitNexus is graph traversal).
  • V4 is counterfactual: the patch mechanically silencing the crash is independent of whether any agent agrees.
  • V5 uses opposing-prior agents. A shared hallucination would have to survive both priors and produce identical trigger-input bytes in independent derivations — the probability of two independent random derivations matching is negligible.
  • V6 is a structured adversary with a binding mandate. It demands verbatim hunks with line-overlap check, which is a syntactic property of the diff.
  • V7 is the freshness gate. It protects against every upstream change between V1–V6 and send.

Joint P(FP | all pass), assuming pairwise near-independence:

P ≤ 0.30 × 0.25 × 0.25 × 0.30 × 0.35 × 0.30 × 0.20
  = 1.18e-4

Even with correlation slack (assume effective-independence penalty of 10× on the product, due to shared model weights across V3/V4/V5/V6 generation prompts), residual P(FP) ≈ 0.0012 — under 1%.

The conservative design target is P(FP) ≤ 0.01, so we have ~8× headroom for correlation. Calibration ledger (§13) monitors residual FP empirically over rolling 30-disclosure windows; if measured FP > 5%, the gate weightings are re-tuned.

Adding V8 + V9 (tractable classes only): P_tractable ≤ 0.0012 × 0.01 × 0.02 ≈ 2.4e-7. V8/V9 are mechanical SAT/UNSAT, not agent-bounded; treated as near-independent (solver does not share model weights). For non-tractable classes (symex_class: N/A), V1–V7 alone deliver the target FP bound; V8/V9 abstain without blocking. Calibration ledger (§13) tracks residual FP per class over rolling 30-disclosure windows.

2.5 Symex tractability classes

Every HIGH finding is classified at Lead Auditor hypothesis-gen time:

  • Tractable (symex applies): integer arithmetic, bounds / OOB, branch feasibility, small-state invariants, protocol parsers (<10k LoC slice). Gets V8 + V9.
  • N/A (symex skipped): concurrency / TOCTOU, logic / spec bugs, heap-corruption exploitability, UI/side-channel, Go stdlib at full scale. Relies on V1–V7 only; still 99%.

See §19.1 and §19.4 for the full classifier.

2.3 What is NOT claimed

  • 99% recall. The framework is tuned for precision, not recall. Findings that don't pass all seven checks are not disclosed; some real bugs will be missed. Periodic seed-bug injection (§13) bounds false-negative rate.
  • Target FP bound varies by severity tier. MEDIUM findings skip V5 and V6 (per tiering rule, §3). MEDIUM target: P(FP) ≤ 0.10. LOW target: P(FP) ≤ 0.25. Only HIGH gets the full stack and the P(FP) ≤ 0.01 target.

2.4 Oracle class per target class

Every finding declares oracle_class and is validated against that class's mechanical signal:

Language Oracle Check
C/C++ ASan/UBSan/MSan stack trace ERROR: AddressSanitizer: <kind> + top-3 app frames + write-vs-read flag
Go panic + -race detector Stack trace + deterministic repro under race mode
Rust miri on unsafe, sanitizer build otherwise Panic not #[should_panic]
Python C-ext atheris + ASan-Python Same as C/C++
Protobuf parse→serialize→parse byte-diff Any byte divergence
Logical-only None mechanical Human scrutiny heaviest; severity capped at MEDIUM absent oracle

3. Agent Roster (Final)

11 agents. Down from Wave 1's 11 + Wave 2's 15 proposed additions. Justification per §Appendix A.

# Agent Mode Model Tools Trigger Input Output Mandate
1 Lead Auditor single-agent (campaign-long session) Opus (1M) Read, Write, Edit, Bash, Grep, Glob, all mcp__gitnexus__*, mcp__claude-code-ssh__*, Skill (most) /vuln-audit target repo + CVE feeds + archaeology brief candidate_queue.json, dead-ends.jsonl, harness scaffolds Own the audit end-to-end; pivot strategy; dispatch specialists; integrate verdicts
2 Fresh Validator ephemeral Opus Read (source + finding JSON only, path-allowlisted), Grep, Bash (read-only), mcp__gitnexus__context/impact HIGH finding ready for Gate C finding JSON + target source snapshot only validation.json Prior-quarantined re-derivation of claim; two instances (defensive + aggressive prior)
3 Devils Advocate ephemeral Opus Read, Grep, Bash(git show/log), mcp__gitnexus__context, NO Write HIGH finding ready for Gate C finding JSON + target + fix_diffs_read[] requirement devils_advocate.json Exhaustion-loop until new_challenges: [] × 2; verbatim hunk quotes
4 Fuzzing Engineer ephemeral Opus 4.6 Read, Write, mcp__claude-code-ssh__*, Skill(libfuzzer, aflpp, honggfuzz, harness-writing) Candidate needs dynamic validation harness spec + invariants.json + corpus campaigns/<id>/queue/, crashes Run harness-ralph + campaign-ralph; emit crash corpus
5 Crash Triage ephemeral Opus 4.6 Read, Write, Bash(afl-tmin/gdb-batch/addr2line), mcp__claude-code-ssh__* >10 crashes in a campaign raw crashes + ASan stderrs triage.json + minimized PoCs Cluster by stack-hash, rank, run min-ralph per cluster
6 Reproduction Agent ephemeral Opus 4.6 mcp__claude-code-ssh__*, Read, Write Finding needs PoC confirmation PoC + target SHA repro/<id>/{run1,run2,run3}.log, VM job-ids Run stab-ralph; produce physical-grounding attestations
6.5 Symex-Juror ephemeral (two flavors: juror + counter-adversary) Opus 4.6 Read, Write, mcp__claude-code-ssh__ssh_execute, mcp__gitnexus__context/impact, Skill(vr-symex-slice, vr-symex-harness, vr-symex-solve, vr-symcc-concolic) Finding has symex_class != N/A ready for Gate B.5 or C.5 symex_query.json + target slice symex_result.json with verdict ∈ {SAT_CONCRETE_VERIFIED, UNSAT_COMPLETE, INCONCLUSIVE, TIMEOUT} + witness/core Run symex-ralph; emit SAT/UNSAT verdict; counter-mode pins to release-tag and tries to kill
7 Patch Drafter ephemeral Opus Read, Write, Edit, Bash(git), Grep Post-Gate C validated finding + target source patch.diff, patch_notes.md Run patch-ralph; produce minimal upstream-style diff
8 Patch Fresh-Validator ephemeral Opus Read, Write, Bash(build/test), mcp__claude-code-ssh__* Post-draft finding + patch (no patch-drafter rationale) patch_validation.json Counterfactual inversion: does patch close root cause?
9 Release Pinner ephemeral Opus 4.6 Read, Bash(git fetch/show/tag), mcp__claude-code-ssh__*, WebFetch (NVD/GHSA) T-15min before send finding + disclosure draft release_pin.json Re-fetch HEAD + latest tag, replay PoC, rehash provenance atoms
10 Report Writer ephemeral Opus 4.6 Read (findings only), Write (disclosure path only), Skill(vr-disclosure-template) Post-Gate D approved finding + patch disclosure.md, disclosure_plan.md Template-fill; numeric-specifics-to-artifact hook check

Ephemeral deconfliction-agent spawns on demand for Meta-Process P3 (candidate merge/split); not in permanent roster.

Dropped from Wave 1/2 (folded into Lead Auditor as skill calls): cve-analyst, architecture-mapper, novelty-checker, exploit-primitive-mapper, maintainer-profiler, disclosure-timing-optimizer, dependency-reachability-agent, race-condition-hunter, crypto-auditor, variant-hunter-cross-language, session-archaeologist, invariant-extractor.

Dropped to Meta-Process (scheduled cron, not interactive agent): sanitizer-differential, coverage-gap-mapper, corpus-curator.

Folded into Symex-Juror as flavors + skill calls (Wave 5): concolic-runner (= vr-symcc-concolic skill), counter-symex-juror (= symex-juror with mode: adversarial + release-tag commit pin, same pattern as Fresh-Val dual-prior).

Tiering. HIGH findings get all 11 agents and V1–V9. MEDIUM gets V1–V8 only (skips V5+V6+V9 = dual-Fresh-Val + DA + counter-symex, drops cost ~60%). LOW is Lead-only. Only HIGH reaches the 99% + symex guarantee.


4. Gate System (Complete Spec)

Gates A through D are the canonical ladder. Every finding, regardless of origin (fresh audit or intake), traverses these gates. The gate-skip matrix (§8b) applies only to Gate B for sources that ship a known-good reproducer.

Gate A — Target Scoped & Statically Viable

  • Entry. candidate_queue.json has a new entry from Lead Auditor or intake.
  • Checks.
    1. mcp__gitnexus__list_repos confirms target indexed at pinned commit.
    2. mcp__gitnexus__route_map(repo) returns ≥1 entrypoint OR Cypher MATCH (f:Function) WHERE NOT ()-[:CALLS]->(f) AND f.name =~ 'main|<entry>' RETURN count(f) returns ≥1.
    3. For focus function F: mcp__gitnexus__impact(F, upstream, depth=5) returns ≥3 distinct callers.
    4. mcp__gitnexus__impact(F, downstream, depth=3) contains at least one dangerous-sink node (memcpy|exec|eval|unmarshal|system).
    5. gitnexus://repo/{name}/context staleness flag = clean.
    6. Scope file .vr-state/targets/<project>/scope.yaml is present and operator-signed (orchestrator_pre_audit_hook.sh).
  • Exit. All six green → advance to Gate B. Any red → reject + reprioritize.
  • Hook. hooks/gate-a-scope.sh.
#!/bin/bash
set -e
CAND=$1
jq -e '.scope_ack_sig != null' .vr-state/targets/"$REPO"/scope.yaml || exit 1
gitnexus list-repos | jq -e ".[] | select(.name == \"$REPO\" and .commit == \"$COMMIT\")" || exit 1
# ... invoke gitnexus MCP via the client lib ...
vr-write-gate "$CAND" "G_A" "pass" "$EVIDENCE_JSON"

Gate B — PoC Reproduces with Ground-Truth Oracle

  • Entry. Reproduction Agent produces repro/<id>/run{1,2,3}.log and VM job-ids.
  • Checks.
    1. V1 attestation. Each run{1,2,3}.log has a job_id in /var/log/vr-framework/jobs.jsonl on Rocky VM with matching artifact SHA.
    2. ASan kind match. The signature ERROR: AddressSanitizer: <kind> matches finding's bug_class (heap-buffer-overflow → heap-buffer-overflow, etc.). No exit-139 laundering.
    3. Top-3-app-frame stack hash identical across 3 runs (filter out libc.so, libc++.so; require 3+ target-codebase frames).
    4. For C/C++: crash reproduces across glibc AND musl containers (crash_cross_allocator_verified: true).
    5. For every ASan stack frame: mcp__gitnexus__context(name=symbol, file=file) resolves.
    6. Crash-site line ∈ [startLine, endLine] of resolved function.
    7. Fingerprint corpus lookup. Top-3 app-frame hash NOT in logs/crash_corpus.sqlite benign bucket; if matches patched-CVE bucket, Lead must justify why it's a new instance.
  • Exit. All green → advance to Gate C. Flaky (< 8/10) → mark flaky_repro, confidence × 0.6, do not reject (Wave 2 §6). Failure → kick back to Lead Auditor.
  • Hook. hooks/gate-b-poc.sh calls mcp__claude-code-ssh__ssh_execute to verify job-id existence VM-side.

Gate B.5 — Symex SAT-Witness (Tractable Classes)

  • Entry. Gate B pass + finding's symex_query.json present with symex_class != "N/A".
  • Checks (V8).
    1. Symex-Juror invoked via symex-ralph (§7) with symex_query.json.
    2. symex_result.json#verdict ∈ {SAT_CONCRETE_VERIFIED, UNSAT_COMPLETE, INCONCLUSIVE, TIMEOUT}.
    3. If SAT: the produced concrete input is replayed on Rocky VM under ASan; top-3-app-frame stack hash MUST match Gate B's stack hash (same bug, not a different one).
    4. Provenance: symex_query.json and symex_result.json hashes enrolled in the finding's Merkle tree.
  • Exit.
    • SAT_CONCRETE_VERIFIED → pass, advance to Gate C.
    • UNSAT_COMPLETEfail, candidate closed as closed_fp_symex_unsat with the UNSAT core as evidence.
    • INCONCLUSIVE or TIMEOUTabstain (not a fail) — Gate B.5 marked skipped_with_reason, pipeline proceeds to Gate C. No confidence boost from V8.
    • symex_class == "N/A"skipped, passthrough. No confidence regression; V1–V7 still in force.
  • Hook. hooks/gate-b5-symex.sh (see §19.8 for shell body).

Gate C — Fresh TP + Devils Advocate Exhausted

  • Entry. Gate B pass + finding promoted to HIGH by Lead.
  • Checks (V5 + V6 + V3 + V4).
    1. Dual-prior Fresh Validators (two instances with system_prompt_flavor ∈ {defensive, aggressive}) both produce validation.json with compatible trigger inputs. PoC ASan stack hashes match modulo address. Disagreement → escalate human tie-break.
    2. Validator isolation verified. Path-allowlist proven: logs/fresh-validator-{a,b}.jsonl shows no reads of .vr-state/, logs/, sibling findings.
    3. Devils Advocate exhaustion. da-ralph converged with new_challenges: [] × 2. fix_diffs_read[] contains verbatim hunks from git show with diff_line_overlap_with_finding_location ≥ 1 line.
    4. External-oracle triangulation (V3). Agent-generated CodeQL + Semgrep + GitNexus queries; claim location appears in ≥2 oracles; each oracle query returns ≤10 results OR claim is in top-5.
    5. Counterfactual inversion (V4). Lead-proposed patch mechanically applied in worktree; PoC replayed; must NOT crash. Fresh agent reads patched file (no knowledge of PoC) and reports "no bug" or unrelated concern.
    6. Mutation-reachability (HIGH only, path length ≤ 4). abort() inserted at each claimed-traversed function; PoC must hit every abort when compiled -O0 -fno-inline -fno-optimize-sibling-calls.
    7. Dep-graph snapshot recorded. go.sum/Cargo.lock/package-lock.json SHAs pinned to candidates.json#target.dep_graph_snapshot. Validator checked that vulnerable symbol is in the target repo, not a dep.
  • Exit. All green → advance to Gate D. Validator disagreement → AskUserQuestion tie-break. Mutation-reach failure → not auto-reject; may indicate inlining (Attack 31) — retry with stricter compile flags before rejecting.
  • Hook. hooks/gate-c-fresh-tp.sh.

Gate C.5 — Counter-Symex Adversarial UNSAT (Tractable Classes)

  • Entry. Gate C pass + HIGH tier + symex_class != "N/A".
  • Checks (V9).
    1. Symex-Juror invoked in adversarial flavor — same symex_query.json BUT commit_sha rebound to the latest upstream release tag (via git ls-remote --tags | tail -1). Its binding mandate: prove the bug does NOT exist on the released artifact.
    2. Budget: 60 CPU-min (longer than Gate B.5 because adversary is allowed to explore more aggressively).
    3. symex_result.json#verdict inspected:
      • UNSAT_COMPLETEfinding killed: state = closed_patched_upstream_symex_confirmed. Release Pinner notified; no disclosure.
      • SAT_CONCRETE_VERIFIED → finding survives; advance to Gate D. Witness added to finding artifacts as "reproduces on release tag X" → upgrades disclosure severity.
      • INCONCLUSIVE / TIMEOUT → pass-through; no change to finding state.
    4. Counter-symex output is a provenance atom; rehashed at Gate D per V2.
  • Exit. Same as above. Only UNSAT_COMPLETE kills; everything else advances.
  • Hook. hooks/gate-c5-counter-symex.sh.
  • Rationale. Gate C.5 is adversarial UNSAT on release-tag. It complements V7 (Release Pinner) which is a DYNAMIC freshness check (PoC replay on release tarball). V9 is the STATIC / solver-based equivalent. V7 catches bug-silently-patched; V9 proves bug-mathematically-absent under release-tag semantics.

Gate D — Fix-Scope Verified + Last-Mile Freshness

  • Entry. Gate C pass AND (Gate C.5 pass OR symex_class: N/A OR C.5 abstained) + Patch Drafter produces patch.diff.
  • Checks (V4 + V7).
    1. Patch Fresh-Validator verdict. patch_validation.json confirms: patch applies at declared commit; build succeeds; PoC replayed under ASan produces no crash; make check/go test ./... pass rate ≥ pre-patch baseline; independent read of patch says it closes root cause, not symptom.
    2. API-impact clean. mcp__gitnexus__api_impact(symbol=patched_fn) shows no breaking change, OR if signature changed, every downstream caller verified post-patch.
    3. Intersect vuln callers with patched callers. impact(vuln_fn, upstream) ∩ impact(patched_fn, upstream); every vuln caller must go through the patch. Any bypass = incomplete fix.
    4. Release Pinner (V7). Within 15 min of scheduled send: git fetch origin; if detect_changes(since=claim_commit) shows intersection with vuln's call path → HOLD disclosure 24h, re-validate. Re-run PoC on latest-tag AND HEAD-of-main. Re-hash every provenance atom (V2 live re-verification).
    5. Group-query cross-repo. mcp__gitnexus__group_query(group=<lang>, q=<bug pattern>) finds no known-fixed-elsewhere instance with same signature (Attack 16 defense).
    6. Patch-counter-symex. Re-run symex-juror on the PATCHED code with the original bug's SAT witness constraint pinned. Expected UNSAT_COMPLETE. If SAT, the patch does not semantically close the bug — kick back to Patch Drafter with the surviving witness. (This is technique §19.13.5.)
  • Exit. All green → surface to human batch review. Release-Pin finds upstream fixed → auto-close as patched_upstream, notify Lead, do NOT disclose.
  • Hook. hooks/gate-d-send.sh runs as the literal last step before disclosure-tracker MCP sends email; defense-in-depth against race-window.

Sub-gates (cross-cutting)

  • Gate-0 (Target-Supplied Instruction Quarantine) — on every target load: hooks/quarantine-target-instructions.sh moves CLAUDE.md, AGENTS.md, .cursorrules, SECURITY.md from target root into .vr-state/targets/<project>/quarantine/, displays to operator, never into agent prompt. Defeats Attack 30.
  • Gate-Provenance (runs at every gate)hooks/pre-gate-provenance.sh validates all atoms in the finding's atoms[] array: re-shell git cat-file, curl NVD, sha256sum; any mismatch hard-blocks. Technique 3 enforced every gate transition, not just at disclosure.
  • Gate-Physical (runs at every VM interaction)hooks/post-ssh-execute.sh writes job_id + artifact_sha256 to /var/log/vr-framework/jobs.jsonl VM-side. Future gates verify job-id existence. Technique 9.

5. State Schema

All files under .vr-state/. Invariants: version: "1"; updated_at ISO-8601 UTC; checksum sha256 of body; atomic writes via write-tmp → fsync → rename → flock. Schema-validated on read; validation failure → quarantine + alert.

.vr-state/candidates.json

Unified schema for both fresh-audit and intake-resumed candidates. See research-resumption.md for origin field taxonomy.

{
  "version": "1",
  "candidates": {
    "C-2026-0423-001": {
      "id": "C-2026-0423-001",
      "created_at": "...", "updated_at": "...",
      "target": {
        "repo": "nginx/nginx", "commit": "a1b2c3d",
        "path": "src/http/v3/ngx_http_v3_parse.c", "lines": [412, 447],
        "dep_graph_snapshot": {"go.sum": "sha256:..."}
      },
      "origin": {
        "source_type": "fresh_audit|ossfuzz|reverted_fix|silent_patch|syzbot|...",
        "source_url": "...",
        "ingested_at": "...",
        "ingested_by_agent": "lead-auditor|intake-revert-hunter|...",
        "raw_artifact_ref": ".vr-state/artifacts/<hash>/"
      },
      "hypothesis": "...",
      "bug_class": "memory-corruption|logic|auth-bypass|...",
      "oracle_class": "asan-crash|panic|race|idempotency-violation|differential|reachability-mutation|logical-only",
      "severity_estimate": {"tier": "high|medium|low", "cvss_v3": 7.5, "confidence": "low"},
      "reproduction_state": "unreproduced|known_reproducer|reproduced_locally|weaponized",
      "poc_status": "none|described|crash_only|partial|full",
      "gates": {
        "G_A": {"status": "pass|fail|pending|skipped", "evidence_ref": "gates.json#eid-001", "ts": "..."},
        "G_B": {...}, "G_C": {...}, "G_D": {...}
      },
      "gates_skipped": {"G_B": "ossfuzz testcase pre-reproduced"},
      "atoms": [
        {"claim": "line 412", "source_type": "file_content", "source_ref": "<file>#L412@<commit>", "sha256": "..."},
        {"claim": "CVE-2023-NNNNN", "source_type": "external_url", "source_ref": "https://nvd.nist.gov/...", "sha256": "..."}
      ],
      "owner": {"kind": "agent|human", "id": "agent:lead-7|roger", "locked_until": "..."},
      "artifacts": {
        "poc": ".vr-state/artifacts/C-.../poc.py",
        "crash_input": ".vr-state/artifacts/C-.../crash-0001.bin",
        "asan_log": ".vr-state/artifacts/C-.../asan.log",
        "vm_job_ids": ["job-ab12", "job-cd34", "job-ef56"],
        "patch": ".vr-state/artifacts/C-.../patch.diff",
        "disclosure_draft": ".vr-state/artifacts/C-.../disclosure.md"
      },
      "state": "triage|investigate|repro|validate|draft|awaiting_human|disclosed|closed_tp|closed_fp|closed_dup|closed_patched_upstream",
      "confidence": {"current": 0.78, "history": [...]},
      "trust_tier": "native|imported_unverified|imported_trusted",
      "dedupe_keys": {
        "stacktrace_top3_hash": "...",
        "patch_fingerprint": "...",
        "cve_candidates": []
      },
      "lineage": {"parent": null, "variant_of": "CVE-2025-12345", "spawned_by": "agent:lead-3"},
      "predictions": {"P_real": 0.85, "P_poc_reproduces": 0.92, "P_attacker_reachable": 0.70, "P_patched_in_release": 0.05},
      "strategy_id": "0day-strategy-11-copy-fn",
      "tags": ["hpack", "integer-overflow"]
    }
  }
}

Invariants. state monotonic except via explicit reopen; owner.locked_until past → lock auto-expires; gate pass requires non-null evidence_ref; confidence.current ∈ [0,1]; atoms[] all hashable & live-re-verifiable.

.vr-state/gates.json (append-only)

{"entries": [
  {"eid": "eid-00001847", "candidate": "C-...", "gate": "G_B",
   "verdict": "pass", "score": 1.0,
   "evidence": {"asan_signature": "heap-buffer-overflow READ 8", "reproducible_runs": "10/10",
                "vm_job_ids": ["job-ab12","job-cd34","job-ef56"], "log_ref": "artifacts/C-.../asan.log"},
   "reviewer": "agent:repro-5", "ts": "...", "duration_s": 412, "prior_eid": null}
]}

.vr-state/vm-lock.json, .vr-state/agents.json, .vr-state/queue.json, .vr-state/hallucination-log.json, .vr-state/calibration.json, .vr-state/disclosure-pipeline.json

Per Wave 2 infrastructure.md §1. Full schemas preserved verbatim. Summary:

  • vm-lock.json — slot-based VM ownership, flock + CAS atomic acquisition, systemd-scope-backed hard cap.
  • agents.json — registry + 60s heartbeat; > 180s = stalled, > 600s = dead → watchdog reclaims.
  • queue.json — priority queue per stage (triage|investigate|repro|validate|disclose) + DLQ.
  • hallucination-log.json — every caught hallucination with {claim_type, caught_by, detection_method}.
  • calibration.json — agent-type-level Brier + FP_rate_{7d,30d} + stratified-by-bug-class.
  • disclosure-pipeline.json — outbound tracking with deadline + status timeline.

.vr-state/events/<ts>-<candidate>-<gate>.evt

Zero-byte sentinel files. PostToolUse hook on orchestrator streams them via Monitor tool; each event = one dispatch decision.

.vr-state/journal/<txn-id>.json

Multi-file transaction journal for crash recovery.

.vr-state/targets/<project>/{scope.yaml, maintainer_profile.md, archaeology_brief.md, invariants.json, coverage_gaps.json, quarantine/}

Per-target cached artifacts. scope.yaml carries operator-signed scope_ack_sig.

.vr-state/logs/<agent>.jsonl

Structured log per agent. Never deleted; compressed after 30d.


6. Claude Code Primitives — Usage Map

One-line per primitive + concrete use. Every primitive earns weight.

  • Plugins. Single plugin vr-audit bundling commands, agents, skills, hooks, MCP servers. Installed from private GitHub marketplace.
  • Subagents (agents/<name>.md). 10-agent roster (§3), each with tools: allowlist, model: tier, mode: worktree for destructive agents, explicit invariants:.
  • Skills (skills/<name>/SKILL.md). Agent-only (imperative library): vr-gate-schema, vr-write-gate, vr-atomic-state, vr-asan-poc, vr-vm-slot-lease, vr-corpus-snapshot. Agent+human: vr-variant-from-cve, vr-bisect, vr-harness-draft, vr-disclosure-template. Symex skills: vr-symex-hypothesis, vr-symex-slice, vr-symex-harness, vr-symex-solve, vr-symcc-concolic, vr-symex-replay (the last verifies a SAT witness replays to concrete ASan). Skills intersect with parent agent's tool profile.
  • Hooks (every event):
    • PreToolUsepre-gate-enforce.sh: block Bash side-channel writes (tee, dd, heredocs), check VM slot lease, enforce owner lock, rate-limit token bucket.
    • PostToolUsepost-tool-log.sh + post-gate-dispatch.sh: append to logs/<agent>.jsonl, touch .vr-state/events/<ts>.evt, update calibration counters.
    • UserPromptSubmitprompt-inject-status.sh: prepend gate-blocked/DLQ/VM/budget snippet; warn on rm -rf .vr-state.
    • SessionStartsession-brief.sh: replay journal on resume; print campaign target + gate ladder on fresh; load maintainers/<project>.md memory.
    • SessionEndsession-dump.sh: flush in-memory buffers; rotate logs > 30d; integrity-check .vr-state/; push audit commit to private state repo.
    • Stopstop-watchdog.sh: release any candidate locks + VM slots; mark agent stalled; re-queue unfinished task.
    • SubagentStopsubagent-verify.sh: confirm subagent produced expected gate result; if not, decrement agent-type trust score.
    • PreCompactpreserve-vr-state.sh: dump active-candidate digest + hot paths + last 5 gate decisions to .vr-state/sessions/<sid>/snapshot.json.
    • Notificationnotify-dispatch.sh: append records to .vr-state/notifications/<stream>.jsonl per vr-config.json#notifications.events; surfaced at next session-brief. No external integrations (no Discord/email webhooks — operator reviews inline).
  • Slash commands. /vuln-audit, /vuln-audit-resume, /vuln-audit-status, /vuln-audit-kill, /vuln-audit-gate-check, /vuln-audit-intake, /vuln-audit-autonomous, /vuln-audit-dashboard, /vuln-audit-calibrate, /vuln-audit-postmortem, /vuln-audit-import-session, /vuln-audit-scale-up, /vuln-audit-intake-batch, /vuln-audit-resume-abandoned.
  • MCP servers. gitnexus (primary graph), claude-code-ssh (Rocky VM; hunchom/claude-code-ssh plugin), context7 (target-library docs), nvd-query (CVE feed cache), oss-fuzz-crash, gh-sec-advisory (GHSA), codeql-runner, semgrep-runner, disclosure-tracker, angr-runner (symex executor; wraps angr + claripy + SymCC on Rocky, returns SAT/UNSAT/INCONCLUSIVE JSON — proposed Phase 5), klee-runner (KLEE-on-LLVM-IR executor via Docker klee/klee:3.0 — proposed Phase 5).
  • EnterWorktree / ExitWorktree. Patch-drafter, fix-scope-verifier, ASan PoC build, bisection (parallel worktrees — one per candidate commit), cross-version repro, fuzz campaign, DA PoC rerun with proposed fix applied, variant-hunter at CVE-fix commit. Pool cap: 8 concurrent; auto-reap > 24h to .vr-state/worktree-graveyard/.
  • TeamCreate. Per-sprint team (sprint-nginx-2026W15): orchestrator, watchdog, vm-provisioner. Workers spawn outside team via RemoteTrigger.
  • Memory (.claude/memory/). maintainers/<project>.md, false-positive-patterns/<repo>.md, novel-techniques.md, trusted-reviewers.md, dead-ends.md, user-prefs.md. Loaded on SessionStart.
  • CronCreate. vr-variant-hunt-weekly (Mon 06:00), vr-calibration-replay (daily 05:30), vr-stale-sweep (/15min), vr-disclosure-timeline (hourly), vr-target-rotation-audit (quarterly), vr-poc-regression (Sun 04:00), vr-corpus-archive (daily 03:00), vr-state-integrity (daily 02:00), vr-budget-watchdog (/30min), vr-batch-review-prompt (Sun 19:00).
  • RemoteTrigger. Fire-and-forget: variant-hunter (weekly), fuzz-campaign-starter, corpus-snapshotter, calibration-replay, vm-provisioner. Hybrid: fresh-validator, repro-engineer, bisector (triggered with task_id, polled via TaskGet).
  • ScheduleWakeup. Autonomous driver (§10). Delays: 60s events-pending, 270s tasks-in-flight (cache-warm), 1800s idle (cache cold OK). Never 300–1000s (worst cache cliff).
  • AskUserQuestion. Four canonical checkpoints: pre-disclosure approval, tie-break on similar candidates, scope confirmation, batch disclosure authorization. Multi-question payload when multiple pending — never spam.
  • Monitor. Fuzz-campaign ndjson stream (/tmp/vr/fuzz-campaign-<id>.ndjson) and .vr-state/events/ — each new crash / event = one notification, no polling.
  • Plan mode / ExitPlanMode. Canonical G_D human-in-loop gate: plan shows finding + full validation ladder + draft disclosure + proposed action. Human approves/rejects/edits.
  • Status line (bin/vr-status.sh). One line: vr | inflight: 47 | blocked: 3 | ready-send: 2 | DLQ: 1 | VM: rocky-0 67% | gate-C-042: review | next-tick: 11m | budget: $34/$100. Refreshed 5s.
  • Output styles. default (terse autonomous), plan (pre-disclosure), explanatory (--learn flag, triples cost).
  • TaskCreate / TaskGet / TaskUpdate / TaskList. Every long-running specialist dispatched via TaskCreate with deadline. Orchestrator polls TaskList(status=running|completed) per ScheduleWakeup tick. TaskUpdate(status=cancelled) for human overrides.
  • Background (run_in_background). Always: fuzz, corpus archive, VM provisioning, ASan rebuild, bisection. Never: scope-gate, dedup-triage, gate-result writes.
  • Model tier selection. All agents use Opus 4.6. Rationale: per user directive and Anthropic §1.2 (single high-capability agent beats specialized cheaper tiers when ground-truth is noisy). Haiku/Sonnet reserved for utility scripts only (crash-classifier regex, status-line shell). Cost trade-off accepted against fewer false-positive escalations to human review.
  • SendMessage. Exceptions-only: orchestrator→human (urgent ping), orchestrator→watchdog (immediate lock release), fresh-validator→orchestrator (CRITICAL halt). Default = write to shared state.
  • Permissions (.claude/settings.json). allow block for read-only + most MCP; ask for git push, disclosure send, CronCreate, ssh_deploy; deny for rm -rf .vr-state, git push --force, Write(.claude/**), sudo:*. Per-agent override narrows allowlist.
  • Marketplace. Private GitHub marketplace initially: claude plugin install github.com/roger/vr-audit. Public eventually with maintainer-specific memory stripped.

7. Ralph-Loop Integration Points

Ten loops. Each has mechanical convergence, hard budget cap, fresh-validator at exit, registered with safety-net service at spawn.

Loop Agent Mechanism Convergence hook Budget
harness-ralph Fuzzing Engineer /loop + ScheduleWakeup (270s) bin/check-harness.shafl-showmap edge-delta <5% × 3 + target_hit 12 iters / 4h / $10
min-ralph Crash Triage agent-internal while bin/check-min.sh — size + stack_hash stable across 2 iters 30 iters / 2h wallclock (MIN_WALLCLOCK_S env override)
stab-ralph Reproduction Agent /loop + ScheduleWakeup (60s) bin/check-stab.shsha256(exit || asan_stack) identical 3 consecutive 20 iters / 1h
campaign-ralph Fuzzing Engineer external shell (while true) bin/check-campaign.shedges_total Δ=0 × 8h + 0 uniq_crashes × 24h 7d / $100
patch-ralph Patch Drafter /loop + ScheduleWakeup (270s) bin/check-patch.shmake check ok + PoC no-crash + diff ≤ 3× smallest 8 iters / 2h
nov-ralph Lead Auditor (inline) agent-internal while bin/check-nov.sh — 3 orthogonal NVD/GHSA/git log queries return 0 sim>0.85 hits 5 iters
da-ralph Devils Advocate /loop + ScheduleWakeup (270s) bin/check-da.shnew_challenges: [] × 2 4 iters (diminishing)
variant-ralph Lead Auditor external shell bin/check-variant.sh — 0 candidates_added on final iter (yield plateau) 7d per family / $30/iter
invest-ralph Lead Auditor (inline) agent-internal while bin/check-invest.sh — hypothesis confirmed/refuted OR dead-end added OR PoC reproduced 6 iters per candidate
symex-ralph Symex-Juror /loop + ScheduleWakeup (270s) bin/check-symex.sh — inspects symex_result.json#verdict ∈ {SAT_CONCRETE_VERIFIED, UNSAT_COMPLETE} OR budget_exhausted==true 15 CPU-min HIGH (B.5), 60 CPU-min HIGH (C.5), 5 CPU-min MED

Safety nets.

  • Convergence criterion is always a shell hook inspecting an artifact — never the agent's self-report.
  • Hard-reset per iter: context reloaded from persisted artifacts; no accumulated chat history.
  • Divergence detector: if monotonic-expected metric moves wrong 2 iters in a row → abort, preserve best-iter.
  • Hourly budget auditor: projects trend; halts on 110% of allocated budget.
  • Fresh-validator at convergence: loops never bypass gates (patch-ralph's converged patch still goes through Gate D; harness-ralph's harness feeds normal G_A/G_B).
  • Symex-ralph has two additional hard caps: ulimit -t (CPU time) and ulimit -v 8G (RSS) — symbolic memory explosion otherwise tanks VM. SAT witnesses MUST replay to concrete ASan crash on VM; solver self-verdict is not binding without physical replay (Technique 9 extension).

8. Two Execution Modes

8a. Fresh research mode: /vuln-audit <target>

Full pipeline, one target, end-to-end.

/vuln-audit github.com/nginx/nginx [--scope path]
  │
  ▼ AskUserQuestion: confirm scope, severity-floor, budget-cap, disclosure-style
  ▼ orchestrator creates campaign dir, inits .vr-state/, loads maintainer_profile.md
  ▼
  Phase 1 — Lead Auditor SessionStart (Opus, long-context)
    ├─ audit-context-building skill — build mental model
    ├─ GitNexus: list_repos, route_map, impact, api_impact (§15)
    ├─ variant-hunting: CVE fix-diff analysis via vr-variant-from-cve
    ├─ vuln_density_score ranks all functions (§15)
    ├─ invest-ralph per candidate — propose HIGH/MED/LOW
    └─ write candidate_queue.json
  Phase 2 — Fuzzing Engineer (if ASan-class target)
    ├─ harness-ralph converges on harness
    ├─ campaign-ralph runs 7d fuzz
    └─ Crash Triage + min-ralph dedupe & minimize
  Phase 3 — Reproduction Agent per candidate
    └─ stab-ralph: 3x identical runs with VM job-id attestation
  Phase 4 — Gate A (static) → Gate B (PoC+oracle)
  Phase 5 — HIGH candidates: Fresh Validators (dual prior) + Devils Advocate (da-ralph)
    └─ Gate C
  Phase 6 — Patch Drafter (patch-ralph) → Patch Fresh-Validator
    └─ Gate D
  Phase 7 — await batch review (Sunday 19:00 UTC)
    ▼ Human approves → Release Pinner (T-15min) → Report Writer → disclosure-tracker MCP sends

Lead's session runs concurrently with specialist sessions. Lead keeps working on other candidates while Fuzzing Engineer grinds and Fresh Validator deliberates. Specialists return JSON; Lead integrates asynchronously. Session rotation daily via archaeology_brief.md handoff.

8b. Research resumption mode: /vuln-audit-intake <source>

Ingest external leads. Source auto-detected from URL or file extension:

URL / file Agent Skip matrix
issues.oss-fuzz.com/* intake-ossfuzz-client skip G_B (use testcase)
syzkaller.appspot.com/* intake-syzbot-client skip G_B
zerodayinitiative.com, talosintelligence.com intake-advisory-scraper require patch public before autoingest
seclists.org/oss-sec/* intake-osssec-scraper manual-trigger only
.patch/.diff intake-patch-analyzer
.pdf (academic) intake-paper-miner manual-trigger only
GitHub branch URL intake-branch-archaeologist restrict to past-contributor forks
git log matching Revert "..." intake-revert-hunter HIGH — original patch IS bug spec
stable-branch commit with terse fix msg intake-backport-differ HIGH — cross-ref NVD first
reverted-then-relanded with different wording intake-relanding-detector HIGH
Other session's .vr-state/ intake-session-merger always re-run Gate C + Gate D; trust_tier=imported_unverified

Unified candidate schema (§5). All origins normalize to the same candidates.json shape with origin.source_type distinguishing.

Gate-skip matrix (from Wave 1 research-resumption.md):

Source G_A G_B G_C G_D
fresh_audit required required required required
ossfuzz required skip (use testcase) required required
reverted_fix required required (reconstruct PoC) required required
silent_patch required required required required
imported_session required skip if completed always required always required
syzbot required skip (kernel reproducer) required required

Rule. Gate C (fresh TP) and Gate D (patch + last-mile freshness) are NEVER skipped, even for imported work. Trust but verify.

/vuln-audit-intake-batch manifest.yaml — bulk ingest with rate-limit hooks per-source.

/vuln-audit-resume-abandoned <cid> — reset gate pointer to first gate with new bearing evidence.

/vuln-audit-import-session <path> — three-way diff on conflicts; force gates_completed ⊆ {triage, poc_gen}; strip G_C and G_D regardless.


9. Minimum-Human-Interaction Policy

The contract. A human sees NOTHING from the pipeline until a candidate reaches Gate D pass. The orchestrator runs autonomously, ralph-loops converge mechanically, gates enforce without LLM judgment. When candidates accumulate past BATCH_THRESHOLD OR the Sunday 19:00 UTC cron fires, records land in .vr-state/notifications/*.jsonl and surface at the next session-brief. Review is inline (vr review --batch) — no external notification services.

9.1 Must-keep human touchpoints (non-negotiable)

Touchpoint Primitive Rationale
Target scoping at /vuln-audit start AskUserQuestion with scope/budget options Political + business judgment
Batch review (Sunday 19:00 UTC) vr review --batch TUI Accountability anchor
Gate D final approval (the send button) ExitPlanMode(approve=true) Legal, reputation
Fresh-Val + DA disagreement on same HIGH finding AskUserQuestion tie-break Two trusted specialists disagree → human call
Spec-compliant-but-target-ships-it AskUserQuestion (3 options) RFC judgment call
FP-rate > 20% × 2 weeks (kill trigger) AskUserQuestion halt Halt accountability
Budget > 95% of cap AskUserQuestion halt Spend control
External reviewer onboarding Manual registry edit Trust relationship

9.2 Eliminated via ralph-loop / auto

Harness iteration (harness-ralph), crash triage, candidate prioritization, fuzz management (campaign-ralph), novelty dedupe (nov-ralph), patch drafting (patch-ralph), report drafting (Report Writer one-shot), VM provisioning, stale sweep, corpus archiving, cross-repo variant hunt (variant-ralph), release pinning.

9.3 Notification channels

vr-config.json#notifications:

{
  "streams": {
    "ready_to_send_batch":    ".vr-state/notifications/review.jsonl",
    "fp_rate_alert":          ".vr-state/notifications/fp_rate.jsonl",
    "disclosure_deadline_14d":".vr-state/notifications/disclosure_deadline.jsonl",
    "budget_80pct":           ".vr-state/notifications/budget.jsonl",
    "budget_95pct":           ".vr-state/notifications/budget.jsonl",
    "validator_disagreement": ".vr-state/notifications/review.jsonl",
    "unattended_24h_summary": ".vr-state/notifications/daily.jsonl"
  }
}

Records are plain JSON lines. The session-brief hook reads recent unread entries and surfaces them at the next Claude Code session start. Status line updates every 5s — user sees progress without interrupting. No external notification services — everything is local file state.

9.4 Ping thresholds

Pull human back in:

  • Unattended > 24h → daily summary record appended to .vr-state/notifications/daily.jsonl.
  • FP rate 7d > 15% → immediate append to .vr-state/notifications/fp_rate.jsonl.
  • 2 candidates differ between DA and Fresh-Val → immediate append to .vr-state/notifications/review.jsonl.
  • Budget > 80% → budget.jsonl warn; > 95% → halt campaigns + budget.jsonl halt record.
  • Any Stop hook fires on Lead session mid-work → append to .vr-state/notifications/stall.jsonl.
  • Disclosure deadline < 14d with status ≠ patched → daily append to .vr-state/notifications/disclosure_deadline.jsonl.

9.5 Batch review UX

Command: vr review --batch --since last-batch. SRS-style cards; a/r/d/k/e/? keys; auto-commit to disclosure-pipeline.json; trigger Release Pinner on approvals; schedule sends 24h apart per-vendor. Mobile variant: vr serve --mobile over Tailscale.

Hard cap: findings older than 14 days force-surface as red; cannot defer twice.


10. Autonomous Driver (Pseudocode)

/vuln-audit-autonomous --budget-usd N --hours H [--targets file] [--batch-review-cadence weekly].

Runs until: budget exhausted, hours exceeded, human-required gate hit, or explicit /vuln-audit-kill.

Orchestrator is Python, not an LLM (ADR §Appendix A).

def autonomous_tick():
    # 1. Drain event queue (gate results, crash events, timer fires)
    for ev in drain_events(".vr-state/events/"):
        dispatch_next(ev)  # enqueues next gate per state machine

    # 2. Reap completed / failed background tasks
    for tid in TaskList(status="completed"):
        advance_gate_from(tid)
    for tid in TaskList(status="failed"):
        handle_failure(tid)  # retry with backoff or DLQ after 3 attempts

    # 3. Heartbeat check — reclaim dead agents' locks
    reclaim_dead_agents(threshold_s=600)

    # 4. Gate dispatch
    for cand in candidates_ready():
        gate = cand.next_gate
        if gate in HUMAN_GATES:
            enqueue_for_batch_review(cand)
        else:
            TaskCreate(agent=agent_for(gate), input=cand.to_json(), deadline=1800)

    # 5. Ralph-loop dispatch
    for loop in active_loops():
        if loop.mechanism == "external_shell":
            continue  # running as subprocess
        if loop.mechanism == "loop_slash" and ready_for_next_iter(loop):
            trigger_loop_tick(loop)

    # 6. Human gate check — BATCH only, never one-at-a-time
    if pending_human_reviews() >= BATCH_THRESHOLD or batch_cadence_hit():
        SendMessage(to=user, summary=human_ready_summary())

    # 7. Budget / health / alerts
    if spend_today() > 0.8 * budget_total:
        alert_user("budget 80%"); pause_new_campaigns()
    if spend_today() > budget_total:
        graceful_halt(); return

    # 8. Weekly-cron checks
    if weekly_cron_due("new-cve-rehunt"):
        RemoteTrigger("variant-hunter", last_week_nvd_delta())

    # 9. Compute next wakeup — cache-aware (never 300-1000s)
    delay = compute_delay()
    ScheduleWakeup(delaySeconds=delay,
                   prompt="<<autonomous-loop-dynamic>>",
                   reason=describe_state())

def compute_delay():
    if events_pending():              return 60
    if tasks_in_flight():             return 270   # cache-warm
    if vm_saturated_work_queued():    return 270
    if queue_empty() and no_active_loops(): return 1800  # idle, cache cold OK
    return 270  # default conservative

Cron integration (set at campaign start):

CronCreate: "0 6 * * 1"    → variant-hunter on last-week NVD delta
CronCreate: "30 5 * * *"   → calibration-analyst replays
CronCreate: "0 19 * * 0"   → batch-review prompt (Sunday 7pm UTC)
CronCreate: "*/15 * * * *" → stale-sweep watchdog
CronCreate: "0 4 * * 0"    → vr-poc-regression (all past TPs vs tip)

Monitor streams from .vr-state/events/ and fuzz ndjson; each event wakes orchestrator immediately, bypassing ScheduleWakeup.


11. Plugin Structure

plugin.json:

{
  "name": "vr-audit",
  "version": "1.0.0",
  "description": "Vulnerability research pipeline for Claude Code: six-gate verification ladder with hook-enforced evidence",
  "author": "roger",
  "commands": [
    "vuln-audit", "vuln-audit-resume", "vuln-audit-status", "vuln-audit-kill",
    "vuln-audit-gate-check", "vuln-audit-intake", "vuln-audit-intake-batch",
    "vuln-audit-resume-abandoned", "vuln-audit-autonomous",
    "vuln-audit-dashboard", "vuln-audit-calibrate", "vuln-audit-postmortem",
    "vuln-audit-import-session", "vuln-audit-scale-up"
  ],
  "agents": [
    "lead-auditor", "fresh-validator", "devils-advocate",
    "fuzzing-engineer", "crash-triage", "reproduction-agent",
    "symex-juror",
    "patch-drafter", "patch-fresh-validator", "release-pinner",
    "report-writer"
  ],
  "skills": [
    "vr-gate-schema", "vr-write-gate", "vr-atomic-state", "vr-asan-poc",
    "vr-gitnexus-protocols", "vr-harness-draft", "vr-disclosure-template",
    "vr-variant-from-cve", "vr-bisect", "vr-corpus-snapshot", "vr-vm-slot-lease",
    "vr-fp-check", "vr-counterfactual-patch", "vr-provenance-merkle",
    "vr-oracle-triangulate", "vr-fingerprint-match",
    "vr-symex-hypothesis", "vr-symex-slice", "vr-symex-harness",
    "vr-symex-solve", "vr-symcc-concolic", "vr-symex-replay"
  ],
  "hooks": ".claude/hooks/",
  "mcpServers": {
    "gitnexus":           {"command": "gitnexus", "args": ["mcp"]},
    "claude-code-ssh":    {"command": "claude-code-ssh-mcp"},
    "context7":           {"command": "npx", "args": ["-y", "@upstash/context7-mcp"]},
    "nvd-query":          {"command": "${CLAUDE_PLUGIN_ROOT}/mcp/nvd-query.py"},
    "oss-fuzz-crash":     {"command": "${CLAUDE_PLUGIN_ROOT}/mcp/oss-fuzz.py"},
    "semgrep-runner":     {"command": "${CLAUDE_PLUGIN_ROOT}/mcp/semgrep.py"},
    "codeql-runner":      {"command": "${CLAUDE_PLUGIN_ROOT}/mcp/codeql.py"},
    "gh-sec-advisory":    {"command": "${CLAUDE_PLUGIN_ROOT}/mcp/ghsa.py"},
    "disclosure-tracker": {"command": "${CLAUDE_PLUGIN_ROOT}/mcp/disclosure-tracker.py"},
    "angr-runner":  {"command": "${CLAUDE_PLUGIN_ROOT}/mcp/angr-runner.py"},
    "klee-runner":  {"command": "${CLAUDE_PLUGIN_ROOT}/mcp/klee-runner.py"}
  },
  "settings": {"statusLine": "${CLAUDE_PLUGIN_ROOT}/bin/vr-status.sh"},
  "privacy": {"shareFindings": false, "telemetryOptOut": true}
}

Directory layout:

vr-audit/
├── plugin.json
├── commands/        (slash command md files)
├── agents/          (10 agent md files w/ YAML frontmatter)
├── skills/          (16 skills, each SKILL.md + helper files)
├── hooks/           (~18 hook scripts, one per event × context)
├── mcp/             (Python MCP server implementations, incl. angr-runner.py, klee-runner.py)
├── bin/             (vr-status.sh, vr-dashboard, check-*-hook.sh, vr-variant-loop.sh)
├── templates/       (disclosure templates, patch notes templates)
├── schemas/         (JSON schemas for .vr-state/*.json — CODEOWNERS locked)
└── migrations/      (schema version migrations)

Privacy invariants baked in: .vr-state/ gitignored at plugin level; memory files never bundled; telemetry off; disclosure drafts local-only; evidence artifacts local.


12. Phased Rollout

Four phases. Each delivers a testable increment.

Phase 1 — Foundation + Single-Agent Fresh Audit (Weeks 1–2)

Goal. End-to-end /vuln-audit <target> in HIGH-finding-only mode. Single Lead Auditor + Reproduction Agent + Report Writer. One ralph-loop (invest-ralph). All 7 V1–V7 checks implemented as separate hook scripts; Gate ladder wired but tiering not yet live.

Files / agents touched.

  • agents/lead-auditor.md, agents/reproduction-agent.md, agents/report-writer.md.
  • skills/vr-write-gate, skills/vr-atomic-state, skills/vr-asan-poc, skills/vr-gitnexus-protocols, skills/vr-provenance-merkle.
  • hooks/gate-{a,b,c,d}-*.sh, hooks/pre-gate-provenance.sh, hooks/post-ssh-execute.sh, hooks/quarantine-target-instructions.sh.
  • mcp/nvd-query.py, mcp/disclosure-tracker.py.
  • schemas/candidates.json, schemas/gates.json.
  • .vr-state/ schema + atomic-write library.

Acceptance. Run /vuln-audit github.com/<small-known-vuln-target> on a historical 0-day (e.g., reintroduce CVE-2026-33523 AJP splitting). Framework produces disclosure-ready finding passing all 4 gates within 8h of wall-clock. Manual human approval demonstrated.

Expected time. ~2 weeks.

Phase 2 — Dual-Prior Validation + Devils Advocate + Full Anti-Hall Spine (Weeks 3–4)

Goal. Add V5 (dual-prior fresh validators), V6 (Devils Advocate exhaustion loop), V3 (oracle triangulation). Enable the P(FP) ≤ 0.01 target for HIGH findings.

Files / agents touched.

  • agents/fresh-validator.md (spawned twice with system_prompt_flavor), agents/devils-advocate.md, agents/patch-drafter.md, agents/patch-fresh-validator.md, agents/release-pinner.md.
  • skills/vr-counterfactual-patch, skills/vr-oracle-triangulate (with CodeQL + Semgrep + GitNexus sub-skills), skills/vr-fingerprint-match.
  • mcp/codeql-runner.py, mcp/semgrep-runner.py.
  • Hooks: pre-ralph-register-safety-net.sh, post-ralph-convergence-verify.sh.
  • bin/check-da.sh, bin/check-patch.sh, bin/check-nov.sh.
  • logs/crash_corpus.sqlite (ingested from historical CVE stack traces).

Acceptance. Replay 3 known-true-positives + 3 known-false-positives through full gate ladder; TPs pass, FPs rejected. Injected fabricated finding (synthetic ASan, fake SHA, plausible-but-wrong CVE) dies at each of V1/V2/V5/V6.

Expected time. ~2 weeks.

Phase 3 — Autonomous Driver + Ralph-Loops + Fuzzing (Weeks 5–7)

Goal. /vuln-audit-autonomous runs unattended for 72h. All 9 ralph-loops live. Fuzzing Engineer spawns and grinds. Crash Triage + min-ralph. Batch-review UX.

Files / agents touched.

  • agents/fuzzing-engineer.md, agents/crash-triage.md.
  • All 9 *-ralph loop drivers + their convergence hooks in bin/.
  • commands/vuln-audit-autonomous.md + orchestrator Python (bin/vr-orchestrator.py).
  • CronCreate setup for weekly/daily crons.
  • bin/vr-status.sh, bin/vr-dashboard, vr review --batch TUI.
  • VM-lock + vm-provisioner + systemd-scope integration.

Acceptance. Unattended 72h campaign on nginx or httpd produces ≥1 HIGH finding that passes all V1–V7 checks and is surfaced to batch review, with < $150 spend and no human intervention. Kill-review spot-check (P1 meta-process) confirms rejected candidates were truly rejected.

Expected time. ~3 weeks.

Phase 4 — Intake Pipeline + Tiering + Calibration (Weeks 8–10)

Goal. /vuln-audit-intake ingests the 15 sources (ossfuzz, syzbot, reverted commits, silent patches, relandings, imported sessions, etc.). MEDIUM/LOW tiering drops cost 60% for non-HIGH. Calibration ledger live (log-only, not routing).

Files / agents touched.

  • Intake agents: agents/intake-ossfuzz-client.md, agents/intake-syzbot-client.md, agents/intake-revert-hunter.md, agents/intake-backport-differ.md, agents/intake-relanding-detector.md, agents/intake-session-merger.md, etc.
  • commands/vuln-audit-intake.md, commands/vuln-audit-intake-batch.md, commands/vuln-audit-import-session.md, commands/vuln-audit-resume-abandoned.md.
  • schemas/candidates.json — source_type taxonomy + gate-skip matrix.
  • .vr-state/calibration.json + nightly Brier recompute job.
  • Tiering logic in orchestrator: HIGH=full stack, MEDIUM=skip G_C V5+V6, LOW=Lead-only.

Acceptance. Ingest 10 ossfuzz issues + 5 reverted-fix commits + 1 abandoned .vr-state/ session. Each produces candidate with correct gate-skip pattern; imported session forced through G_C + G_D re-validation. MEDIUM findings process at < $40 amortized cost.

Expected time. ~3 weeks.

Phase 5 — Symbolic Execution as Mechanical Juror (Weeks 11–13)

Goal. Wire Gate B.5 (V8) and Gate C.5 (V9) into the pipeline. Symex-Juror agent + symex-ralph loop live. Closes residual P(FP) gap for tractable bug classes (integer overflow, bounds, branch-feasibility, invariants, small-state parsers).

Files / agents touched.

  • agents/symex-juror.md (ephemeral Opus 4.6; two flavors: juror and counter-adversary).
  • Skills: vr-symex-hypothesis, vr-symex-slice, vr-symex-harness, vr-symex-solve, vr-symcc-concolic, vr-symex-replay.
  • Hooks: gate-b5-symex.sh, gate-c5-counter-symex.sh, post-symex-merkle.sh (enroll result into provenance).
  • Bin: bin/check-symex.sh.
  • MCP: mcp/angr-runner.py, mcp/klee-runner.py.
  • Rocky VM provisioning: install KLEE via docker pull klee/klee:3.0, build SymCC from source against clang-20, verify angr 9.2 + claripy + z3 + unicorn already present (per user VM inventory).
  • Memory: ~/.claude/memory/symex-ledger.md seeded, sync wired into SessionEnd hook.
  • Schema: schemas/symex_query.json, schemas/symex_result.json (CODEOWNERS-locked).

Acceptance. (1) Replay CVE-2022-41723 (HPACK integer) and confirm B.5 returns SAT-concrete-verified. (2) Replay VULN-009 (url-zone-bracket, fixed in Go 1.24.2) and confirm C.5 returns UNSAT_COMPLETE on release tag → finding killed as closed_patched_upstream_symex_confirmed. (3) Seed-inject a plausible-but-impossible bound claim (agent says overflow, algebra forbids); B.5 returns UNSAT_COMPLETE and auto-rejects. (4) At least one HIGH finding produced end-to-end through V1–V9 with witness.bin in artifacts.

Expected time. ~3 weeks.

Total build time. ~13 weeks to production (Phase 5 symex optional for first disclosure; hard gate from Phase 5+).


13. Anti-Hallucination Spine

Top 7 techniques, each enforced mechanically. These ship in Phase 1–2 and are non-negotiable.

  1. Provenance Merkle tree (V2, Technique 3). Every factual atom (claim, source_type, source_ref, sha256). Hook live-rehashes at every gate transition AND at disclosure send time. Hard-block on mismatch. Catches: confabulated CVE IDs, line drift after rebase, fabricated SHAs. Enforced by hooks/pre-gate-provenance.sh.

  2. Execution-Trace Physical Grounding (V1, Technique 9). Every ASan/gdb/strace artifact carries a VM-side job_id written to /var/log/vr-framework/jobs.jsonl on Rocky. Future gates verify job-id existence + artifact SHA match. Catches: fabricated stderr, synthetic ASan stack traces. Enforced by hooks/post-ssh-execute.sh.

  3. Counterfactual Diff Inversion (V4, Technique 1). Patch mechanically applied in worktree; PoC replayed; must NOT crash. Fresh agent on patched file must not see the bug. Catches: mislocated root cause, cosmetic fixes. Enforced by skills/vr-counterfactual-patch invoked during Gate C.

  4. Dual-Prior Differential Derivation (V5, Technique 5). Two Fresh Validators with opposing priors (defensive + aggressive) both derive compatible trigger inputs from raw source only. PoC stack hashes must match. Catches: shared-training-prior hallucinations, mode collapse on CVE-lookalikes. Enforced by agents/fresh-validator.md spawned twice with distinct system_prompt_flavor.

  5. External-Oracle Triangulation (V3, Technique 6). Claim translated into CodeQL + Semgrep + GitNexus queries; ≥2 must fire; each query specificity ≤10 results OR claim in top-5. Catches: claims internally consistent but not in AST. Enforced by skills/vr-oracle-triangulate.

  6. Crash Fingerprint Corpus (Technique 7). logs/crash_corpus.sqlite with three buckets: historical CVE fingerprints, prior findings, known-benign (all test-fixture strings auto-ingested). Fuzzy match on top-3 app-frame n-grams + stack depth. Catches: rediscovered patched CVEs, test-fixture-as-authority. Enforced by skills/vr-fingerprint-match.

  7. Last-Mile Release Pinning (V7, Technique 4). T-15min before disclosure send: git fetch, re-run PoC on tagged release AND HEAD-of-main, re-hash provenance atoms. Hold 24h if HEAD diverged from claim commit in ways intersecting vuln path. Catches: silently-patched-between-audit-and-disclosure, tag-vs-main race window. Enforced by agents/release-pinner.md running from Gate D hook.

Deferred (Phase 3+ or optional):

  • Mutation-Testing Reachability Gate (Technique 10) — compute-expensive; reserve for HIGH-severity findings, path length ≤ 4, compile with -O0 -fno-inline. Catches phantom reachability claims.
  • Symbolic Juror (Technique 2)PROMOTED to Phase 5 mandatory gate (§19). Applies to tractable classes (integer, bounds, branch-feasibility, invariants, small-state parsers). Implemented as Gate B.5 (SAT-witness) + Gate C.5 (adversarial UNSAT on release tag) + Gate D patch-counter-symex. Non-tractable classes (concurrency, spec, exploitability) remain V1–V7 only — still ≥99% without symex.
  • Calibration Ledger (Technique 8) — log-only in Phase 4; don't wire into routing until 50+ findings produce data.

14. Red-Team Attack Coverage

Top 5 attacks + defenses + residual risk.

# Attack Defense in this architecture Residual risk
7 Synthetic ASan output (fabricated stderr) V1 (Execution-Trace Physical Grounding) — VM-side job_id + artifact SHA verification at every gate None if hook is correctly installed. Risk is hook bypass; mitigated by PreToolUse on Bash blocking side-channel writes (tee, dd, heredoc into /var/log/vr-framework/jobs.jsonl)
15 Real-SHA-wrong-fix in fix_diffs_read[] V6 (DA exhaustion) demands verbatim hunks from git show with diff_line_overlap_with_finding_location ≥ 1 line mechanically checked Low. An attacker with push to the target repo could craft a fix that overlaps the claimed line but doesn't fix the bug — partially caught by V4 counterfactual inversion (patch+replay must silence crash). Residual risk exists if both V4 and V6 are fooled by same supply-chain-poisoned commit; pinned by dep_graph_snapshot provenance.
23 Silently patched between audit and disclosure V7 (Release Pinner) T-15min freshness + hooks/gate-d-send.sh as literal last step before SMTP send + provenance live-rehash Near-zero. 15-minute window between V7 and send is a tiny attack surface; auto-hold 24h if any divergence in vuln path.
1 Fresh Validator shares prior (shared model weights) V5 (Dual-Prior Differential Derivation) — two validators with opposing system_prompt_flavor must derive compatible triggers from raw source Residual: both priors share the same underlying model. An extremely strong attractor (e.g., well-known CVE-pattern-lookalike) could fool both priors AND V6 simultaneously. Mitigated only by the 6-fold independence across V1–V7; not fully eliminated.
18 Adversarial comments / CLAUDE.md in target hooks/quarantine-target-instructions.sh moves CLAUDE.md, AGENTS.md, SECURITY.md, .cursorrules to .vr-state/targets/<project>/quarantine/ before agent sees target. Lead Auditor operates on AST-stripped code view Residual: source comments (not top-level instruction files) still accessible to Lead for context. An attacker with commit access could embed "this is safe because X" to mislead. Defense-in-depth: V4/V5/V6 re-derive from AST regardless of comments; V3 queries ignore comments. Strong defense overall.

Honest residual risks:

  • Attack 21 (Brier gaming) partially undefended in Phase 1–3 because calibration ledger is log-only. Phase 4 stratifies Brier by claim class + novelty; still Goodhart-prone. Mitigation is non-mechanical (human review of kill decisions per P1).
  • Attack 22 (conservative agent misses real bugs) — we accept precision/recall tradeoff. Seed-bug injection (§P2 calibration day) bounds FN rate empirically.
  • Attack 31 (mutation-reach inlining) — compile with -O0 -fno-inline -fno-optimize-sibling-calls, but optimization-disabled binary ≠ production binary. We accept this: mutation gate validates the path exists in some build; production crash is validated separately by V1.
  • Attack 29 (schema/hook/CODEOWNERS supply-chain).claude/ is CODEOWNERS-locked and requires two-human review for changes. CI fails if required: list shrinks or hook exit codes soften. Still assumes the maintainer's PR review process is trustworthy.

15. GitNexus Usage Index

Per-agent, the 2–3 most important queries. Condensed from gitnexus-heavy-integration.md.

Agent Queries
Lead Auditor (variant hunt) detect_changes(scope=compare, base=<pre-fix>, head=<post-fix>) for CVE fix diffs; impact(patched_fn, upstream, depth=3) to enumerate all callers that didn't get the analogous fix; group_query(group=<lang>, q=<pattern>) for cross-repo variants.
Lead Auditor (architecture map) route_map(repo) for HTTP/RPC entrypoints; impact(handler, downstream, depth=5) for per-entrypoint reach set; cypher("MATCH (f) WHERE NOT ()-[:CALLS]->(f) AND f.name =~ '(?i).*(handle|serve|dispatch).*' RETURN f") for externally-reachable.
Lead Auditor (vuln_density_score) impact(F, upstream, depth=6) for blast radius; detect_changes(scope=history, fn=F) for churn-or-silence; group_query for cross-repo callers (public API surface).
Fresh Validator context(name=<claimed_fn>, file=<claimed_file>, content=true) to verify file:line exists at pinned commit; impact(target=fn, upstream, depth=5) for reachability witness; cypher("MATCH path=(e)-[:CALLS*1..10]->(v {uid:'...'}) RETURN path LIMIT 3") for shortest reach path.
Devils Advocate impact(vuln_fn, upstream, depth=6) to find any pre-call validator; detect_changes(scope=since, ref=<claim_commit>) for silent upstream fix; shape_check(struct=<input>) for invariants enforced at construction.
Fuzzing Engineer context(name=<vuln_fn>, content=true) + all functions on ingress→vuln path (harness source dump); shape_check(struct=<input_struct>) for payload constraints; impact(target=vuln_fn, upstream, includeTests=true) for existing tests as PoC templates.
Reproduction Agent context(name=<frame_fn>, content=true) per ASan stack frame for physical-grounding; cypher("MATCH path=(e)-[:CALLS*1..10]->(v {uid:'...'}) RETURN path LIMIT 1") for concrete call chain to replicate.
Patch Drafter context(name=<vuln_fn>) for exact source; api_impact(symbol=<patched_fn>) to avoid breaking downstream; impact(patched_fn, upstream) to verify every caller still satisfies new contract.
Patch Fresh-Validator detect_changes(scope=compare, base=HEAD, head=<patched>) for touched functions; intersect impact(originally_vuln_fn, upstream) with impact(patched_fn, upstream) — any bypass caller = incomplete fix.
Release Pinner detect_changes(scope=compare, base=<release>, head=<tip>) for drift between claim and release; api_impact(symbol=<vuln_fn>) to check rename/deletion.
Report Writer context(name=<vuln_fn>, content=true) to embed exact source; impact(vuln_fn, upstream, depth=3) to cite reachable ingress path; detect_changes(scope=history, path=<file>) for "introduced in commit X" advisory text.

Failure-mode fallbacks per gitnexus-heavy-integration.md §6: stale graph → gitnexus-cli analyze --force in background + block Gate C; ambiguous symbol → pass file=<path>; slow Cypher → add LIMIT + depth cap; Go-stdlib-too-big → vendored subset; MCP unavailable → grep+read fallback documented per skill.


16. Scaling Path

10 in flight (Day 1)

Single Rocky VM, single researcher, flat .vr-state/. Works. What binds first: operator fatigue (not infrastructure). TUI dashboard adequate.

50 in flight

Needs:

  • Priority queue (not FIFO) in queue.json.
  • Watchdog + heartbeat discipline enforced (no silent stalls).
  • Checkpoint/restore on fuzz campaigns (8-core Rocky sustains ~3 concurrent fuzz slots; time-slice across more candidates).
  • Batch-review mode live (50 findings/week = 10/day average).
  • Calibration ledger log-only → routing: per-agent-type FP rate affects queue priority.

Breaks first. VM CPU (fuzzing contention) + human review throughput.

500 in flight

Needs full Wave 2 infrastructure:

  • vm-provisioner auto-spinning Rocky nodes (5–10 concurrent via DO API, golden-snapshot boot ~15min).
  • Shared state: git-backed .vr-state/ → Postgres or S3+DynamoDB for concurrent locks. NFS flock unreliable at scale.
  • External human validators: vr-config.json#external_reviewers with scoped access.
  • Automated disclosure deadline tracking with escalation ladder (21d → pre-CVE reserve; 30d → public warning).
  • Calibration becomes per-agent-type × per-vulnerability-class (investigator confidence on memcorr ≠ on auth-bypass).
  • disclosure-coordinator role split from validator — coordinator manages per-maintainer token bucket + CRM-lite.

Breaks first at 500. Three simultaneously: (a) git-merge pain on .vr-state/, (b) VM budget > $50/day, (c) disclosure coordination across ~100 maintainers needs CRM-lite, not ad-hoc email.

Invariant that must not bend at any scale: every gate result has evidence pointer; every hallucination logged and fed to calibration; every candidate's full history reconstructable from state alone. That's the difference between a pipeline and a mess.


17. Open Decisions

Recommended defaults given; operator confirms at /vuln-audit start.

  1. First target? Recommend: nginx (mature fuzzer coverage per Anthropic §1.7, well-known maintainer, HTTP-adjacent — plays to your CVE-2026-33523 AJP strengths). Alternatives: httpd, curl, openssl. Avoid: Linux kernel, Chromium (GitNexus capacity).
  2. Budget cap? Recommend: $100/day hard, $1500/month soft, $250 per-finding kill trigger. Per Wave 3.5 §10, one HIGH finding amortizes to ~$103; 8 findings/week = ~$40k/year.
  3. Disclosure style? Recommend: 45-day hold for non-critical, 30-day for critical, per-vendor token bucket max 3 findings/week. Acceleration triggers: active exploitation, PoC leak, other-researcher announcement, maintainer request.
  4. Severity floor for autonomous disclosure? Recommend: HIGH only. MEDIUM + LOW require explicit human promotion via vr review --promote C-.... Reduces batch-review cognitive load.
  5. External reviewers? Recommend: defer until 50-in-flight milestone, then recruit 2–3 trusted peers with scoped external_reviewers access. Anthropic's scaling move; don't pre-optimize.
  6. Rocky VM count? Recommend: single VM until batch-review exceeds 30 findings/week; then provision second Rocky via vm-provisioner. 8 vCPU sustains ~3 concurrent fuzz slots which covers 3–5 findings/week throughput.
  7. Symex solver backend? Recommend: KLEE for C/C++ bitcode (reproducible, Docker-isolated) + angr for binary-only targets + SymCC for concolic hybrid. Rationale: KLEE is most mature on LLVM-IR and our targets are LLVM-buildable; angr fills binary-only gap; SymCC is our concolic fallback for path explosion. Revisit if Z3 licensing changes or boolector v4 beats current cvc5.
  8. Symex budget default? Recommend: 15 CPU-min per HIGH at Gate B.5, 60 CPU-min at Gate C.5, 5 CPU-min at MED. Per-finding amortized cost ≈ $1 (Phase 5 §19.12). Hard cap 3600s. Revisit if measured INCONCLUSIVE rate exceeds 40% — indicates slicing is too generous.

18. Build Order

Atomic-commit-able steps, referencing phased rollout.

Sprint 1 (Week 1) — Plugin skeleton.

  1. plugin.json + directory structure.
  2. schemas/candidates.json + schemas/gates.json — JSON Schema files, CODEOWNERS locked.
  3. skills/vr-atomic-state — write-tmp → fsync → rename → flock library.
  4. skills/vr-write-gate — gate result library calling atomic-state.
  5. .vr-state/ layout created + .gitignore rules.
  6. .claude/settings.json with allow/ask/deny + per-agent overrides.

Sprint 2 (Week 2) — Lead Auditor + Reproduction + Gate A/B. 7. agents/lead-auditor.md + skills/vr-gitnexus-protocols. 8. agents/reproduction-agent.md + skills/vr-asan-poc. 9. hooks/gate-a-scope.sh, hooks/gate-b-poc.sh, hooks/post-ssh-execute.sh (job-id writer). 10. commands/vuln-audit.md — invoke Lead + Reproduction, run Gate A + B. 11. mcp/nvd-query.py — CVE feed cache. 12. Acceptance test: reproduce CVE-2026-33523 AJP splitting end-to-end in <target>/httpd-2.4.59.

Sprint 3 (Weeks 3–4) — Anti-Hall Spine (V1–V7). 13. skills/vr-provenance-merkle + hooks/pre-gate-provenance.sh. 14. skills/vr-counterfactual-patch. 15. agents/fresh-validator.md with system_prompt_flavor parameter, spawned twice. 16. agents/devils-advocate.md + da-ralph convergence hook. 17. skills/vr-oracle-triangulate + mcp/codeql-runner.py + mcp/semgrep-runner.py. 18. skills/vr-fingerprint-match + logs/crash_corpus.sqlite seed. 19. agents/patch-drafter.md + agents/patch-fresh-validator.md + patch-ralph. 20. agents/release-pinner.md + hooks/gate-d-send.sh. 21. agents/report-writer.md + skills/vr-disclosure-template. 22. commands/vuln-audit.md end-to-end: Lead → Reproduction → Gate B → Fresh Val × 2 → DA → Patch → Patch Fresh Val → Release Pinner → Report Writer.

Sprint 4 (Weeks 5–7) — Autonomous + Ralph-Loops + Fuzzing. 23. agents/fuzzing-engineer.md + agents/crash-triage.md. 24. All 9 *-ralph convergence hooks in bin/. 25. bin/vr-orchestrator.py — autonomous tick loop (§10). 26. commands/vuln-audit-autonomous.md. 27. CronCreate setup scripts. 28. VM-lock + vm-provisioner + systemd-scope. 29. bin/vr-status.sh, bin/vr-dashboard. 30. vr review --batch TUI. 31. Acceptance test: unattended 72h campaign.

Sprint 5 (Weeks 8–10) — Intake + Tiering + Calibration. 32. Intake agents (ossfuzz, syzbot, revert-hunter, backport-differ, relanding-detector, session-merger). 33. commands/vuln-audit-intake.md + auto-dispatcher. 34. Gate-skip matrix in orchestrator. 35. Tiering (HIGH/MEDIUM/LOW routing). 36. .vr-state/calibration.json + nightly Brier recompute. 37. Acceptance test: ingest 10 ossfuzz + 5 reverted-fix + 1 imported session.

Sprint 6 (Weeks 11–13) — Symex Gates + Juror. Build order is the Phase 5 file list from Patch 8: VM provisioning (KLEE Docker + SymCC) → schemas → 6 skills → symex-juror agent → convergence hook + 3 gate hooks → 2 MCP servers → symex-ralph wiring → ledger seed → acceptance tests (items 38–51).


Appendix A: Decision Record (Compacted)

Title. Hybrid Lead-Auditor-plus-Specialists architecture. Status. Accepted. Supersedes Wave 1/Wave 2 multi-agent designs. Context. Anthropic produced 500+ 0-days with one Claude Opus + unix tools + human validation. Wave 1 proposed 11 agents; Wave 2 proposed 15 more. Handoff cost erodes single-agent context-coherence advantage. Decision.

  • One Lead Auditor (Opus, long-context) per campaign. Owns candidate generation, strategy pivots, specialist dispatch. Folds cve-analyst, architecture-mapper, novelty-checker, auditor, exploit-primitive-mapper, maintainer-profiler, invariant-extractor, coverage-gap-mapper, session-archaeologist, variant-hunter-cross-language, disclosure-timing-optimizer, dependency-reachability-agent, race-condition-hunter, crypto-auditor.
  • Nine ephemeral specialists (Fresh Validator, Devils Advocate, Fuzzing Engineer, Crash Triage, Reproduction, Patch Drafter, Patch Fresh-Validator, Release Pinner, Report Writer) — each justified by ≥1 of: narrower tools, cleaner prior, longer-running profile, persistent artifact.
  • Stateless Python/shell orchestrator (not LLM). Deterministic gate enforcement, prompt-injection-resistant.
  • Tiering: HIGH = full V1–V9 stack + P(FP) ≤ 0.01 target; MEDIUM skips V5+V6; LOW is Lead-only. Pros. Preserves context coherence + pivot agility (single-agent's strength); gets prior-quarantine + tool-allowlist + resource-isolation + last-mile-freshness (multi-agent's strength); deterministic orchestrator resists prompt injection; agent count 10 (not 26). Cons. Lead session-rotation cost (mitigated by archaeology_brief.md); tiering risk (MEDIUM FP rate higher, measured separately); orchestrator must fail-closed to human when confused. Revisit when. FP rate > 10% × 2 windows; throughput < 3 HIGH/week; specialist spend > Lead spend (over-specialization); new model (Opus 5+ / 2M context) changes context-saturation math; ≥3 concurrent campaigns.

Appendix B: Glossary

  • Agent. A Claude session with a specific system prompt, tool allowlist, and model tier.
  • Candidate. A potential vulnerability under investigation. Lives in .vr-state/candidates.json with a C-YYYY-MMDD-NNN ID.
  • Finding. A candidate that has passed Gate D — a real vulnerability ready for disclosure.
  • Gate. A mechanical checkpoint (A, B, C, D) with shell-hook-enforced entry/exit conditions. Gates never ask an LLM "are you sure?"
  • Ralph-loop. A bounded iteration loop with mechanical convergence criterion, budget cap, and fresh-validator at exit. Nine instances in this architecture (*-ralph).
  • Fresh Validator. An ephemeral Opus agent with no access to .vr-state/, logs/, or the Lead's transcript. Derives finding conclusion from raw source only. Two instances spawned with opposing priors (V5).
  • Devils Advocate (DA). An ephemeral Opus agent whose binding mandate is to attack a finding. Runs da-ralph until new_challenges: [] × 2.
  • Oracle class. The mechanical signal class for ground-truth validation: asan-crash, panic, race, idempotency-violation, differential, reachability-mutation, logical-only.
  • Provenance atom. A (claim, source_type, source_ref, sha256) tuple attached to every factual assertion in a finding. Live-rehashed at every gate.
  • VM job-id. A cryptographic attestation written to /var/log/vr-framework/jobs.jsonl on Rocky VM whenever ssh_execute produces an artifact. Verifies output was actually generated, not fabricated.
  • GitNexus. Code-graph MCP server. Primary query substrate — every agent is a GitNexus client first, reasoner second.
  • 99% confidence. P(FP | Gate A ∧ B ∧ C ∧ D pass) ≤ 0.01 for HIGH findings, computed as product of V1–V7 independent FP probabilities with 10× correlation slack.
  • Batch review. Sunday 19:00 UTC human review session, SRS-style TUI, target 20 findings in 30 min.
  • Meta-process. A scheduled non-agent workflow: P1 (daily kill-review), P2 (calibration day), P5 (adversarial self-play), P10 (PoC regression), P11 (strategy yield dashboard), etc.
  • Symex-Juror. An ephemeral Opus 4.6 agent whose only input is a structured symex_query.json (no Lead rationale, no .vr-state, no transcript). Runs angr/KLEE/SymCC and emits symex_result.json with verdict ∈ {SAT_CONCRETE_VERIFIED, UNSAT_COMPLETE, INCONCLUSIVE, TIMEOUT}. Two flavors: juror (Gate B.5) and counter-adversary (Gate C.5, pinned to release-tag).
  • Gate B.5 / Gate C.5. Symex sub-gates. B.5 = SAT-witness required (tractable classes); C.5 = adversarial UNSAT attempt on release-tag. See §19.8.
  • Symex class. Every HIGH candidate is classified: tractable (integer, bounds, branch-feasibility, small-state parser, invariant) or N/A (concurrency, logic, heap exploitability, UI, Go stdlib-scale). Tractable gets V8+V9; N/A relies on V1–V7.
  • Witness. A concrete input produced by a SAT verdict. A witness is BINDING only after it replays on Rocky VM and produces an ASan stack hash matching Gate B's signature (Technique 9 applied to solver output).

End of master architecture. Build from §18. Revisit when decision-record triggers fire.