A merged verdict layer for the agent-gov toolchain. GovVerdict reads canonical JSON reports from ScopeTrail, PolicyMesh, CapabilityEcho, TaskBound, SessionTrail, and related tools, dedupes their findings, and renders one review.
One PR can produce five useful reports. Five separate comments are easy to ignore. GovVerdict collapses them into a single severity-ranked verdict so reviewers see the worst cross-tool signal first.
flowchart LR
Scope["ScopeTrail<br/>config drift"] --> Gov
Policy["PolicyMesh<br/>policy contradictions"] --> Gov
Echo["CapabilityEcho<br/>code capability drift"] --> Gov
Bound["TaskBound<br/>scope creep"] --> Gov
Session["SessionTrail<br/>runtime behavior"] --> Gov
Gov[("GovVerdict<br/>merge + dedupe + render")] --> Review["One PR review<br/>terminal · markdown · JSON"]
classDef input fill:#1e293b,stroke:#334155,color:#e2e8f0
classDef engine fill:#0f172a,stroke:#1e293b,color:#e2e8f0,stroke-width:2px
classDef output fill:#0c4a6e,stroke:#0369a1,color:#e0f2fe
class Scope,Policy,Echo,Bound,Session input
class Gov engine
class Review output
See also: agent-gov-core for the shared Finding schema · agent-gov-demo for the end-to-end sample PR.
When every detector posts separately, reviewers tune out. The same widened permission can appear in ScopeTrail and PolicyMesh. A low-severity transcript oddity can sit above the critical workflow permission that actually matters. Invalid reports can disappear into CI logs.
GovVerdict exists to make the suite feel like one reviewer: read every canonical report, dedupe stable fingerprints, surface invalid inputs, roll up the aggregate rating, and render one decision-ready summary.
| Input | Role |
|---|---|
| ScopeTrail | PR-level agent config drift. |
| PolicyMesh | Contradictory current policy surfaces. |
| CapabilityEcho | New executable capability in code, manifests, workflows, and Dockerfiles. |
| TaskBound | Stated task vs. actual diff. |
| SessionTrail | Runtime behavior from JSONL transcripts. |
| AgentPulse / future tools | Any compatible canonical Report envelope. |
# Assumes the consumer tools already wrote canonical reports to ./reports/
npx govverdict@latest review --reports "reports/*.json" --format termFor the GitHub Action wiring, see examples/agent-gov-review.yml. It runs the suite tools, collects their reports, and hands the glob to Conalh/GovVerdict@v0.2.1.
Run against the fixture reports in this repo (test/fixtures/*-report.json):
$ npx govverdict review --reports "test/fixtures/*-report.json" --format term
GovVerdict: CRITICAL
====================
Sources: 5 report(s) — capability_echo, policy_mesh, scope_trail, session_trail, task_bound
Findings: 6 unique (deduped 0, dropped below threshold 0)
[CRITICAL] 1
- capability_echo.workflow_permission_write: CI workflow grants contents: write. (.github/workflows/ci.yml:5)
[HIGH] 1
- scope_trail.permission_allow_widened: Claude permission allowlist now includes Bash(npm *). (.claude/settings.json:12)
[MEDIUM] 3
- capability_echo.high_capability_dep_added: New dependency puppeteer at version 22.0.0. (package.json:18)
- policy_mesh.mcp_command_mismatch: MCP server fileserver disagrees across .mcp.json and .claude/mcp.json. (.mcp.json:4)
- session_trail.unusual_runtime_tool: Agent invoked WebFetch 8 times — outlier for this conversation. (session.jsonl:142)
[LOW] 1
- task_bound.out_of_scope_file: Modified docs/CHANGELOG.md outside declared scope src/auth/** (docs/CHANGELOG.md:1)
--format json emits a validated MergedReport envelope; --format md produces a collapsible Markdown summary ready for $GITHUB_STEP_SUMMARY.
- Local-only. Reads JSON files off disk, writes one file or stdout. No network, no telemetry, no API keys. The only runtime dependency is
agent-gov-core. - Substrate, not orchestrator. Wires existing
agent-gov-corehelpers:validateReport,mergeFindings,applyExceptions,generateWorkflowSummary,emitFindingAnnotation,anyAtOrAbove. - Dedup by fingerprint. Each consumer tool sets a stable
Finding.fingerprint; identical findings collapse to one row with theduplicateCollapsedcounter on the merged report. - Never silently drop. Unreadable files, invalid envelopes, and individual malformed findings surface on
invalidReports[]/invalidFindings[]. - Exceptions with expiry.
--exceptions baseline.jsoncsuppresses active rules; expired rules re-surface as[EXPIRED WHITELIST]low-severity findings so baselines cannot quietly rot. - GitHub-aware. Under
$GITHUB_ACTIONS=true, emits::warning/::errorannotations and writesrating,findings-count,invalid-reports-count,merged-report-pathto$GITHUB_OUTPUT.
- One observable outcome per input. A bad report is itself a reportable condition, not invisible noise.
- Worst finding wins. The aggregate rating follows the highest surviving severity after thresholding and exceptions.
- Dedup without erasing context. Fingerprints collapse duplicate findings while keeping counts so repeated tool agreement remains visible.
- Thin layer. Cross-tool primitives belong in
agent-gov-core; GovVerdict is the merge/render layer.
CLI flags (govverdict review --reports <glob> [options]):
| Flag | Required | Default | Description |
|---|---|---|---|
--reports <glob> |
yes | — | Glob of report JSON files. * and ? in basename only. A directory path expands to its *.json children. |
--threshold <sev> |
no | — | Drop findings below this severity. One of low, medium, high, critical. |
--exceptions <path> |
no | — | JSONC baseline. Either an array or { exceptions: [...] }. |
--workflow-name <str> |
no | — | Propagated onto the merged report. Cross-walks to OpenTelemetry gen_ai.workflow.name. |
--format <fmt> |
no | term |
One of term, md, json. The GitHub Action defaults to md for $GITHUB_STEP_SUMMARY. |
--output <path> |
no | stdout | Write to file instead of stdout. |
--fail-on <sev|none> |
no | none |
Exit 1 when any surviving finding meets or exceeds this severity. |
GitHub Action inputs mirror the CLI flags one-to-one; see action.yml.
GovVerdict consumes the canonical Report envelope (schemaVersion: "1.0") from agent-gov-core@^1.0.0. The consumer tools all emit that envelope starting at the versions below; older releases used pre-canonical shapes and are not supported:
| Tool | Minimum version |
|---|---|
| agent-gov-core | 1.0.0 |
| ScopeTrail | v0.2.0 |
| PolicyMesh | v0.5.0 |
| CapabilityEcho | v0.2.1 |
| TaskBound | v0.7.0 |
| SessionTrail | v0.6.1 |
Local-only OSS tools that review AI-agent PRs and coding sessions for config drift, policy mismatches, and scope creep. Every tool emits the same canonical Finding schema so GovVerdict can merge them.
| Repo | What it catches |
|---|---|
| ScopeTrail | Agent config drift between PR base and head. |
| PolicyMesh | Contradictory agent instructions and config drift that make behavior non-reproducible. |
| CapabilityEcho | Capability drift introduced by code, manifests, workflows, and Dockerfiles. |
| TaskBound | Scope creep between the stated task and the actual diff. |
| SessionTrail | Risky runtime behavior in Cursor / Claude Code / Codex session transcripts. |
| AgentPulse | Live local trajectory verdicts for active agent sessions. |
| GovVerdict (this repo) | Merges JSON reports from the tools above into one deduped review. |
| agent-gov-core | Shared parsers, the canonical Finding schema, and mergeFindings. |
| agent-gov-demo | Demo sandbox with a deliberately rogue PR that fires every tool. |
See the suite end-to-end on a real PR: agent-gov-demo#1.
MIT.