Skip to content
github-actions[bot] edited this page Apr 4, 2026 · 6 revisions

Judges Panel

Judges Panel is an open-source MCP server that provides 45 specialized judges to evaluate AI-generated code — acting as an independent quality gate regardless of which project is being reviewed.

It combines deterministic pattern matching & AST analysis (instant, offline, zero LLM calls) with LLM-powered deep-review prompts that let your AI assistant perform expert-persona analysis across all 45 domains.

npm License: MIT Tests


Why Judges?

AI code generators (Copilot, Cursor, Claude, ChatGPT, etc.) write code fast — but they routinely produce insecure defaults, missing auth, hardcoded secrets, and poor error handling. Human reviewers catch some of this, but nobody reviews 45 dimensions consistently.

Judges doesn't replace linters — it covers the dimensions linters don't: authentication strategy, data sovereignty, cost patterns, accessibility, framework-specific anti-patterns, and architectural issues across multiple files.




Quick Start

# Install globally
npm install -g @kevinrabun/judges-cli

# Evaluate any file
judges eval src/app.ts

# Single judge
judges eval --judge cybersecurity server.ts

# SARIF output for CI
judges eval --file app.ts --format sarif > results.sarif

# List all judges and regulatory frameworks
judges list
judges list --frameworks

Or use as an MCP server with any compatible client:

{
  "command": "npx",
  "args": ["-y", "@kevinrabun/judges"]
}

Benchmark Results

The deterministic benchmark suite (L1) tests all 45 judges across 1,048 test cases — including 191 clean-code false-positive tests.

Metric Result
Overall Grade 🟢 A
Test Suite 3,614 tests passing

👉 View the full Benchmark Report


Resources

Clone this wiki locally