Productionβready Next.js App Router application that audits large Next.js repositories for security vulnerabilities, performance issues, and code quality problems.
Built with Vercel AI SDK 5 for multiβagent reasoning, supports OpenAI (direct or via Vercel AI Gateway) and OpenRouter (300+ models), and uses Prisma + Postgres for persistence with optional pgvector for semantic search.
- π€ 7 Specialized AI Agents: UI/UX, Backend, Database, Security, Performance, Lint, and Codemod agents work together
- π Flexible AI Providers: Choose between OpenAI Direct, Vercel AI Gateway, or OpenRouter (300+ models)
- β‘ Intelligent Sampling: Smart file selection analyzes 200+ files in 2-4 minutes (configurable for 100% coverage)
- π― Structured Outputs: Reliable JSON responses using Zod schemas and AI SDK's
generateObject - π Fast Heuristics: Quick pattern matching on ALL files before AI analysis
- π Comprehensive Reports: Issues grouped by severity with actionable recommendations
- π‘οΈ Security-First: Read-only by default, no code execution
- π GitHub Integration: Public and private repos via tarball downloads (handles redirects)
- XSS vulnerabilities (dangerouslySetInnerHTML, unescaped input)
- CSRF protection gaps
- Hardcoded secrets and API keys
- Missing security headers (CSP, HSTS, X-Frame-Options)
- Insecure cookie configurations
- SQL injection risks
- Large bundle sizes and missing code splitting
- Render-blocking resources
- Missing streaming/suspense boundaries
- Avoidable re-renders
- Slow data fetches and serial waterfalls
- Route handlers without error boundaries
- Missing or incorrect caching strategies
- ISR/SSG misconfiguration
- Improper edge/node runtime selection
- Missing indexes on queried fields
- N+1 query patterns
- Unbounded result sets without pagination
- Connection pooling problems
- Accessibility violations (ARIA, alt text, semantic HTML)
- Layout shifts and CLS problems
- Missing loading states and skeletons
- Form usability issues
- TypeScript 'any' types
- Missing error handling
- React hooks anti-patterns
- Server/client component mixing issues
- Node 20+
- Postgres database (Neon, Vercel Postgres, or local)
- AI API key (choose one):
- OpenAI Direct (recommended for simplicity)
- Vercel AI Gateway (recommended for caching)
- OpenRouter (300+ models, flexible)
# Clone and install dependencies
pnpm i
# Set up environment variables
cp .env.example .env.local
# Configure your .env.local (see Configuration section below)
# Initialize database
pnpm db:push
# Start development server
pnpm devOpen http://localhost:3000 and paste a GitHub repo URL. Public repos work without a token; for private repos, add GITHUB_TOKEN to .env.local or paste it in the UI.
Best for: Quick start, direct OpenAI API access, no proxies
# .env.local
# Required: OpenAI API Key
OPENAI_API_KEY=sk-proj-...
# Required: Database
DATABASE_URL=postgresql://user:pass@host:5432/dbname
# Optional: GitHub token for private repos
GITHUB_TOKEN=ghp_...
# Optional: Default model (if not specified, uses gpt-4o)
OPENAI_DEFAULT_MODEL=gpt-4o
# Other options: gpt-4o-mini (cheaper), gpt-4-turbo, gpt-3.5-turboCost estimate: $0.50-1.00 per scan (200 files with gpt-4o)
In the UI: Select "Vercel AI Gateway" provider, model will be automatically used
Best for: Production use, caching, rate limiting, cost control
# .env.local
# Required: AI Gateway key
AI_GATEWAY_API_KEY=your_gateway_key_here
# Required: Database
DATABASE_URL=postgresql://user:pass@host:5432/dbname
# Optional: GitHub token
GITHUB_TOKEN=ghp_...
# Optional: Default model
AI_GATEWAY_DEFAULT_MODEL=gpt-4oBenefits:
- β Built-in caching (saves on repeat scans)
- β Rate limiting and quota management
- β Analytics and monitoring
- β Same OpenAI models, but with more control
In the UI: Select "Vercel AI Gateway" provider
Best for: Access to 300+ models (Claude, Gemini, Llama, etc.), cost optimization
# .env.local
# Required: OpenRouter API Key
OPENROUTER_API_KEY=sk-or-v1-...
# Required: Database
DATABASE_URL=postgresql://user:pass@host:5432/dbname
# Optional: GitHub token
GITHUB_TOKEN=ghp_...
# Optional: Default model
OPENROUTER_DEFAULT_MODEL=anthropic/claude-3.5-sonnet
# Popular options:
# - anthropic/claude-3.5-sonnet (best quality)
# - openai/gpt-4o (fast, reliable)
# - openai/gpt-4o-mini (cheapest, good quality)
# - google/gemini-pro-1.5 (free tier available)
# - meta-llama/llama-3.1-70b-instruct (open source)
# Optional: Set your app name/URL for OpenRouter credits
OPENROUTER_APP_NAME=nextjs-audit-app
OPENROUTER_SITE_URL=http://localhost:3000In the UI: Select "OpenRouter" provider, choose from 300+ models
Cost comparison:
- Claude Sonnet 3.5: $1.50-3.00 per scan (best quality)
- GPT-4o: $0.50-1.00 per scan (balanced)
- GPT-4o-mini: $0.10-0.20 per scan (cheapest)
See MODEL_GUIDE.md for detailed cost analysis.
You can configure multiple providers simultaneously:
# .env.local - All three providers configured
OPENAI_API_KEY=sk-proj-...
AI_GATEWAY_API_KEY=your_gateway_key
OPENROUTER_API_KEY=sk-or-v1-...
DATABASE_URL=postgresql://...
GITHUB_TOKEN=ghp_...
# Default models for each provider
OPENAI_DEFAULT_MODEL=gpt-4o
AI_GATEWAY_DEFAULT_MODEL=gpt-4o
OPENROUTER_DEFAULT_MODEL=anthropic/claude-3.5-sonnetThen switch between providers in the UI!
- Heuristics Scan (Fast): Runs on ALL files, detects patterns like
dangerouslySetInnerHTML, hardcoded secrets,eval(), etc. - Intelligent Sampling: Selects high-priority files (default: 200 files, configurable)
- AST Parsing: Uses
ts-morphto create function/class/component-level chunks - Parallel Agent Analysis: 6 specialized agents analyze the codebase simultaneously
- Report Generation: Issues merged, deduplicated, and grouped by severity
- Codemod Planning: Automated fix suggestions for eligible issues
All agents live under lib/ai/agents/*:
- Accessibility violations (ARIA, alt text, semantic HTML)
- Layout shifts and CLS problems
- Missing loading states and skeletons
- Navigation issues
- Form usability (validation, error messages, focus management)
- Route handlers without error boundaries
- Missing or incorrect caching strategies (revalidate, cache-control)
- ISR/SSG misconfiguration
- Improper edge/node runtime selection
- Missing streaming or suspense where beneficial
- Missing indexes on frequently queried fields
- N+1 query patterns (missing includes/select)
- Unbounded result sets without pagination
- Connection pooling configuration issues
- Unsafe raw SQL usage
- XSS vulnerabilities (dangerouslySetInnerHTML, unescaped user input)
- CSRF protection missing on state-changing operations
- SSRF risks in server-side fetch/request calls
- Hardcoded secrets and API keys
- Missing security headers (CSP, HSTS, X-Frame-Options)
- Insecure cookie configuration
- Render-blocking resources (large synchronous imports)
- Heavy client bundles (missing code splitting)
- Missing streaming or suspense boundaries
- Avoidable re-renders (missing memoization)
- Slow data fetches (serial waterfalls)
- TypeScript 'any' types instead of proper types
- Missing error handling in async functions
- Improper React hook usage
- Server components importing client-only code
- Client components not marked with 'use client'
- Generates automated fix suggestions
- Creates jscodeshift/ts-morph transformation plans
- Provides executable commands for safe refactors
Orchestrates the entire scan:
- Manages agent execution (parallel where possible)
- Handles errors with fallback markdown unwrapper
- Merges and deduplicates issues
- Attaches codemod recommendations
- Tracks progress and provides real-time updates
All agents use Zod schemas (schemas.ts) for type-safe responses:
IssuesArraySchema: Validates agent findingsCodemodsArraySchema: Validates codemod suggestions- AI SDK's
generateObjectwithmode: 'json'ensures reliable parsing
No more JSON parsing errors! β
The system uses intelligent sampling to balance speed, cost, and coverage:
Default behavior (as of latest version):
- Repos with β€200 files: 100% analyzed
- Repos with >200 files: 200 files analyzed (28% for a 700-file repo)
| Coverage | Time | Cost | Issues Found |
|---|---|---|---|
| 200 files β | 2-4 min | $0.50 | 90-95% |
| 500 files | 8-12 min | $1.25 | 98% |
| All files (700) | 15-25 min | $2.50 | 100% |
Key insight: The first 200 files catch 90-95% of issues because intelligent sampling prioritizes:
- Files with heuristic hits (security/quality issues detected)
- Authentication and security-critical files
- API routes and server components
- Database queries and schemas
- Large files (often contain complex logic)
- Page components (user-facing code)
To change the sampling limit, edit lib/ai/agents/coordinator.ts line 48:
// Option 1: Audit ALL files (100% coverage)
const samplingLimit = input.files.length;
// Option 2: Custom limit
const samplingLimit = 300; // Audit 300 files
// Option 3: Default (current)
const samplingLimit = input.files.length <= 200 ? input.files.length : 200;See SAMPLING_GUIDE.md for detailed configuration options and cost/coverage trade-offs.
This audit tool can be integrated with OpenAI's Agents SDK for direct use within ChatGPT:
npm install @openai/agents-sdk// agents/audit-tool.ts
import { coordinateAudit } from '@/lib/ai/agents/coordinator';
export const auditRepository = {
name: 'audit_nextjs_repository',
description: 'Audits a Next.js repository for security, performance, and quality issues',
parameters: {
type: 'object',
properties: {
repoUrl: {
type: 'string',
description: 'GitHub repository URL (e.g., https://github.com/owner/repo)'
},
branch: {
type: 'string',
description: 'Branch name (default: main)',
default: 'main'
}
},
required: ['repoUrl']
},
async execute({ repoUrl, branch = 'main' }) {
const result = await coordinateAudit({
repoUrl,
ref: branch,
provider: 'openai', // Use OpenAI directly
model: 'gpt-4o'
});
return {
summary: `Found ${result.issues.length} issues`,
criticalIssues: result.issues.filter(i => i.severity === 'critical').length,
highIssues: result.issues.filter(i => i.severity === 'high').length,
issues: result.issues.slice(0, 10), // Top 10 issues
fullReport: result.markdownReport
};
}
};// app.ts
import { Agent } from '@openai/agents-sdk';
import { auditRepository } from './agents/audit-tool';
const agent = new Agent({
model: 'gpt-4o',
tools: [auditRepository],
instructions: `You are a code audit assistant. When users ask to audit a repository,
use the audit_nextjs_repository tool to analyze it for security, performance, and quality issues.`
});
// Use the agent
const response = await agent.run({
messages: [{
role: 'user',
content: 'Audit https://github.com/vercel/next.js for issues'
}]
});# .env.local
OPENAI_API_KEY=sk-proj-...
DATABASE_URL=postgresql://...
# Optional: For private repos
GITHUB_TOKEN=ghp_...Benefits:
- β Direct integration with ChatGPT/GPTs
- β Conversational audit requests
- β Automatic follow-up questions
- β Context-aware recommendations
The app uses Prisma with Postgres to store:
- Repositories: Tracked repos and their metadata
- Scans: Audit results with timestamps
- Issues: Individual findings with severity and recommendations
model Repository {
id String @id @default(cuid())
url String @unique
owner String
name String
branch String @default("main")
scans Scan[]
createdAt DateTime @default(now())
updatedAt DateTime @updatedAt
}
model Scan {
id String @id @default(cuid())
repositoryId String
repository Repository @relation(fields: [repositoryId], references: [id])
commitSha String
status String // pending, running, completed, failed
issues Issue[]
createdAt DateTime @default(now())
completedAt DateTime?
}
model Issue {
id String @id @default(cuid())
scanId String
scan Scan @relation(fields: [scanId], references: [id])
title String
description String @db.Text
severity String // critical, high, medium, low, info
recommendation String? @db.Text
file String?
line Int?
agentType String // security, performance, uiux, backend, db, lint
createdAt DateTime @default(now())
}To enable vector similarity search for code chunks:
-
Enable the extension:
CREATE EXTENSION IF NOT EXISTS vector;
-
Add embeddings table:
CREATE TABLE embeddings ( id SERIAL PRIMARY KEY, scan_id TEXT NOT NULL, file TEXT NOT NULL, chunk_text TEXT NOT NULL, embedding vector(1536), -- OpenAI ada-002 dimension created_at TIMESTAMP DEFAULT NOW() ); -- Create HNSW index for fast similarity search CREATE INDEX ON embeddings USING hnsw (embedding vector_cosine_ops);
-
Use in queries:
// Find similar code patterns const similar = await prisma.$queryRaw` SELECT file, chunk_text, 1 - (embedding <=> ${queryEmbedding}::vector) as similarity FROM embeddings WHERE scan_id = ${scanId} ORDER BY embedding <=> ${queryEmbedding}::vector LIMIT 10 `;
The app fetches repository tarballs via GitHub REST API:
- β Read-only access (no code execution)
- β Handles 302 redirects automatically
- β Supports public repos without authentication
- β
Private repos with
GITHUB_TOKEN
For private repos or higher rate limits:
# .env.local
GITHUB_TOKEN=ghp_...Or paste directly in the UI (per-scan).
Write operations (PRs, commits) are disabled by default. To enable:
# .env.local
ALLOW_WRITE=true
GITHUB_TOKEN=ghp_... # Must have repo write scopeRequired permissions:
reposcope for private repospublic_repofor public repos- Additional scopes for PR creation/commits
pnpm dev # Start Next.js dev server (http://localhost:3000)
pnpm build # Build for production
pnpm start # Run production build
pnpm lint # Run ESLint
pnpm type-check # Run TypeScript type checkingpnpm db:push # Push Prisma schema to database
pnpm db:studio # Open Prisma Studio (database GUI)
pnpm db:generate # Generate Prisma Client
pnpm db:migrate # Create and apply migrationspnpm mcp:dev # Start MCP server for Apps SDK integrationThis project includes comprehensive documentation:
README.md(this file) - Setup and usage guide.env.example- Environment variable template
-
MODEL_GUIDE.md- AI model comparison, costs, and recommendations- Detailed cost analysis for GPT-4, Claude, Gemini, etc.
- Quality vs. cost trade-offs
- Model selection guide for different use cases
-
SAMPLING_GUIDE.md- File sampling configuration- How intelligent sampling works
- Coverage vs. speed vs. cost trade-offs
- Configuration examples for different repo sizes
-
BUGFIXES.md- All resolved issues and their fixes- GitHub 302 redirect handling
- Provider mixing resolution
- JSON parsing error fixes
- Array schema validation fixes
-
ROOT_CAUSE_ANALYSIS.md- Deep dive into JSON parsing issues- Why markdown wrapping occurred
- How
mode: 'json'solves it - Schema structure requirements
AGENTS.md- Agent architecture and patterns- Vercel AI SDK 5 best practices
- Tool calling patterns
- Coordinator loop control
- Context7 MCP integration
- β Read-only by default: Never executes code from repositories
- β No arbitrary code execution: Only parses and analyzes
- β Constrained prompts: Agents use strict prompts to prevent injection
- β Zod validation: All AI outputs validated with type-safe schemas
- β Secure token handling: GitHub tokens never logged or exposed
- β HTTPS only: All external API calls use HTTPS
- Repository code is sent to AI providers (OpenAI/OpenRouter) for analysis
- Use self-hosted models or Azure OpenAI for sensitive codebases
- Scan results stored in your Postgres database only
- No telemetry or analytics sent to third parties
Agents are instructed with:
- Direct, constrained prompts
- Explicit output format requirements (JSON schemas)
- No code execution capabilities
- Validation guardrails on all outputs
- For public repos: Use any provider
- For private repos: Consider:
- Azure OpenAI (SOC 2, HIPAA compliant)
- Self-hosted models via OpenRouter
- Vercel AI Gateway with data retention policies
- For sensitive code:
- Review privacy policies of your chosen AI provider
- Use providers with zero data retention
- Consider on-premise deployment
See SECURITY.md for operational security guidelines.
# Install Vercel CLI
npm i -g vercel
# Deploy
vercel
# Set environment variables in Vercel dashboard
# - OPENAI_API_KEY or OPENROUTER_API_KEY
# - DATABASE_URL
# - GITHUB_TOKEN (optional)# Dockerfile
FROM node:20-alpine AS base
WORKDIR /app
COPY package.json pnpm-lock.yaml ./
RUN npm install -g pnpm && pnpm install
COPY . .
RUN pnpm build
EXPOSE 3000
CMD ["pnpm", "start"]docker build -t nextjs-audit .
docker run -p 3000:3000 --env-file .env.local nextjs-audit# Production .env
NODE_ENV=production
DATABASE_URL=postgresql://...
OPENAI_API_KEY=sk-proj-...
GITHUB_TOKEN=ghp_...
# Optional: Redis for caching
REDIS_URL=redis://...
# Optional: Observability
LANGFUSE_PUBLIC_KEY=pk-...
LANGFUSE_SECRET_KEY=sk-...Fixed! Ensure you have the latest code with:
mode: 'json'inlib/ai/provider.ts- Object-wrapped schemas in
lib/ai/agents/schemas.ts
Fixed! The app now follows redirects automatically with maxRedirections: 5.
Use OpenRouter with multiple fallback models:
models: [
'openai/gpt-4o', // Primary
'anthropic/claude-3-haiku', // Fallback
'google/gemini-pro' // Fallback 2
]Ensure your DATABASE_URL includes SSL for cloud databases:
postgresql://user:pass@host:5432/db?sslmode=require
- Reduce sampling limit (see
SAMPLING_GUIDE.md) - Use faster models (gpt-4o-mini instead of claude-3.5-sonnet)
- Enable database caching
Contributions welcome! Please:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit changes (
git commit -m 'Add amazing feature') - Push to branch (
git push origin feature/amazing-feature) - Open a Pull Request
- Follow TypeScript strict mode
- Use Zod for all data validation
- Add tests for new agents
- Update documentation for config changes
- Keep agent prompts focused and constrained
MIT License - see LICENSE file for details.
Third-party components: Check licenses of any components you add (shadcn/ui, etc.).
Built with:
- Vercel AI SDK 5 - Agentic AI framework
- Next.js 15 - React framework
- Prisma - Database ORM
- Zod - TypeScript-first schema validation
- ts-morph - TypeScript AST manipulation
- OpenRouter - Unified LLM API
- shadcn/ui - UI components
- π Documentation: See guides in repo root
- π Bug Reports: Open an issue
- π¬ Discussions: GitHub Discussions
- π§ Email: your-email@example.com
- UI configuration for sampling limits
- Support for monorepos (Nx, Turborepo)
- Custom agent creation via config files
- Automated PR creation with fixes
- CI/CD integration (GitHub Actions, GitLab CI)
- Support for other frameworks (Remix, SvelteKit, Nuxt)
- Real-time collaboration features
- Custom heuristic rules via UI
- Integration with Sentry, Datadog, etc.
Star β this repo if you find it useful!