Skip to content

Latest commit

 

History

History
144 lines (116 loc) · 5.79 KB

File metadata and controls

144 lines (116 loc) · 5.79 KB

CodeLens — AI-Powered Code Review Assistant

CodeLens is a full-stack web application that uses OpenAI's GPT-4o to provide automated code reviews, explanations, and refactoring suggestions. Built as a production-grade AI system with structured LLM output, fault-tolerant caching, real analytics instrumentation, and a Celery task queue for scalable background processing.

Features

  • AI Code Review — Structured analysis of bugs, security vulnerabilities, performance issues, and style problems with severity ratings, line-specific feedback, and line number verification against actual code
  • Code Explanation — Plain English breakdowns of what code does, ideal for learning or onboarding
  • Refactor Suggestions — Improvement recommendations with before/after code examples
  • Review History — Save, revisit, and paginate past reviews with cursor-based pagination
  • Issue Feedback Loop — Mark which suggestions you actually applied; analytics track application rate by category
  • Redis Caching — SHA-256 content-addressable caching with 33% hit rate and 83x speedup on cache hits
  • Prompt Injection Detection — Sanitizer scans submitted code for injection patterns and prepends system guards to LLM prompts
  • Request Analytics — Middleware tracks response times, cache hit rates, score distributions, and issue category breakdowns
  • Async Task Queue — Celery workers process LLM calls in the background, API returns 202 Accepted immediately
  • 9 Languages Supported — Python, JavaScript, TypeScript, Java, Go, C++, Rust, C, C#

Tech Stack

Layer Technology
Backend Python, FastAPI (async), LangChain
LLM OpenAI GPT-4o with Pydantic structured output parsing
Database PostgreSQL + async SQLAlchemy (pool: 20+5)
Cache Redis (SHA-256 keys, 1hr TTL, fail-open)
Task Queue Celery with Redis broker
Auth JWT + bcrypt, per-user rate limiting
Frontend React, Tailwind CSS, Axios
DevOps Docker Compose (5 services), GitHub Actions CI/CD
Testing 68 automated tests (pytest)

Architecture

React Frontend
    |
    | REST API (Axios + JWT)
    v
FastAPI Backend (4 async workers)
    |
    |--- JWT Auth + Rate Limiting (fail-open)
    |--- Prompt Injection Sanitizer
    |--- Cache Check (Redis, SHA-256 keys)
    |
    |--- [Sync] LLM Analysis (GPT-4o via LangChain + Pydantic)
    |--- [Async] Celery Task Queue (Redis broker)
    |
    |--- Line Number Verification
    |--- Save to PostgreSQL (history + analytics + feedback)
    |
    v
Response to Frontend (structured JSON)

Quick Start

Prerequisites

  • Docker and Docker Compose
  • OpenAI API key (optional: set MOCK_LLM=true for free testing)

Run with Docker

# Clone the repo
git clone https://github.com/shinersup/codelens.git
cd codelens

# Set your OpenAI API key (or use mock mode)
export OPENAI_API_KEY=sk-your-key-here

# Start everything (API, frontend, PostgreSQL, Redis, Celery worker)
docker compose up --build

Frontend at http://localhost:3000 API at http://localhost:8000 API docs at http://localhost:8000/docs (auto-generated by FastAPI)

Mock Mode

Set MOCK_LLM=true in backend/.env to run without OpenAI API calls. The mock service performs real rule-based static analysis (detects eval, os.system, SQL injection, bare excepts, etc.) for free, fast, deterministic testing.

Run Locally (Development)

# Backend
cd backend
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cp .env.example .env  # then edit with your keys
uvicorn app.main:app --reload

# Frontend (in another terminal)
cd frontend
npm install
npm run dev

Run Tests

cd backend
python -m pytest tests/ -v

68 tests covering: auth (7), schemas (8), languages (2), caching (5), endpoints (5), rate limiting (2), line verification (10), prompt injection sanitizer (7), cache logging (9), analytics (6), feedback (5), async tasks (2).

API Endpoints

Method Endpoint Description Auth
POST /api/auth/register Create account No
POST /api/auth/login Get JWT token No
POST /api/review AI code review (sync) Yes
POST /api/explain Code explanation (sync) Yes
POST /api/refactor Refactor suggestions (sync) Yes
POST /api/review/async AI code review (task queue) Yes
GET /api/tasks/{task_id} Poll async task status Yes
GET /api/history Review history (paginated) Yes
GET /api/history/{id} Single review detail Yes
DELETE /api/history/{id} Delete a review Yes
DELETE /api/history Clear all history Yes
POST /api/history/{id}/feedback Submit issue feedback Yes
GET /api/history/{id}/feedback Get feedback for a review Yes
GET /api/analytics Usage metrics and stats Yes
GET /health Health check No

Rate Limits

  • Reviews: 20/hour per user
  • Explanations: 30/hour per user
  • Refactors: 20/hour per user
  • Fail-open design: if Redis is down, requests proceed without rate limiting

Key Technical Decisions

  • Fail-open caching and rate limiting — Redis failures don't block users; availability over strict enforcement
  • Pydantic structured output — Forces GPT-4o into typed JSON schemas; temperature 0.2 for consistency
  • Exact-match SHA-256 caching — One character change = cache miss; 100% accuracy over higher hit rate
  • Line number verification — Cross-checks LLM-returned line numbers against actual code
  • Prompt injection sanitization — Detects injection patterns without modifying user code
  • Mock LLM service — Real rule-based static analysis for testing and CI without API costs
  • Celery task queue — Decouples LLM calls from HTTP requests for horizontal scaling