Enterprise-ready Retrieval Augmented Generation backend with tenant-scoped APIs, Supabase-backed storage, pgvector retrieval, and model/provider boundaries aligned to the planning docs.
- FastAPI service with
/health,/ready,/metrics, tenant listing, documents, chat, conversations, feedback, and admin/ops endpoints - Multi-provider model gateway with
Groq,OpenAI,Anthropic,Ollama, and hosted embedding provider routing - Supabase-backed pgvector + Postgres FTS retrieval with ACL-aware filtering, reranker hooks, and retrieval telemetry
- Async ingestion with
Celery + Redis, signed Supabase Storage uploads, and parsers for text, Markdown, PDF, DOCX, and HTML - Request correlation IDs, structured logging, JWT-first auth posture, rate limiting, restricted-data provider policy hooks, and redacted audit logging
- Offline evaluation scaffolding with golden dataset loading and persisted evaluation runs
- Docker Compose for API, Redis, and Prometheus; Supabase should provide database, auth, storage, and pgvector
- Supabase-aligned SQL migrations under
app/storage/db/migrations - Unit coverage for routing, config, auth, parsers, retrieval, rate limiting, and evaluation helpers
cp .env.example .env
docker compose up --buildThen open:
- API health:
http://localhost:8000/health - API readiness:
http://localhost:8000/ready - API docs:
http://localhost:8000/docs - Supabase: use a local Supabase stack or a dedicated Supabase project for database, auth, storage, and vectors
Apply the SQL migrations in app/storage/db/migrations to your Supabase/Postgres database before using the APIs.
For local development:
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
uvicorn app.main:app --reloadIngest a document:
curl -X POST http://localhost:8000/v1/tenants/11111111-1111-1111-1111-111111111111/documents \
-H "X-API-Key: dev-api-key-change-me" \
-H "Content-Type: application/json" \
-d '{"title":"Employee Handbook","source_type":"text","text":"Your enterprise knowledge text...","metadata":{"source":"handbook"},"sensitivity":"internal","acl_group_ids":[]}'Ask a question:
curl -X POST http://localhost:8000/v1/tenants/11111111-1111-1111-1111-111111111111/chat \
-H "X-API-Key: dev-api-key-change-me" \
-H "Content-Type: application/json" \
-d '{"question":"What does the handbook say?","model_profile":"balanced","top_k":5,"filters":{},"stream":false}'Record feedback:
curl -X POST http://localhost:8000/v1/tenants/11111111-1111-1111-1111-111111111111/messages/<message-id>/feedback \
-H "X-API-Key: dev-api-key-change-me" \
-H "Content-Type: application/json" \
-d '{"rating":1,"comment":"Correct and useful.","categories":["helpful_sources"]}'Usage summary:
curl -X GET "http://localhost:8000/v1/tenants/11111111-1111-1111-1111-111111111111/admin/usage?group_by=model_profile" \
-H "X-API-Key: dev-api-key-change-me"Run an offline evaluation:
.venv/bin/python scripts/run_evaluation.py --tenant-id 11111111-1111-1111-1111-111111111111See docs/ARCHITECTURE.md.
- Metrics:
GET /metrics - Evaluation runner:
scripts/run_evaluation.py - Usage export:
scripts/export_usage_report.py - Dead-letter requeue:
scripts/requeue_dead_letter_job.py
- Apply the latest migration
0005_phase56_hardening.sqlto the target Supabase project - Re-run the live DB-backed integration flow in an environment with network access to Supabase
- Enable OTLP tracing exporter in deployment if you want external trace collection instead of the local fallback