An implementation of Retrieval-Augmented Generation (RAG) system with enterprise-grade compliance controls.
This project builds:
- A stable LLM baseline
- A proper ingestion pipeline
- Retrieval grounding
- Access control
- Policy enforcement
- PII protection
- Measurable evaluation
User
↓
FastAPI
↓
Authentication Layer (???)
↓
Policy Engine (???)
↓
Retriever (???)
↓
LLM (Ollama)
↓
PII Redaction (???)
↓
Response
Architecture:
User → FastAPI → Ollama → Response
Exit Condition: API consistently returns model responses in < 30 seconds.
- Python 3.12.3 — Core language
- FastAPI — Web framework
- Ollama — Local LLM inference
- Uvicorn — ASGI server
- Requests — HTTP client
Design principle: Everything runs locally. No external APIs or cloud dependencies.