ChatVector-AI is an open-source Retrieval-Augmented Generation (RAG) engine for ingesting, indexing, and querying unstructured documents such as PDFs and text files.
Think of it as an engine developers can use to build document-aware applications β such as research assistants, contract analysis tools, or internal knowledge systems β without having to reinvent the RAG pipeline.
β Star the repo to follow progress and support the project!
ChatVector-AI provides a clean, extensible backend foundation for RAG-based document intelligence. It handles the full lifecycle of document Q&A:
- Document ingestion (PDF, text)
- Text extraction and chunking
- Vector embedding and storage
- Semantic retrieval
- LLM-powered answer generation
The goal is to offer a developer-focused RAG engine that can be embedded into other applications, tools, or products β not a polished end-user SaaS.
ChatVector-AI is designed as a production-ready backend engine, not a general-purpose framework. If you need a running, reliable API for document Q&A, this project provides a complete, opinionated solution. Here's how it compares to the approach of using a modular framework:
| Aspect | ChatVector-AI (This Project) | General AI Framework (e.g., LangChain) |
|---|---|---|
| Primary Goal | Deliver a deployable backend service for document intelligence. | Provide modular components to build a wide variety of AI applications. |
| Out-of-the-Box Experience | A fully functional FastAPI service with logging, testing, and a clean API. | A collection of tools and abstractions you must wire together and productionize. |
| Architecture | Batteries-included, opinionated engine. Get a working system for one use case. | Modular building blocks. Assemble and customize components for many use cases. |
| Best For | Developers, startups, or teams who need a document Q&A API now and want to focus on their application layer. | Developers and researchers building novel, complex AI agents or exploring multiple LLM patterns from the ground up. |
| Path to Production | Short. Configure, deploy, and integrate via API. Built-in observability and scaling patterns. | Long. Requires significant additional work on API layers, monitoring, deployment, and performance tuning. |
ChatVector-AI is designed for:
- Developers building document intelligence tools or internal knowledge systems
- Backend engineers who want a solid RAG foundation without heavy abstractions
- AI/ML practitioners experimenting with chunking, retrieval, and prompt strategies
- Open-source contributors interested in retrieval systems, embeddings, and LLM orchestration
The core RAG backend is complete and functional.
What works today:
- β PDF text extraction
- β Basic chunking pipeline
- β Vector embeddings
- β Semantic search (pgvector)
- β LLM-powered answers
- β Supabase integration
Backend improvements in progress:
- π§ Advanced chunking strategies
- π§ Error handling & logging
- π§ API rate limiting
- π§ Performance optimization
- π§ Authentication & access control
Frontend Demo: A lightweight UI for testing the backend API. Not production-ready.
- FastAPI β modern Python API framework with automatic OpenAPI docs
- Uvicorn β high-performance ASGI server
- Design goals: clarity, extensibility, and debuggability
- Google AI Studio (Gemini) β LLM + embeddings
- Features: chunking, semantic retrieval, prompt construction
- Supabase β PostgreSQL backend
- pgvector β native vector similarity search
- Storage: document metadata and embeddings
- Next.js + TypeScript
- Exists solely to demonstrate backend usage
- Not production-ready
- Subject to breaking changes
Follow these steps to get the backend running in under 5 minutes.
-
Docker & Docker Compose installed
- Install Docker (Mac/Windows/Linux)
-
Google AI Studio API Key (Get Key)
cd backend
# Create .env file
Create .env file in /backend and paste in the following values
APP_ENV=development
LOG_LEVEL=INFO
LOG_USE_UTC=false
GEN_AI_KEY=your_google_ai_studio_api_key_here
# Replace GEN_AI_KEY with your actual API key
# Upload validation
MAX_UPLOAD_SIZE_MB=10Note: Make sure Docker Desktop is running (Mac/Windows) before executing this command.
Run from the project root (where docker-compose.yml is located):
docker-compose up --buildWhat happens:
- Postgres with pgvector starts automatically and initializes tables + vector functions
- API waits for Postgres healthcheck
- Live reload enabled for backend code
- Root: http://localhost:8000
- Swagger UI: http://localhost:8000/docs
Try endpoints:
/upload- Upload a PDF and get adocument_idandstatus_endpoint/documents/{document_id}/status- Poll upload stage/progress metadata/chat- Ask questions using thedocument_id
| Command | Purpose |
|---|---|
docker-compose up |
Start containers without rebuilding (normal start). |
docker-compose down |
Stop containers and preserve data (normal stop). |
docker-compose down -v |
Stop containers and delete all database data. Use to completely reset DB. |
docker-compose up --build |
Rebuild containers after code changes or DB reset. |
docker-compose logs -f api |
Follow API logs in real time. |
docker-compose exec db psql -U postgres |
Connect to Postgres inside Docker for manual queries. |
If you want to run scripts or the API without Docker:
# 1. Create virtual environment
python -m venv venv
source venv/bin/activate # Mac/Linux
venv\Scripts\activate # Windows
# 2. Install dependencies
pip install -r requirements.txt
# 3. Set DATABASE_URL in .env if different from Docker
DATABASE_URL=postgresql://postgres:postgres@localhost:5432/postgres
# 4. Run scripts or start API manually
python scripts/your_script.py
uvicorn main:app --reload --host 0.0.0.0 --port 8000Notes:
- Requires a running Postgres instance with pgvector enabled
- Only needed for local development outside Docker
β Result
- Docker-first setup is simple, cross-platform, and fully initialized
- Optional sections give control for resets, logs, or running scripts manually
Note: The frontend serves as the web presence for the OSS, and as a testing demo -- but is not central to the actual OSS.
- Node.js 18+
- npm or yarn
# 1. Navigate to frontend directory
cd frontend-demo
# 2. Install dependencies
npm install
# 3. Start development server
npm run dev
#4. Run in browser
The frontend will run on http://localhost:3000High-impact contribution areas:
- Ingestion & indexing pipelines
- Retrieval quality & evaluation
- Chunking strategies
- API design & refactoring
- Performance & scaling
- Documentation & examples
Frontend contributions are welcome but considered non-core.
See CONTRIBUTING.md for details.