Skip to content

freya-docs/chatvector-ai

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

54 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

ChatVector-AI

Open-Source Backend-First RAG Engine for Document Intelligence

ChatVector-AI is an open-source Retrieval-Augmented Generation (RAG) engine for ingesting, indexing, and querying unstructured documents such as PDFs and text files.

Think of it as an engine developers can use to build document-aware applications β€” such as research assistants, contract analysis tools, or internal knowledge systems β€” without having to reinvent the RAG pipeline.

Status PRs Welcome AI RAG

Python Version FastAPI Version

⭐ Star the repo to follow progress and support the project!

GitHub stars Next Milestone: 25


πŸ”— Quick Links

Good First Issues Roadmap Quick Setup Project Board Dev Notes License: MIT Contributing Docs Contributing Video Discussions


πŸ“Œ Table of Contents

  • What is ChatVector-AI?
  • ChatVector-AI vs Frameworks
  • Who is this for?
  • Current Status
  • Architecture Overview
    • Backend Core
    • AI Retrieval
    • Data Layer
    • Reference Frontend
  • Quick Start
    • Backend Setup
    • Frontend Demo
  • Contributing
  • License

πŸ”Ž What is ChatVector-AI?

ChatVector-AI provides a clean, extensible backend foundation for RAG-based document intelligence. It handles the full lifecycle of document Q&A:

  • Document ingestion (PDF, text)
  • Text extraction and chunking
  • Vector embedding and storage
  • Semantic retrieval
  • LLM-powered answer generation

The goal is to offer a developer-focused RAG engine that can be embedded into other applications, tools, or products β€” not a polished end-user SaaS.


ChatVector vs Frameworks

ChatVector-AI is designed as a production-ready backend engine, not a general-purpose framework. If you need a running, reliable API for document Q&A, this project provides a complete, opinionated solution. Here's how it compares to the approach of using a modular framework:

Aspect ChatVector-AI (This Project) General AI Framework (e.g., LangChain)
Primary Goal Deliver a deployable backend service for document intelligence. Provide modular components to build a wide variety of AI applications.
Out-of-the-Box Experience A fully functional FastAPI service with logging, testing, and a clean API. A collection of tools and abstractions you must wire together and productionize.
Architecture Batteries-included, opinionated engine. Get a working system for one use case. Modular building blocks. Assemble and customize components for many use cases.
Best For Developers, startups, or teams who need a document Q&A API now and want to focus on their application layer. Developers and researchers building novel, complex AI agents or exploring multiple LLM patterns from the ground up.
Path to Production Short. Configure, deploy, and integrate via API. Built-in observability and scaling patterns. Long. Requires significant additional work on API layers, monitoring, deployment, and performance tuning.

πŸ‘₯ Who is this for?

ChatVector-AI is designed for:

  • Developers building document intelligence tools or internal knowledge systems
  • Backend engineers who want a solid RAG foundation without heavy abstractions
  • AI/ML practitioners experimenting with chunking, retrieval, and prompt strategies
  • Open-source contributors interested in retrieval systems, embeddings, and LLM orchestration

πŸš€ Current Status

Backend MVP (Core Engine)

The core RAG backend is complete and functional.

What works today:

  • βœ… PDF text extraction
  • βœ… Basic chunking pipeline
  • βœ… Vector embeddings
  • βœ… Semantic search (pgvector)
  • βœ… LLM-powered answers
  • βœ… Supabase integration

Backend improvements in progress:

  • 🚧 Advanced chunking strategies
  • 🚧 Error handling & logging
  • 🚧 API rate limiting
  • 🚧 Performance optimization
  • 🚧 Authentication & access control

Frontend Demo: A lightweight UI for testing the backend API. Not production-ready.


🧠 Architecture Overview

Backend Layer (Core)

  • FastAPI β€” modern Python API framework with automatic OpenAPI docs
  • Uvicorn β€” high-performance ASGI server
  • Design goals: clarity, extensibility, and debuggability

AI & Retrieval Layer

  • Google AI Studio (Gemini) β€” LLM + embeddings
  • Features: chunking, semantic retrieval, prompt construction

Data Layer

  • Supabase β€” PostgreSQL backend
  • pgvector β€” native vector similarity search
  • Storage: document metadata and embeddings

Reference Frontend (Non-Core)

  • Next.js + TypeScript
  • Exists solely to demonstrate backend usage
  • Not production-ready
  • Subject to breaking changes

🎯 Quick Start: Run in 5 Minutes

Backend Setup

Follow these steps to get the backend running in under 5 minutes.

Prerequisites

  • Docker & Docker Compose installed

  • Google AI Studio API Key (Get Key)

Setup .env

cd backend

# Create .env file
Create .env file in /backend and paste in the following values

APP_ENV=development
LOG_LEVEL=INFO
LOG_USE_UTC=false
GEN_AI_KEY=your_google_ai_studio_api_key_here
# Replace GEN_AI_KEY with your actual API key

# Upload validation
MAX_UPLOAD_SIZE_MB=10

Launch Backend

Note: Make sure Docker Desktop is running (Mac/Windows) before executing this command.

Run from the project root (where docker-compose.yml is located):

docker-compose up --build

What happens:

  • Postgres with pgvector starts automatically and initializes tables + vector functions
  • API waits for Postgres healthcheck
  • Live reload enabled for backend code

Test the API

Try endpoints:

  1. /upload - Upload a PDF and get a document_id and status_endpoint
  2. /documents/{document_id}/status - Poll upload stage/progress metadata
  3. /chat - Ask questions using the document_id

2️⃣ Extra Docker Commands

Command Purpose
docker-compose up Start containers without rebuilding (normal start).
docker-compose down Stop containers and preserve data (normal stop).
docker-compose down -v Stop containers and delete all database data. Use to completely reset DB.
docker-compose up --build Rebuild containers after code changes or DB reset.
docker-compose logs -f api Follow API logs in real time.
docker-compose exec db psql -U postgres Connect to Postgres inside Docker for manual queries.

3️⃣ Run Python Scripts Outside Docker (Optional / Advanced)

If you want to run scripts or the API without Docker:

# 1. Create virtual environment
python -m venv venv
source venv/bin/activate   # Mac/Linux
venv\Scripts\activate      # Windows

# 2. Install dependencies
pip install -r requirements.txt

# 3. Set DATABASE_URL in .env if different from Docker
DATABASE_URL=postgresql://postgres:postgres@localhost:5432/postgres

# 4. Run scripts or start API manually
python scripts/your_script.py
uvicorn main:app --reload --host 0.0.0.0 --port 8000

Notes:

  • Requires a running Postgres instance with pgvector enabled
  • Only needed for local development outside Docker

βœ… Result

  • Docker-first setup is simple, cross-platform, and fully initialized
  • Optional sections give control for resets, logs, or running scripts manually

Frontend Layer (Non-Core)

Note: The frontend serves as the web presence for the OSS, and as a testing demo -- but is not central to the actual OSS.

Prerequisites

  • Node.js 18+
  • npm or yarn

Setup Instructions

# 1. Navigate to frontend directory
cd frontend-demo

# 2. Install dependencies
npm install

# 3. Start development server
npm run dev

#4. Run in browser
The frontend will run on http://localhost:3000

🀝 Contributing

High-impact contribution areas:

  • Ingestion & indexing pipelines
  • Retrieval quality & evaluation
  • Chunking strategies
  • API design & refactoring
  • Performance & scaling
  • Documentation & examples

Frontend contributions are welcome but considered non-core.

See CONTRIBUTING.md for details.


πŸ“„ License

License: MIT

About

Open-source RAG engine for ingesting, indexing, and querying unstructured documents

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 88.1%
  • TypeScript 3.8%
  • Makefile 3.1%
  • PLpgSQL 2.4%
  • Shell 1.0%
  • Dockerfile 0.7%
  • Other 0.9%