Skip to content

N3XT3R1337/ragpilot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Typing SVG

A production-grade Retrieval-Augmented Generation platform built from scratch with zero external vector DB dependencies. RAGPilot implements custom HNSW indexing, hybrid BM25+vector search with reciprocal rank fusion, multi-strategy document chunking, and a full evaluation framework — all in pure Python and TypeScript.


Python FastAPI Vue.js TypeScript Docker

NumPy Vite Pinia Chart.js


MetricValue
Endpoints45+
Components40+
Chunking Strategies5
Search Modes3
Eval Metrics5
Lines of Code8500+


Table of Contents


Features

Document Processing

  • Multi-format ingestion supporting PDF, DOCX, HTML, and Markdown
  • Automatic content extraction with metadata preservation
  • Batch upload and processing pipeline
  • Document versioning and collection management

Chunking Strategies

  • Fixed-size chunking with configurable character windows
  • Sentence-aware chunking that preserves linguistic boundaries
  • Paragraph-based chunking for structurally coherent segments
  • Semantic chunking using embedding similarity thresholds
  • Recursive chunking with hierarchical separator fallback

Vector Store

  • Custom HNSW (Hierarchical Navigable Small World) approximate nearest neighbor index
  • Exact KNN brute-force search for baseline comparison
  • Rich metadata filtering with operators: $gt, $gte, $lt, $lte, $ne, $in, $nin
  • Cosine similarity scoring with normalized vectors

Hybrid Search

  • BM25 sparse retrieval with TF-IDF weighting
  • Dense vector search via custom HNSW or exact KNN
  • Reciprocal Rank Fusion (RRF) for combining sparse and dense results
  • Configurable alpha blending between search modalities

Re-ranking

  • Cross-encoder simulation for fine-grained relevance scoring
  • Maximal Marginal Relevance (MMR) for result diversification
  • Cohere-style re-ranking simulation
  • Configurable top-k post-rerank selection

LLM Integration

  • Pluggable prompt templates with variable substitution
  • Streaming response generation
  • Source citation with chunk-level attribution
  • Context window management with automatic truncation

Conversation Memory

  • Sliding window memory for recent message retention
  • Summary-based compression for long conversation histories
  • Hybrid window + summary strategy
  • Per-conversation isolation with session management

Evaluation Framework

  • Faithfulness scoring for hallucination detection
  • Relevance measurement between queries and retrieved context
  • Correctness comparison against ground-truth answers
  • Precision and Recall metrics for retrieval quality
  • Batch evaluation runs with aggregate reporting

Architecture

System Architecture Diagram
graph TB
    User[User] --> Frontend[Frontend - Vue.js SPA]
    Frontend --> API[API Gateway - FastAPI]

    API --> DocService[Document Service]
    API --> SearchService[Search Service]
    API --> ChatService[Chat Service]
    API --> EvalService[Eval Service]

    DocService --> Chunking[Chunking Engine]
    Chunking --> Embedding[Embedding Service]
    Embedding --> VectorStore[(Vector Store)]

    SearchService --> VectorStore
    SearchService --> BM25[BM25 Index]
    SearchService --> Reranker[Re-ranker]
    BM25 --> Reranker

    ChatService --> Memory[Conversation Memory]
    ChatService --> LLM[LLM Service]
    ChatService --> SearchService

    style User fill:#667eea,color:#fff
    style Frontend fill:#764ba2,color:#fff
    style API fill:#667eea,color:#fff
    style DocService fill:#7c3aed,color:#fff
    style SearchService fill:#7c3aed,color:#fff
    style ChatService fill:#7c3aed,color:#fff
    style EvalService fill:#7c3aed,color:#fff
    style Chunking fill:#6d28d9,color:#fff
    style Embedding fill:#6d28d9,color:#fff
    style VectorStore fill:#5b21b6,color:#fff
    style BM25 fill:#5b21b6,color:#fff
    style Reranker fill:#6d28d9,color:#fff
    style Memory fill:#6d28d9,color:#fff
    style LLM fill:#6d28d9,color:#fff
Loading

Quick Start

# Clone and setup
cp .env.example .env

# Backend
cd backend && pip install -r requirements.txt && uvicorn app.main:app --reload

# Frontend
cd frontend && npm install && npm run dev

Docker Quick Start

docker-compose up --build

The frontend will be available at http://localhost:3000 and the API at http://localhost:8000.


API Overview

Group Endpoints Methods
Documents /documents/upload, /documents/{id}, /documents/list POST GET DELETE
Collections /collections, /collections/{id}, /collections/{id}/stats POST GET PUT DELETE
Search /search, /search/hybrid, /search/rerank POST
Chat /chat, /chat/stream, /chat/history/{id} POST GET
Evaluation /eval/run, /eval/results/{id} POST GET
System /health, /stats GET

Project Structure

ragpilot/
├── backend/
│   ├── app/
│   │   ├── main.py
│   │   ├── config.py
│   │   ├── models.py
│   │   ├── routers/
│   │   │   ├── chat.py
│   │   │   ├── collections.py
│   │   │   ├── documents.py
│   │   │   ├── eval.py
│   │   │   └── search.py
│   │   └── services/
│   │       ├── chunking_service.py
│   │       ├── document_service.py
│   │       ├── embedding_service.py
│   │       ├── llm_service.py
│   │       ├── reranker.py
│   │       └── vector_store.py
│   ├── tests/
│   │   ├── test_chunking.py
│   │   ├── test_vector_store.py
│   │   └── test_eval.py
│   ├── Dockerfile
│   └── requirements.txt
├── frontend/
│   ├── src/
│   │   ├── components/
│   │   ├── views/
│   │   ├── stores/
│   │   ├── services/
│   │   └── router/
│   ├── Dockerfile
│   ├── nginx.conf
│   └── package.json
├── docker-compose.yml
├── .env.example
├── .editorconfig
├── .gitignore
├── LICENSE
└── README.md

Tech Stack

Backend

Python FastAPI Pydantic NumPy SciPy Uvicorn

Frontend

Vue.js TypeScript Vite Pinia Chart.js Tailwind

Infrastructure

Docker Nginx pytest



Built with ❤️ by panaceya

About

RAG platform with document ingestion, chunking strategies, vector search, re-ranking, conversational memory, and evaluation framework

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors