GitHub - N3XT3R1337/ragpilot: RAG platform with document ingestion, chunking strategies, vector search, re-ranking, conversational memory, and evaluation framework

A production-grade Retrieval-Augmented Generation platform built from scratch with zero external vector DB dependencies. RAGPilot implements custom HNSW indexing, hybrid BM25+vector search with reciprocal rank fusion, multi-strategy document chunking, and a full evaluation framework — all in pure Python and TypeScript.

Metric	Value
Endpoints	45+
Components	40+
Chunking Strategies	5
Search Modes	3
Eval Metrics	5
Lines of Code	8500+

Features

Document Processing

Multi-format ingestion supporting PDF, DOCX, HTML, and Markdown
Automatic content extraction with metadata preservation
Batch upload and processing pipeline
Document versioning and collection management

Chunking Strategies

Fixed-size chunking with configurable character windows
Sentence-aware chunking that preserves linguistic boundaries
Paragraph-based chunking for structurally coherent segments
Semantic chunking using embedding similarity thresholds
Recursive chunking with hierarchical separator fallback

Vector Store

Custom HNSW (Hierarchical Navigable Small World) approximate nearest neighbor index
Exact KNN brute-force search for baseline comparison
Rich metadata filtering with operators: $gt, $gte, $lt, $lte, $ne, $in, $nin
Cosine similarity scoring with normalized vectors

Hybrid Search

BM25 sparse retrieval with TF-IDF weighting
Dense vector search via custom HNSW or exact KNN
Reciprocal Rank Fusion (RRF) for combining sparse and dense results
Configurable alpha blending between search modalities

Re-ranking

Cross-encoder simulation for fine-grained relevance scoring
Maximal Marginal Relevance (MMR) for result diversification
Cohere-style re-ranking simulation
Configurable top-k post-rerank selection

LLM Integration

Pluggable prompt templates with variable substitution
Streaming response generation
Source citation with chunk-level attribution
Context window management with automatic truncation

Conversation Memory

Sliding window memory for recent message retention
Summary-based compression for long conversation histories
Hybrid window + summary strategy
Per-conversation isolation with session management

Evaluation Framework

Faithfulness scoring for hallucination detection
Relevance measurement between queries and retrieved context
Correctness comparison against ground-truth answers
Precision and Recall metrics for retrieval quality
Batch evaluation runs with aggregate reporting

Architecture

System Architecture Diagram

graph TB
    User[User] --> Frontend[Frontend - Vue.js SPA]
    Frontend --> API[API Gateway - FastAPI]

    API --> DocService[Document Service]
    API --> SearchService[Search Service]
    API --> ChatService[Chat Service]
    API --> EvalService[Eval Service]

    DocService --> Chunking[Chunking Engine]
    Chunking --> Embedding[Embedding Service]
    Embedding --> VectorStore[(Vector Store)]

    SearchService --> VectorStore
    SearchService --> BM25[BM25 Index]
    SearchService --> Reranker[Re-ranker]
    BM25 --> Reranker

    ChatService --> Memory[Conversation Memory]
    ChatService --> LLM[LLM Service]
    ChatService --> SearchService

    style User fill:#667eea,color:#fff
    style Frontend fill:#764ba2,color:#fff
    style API fill:#667eea,color:#fff
    style DocService fill:#7c3aed,color:#fff
    style SearchService fill:#7c3aed,color:#fff
    style ChatService fill:#7c3aed,color:#fff
    style EvalService fill:#7c3aed,color:#fff
    style Chunking fill:#6d28d9,color:#fff
    style Embedding fill:#6d28d9,color:#fff
    style VectorStore fill:#5b21b6,color:#fff
    style BM25 fill:#5b21b6,color:#fff
    style Reranker fill:#6d28d9,color:#fff
    style Memory fill:#6d28d9,color:#fff
    style LLM fill:#6d28d9,color:#fff

Quick Start

# Clone and setup
cp .env.example .env

# Backend
cd backend && pip install -r requirements.txt && uvicorn app.main:app --reload

# Frontend
cd frontend && npm install && npm run dev

Docker Quick Start

docker-compose up --build

The frontend will be available at http://localhost:3000 and the API at http://localhost:8000.

API Overview

Group	Endpoints	Methods
Documents	`/documents/upload`, `/documents/{id}`, `/documents/list`
Collections	`/collections`, `/collections/{id}`, `/collections/{id}/stats`
Search	`/search`, `/search/hybrid`, `/search/rerank`
Chat	`/chat`, `/chat/stream`, `/chat/history/{id}`
Evaluation	`/eval/run`, `/eval/results/{id}`
System	`/health`, `/stats`

Project Structure

ragpilot/
├── backend/
│   ├── app/
│   │   ├── main.py
│   │   ├── config.py
│   │   ├── models.py
│   │   ├── routers/
│   │   │   ├── chat.py
│   │   │   ├── collections.py
│   │   │   ├── documents.py
│   │   │   ├── eval.py
│   │   │   └── search.py
│   │   └── services/
│   │       ├── chunking_service.py
│   │       ├── document_service.py
│   │       ├── embedding_service.py
│   │       ├── llm_service.py
│   │       ├── reranker.py
│   │       └── vector_store.py
│   ├── tests/
│   │   ├── test_chunking.py
│   │   ├── test_vector_store.py
│   │   └── test_eval.py
│   ├── Dockerfile
│   └── requirements.txt
├── frontend/
│   ├── src/
│   │   ├── components/
│   │   ├── views/
│   │   ├── stores/
│   │   ├── services/
│   │   └── router/
│   ├── Dockerfile
│   ├── nginx.conf
│   └── package.json
├── docker-compose.yml
├── .env.example
├── .editorconfig
├── .gitignore
├── LICENSE
└── README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Table of Contents

Features

Document Processing

Chunking Strategies

Vector Store

Hybrid Search

Re-ranking

LLM Integration

Conversation Memory

Evaluation Framework

Architecture

Quick Start

Docker Quick Start

API Overview

Project Structure

Tech Stack

Backend

Frontend

Infrastructure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
backend		backend
frontend		frontend
.editorconfig		.editorconfig
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml

Folders and files

Latest commit

History

Repository files navigation

Table of Contents

Features

Document Processing

Chunking Strategies

Vector Store

Hybrid Search

Re-ranking

LLM Integration

Conversation Memory

Evaluation Framework

Architecture

Quick Start

Docker Quick Start

API Overview

Project Structure

Tech Stack

Backend

Frontend

Infrastructure

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages