Skip to content

nakulbh/Meve-framework

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

29 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

MeVe Framework

Python 3.13+ License: MIT Version

Multi-phase Efficient Vector Retrieval - A 5-phase RAG pipeline that optimizes context selection through progressive filtering and intelligent budgeting.

Unlike simple vector search, MeVe combines vector similarity, cross-encoder verification, BM25 fallback, MMR deduplication, and token budgeting to deliver high-quality, budget-aware context for LLMs.


🎯 Why MeVe?

Traditional RAG systems often:

  • Return irrelevant chunks despite high similarity scores
  • Waste tokens on redundant information
  • Fail silently when vector search underperforms
  • Ignore token budget constraints

MeVe solves these problems with a smart 5-phase pipeline:

βœ… Quality First - Cross-encoder verification ensures relevance
βœ… Adaptive Fallback - BM25 backup when vector search fails
βœ… Zero Redundancy - MMR-based deduplication
βœ… Budget Aware - Greedy token packing within limits
βœ… Production Ready - Tested on HotpotQA dataset


πŸš€ Quick Start

Installation

# Clone the repository
git clone https://github.com/nakulbh/Meve-framework.git
cd meve

# Install with uv (recommended)
uv pip install -e .

# Or with pip
pip install -e .

Basic Usage

from meve import MeVeEngine, MeVeConfig, ContextChunk
from meve.services.vector_db_client import VectorDBClient

# 1. Prepare your data
chunks = [
    ContextChunk("The Eiffel Tower is in Paris, France.", "doc1"),
    ContextChunk("Paris is the capital of France.", "doc2"),
    ContextChunk("The Louvre Museum is in Paris.", "doc3"),
]

# 2. Initialize ChromaDB (default vector database)
vector_db = VectorDBClient(
    chunks=chunks,
    collection_name="my_collection",
    is_persistent=False  # In-memory for quick testing
)

# 3. Configure the pipeline
config = MeVeConfig(
    k_init=10,           # Initial candidates from vector search
    tau_relevance=0.5,   # Relevance threshold (0-1)
    n_min=3,             # Min chunks to avoid fallback
    theta_redundancy=0.85,  # Similarity threshold for deduplication
    t_max=512            # Maximum token budget
)

# 4. Initialize engine with ChromaDB
engine = MeVeEngine(config=config, vector_db_client=vector_db)

# 5. Retrieve context
context = engine.run("Where is the Eiffel Tower?")
print(context)

ChromaDB Integration

MeVe uses ChromaDB as its default vector database. Three ways to use it:

# Option 1: In-memory (fastest, temporary)
vector_db = VectorDBClient(chunks=chunks, is_persistent=False)

# Option 2: Persistent storage (production)
vector_db = VectorDBClient(chunks=chunks, is_persistent=True)

# Option 3: Load existing collection
vector_db = VectorDBClient(
    collection_name="my_collection",
    is_persistent=True,
    load_existing=True  # No re-embedding needed!
)

Quick start: python examples/quickstart_chromadb.py
Full guide: See docs/chromadb_guide.md

Run with HotpotQA Data

# Download HotpotQA dataset
make download-data

# Run with real data (loads 50 examples by default)
make run

# Or run the basic example
make example

πŸ“Š Pipeline Architecture

Query β†’ kNN Search β†’ Verification β†’ [Fallback?] β†’ Deduplication β†’ Budgeting β†’ Context
         ↓              ↓              ↓              ↓              ↓
      Phase 1        Phase 2        Phase 3        Phase 4        Phase 5

The 5 Phases

image

Phase Descriptions

  1. Phase 1 (kNN) - Vector similarity search via ChromaDB
    Returns top k_init candidates using all-MiniLM-L6-v2 embeddings

  2. Phase 2 (Verification) - Cross-encoder re-ranking
    Uses cross-encoder/ms-marco-MiniLM-L-6-v2 to filter by tau_relevance threshold

  3. Phase 3 (Fallback) - Conditional BM25 retrieval
    Only triggers when |verified| < n_min - supplements with lexical search

  4. Phase 4 (Prioritization) - MMR-based deduplication
    Removes redundant chunks using theta_redundancy similarity threshold

  5. Phase 5 (Budgeting) - Greedy token packing
    Fits top chunks within t_max budget using GPT-2 tokenizer


βš™οΈ Configuration

Hyperparameters

Parameter Type Default Description
k_init int 10 Initial candidates from vector search
tau_relevance float 0.5 Cross-encoder threshold (0-1)
n_min int 3 Min verified chunks to skip fallback
theta_redundancy float 0.85 Similarity threshold for deduplication
t_max int 512 Maximum token budget

Example Configurations

# Development - Fast iteration
config = MeVeConfig(k_init=5, tau_relevance=0.3, n_min=2, t_max=256)

# Production - Quality focus
config = MeVeConfig(k_init=20, tau_relevance=0.6, n_min=5, t_max=1024)

# Tight budget - Minimal tokens
config = MeVeConfig(k_init=10, tau_relevance=0.7, n_min=2, t_max=128)

πŸ› οΈ Development

Setup

# Install development dependencies
make install-dev

# Run tests
make test

# Format code
make format

# Lint code
make lint

# Clean cache
make clean

Project Structure

meve/
β”œβ”€β”€ core/
β”‚   β”œβ”€β”€ engine.py          # MeVeEngine orchestrator
β”‚   └── models.py          # ContextChunk, MeVeConfig, Query
β”œβ”€β”€ phases/
β”‚   β”œβ”€β”€ phase1_knn.py      # Vector search
β”‚   β”œβ”€β”€ phase2_verification.py  # Cross-encoder
β”‚   β”œβ”€β”€ phase3_fallback.py # BM25 fallback
β”‚   β”œβ”€β”€ phase4_prioritization.py  # MMR deduplication
β”‚   └── phase5_budgeting.py  # Token packing
β”œβ”€β”€ services/
β”‚   └── vector_db_client.py  # ChromaDB wrapper
└── utils/

Adding Custom Phases

# meve/phases/phase6_custom.py
def execute_phase_6(query: str, chunks: List[ContextChunk], config: MeVeConfig):
    """Your custom phase logic."""
    # Process chunks
    return processed_chunks

# Update MeVeEngine.run() to call your phase
# Add parameters to MeVeConfig if needed

πŸ“š Use Cases

  • Question Answering - Retrieve precise context for factual queries
  • Chatbots - Budget-aware context for conversational AI
  • Document Search - Hybrid vector + lexical retrieval
  • Knowledge Bases - Deduplicated, relevant snippets

πŸ§ͺ Testing

# Run all tests
pytest

# Run specific test file
pytest __tests__/unit/test_engine.py

# Run with coverage
pytest --cov=meve

Test fixtures available in __tests__/fixtures/sample_data.py.


πŸ“– Documentation


🀝 Contributing

See CONTRIBUTING.md for development guidelines.

Commit Convention: feat:, fix:, docs:, test:, refactor:, chore:


πŸ“ License

MIT License - see LICENSE for details.


πŸ”— Related


Built with ❀️ for the RAG community

About

Reimaging Rag

Resources

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors