MeVe Framework

Multi-phase Efficient Vector Retrieval - A 5-phase RAG pipeline that optimizes context selection through progressive filtering and intelligent budgeting.

Unlike simple vector search, MeVe combines vector similarity, cross-encoder verification, BM25 fallback, MMR deduplication, and token budgeting to deliver high-quality, budget-aware context for LLMs.

🎯 Why MeVe?

Traditional RAG systems often:

Return irrelevant chunks despite high similarity scores
Waste tokens on redundant information
Fail silently when vector search underperforms
Ignore token budget constraints

MeVe solves these problems with a smart 5-phase pipeline:

✅ Quality First - Cross-encoder verification ensures relevance
✅ Adaptive Fallback - BM25 backup when vector search fails
✅ Zero Redundancy - MMR-based deduplication
✅ Budget Aware - Greedy token packing within limits
✅ Production Ready - Tested on HotpotQA dataset

🚀 Quick Start

Installation

# Clone the repository
git clone https://github.com/nakulbh/Meve-framework.git
cd meve

# Install with uv (recommended)
uv pip install -e .

# Or with pip
pip install -e .

Basic Usage

from meve import MeVeEngine, MeVeConfig, ContextChunk
from meve.services.vector_db_client import VectorDBClient

# 1. Prepare your data
chunks = [
    ContextChunk("The Eiffel Tower is in Paris, France.", "doc1"),
    ContextChunk("Paris is the capital of France.", "doc2"),
    ContextChunk("The Louvre Museum is in Paris.", "doc3"),
]

# 2. Initialize ChromaDB (default vector database)
vector_db = VectorDBClient(
    chunks=chunks,
    collection_name="my_collection",
    is_persistent=False  # In-memory for quick testing
)

# 3. Configure the pipeline
config = MeVeConfig(
    k_init=10,           # Initial candidates from vector search
    tau_relevance=0.5,   # Relevance threshold (0-1)
    n_min=3,             # Min chunks to avoid fallback
    theta_redundancy=0.85,  # Similarity threshold for deduplication
    t_max=512            # Maximum token budget
)

# 4. Initialize engine with ChromaDB
engine = MeVeEngine(config=config, vector_db_client=vector_db)

# 5. Retrieve context
context = engine.run("Where is the Eiffel Tower?")
print(context)

ChromaDB Integration

MeVe uses ChromaDB as its default vector database. Three ways to use it:

# Option 1: In-memory (fastest, temporary)
vector_db = VectorDBClient(chunks=chunks, is_persistent=False)

# Option 2: Persistent storage (production)
vector_db = VectorDBClient(chunks=chunks, is_persistent=True)

# Option 3: Load existing collection
vector_db = VectorDBClient(
    collection_name="my_collection",
    is_persistent=True,
    load_existing=True  # No re-embedding needed!
)

Quick start: python examples/quickstart_chromadb.py
Full guide: See docs/chromadb_guide.md

Run with HotpotQA Data

# Download HotpotQA dataset
make download-data

# Run with real data (loads 50 examples by default)
make run

# Or run the basic example
make example

📊 Pipeline Architecture

Query → kNN Search → Verification → [Fallback?] → Deduplication → Budgeting → Context
         ↓              ↓              ↓              ↓              ↓
      Phase 1        Phase 2        Phase 3        Phase 4        Phase 5

The 5 Phases

Phase Descriptions

Phase 1 (kNN) - Vector similarity search via ChromaDB
Returns top k_init candidates using all-MiniLM-L6-v2 embeddings
Phase 2 (Verification) - Cross-encoder re-ranking
Uses cross-encoder/ms-marco-MiniLM-L-6-v2 to filter by tau_relevance threshold
Phase 3 (Fallback) - Conditional BM25 retrieval
Only triggers when |verified| < n_min - supplements with lexical search
Phase 4 (Prioritization) - MMR-based deduplication
Removes redundant chunks using theta_redundancy similarity threshold
Phase 5 (Budgeting) - Greedy token packing
Fits top chunks within t_max budget using GPT-2 tokenizer

⚙️ Configuration

Hyperparameters

Parameter	Type	Default	Description
`k_init`	int	10	Initial candidates from vector search
`tau_relevance`	float	0.5	Cross-encoder threshold (0-1)
`n_min`	int	3	Min verified chunks to skip fallback
`theta_redundancy`	float	0.85	Similarity threshold for deduplication
`t_max`	int	512	Maximum token budget

Example Configurations

# Development - Fast iteration
config = MeVeConfig(k_init=5, tau_relevance=0.3, n_min=2, t_max=256)

# Production - Quality focus
config = MeVeConfig(k_init=20, tau_relevance=0.6, n_min=5, t_max=1024)

# Tight budget - Minimal tokens
config = MeVeConfig(k_init=10, tau_relevance=0.7, n_min=2, t_max=128)

🛠️ Development

Setup

# Install development dependencies
make install-dev

# Run tests
make test

# Format code
make format

# Lint code
make lint

# Clean cache
make clean

Project Structure

meve/
├── core/
│   ├── engine.py          # MeVeEngine orchestrator
│   └── models.py          # ContextChunk, MeVeConfig, Query
├── phases/
│   ├── phase1_knn.py      # Vector search
│   ├── phase2_verification.py  # Cross-encoder
│   ├── phase3_fallback.py # BM25 fallback
│   ├── phase4_prioritization.py  # MMR deduplication
│   └── phase5_budgeting.py  # Token packing
├── services/
│   └── vector_db_client.py  # ChromaDB wrapper
└── utils/

Adding Custom Phases

# meve/phases/phase6_custom.py
def execute_phase_6(query: str, chunks: List[ContextChunk], config: MeVeConfig):
    """Your custom phase logic."""
    # Process chunks
    return processed_chunks

# Update MeVeEngine.run() to call your phase
# Add parameters to MeVeConfig if needed

📚 Use Cases

Question Answering - Retrieve precise context for factual queries
Chatbots - Budget-aware context for conversational AI
Document Search - Hybrid vector + lexical retrieval
Knowledge Bases - Deduplicated, relevant snippets

🧪 Testing

# Run all tests
pytest

# Run specific test file
pytest __tests__/unit/test_engine.py

# Run with coverage
pytest --cov=meve

Test fixtures available in __tests__/fixtures/sample_data.py.

📖 Documentation

🤝 Contributing

See CONTRIBUTING.md for development guidelines.

Commit Convention: feat:, fix:, docs:, test:, refactor:, chore:

📝 License

MIT License - see LICENSE for details.

🔗 Related

Built with ❤️ for the RAG community

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
.github		.github
__tests__		__tests__
config		config
data		data
docs		docs
examples		examples
integrations		integrations
logs		logs
meve		meve
result		result
scripts		scripts
.env		.env
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
CONTRIBUTING.md		CONTRIBUTING.md
Makefile		Makefile
README.md		README.md
TODO.md		TODO.md
postman_collection.json		postman_collection.json
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MeVe Framework

🎯 Why MeVe?

🚀 Quick Start

Installation

Basic Usage

ChromaDB Integration

Run with HotpotQA Data

📊 Pipeline Architecture

The 5 Phases

Phase Descriptions

⚙️ Configuration

Hyperparameters

Example Configurations

🛠️ Development

Setup

Project Structure

Adding Custom Phases

📚 Use Cases

🧪 Testing

📖 Documentation

🤝 Contributing

📝 License

🔗 Related

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MeVe Framework

🎯 Why MeVe?

🚀 Quick Start

Installation

Basic Usage

ChromaDB Integration

Run with HotpotQA Data

📊 Pipeline Architecture

The 5 Phases

Phase Descriptions

⚙️ Configuration

Hyperparameters

Example Configurations

🛠️ Development

Setup

Project Structure

Adding Custom Phases

📚 Use Cases

🧪 Testing

📖 Documentation

🤝 Contributing

📝 License

🔗 Related

About

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages