A modular, pluggable RAG (Retrieval-Augmented Generation) system with MCP (Model Context Protocol) server support.
- Modular Architecture: Pluggable components for LLM, Embedding, Vector Store, Reranker, and Evaluator
- Multi-provider Support: OpenAI, Azure OpenAI, Ollama, DeepSeek, and more
- MCP Server Integration: Seamless integration with Copilot and Claude Desktop
- Hybrid Search: Combined dense and sparse retrieval with RRF fusion
- Multi-modal Support: Image understanding and caption generation
- Full Observability: Trace collection and web-based dashboard
- Zero External Dependencies: Local-first design, runs offline
# Create virtual environment
python -m venv .venv
# Activate virtual environment
# On Windows:
.\.venv\Scripts\activate
# On macOS/Linux:
source .venv/bin/activate
# Install dependencies
pip install -e .
# Install dev dependencies (optional)
pip install -e ".[dev]"- Create local config:
cp config/settings.yaml config/settings.local.yaml- Set environment variables:
export AZURE_API_ENDPOINT="https://..."
export AZURE_API_KEY="..."
export OPENAI_API_KEY="..."# Run smoke tests
python -m compileall src
pytest tests/unit/test_smoke_imports.py
# Run all tests
pytest
# Run with coverage
pytest --cov=src --cov-report=htmlSee DEV_SPEC.md for detailed architecture documentation.
Currently implementing Phase A: Engineering skeleton and test base
- ✅ Directory structure initialized
- ⏳ Pytest setup and smoke tests
- ⏳ Configuration loading and validation
MIT