A hybrid search engine built on SQLite with SQLite AI and SQLite Vector extensions. SQLite RAG combines vector similarity search with full-text search (FTS5 extension) using Reciprocal Rank Fusion (RRF) for enhanced document retrieval.
- Hybrid Search: Combines vector embeddings with full-text search for optimal results
- SQLite-based: Built on SQLite with AI and Vector extensions for reliability and performance
- Multi-format Text Support: Process text file formats including PDF, DOCX, Markdown, code files
- Recursive Character Text Splitter: Token-aware text chunking with configurable overlap
- Interactive CLI: Command-line interface with interactive REPL mode
- Flexible Configuration: Customizable embedding models, search weights, and chunking parameters
SQLite RAG requires SQLite with extension loading support.
If you encounter extension loading issues (e.g., 'sqlite3.Connection' object has no attribute 'enable_load_extension'), follow the setup guides for macOS or Windows.
python3 -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install sqlite-ragDownload the model Embedding Gemma from Hugging Face chosen as default model:
sqlite-rag download-model unsloth/embeddinggemma-300m-GGUF embeddinggemma-300M-Q8_0.ggufSQLite RAG comes preconfigured to work with the Embedding Gemma model. When you add a document or text, it automatically creates a new database (if one does not already exist) and uses default settings, so you can get started immediately without manual setup.
# Initialize sqliterag.sqlite database and add documents
sqlite-rag add-text "Artificial intelligence (AI) enables machines to learn from data"
sqlite-rag add /path/to/documents --recursive
# Search your documents
sqlite-rag search "explain AI"
# Interactive mode
sqlite-rag
> help
> search "interactive search"
> exitFor help run:
sqlite-rag --helpSettings are stored in the database and should be set before adding any documents.
# View available configuration options
sqlite-rag configure --help
sqlite-rag configure --model-path ./mymodels/path
# View current settings
sqlite-rag settingsTo use a different database filename, use the global --database option:
# Single command with custom database
sqlite-rag --database path/to/mydb.db add-text "Let's talk about AI."
# Interactive mode with custom database
sqlite-rag --database path/to/mydb.dbYou can experiment with other models from Hugging Face by downloading them with:
# Download GGUF models from Hugging Face
sqlite-rag download-model <model-repo> <filename>SQLite RAG supports the following file formats:
- Text:
.txt,.md,.mdx,.csv,.json,.xml,.yaml,.yml - Documents:
.pdf,.docx,.pptx,.xlsx - Code:
.c,.cpp,.css,.go,.h,.hpp,.html,.java,.js,.mjs,.kt,.php,.py,.rb,.rs,.swift,.ts,.tsx - Web Frameworks:
.svelte,.vue
For development, clone the repository and install with development dependencies:
# Clone the repository
git clone https://github.com/sqliteai/sqlite-rag.git
cd sqlite-rag
# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install in development mode
pip install -e '.[dev]'- Document Processing: Files are processed and split into overlapping chunks
- Embedding Generation: Text chunks are converted to vector embeddings using AI models
- Dual Indexing: Content is indexed for both vector similarity and full-text search
- Hybrid Search: Queries are processed through both search methods
- Result Fusion: Results are combined using Reciprocal Rank Fusion for optimal relevance
