ProRAG is a minimal entity-graph RAG runtime for grounded question answering. It ingests text into a knowledge graph, resolves references during ingestion, retrieves graph evidence with entity-first ranking, and answers with a single LLM synthesis call.
graph TD
A[Raw Text or File] --> B[Sentence-Aware Ingestion]
B --> C[Lazy Entity Resolution]
C --> D[Fact Extraction: Relations, Attributes, Events]
D --> E[(NetworkX MultiDiGraph)]
E --> F[Entity + Semantic Retrieval]
F --> G[Slot, Temporal, and Path Reranking]
G --> H[Graph-First Answer Prompt]
H --> I[Concise Answer]
- Adaptive ingestion: Short passages resolve entities in one pass; long files use overlapping sentence chunks with entity memory carried forward.
- Lazy context expansion: A chunk only retries with previous context when entity resolution returns unresolved mentions.
- Graph-first QA: Answers are generated from selected graph facts by default.
- Optional source snippets: Raw text snippets can be included as supporting context with
include_source_text=True. - Cost/quality modes:
cheap,balanced, andqualitytune token budgets and lazy retries. - LLM response cache: Optional file-backed cache avoids repeated API calls for identical prompts.
- Offline embedding fallback: A local sentence-transformer model is preferred when present; a hash embedding fallback keeps retrieval usable without it.
python -m venv .venv
source .venv/bin/activate
pip install -e .For development:
pip install -e ".[dev]"Set a provider key before using the default Groq backend:
cp .env.example .env
export GROQ_API_KEY=your_key_hereOn Windows PowerShell:
python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -e .
Copy-Item .env.example .env
$env:GROQ_API_KEY = "your_key_here"ProRAG does not ship a local embedding model. Set PRORAG_EMBEDDING_MODEL to a
SentenceTransformers model name or local path if you want semantic embeddings:
export PRORAG_EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2If the model is unavailable, ProRAG falls back to deterministic hash embeddings so local tests and keyword-oriented retrieval still work.
from prorag import ProRAG
rag = ProRAG(quality_mode="balanced")
rag.ingest(
"Christopher Nolan directed Inception. "
"Inception was filmed in Paris and Tokyo."
)
result = rag.ask("Where was the film directed by Christopher Nolan filmed?")
print(result["answer"])
print(result["sources"])| Mode | Use when | Behavior |
|---|---|---|
cheap |
Cost matters most | Shorter outputs, fewer lazy retries |
balanced |
Default use | Moderate token budgets and retries |
quality |
Recall matters most | Larger token budgets and more lazy retries |
rag = ProRAG(quality_mode="balanced")
rag.ingest_file("notes.txt", quality_mode="cheap")
result = rag.ask("Who directed Inception?", quality_mode="quality")By default, ProRAG answers from graph facts only. To include short source snippets as supporting context:
result = rag.ask(
"Who directed Inception?",
include_source_text=True,
max_source_chars=1200,
)The answer prompt still treats graph facts as primary evidence and snippets as support only.
Enable cache for repeated benchmark or ingestion runs:
export PRORAG_LLM_CACHE=1
export PRORAG_LLM_CACHE_PATH=.prorag_cache/llm_cache.jsonPowerShell:
$env:PRORAG_LLM_CACHE = "1"
$env:PRORAG_LLM_CACHE_PATH = ".prorag_cache/llm_cache.json"The cache key includes provider, model, token budget, system prompt, and prompt text. It is disabled by default.
prorag ingest notes.txt --graph graph.json
prorag ingest notes.txt --graph graph.json --quality-mode cheap
prorag ask "Who directed Inception?" --graph graph.json
prorag ask "Who directed Inception?" --graph graph.json --quality-mode quality
prorag ask "Who directed Inception?" --graph graph.json --include-source-text
prorag interactive --graph graph.json
prorag stats --graph graph.jsonGlobal options are also accepted before the subcommand:
prorag --graph graph.json --quality-mode balanced statsingest_file()routes short files directly toingest_text().- Long files are split into overlapping sentence chunks.
- Each chunk resolves entities with the current entity registry.
- If unresolved mentions appear, resolution retries with prior sentence context.
- Facts are extracted into relations, attributes, and events.
- Each fact is written with a chunk-level source ID for later grounding.
- The question is classified into a slot such as
who,where,when, orhow_many. - Seed entities are detected with lexical and semantic matching.
- Candidate triples are retrieved with semantic graph traversal and keyword fallback.
- Triples are reranked by entity alignment, relation cues, temporal hints, confidence, and path connectivity.
- The answer prompt receives graph facts, plus optional supporting source snippets.
pytest tests -q
python -m compileall prorag tests
python -m ruff check .- Graph storage is still in-memory
networkx.MultiDiGraphwith JSON persistence. - Vector retrieval is a local linear scan, suitable for prototypes and small corpora.
- Concurrent writes are not transaction-safe.
- Relation normalization is still mostly LLM-driven and should become schema-backed for strict production deployments.
For larger deployments, move graph storage to a graph database, move semantic lookup to a vector index, and add a job queue around ingestion.
Apache License 2.0. See LICENSE.