TN-Legal-RAG is a mission-focused, private RAG (Retrieval-Augmented Generation) system designed for legal professionals and researchers. It provides grounded, citeable answers from a localized corpus including the Tennessee Code Annotated (TCA), Administrative Rules, and specialized Appellate/Supreme Court case law.
- Corpus:
98 filesindexed via Modular Chunking (python3 indexer.py) β Added Open Meetings Act (TOMA) & TPRA Willful Denial Case Law. - Architecture upgrade: WSL-to-Windows GPU Bridging established. Hybrid Re-Ranking Pipeline (Bi-Encoder + Cross-Encoder).
- Ingestion automation: Deployed
auto_ingest.py(LLM-powered unified normalizer for Justia/Lexis scraping). - Verification workflow:
./scripts/check_all.sh- Rebuilt index: Verified (Header-Aware)
- API health check: OK (120-attempt window for heavy Cross-Encoder loading)
- Smoke test: OK (TOMA & TPRA hallucination checks passed)
- API eval: 23/23 passed (FAST mode, workers=8) β 100% success rate
-
Open Meetings (TOMA) & Secret Ballots
- Output:
"No, a county commission cannot hold a closed executive session or use secret ballots. The Tennessee Open Meetings Act (TOMA) prohibits exceptions for closed meetings, and all votes must be vocal and public." - Sources:
docs/tn/code/tca-8-44-104-minutes-recorded.md,docs/tn/opinions/sc/dorrier_v_dark.md
- Output:
-
Willful Denial (TPRA)
- Output:
"If a county requires you to appear in person to request a public record, it constitutes a willful violation of the TPRA and justifies an award of attorney's fees." - Sources:
docs/tn/opinions/coa/friedmann_v_marshall_county.md
- Output:
-
Right to Farm (New Case Law)
- Output:
"Established poultry farms are protected under the Right to Farm Act from nuisance suits by residents who move in after the farm has been operating for at least one year." - Sources:
docs/tn/opinions/sc/estate_of_johnson_v_smith.md
- Output:
Evaluation Suite: 23/23 Pass Rate The system is hardened against cross-jurisdictional hallucinations and "context smearing" through modular indexing and precision re-ranking.
| Case ID | Objective | Status | Mode |
|---|---|---|---|
| tca-records-act | Public Records Custodian | β PASS | Fast |
| tpra-attorney-fees | Willful Denial & Fee Recovery (10-7-505) | β PASS | Fast |
| tpra-commercial-value-news | GIS Data Fees & Media Exemption (10-7-506) | β PASS | Fast |
| toma-secret-ballots | T.C.A. 8-44-104 Voting Requirements | β PASS | Fast |
| toma-action-nullified | T.C.A. 8-44-105 Illegal Meeting Sanctions | β PASS | Fast |
| toma-enforcement-jurisdiction | Chancery/Circuit Court Authority | β PASS | Fast |
| county-quorum | 5-5-108 "Majority" Rule | β PASS | Fast |
| right-to-farm-nuisance | Residential Encroachment (Johnson) | β PASS | Fast |
| arbitration-jurisdiction | SC Case Law (Berkeley Opinion) | β PASS | Fast |
| ag-labor-workers-comp | Landscaping Exemption (Martinez) | β PASS | Fast |
- Engine: FastAPI (backend) + Ollama (local Windows host inference)
- Vector store: ChromaDB (disk-persistent)
- Intelligence: Qwen 2.5 (1.5B Instruct) β temperature
0.0for legal determinism - Retrieval pipeline:
- Modular indexing: header-aware splitting (
##,###) prevents statutory context blending - Wide-net retrieval: semantic search pulls top
40β60candidates usingall-MiniLM-L6-v2 - Precision re-ranking:
cross-encoder/ms-marco-MiniLM-L-6-v2re-scores candidates to find logical matches that simple vectors might miss
- Modular indexing: header-aware splitting (
- Interface: modern Apple-style glassmorphism UI with integrated source citations
To maximize GPU performance, Ollama runs natively on the Windows Host while the Python API runs in WSL Ubuntu.
On Windows Host (PowerShell Admin): Ensure Ollama listens to the WSL virtual network and allow it through the firewall:
$env:OLLAMA_HOST="0.0.0.0"
New-NetFirewallRule -DisplayName "Ollama WSL Bridge" -Direction Inbound -Action Allow -Protocol TCP -LocalPort 11434 -Profile Any
ollama run qwen2.5:1.5b-instructExtract your true Windows Gateway IP and update OLLAMA_URL in rag_api.py:
WIN_IP=$(ip route show default | awk '{print $3}')
echo "Update rag_api.py with this IP: $WIN_IP"Clone and set up the Python env:
# Clone and enter
git clone git@github.com:xXBricksquadXx/TN-Legal-RAG.git
cd TN-Legal-RAG
# Create and activate venv
python3 -m venv .venv
source .venv/bin/activate
# Install deps
pip install -r requirements.txtStep A: Auto-Ingest Raw Text
Drop raw, unformatted text from Justia or Lexis into docs/raw_imports/ as .txt files. Run the automated LLM normalizer to generate perfect Markdown with Practitioner Summaries:
python3 tools/auto_ingest.pyMove the reviewed .md files from docs/staging/ to their final homes in docs/tn/code/ or docs/tn/opinions/.
python3 indexer.pyStart the hardened API and browser UI:
uvicorn rag_api:app --reloadAccess the dashboard at http://127.0.0.1:8000
Run the full suite (Rebuild Index β Health Check β Smoke Test β Evals):
./scripts/check_all.shGET /β glassmorphic user interfaceGET /healthβ API status check (wait logic included for model loading)POST /queryβ full RAG generation (LLM-powered)POST /debug_queryβ fast retrieval check; returns raw documents and sources after re-ranking
- 100% local: no legal data or queries leave your machine
- High-fidelity sources: prioritizes statutory text (TCA) over secondary interpretations
- Hallucination defense: includes a custom testing framework (
/scripts) to verify TN-specific logic; every file includes a human-proofed Practitioner Summary to anchor the LLM
- County Governance Ingest: Comprehensive Title 5 coverage
- Sunshine Law Ingest: TOMA and TPRA willful denial case law
- Precision Re-ranking: Integrated Cross-Encoder for accuracy
- Unified normalizer: Deployed
auto_ingest.pyto conform Justia/Lexis text via local LLM parsing - Containerization: Docker support for "ship-anywhere" deployment
If you find this useful:
- Watch / Star the repo to keep up with weekly TN updates
- Share the project with researchers who need local, offline legal RAG