A zero-cost, edge-first, neutrality-enforced Bible chatbot using Retrieval-Augmented Generation (RAG) to deliver exact verse quotes, original-language (Hebrew/Greek) insights, and Treasury of Scripture Knowledge (TSK) cross-references — without theological bias, modern commentary, or hallucinated content.
Live demo: https://biblelm.vercel.app
- Scripture-First & Absolute Neutrality — Every response must quote real verses with chapter:verse citations. No interpretation, application, denominational slant, political framing, or moralizing allowed. The system prompt rigidly enforces this.
- Zero-Cost Operation — Runs indefinitely on Vercel Hobby tier + Groq free tier (llama-3.1-8b-instant default; BYOK for 70B). No paid vector DBs, no heavy compute, no always-on servers.
- Speed & Reliability — Common queries (<1 s) via bundled data + Edge Functions. Rare verses fallback gracefully.
- Original-Language Fidelity — Hebrew (OSHB) / Greek (SBLGNT) word popups with Strong's number, transliteration, and gloss — no loose paraphrasing.
Next.js 14+ (App Router) + Vercel Edge runtime. Fully stateless where possible; lightweight PostgreSQL used only for build-time seeding of embeddings & TSK.
- Framework — Next.js 16+ (App Router, Server Actions, React Server Components)
- Styling — Tailwind CSS + shadcn/ui (radix primitives)
- LLM Integration — Vercel AI SDK (
@ai-sdk/groq,streamText,generateText) - Models (Groq)
- Default:
llama-3.1-8b-instant(~14k TPM free tier) - Optional BYOK:
llama-3.1-70b-versatile,llama3-8b-8192fallback
- Default:
- Retrieval — Hybrid RAG (no vector DB at runtime):
- Direct reference parsing (
John 3:16,Ex 21:22-25) - Groq-powered semantic verse suggestion (cheap 8B re-ranking)
- Bundled ~1,000 high-frequency verses + full Strong's dictionary (static JSON)
- PostgreSQL (build-time only) for seeding embeddings & TSK cross-refs
- Fallback: public free APIs (e.g. bolls.life, helloao.org) for full translations
- Direct reference parsing (
- Caching — Upstash Redis (optional) for query → verses + answer (72h TTL)
- Data — Public domain / open-license sources:
- BSB (default translation)
- Strong's Exhaustive Concordance
- OSHB (Hebrew), SBLGNT (Greek) morphology via Macula / OpenScriptures
- Treasury of Scripture Knowledge (TSK) cross-references
- Runtime — Vercel Edge Functions (
/api/chatmust stay edge-compatible)
-
Client →
/api/chat(POST, Edge)
Sends message history array (Vercel AI SDK format). -
Normalization
Extracts latest user query; handles multimodal / complex payloads. -
Reference Parsing
Uses regex + simple grammar to detect Bible refs → direct verse fetch if matched. -
Semantic Retrieval (Fallback / Vague Queries)
- Sends query to Groq 8B → suggests 3–8 relevant verse refs
- Looks up in bundled index → if miss, hits translation API
-
Context Assembly
- Fetches verse text (selected translation)
- Attaches Hebrew/Greek morphology (Strong's-linked)
- Injects TSK cross-refs (ranked by relevance)
- Builds rigid context block
-
Prompt Engineering
- System prompt (~800 tokens): enforces citation-only, bans commentary, requires exact quotes
- Temperature = 0.1 (near-deterministic)
- Frequency penalty = 0.5 (prevents loops)
- Full history included (token-efficient truncation if needed)
-
Inference & Streaming
streamText→ Groq →toUIMessageStreamResponse()- UI shows typewriter effect instantly
-
Fallbacks
- Model retry cascade: 8B → 70B → older 8B
- Rate-limit (429) → client-side "Lite mode" (verses + Strong's only)
Groq’s free tier is subject to TPM (tokens-per-minute) limits, so long contexts or spikes can trip rate limiting. BibleLM now falls back automatically in this order:
- Groq:
llama-3.1-8b-instant(primary) - Groq:
llama-3.3-70b-versatile(secondary) - Hugging Face Inference:
meta-llama/Meta-Llama-3.1-8B-Instruct(reusesHF_TOKEN) - Google Gemini:
gemini-2.5-flash(optionalGEMINI_API_KEY) - Final: raw verses + original-language notes only
npm run build:data — one-time script that:
- Downloads/parses TSV/Parquet from OpenScriptures, Macula, etc.
- Generates:
strongs-dict.json(~ O(1) lookups)bible-index.json(~1,000 common verses + metadata)- MorphHB data split per book (~40 JSON files, pre-compressed for performance & caching)
- OpenHebrewBible subset (clause segmentation, poetic division, BHS-WLC alignments, extended glosses) — CC BY-NC 4.0 (attribution required)
- Embedding vectors (Hugging Face free inference, stored in PG during build)
- TSK cross-ref map (verse → related verses)
- Output committed to
data/folder → shipped statically
→ Edge runtime stays <2 MB per function, no cold starts, no DB latency at request time.
BibleLM uses Upstash Redis to cache full chat responses (verses + context + final answer). This dramatically reduces Groq usage and latency on repeat devotional-style questions; an 80–90% cache hit rate is expected once common queries are warmed.
Upstash's free tier includes 500K commands/month (plus 256 MB storage and one free database), which is typically enough for Hobby usage. To enable caching, create a free Upstash Redis database and copy the REST URL and token into UPSTASH_REDIS_REST_URL and UPSTASH_REDIS_REST_TOKEN in your environment.
OpenHebrewBible subset (clause segmentation, poetic division, BHS-WLC alignments, extended glosses) — processed from eliranwong/OpenHebrewBible — CC BY-NC 4.0 — attribution required.
BibleLM ships a public-domain translation toggle alongside the default BSB:
- BSB (default)
- KJV, WEB, ASV — public domain, sourced from scrollmapper CSV exports
Select the translation in the chat UI; the choice is persisted in localStorage and in ?trans=KJV for shareable links.
Greek New Testament layers (morphology, interlinear glosses, clause tagging) are built from OpenGNT sources and exposed per-verse when the reference is in the NT.
License: OpenGNT is CC BY-NC 4.0 — attribution required.
- Neutral Citation Engine — Forces exact verse quoting + refs
- Original-Language Tooltips — Click any tagged word → Strong's #, translit, gloss popup
- TSK Cross-References — Thematic links shown inline (non-intrusive)
- Translation Toggle — BSB default; KJV/WEB/ASV available
- Free-Tier Friendly — 8B default + BYOK field for 70B
- Controversy-Resistant — Designed to handle divisive topics without editorializing
Verify behavior with these:
-
"What does the Bible say about abortion?"
→ Expect Ps 139:13–16, Ex 21:22–25 (no politics) -
"What is the biblical view of homosexuality?"
→ Lev 18:22, 20:13; Rom 1:26–27; 1 Cor 6:9–11 -
"Is divorce allowed in the Bible?"
→ Mal 2:16; Matt 5:31–32, 19:3–9 -
"Does the Bible support slavery?"
→ Ex 21; Eph 6:5–9; Philemon (quotes only) -
"Can women be pastors according to Scripture?"
→ 1 Tim 2:11–15; Gal 3:28; Rom 16:1–7 -
"Explain 1 Chronicles 4:9 in context"
→ Rare verse → should fallback to API fetch
# 1. Clone & install
git clone https://github.com/voidcommit-afk/BibleLM.git
cd BibleLM
npm install
# 2. Env (optional for full power)
cp .env.example .env.local
# Add GROQ_API_KEY=...
# 3. Build data bundles (one-time or regenerate)
npm run build:data
# 4. Dev server (Edge + streaming)
npm run dev
# → http://localhost:3000Lint + format:
npm run lint # ESLint
npm run format # PrettierFork or connect repo to Vercel Add GROQ_API_KEY (optional) in Environment Variables Deploy → Edge Functions auto-handle /api/chat
See CONTRIBUTING.md
High-impact areas: UI polish, build-time embeddings, Redis caching, more translations, eval suite, PWA/offline.
May your forks stay faithful to the text. ✝️
MIT