RAG Chat Application

Retrieval-Augmented Generation (RAG) chat experience built with a FastAPI backend and a React/Vite frontend. Users can ask questions grounded in the project knowledge base (RAG_doc.md). Local sentence-transformer embeddings keep requests free and fast, while OpenAI embeddings remain available when needed.

Features

Full RAG pipeline: document chunking, hybrid embedding support, ChromaDB vector store, retrieval, prompt assembly, and OpenAI generation.
Batch embedding generation with async endpoints so long indexing jobs never block the API.
Detailed logging piped into responses so the UI can show a "thinking" trace.
SQLite (default) or Postgres chat history storage via SQLAlchemy async sessions.
Fast buttoned caches (backend LRU + frontend memoization) so repeated questions return instantly.
Responsive chat interface that stays editable even when a response is pending.
Automatic fallback to local embeddings whenever the configured OpenAI embedding model is unavailable.
OpenAI-style chat surface with a bright theme, custom avatars, hover-only scrollbars, cache badges, and a Gemini-like composer pill.
Sidebar history panel with client-side elastic-search (token scoring + ranking) so users can instantly filter through prior answers without new API calls.

Architecture Overview

Backend (`/backend`)

FastAPI + Uvicorn web service with /chat, /health, /reindex, and /history.
rag_service.py loads RAG_doc.md, chunks it, builds embeddings (local or OpenAI), retrieves top k matches, and prompts OpenAI for grounded answers.
vector_store.py wraps a persistent ChromaDB collection stored under backend/chroma_db.
database.py / models.py define the async SQLAlchemy engine and ChatHistory model (defaults to sqlite+aiosqlite:///./chat_history.db).
.env controls OpenAI keys, embedding providers, DB URLs, etc. (.env.example documents every flag).

Frontend (`/frontend`)

React 18 app bootstrapped with Vite.
src/components/Chat.jsx handles the conversation flow, optimistic user messages, streaming-style typing indicator, and log viewer.
Axios is used for all HTTP calls; CSS modules keep styling isolated.
Local cache short-circuits repeated user prompts and the UI uses hover-only scrollbars, custom avatars, and a bright neutral palette inspired by ChatGPT.

Prerequisites

Python 3.10+
Node.js 18+ (with npm)
(Optional) OpenAI API key for generation/embeddings when not using local mode
Git Bash / WSL / PowerShell for running scripts on Windows

Backend Setup

cd backend
python -m venv venv
venv\Scripts\activate        # Windows
# source venv/bin/activate   # macOS/Linux
pip install -r requirements.txt
cp .env.example .env

Populate .env:

OPENAI_API_KEY=sk-your-key
USE_LOCAL_EMBEDDINGS=true          # default; set false to force OpenAI embeddings
OPENAI_EMBEDDING_MODEL=text-embedding-3-small
EMBEDDING_MODEL=all-MiniLM-L6-v2   # local sentence-transformer slug
DATABASE_URL=sqlite+aiosqlite:///./chat_history.db
OPENAI_MODEL=gpt-4o-mini
RESPONSE_CACHE_SIZE=20             # in-memory cache entries for repeated questions

Leave USE_LOCAL_EMBEDDINGS=true if you do not want to spend API credits. When false, ensure OPENAI_EMBEDDING_MODEL is one your OpenAI project can access; otherwise the backend will automatically drop back to the local model.

Database options

SQLite (default): Ready out of the box (stored at backend/chat_history.db). Works great for local development.
Postgres: Set DATABASE_URL=postgresql+asyncpg://user:pass@host:port/dbname. If your password includes special characters (@, &, ^, etc.) percent-encode them. Create the database manually (CREATE DATABASE rag_chat;) before launching the API.

Frontend Setup

cd frontend
npm install

Running the App

Option A - helper script

./scripts/start.sh

Starts the backend on http://localhost:8000 and the Vite dev server on http://localhost:3000.

Option B - manual

# Terminal 1
cd backend
venv\Scripts\activate
uvicorn main:app --reload --host 0.0.0.0 --port 8000

# Terminal 2
cd frontend
npm run dev

Embedding Modes

Local (default): Uses sentence-transformers (all-MiniLM-L6-v2). No API calls or cost. Requires installing the Python dependencies listed in requirements.txt.
OpenAI: Set USE_LOCAL_EMBEDDINGS=false and provide a supported OPENAI_EMBEDDING_MODEL. Handy when you want higher-quality embeddings. If OpenAI returns 403 model_not_found, the backend logs the failure, switches back to the local model, and continues serving queries.

API Endpoints

Method	Endpoint	Description
`GET`	`/`	Health message.
`GET`	`/health`	Returns RAG readiness (chunks loaded, vector DB ready, API key present).
`GET`	`/history`	Latest 50 chat exchanges from the async SQL DB (frontend adds elastic-style search locally).
`POST`	`/chat`	Runs the full RAG pipeline for the provided message.
`POST`	`/reindex`	Clears ChromaDB and rebuilds embeddings from `RAG_doc.md`.

Requests to /chat expect { "message": "..." } and respond with { "response": "...", "logs": { steps: [...] } }.

RAG Pipeline

Load knowledge: Read RAG_doc.md and split into overlapping ~1000-character chunks.
Embed: Generate embeddings in batches via sentence-transformers or OpenAI.
Persist: Save embeddings + metadata in ChromaDB for instant lookups.
Retrieve: Embed incoming queries and fetch the top k similar chunks.
Augment prompt: Build a contextual prompt that clearly separates context and user question.
Generate: Call OpenAI's chat completions API (default gpt-4o-mini) and log each step.

Updating the Knowledge Base

Edit RAG_doc.md (keep it concise and factual for better retrieval).
Restart the backend or call POST /reindex to wipe and rebuild the ChromaDB collection.
Wait for the “Knowledge base initialization complete” log before sending new questions.

Tip: the UI does not reindex automatically. Use a curl command (curl -X POST http://localhost:8000/reindex) or add a temporary button when needed.

Troubleshooting

ModuleNotFoundError: sentence-transformers - run pip install -r backend/requirements.txt inside the backend venv.
OpenAI 403 / model_not_found: either switch USE_LOCAL_EMBEDDINGS=true or configure OPENAI_EMBEDDING_MODEL to a permitted model; the service will fall back automatically but logs the warning.
Port conflicts: stop existing services or edit the port numbers in scripts/start.sh.
Frontend build errors: run npm install from frontend/, then npm run build or npm run dev.
Cache disabled/too small: set RESPONSE_CACHE_SIZE in .env (0 disables caching, higher values improve hit rate for repeated questions at the cost of RAM).

Reference Docs

DESIGN.md – design tokens, layout rules, and UI conventions (composer sizing, sidebar behavior, etc.).
ERRORS.md – mapping between backend error strings and the friendly text shown in the UI.
IMPLEMENTATION_PLAN.md – rolling backlog and open technical tasks.
FOLDER_STRUCTURE.md – quick refresher on the repo layout.

License

Demo project for showcasing RAG patterns. Use internally or extend at your own risk.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG Chat Application

Features

Architecture Overview

Backend (`/backend`)

Frontend (`/frontend`)

Prerequisites

Backend Setup

Database options

Frontend Setup

Running the App

Option A - helper script

Option B - manual

Embedding Modes

API Endpoints

RAG Pipeline

Updating the Knowledge Base

Troubleshooting

Reference Docs

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
backend		backend
frontend		frontend
scripts		scripts
.cursor-rules.md		.cursor-rules.md
.gitattributes		.gitattributes
.gitignore		.gitignore
DESIGN.md		DESIGN.md
ERRORS.md		ERRORS.md
FOLDER_STRUCTURE.md		FOLDER_STRUCTURE.md
IMPLEMENTATION_PLAN.md		IMPLEMENTATION_PLAN.md
RAG_doc.md		RAG_doc.md
README.md		README.md
package-lock.json		package-lock.json

vikast908/RAG-master

Folders and files

Latest commit

History

Repository files navigation

RAG Chat Application

Features

Architecture Overview

Backend (/backend)

Frontend (/frontend)

Prerequisites

Backend Setup

Database options

Frontend Setup

Running the App

Option A - helper script

Option B - manual

Embedding Modes

API Endpoints

RAG Pipeline

Updating the Knowledge Base

Troubleshooting

Reference Docs

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Backend (`/backend`)

Frontend (`/frontend`)

Packages