🤖 Conversational RAG System with Memory Management & LangGraph Orchestration

An end-to-end conversational Retrieval-Augmented Generation (RAG) stack that keeps multi-user context, manages memory across sessions, and orchestrates seven specialized LangGraph agents. The stack now ships with a working API, Streamlit UI, uv-powered Docker images, and Gemini-powered embeddings by default.

✨ Highlights

LangGraph orchestration with dedicated nodes for query understanding, rewriting, retrieval routing, context synthesis, memory management, summarization, and response shaping.
Persistent memory via SQLAlchemy (SQLite by default, bring your own PostgreSQL/MySQL via MEMORY_DB_URL).
Chroma vector store with Gemini text-embedding-004 embeddings—no local torch download required.
Document ingestion pipeline covering PDF (PyMuPDF), Markdown, and HTML with metadata extraction and chunk tracking.
Full-stack experience: FastAPI backend + Streamlit UI, both dockerized and wired together through Docker Compose.
Tracing-ready with LangSmith optional toggles per environment.

✅ Current Status (Nov 2025)

Component	Status	Notes
FastAPI service	✅ running on `:8000`	Includes `/health`, `/query`, `/documents/upload`, `/sessions`, `/sessions/{id}`, `/sessions/{id}/close`
Streamlit UI	✅ running on `:8501`	Talks to the API over the internal Docker network
Docker images	✅ `docker-api` + `docker-ui`	Built with Python 3.11-slim, uv, and cached `.env` defaults
Embeddings	✅ Gemini text embeddings	Requires at least one `GEMINI_API_KEY_*` env var
PDF parsing	✅ PyMuPDF installed	`langchain_community` loader now finds `pymupdf`

🗂️ Repository Layout

etech_conversational_rag/
├── docker/                # Dockerfiles, compose stack, docs
├── sample_data/           # HTML, MD, PDF fixtures for ingestion demos
├── src/                   # FastAPI app, LangGraph orchestration, agents, RAG, memory, UI
├── tests/                 # Pytest suites for agents, graph, rag, api, memory
├── requirements.txt       # Shared runtime + dev dependencies
├── .env                   # Runtime secrets (Gemini, LangSmith, etc.)
└── docs/                  # System architecture and swagger files.

🔑 Environment Variables

Create a .env in the repo root (Docker copies it during the build). At minimum supply one Gemini key; LangSmith is optional but enabled by default.

# Gemini LLM + embedding keys (round-robin rotation supported)
GEMINI_API_KEY_0=AIza...
GEMINI_API_KEY_1=AIza...

# Optional LangSmith tracing
LANGCHAIN_TRACING_V2=true
LANGCHAIN_ENDPOINT=https://api.smith.langchain.com
LANGCHAIN_API_KEY=lsv2_...
LANGCHAIN_PROJECT=etech-conversational-rag-dev

# Memory + vector persistence (override for Postgres/MySQL)
MEMORY_DB_URL=sqlite:////app/data/memory/memory.db
CHROMA_PERSIST_DIR=/app/chroma_data

# UI ↔ API wiring
API_BASE_URL=http://api:8000

💡 Tip: To rotate keys automatically, add GEMINI_API_KEY_{0..N}. The built-in KeyManager handles round-robin, least-used, or random strategies.

🐳 Quick Start (Docker-First)

Clone & enter the repo

git clone <repo-url>
cd etech_conversational_rag

Create .env (see snippet above). Secrets never ship in Git history.
Launch the stack
```
cd docker
docker compose up --build
```
- Uses uv for dependency installs (2-3× faster than pip).
- Named volumes persist chat memory (memory_data), uploads, and Chroma vectors between runs.
Visit the services

Service URL Purpose

Streamlit UI http://localhost:8501 Upload docs, create sessions, chat

Stop everything with Ctrl+C, then docker compose down (add -v to wipe persisted data).

🧑‍💻 Local Development (no Docker)

Python 3.11 virtual env

python -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt

Env setup – copy .env or use python-dotenv CLI.

Run the API

uvicorn src.api.main:app --host 0.0.0.0 --port 8000 --reload

Run the Streamlit UI (new terminal, same env)

streamlit run src/ui/streamlit_app.py --server.port 8501

Optional: Switch MEMORY_DB_URL to a managed Postgres instance for multi-host deployments.

📄 Working with Documents

Use the Streamlit uploader or hit the API directly:

curl -X POST http://localhost:8000/documents/upload \
  -F "file=@sample_data/md_files/sample_langchain_guide.md" \
  -F "document_type=markdown" \
  -F "document_name=langchain_guide" \
  -F 'metadata={"category":"docs","version":"1.0"}'

The ingestion pipeline:
1. Validates + normalizes file metadata.
2. Chunks text (configurable size/overlap).
3. Generates Gemini embeddings.
4. Persists chunks + metadata to Chroma.

Sample fixtures live in sample_data/{md_files,pdf_files,html_files} for quick demos.

💬 Querying the System

REST (non-streaming)

curl -X POST http://localhost:8000/api/v1/query \
  -H "Content-Type: application/json" \
  -d '{
    "session_id": "session-abc",
    "user_id": "user-123",
    "query": "Summarize LangGraph routing",
    "include_metadata": true
  }'

Streaming endpoints are not yet available on this build; the Streamlit UI hides the toggle (STREAMING_SUPPORTED=False).

🛠️ Testing & Quality Gates

# Run the focused suites that touch recent changes
pytest tests/rag/test_embeddings.py tests/rag/test_vector_store.py

# Or run the full suite with coverage
pytest tests -v --cov=src --cov-report=term-missing

All tests run against lightweight SQLite + in-memory stubs, so no external services are required.

🧱 Architecture Overview

┌────────────────────────────────────────────┐
│ Streamlit UI (chat, uploads)               │
└───────────────┬────────────────────────────┘
                │ REST (JSON)
┌───────────────▼────────────────────────────┐
│ FastAPI Backend (src/api)                  │
│ • Session + memory endpoints               │
│ • Document ingestion                       │
│ • Query API                                │
└───────────────┬────────────────────────────┘
                │ invokes LangGraph
┌───────────────▼────────────────────────────┐
│ LangGraph Orchestrator (7 nodes)           │
│ query → rewrite → route → synthesize →     │
│ memory → summarize → respond               │
└───────────────┬────────────────────────────┘
       ┌────────┴────────┬────────┐
       │                 │        │
┌──────▼──────┐  ┌───────▼──────┐ ┌───────────────┐
│Memory Store │  │Chroma Vector │ │Gemini LLM/Emb │
│(SQLite/SQL) │  │DB (persisted)│ │API (cloud)    │
└─────────────┘  └──────────────┘ └───────────────┘

Memory Store: SQLAlchemy models for conversations, summaries, and long-term facts. Default DSN is SQLite; override via MEMORY_DB_URL.
Vector Store: Chroma persisted under /app/chroma_data (Docker volume chroma_data).
Gemini: Used for both embeddings and generation; swap out by editing src/rag/embeddings.py & src/config/llm_config.py if needed.

🧰 Troubleshooting

Symptom	Fix
`ModuleNotFoundError: pymupdf`	Already bundled in `requirements.txt`; re-run `uv pip install -r requirements.txt` or rebuild Docker image.
`401 from Gemini`	Ensure at least one `GEMINI_API_KEY_*` is valid and not rate-limited; keys rotate automatically but still obey quota.
LangSmith warnings	Set `LANGCHAIN_TRACING_V2=false` or remove `LANGCHAIN_API_KEY` in `.env`.
Docker build slow	Cached layers depend on `.env` + `requirements.txt`. After editing those, the uv install step reruns; otherwise it reuses the cache.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤖 Conversational RAG System with Memory Management & LangGraph Orchestration

✨ Highlights

✅ Current Status (Nov 2025)

🗂️ Repository Layout

🔑 Environment Variables

🐳 Quick Start (Docker-First)

🧑‍💻 Local Development (no Docker)

📄 Working with Documents

💬 Querying the System

🛠️ Testing & Quality Gates

🧱 Architecture Overview

🧰 Troubleshooting

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
docker		docker
docs		docs
sample_data		sample_data
src		src
tests		tests
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🤖 Conversational RAG System with Memory Management & LangGraph Orchestration

✨ Highlights

✅ Current Status (Nov 2025)

🗂️ Repository Layout

🔑 Environment Variables

🐳 Quick Start (Docker-First)

🧑‍💻 Local Development (no Docker)

📄 Working with Documents

💬 Querying the System

🛠️ Testing & Quality Gates

🧱 Architecture Overview

🧰 Troubleshooting

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages