Skip to content

Navanit-git/Agentic_conversational_rag

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🤖 Conversational RAG System with Memory Management & LangGraph Orchestration

An end-to-end conversational Retrieval-Augmented Generation (RAG) stack that keeps multi-user context, manages memory across sessions, and orchestrates seven specialized LangGraph agents. The stack now ships with a working API, Streamlit UI, uv-powered Docker images, and Gemini-powered embeddings by default.


✨ Highlights

  • LangGraph orchestration with dedicated nodes for query understanding, rewriting, retrieval routing, context synthesis, memory management, summarization, and response shaping.
  • Persistent memory via SQLAlchemy (SQLite by default, bring your own PostgreSQL/MySQL via MEMORY_DB_URL).
  • Chroma vector store with Gemini text-embedding-004 embeddings—no local torch download required.
  • Document ingestion pipeline covering PDF (PyMuPDF), Markdown, and HTML with metadata extraction and chunk tracking.
  • Full-stack experience: FastAPI backend + Streamlit UI, both dockerized and wired together through Docker Compose.
  • Tracing-ready with LangSmith optional toggles per environment.

✅ Current Status (Nov 2025)

Component Status Notes
FastAPI service ✅ running on :8000 Includes /health, /query, /documents/upload, /sessions, /sessions/{id}, /sessions/{id}/close
Streamlit UI ✅ running on :8501 Talks to the API over the internal Docker network
Docker images docker-api + docker-ui Built with Python 3.11-slim, uv, and cached .env defaults
Embeddings ✅ Gemini text embeddings Requires at least one GEMINI_API_KEY_* env var
PDF parsing ✅ PyMuPDF installed langchain_community loader now finds pymupdf

🗂️ Repository Layout

etech_conversational_rag/
├── docker/                # Dockerfiles, compose stack, docs
├── sample_data/           # HTML, MD, PDF fixtures for ingestion demos
├── src/                   # FastAPI app, LangGraph orchestration, agents, RAG, memory, UI
├── tests/                 # Pytest suites for agents, graph, rag, api, memory
├── requirements.txt       # Shared runtime + dev dependencies
├── .env                   # Runtime secrets (Gemini, LangSmith, etc.)
└── docs/                  # System architecture and swagger files. 

🔑 Environment Variables

Create a .env in the repo root (Docker copies it during the build). At minimum supply one Gemini key; LangSmith is optional but enabled by default.

# Gemini LLM + embedding keys (round-robin rotation supported)
GEMINI_API_KEY_0=AIza...
GEMINI_API_KEY_1=AIza...

# Optional LangSmith tracing
LANGCHAIN_TRACING_V2=true
LANGCHAIN_ENDPOINT=https://api.smith.langchain.com
LANGCHAIN_API_KEY=lsv2_...
LANGCHAIN_PROJECT=etech-conversational-rag-dev

# Memory + vector persistence (override for Postgres/MySQL)
MEMORY_DB_URL=sqlite:////app/data/memory/memory.db
CHROMA_PERSIST_DIR=/app/chroma_data

# UI ↔ API wiring
API_BASE_URL=http://api:8000

💡 Tip: To rotate keys automatically, add GEMINI_API_KEY_{0..N}. The built-in KeyManager handles round-robin, least-used, or random strategies.


🐳 Quick Start (Docker-First)

  1. Clone & enter the repo

    git clone <repo-url>
    cd etech_conversational_rag
  2. Create .env (see snippet above). Secrets never ship in Git history.

  3. Launch the stack

    cd docker
    docker compose up --build
    • Uses uv for dependency installs (2-3× faster than pip).
    • Named volumes persist chat memory (memory_data), uploads, and Chroma vectors between runs.
  4. Visit the services

    Service URL Purpose
    Streamlit UI http://localhost:8501 Upload docs, create sessions, chat

| FastAPI | http://localhost:8000 | Base API (see /health, /query, /documents/upload, /sessions) | | API Docs | http://localhost:8000/docs | Swagger UI |

  1. Stop everything with Ctrl+C, then docker compose down (add -v to wipe persisted data).

🧑‍💻 Local Development (no Docker)

  1. Python 3.11 virtual env

    python -m venv .venv
    source .venv/bin/activate
    pip install --upgrade pip
    pip install -r requirements.txt
  2. Env setup – copy .env or use python-dotenv CLI.

  3. Run the API

    uvicorn src.api.main:app --host 0.0.0.0 --port 8000 --reload
  4. Run the Streamlit UI (new terminal, same env)

    streamlit run src/ui/streamlit_app.py --server.port 8501
  5. Optional: Switch MEMORY_DB_URL to a managed Postgres instance for multi-host deployments.


📄 Working with Documents

  • Use the Streamlit uploader or hit the API directly:
    curl -X POST http://localhost:8000/documents/upload \
      -F "file=@sample_data/md_files/sample_langchain_guide.md" \
      -F "document_type=markdown" \
      -F "document_name=langchain_guide" \
      -F 'metadata={"category":"docs","version":"1.0"}'
  • The ingestion pipeline:
    1. Validates + normalizes file metadata.
    2. Chunks text (configurable size/overlap).
    3. Generates Gemini embeddings.
    4. Persists chunks + metadata to Chroma.

Sample fixtures live in sample_data/{md_files,pdf_files,html_files} for quick demos.


💬 Querying the System

  • REST (non-streaming)
    curl -X POST http://localhost:8000/api/v1/query \
      -H "Content-Type: application/json" \
      -d '{
        "session_id": "session-abc",
        "user_id": "user-123",
        "query": "Summarize LangGraph routing",
        "include_metadata": true
      }'
  • Streaming endpoints are not yet available on this build; the Streamlit UI hides the toggle (STREAMING_SUPPORTED=False).

🛠️ Testing & Quality Gates

# Run the focused suites that touch recent changes
pytest tests/rag/test_embeddings.py tests/rag/test_vector_store.py

# Or run the full suite with coverage
pytest tests -v --cov=src --cov-report=term-missing

All tests run against lightweight SQLite + in-memory stubs, so no external services are required.


🧱 Architecture Overview

┌────────────────────────────────────────────┐
│ Streamlit UI (chat, uploads)               │
└───────────────┬────────────────────────────┘
                │ REST (JSON)
┌───────────────▼────────────────────────────┐
│ FastAPI Backend (src/api)                  │
│ • Session + memory endpoints               │
│ • Document ingestion                       │
│ • Query API                                │
└───────────────┬────────────────────────────┘
                │ invokes LangGraph
┌───────────────▼────────────────────────────┐
│ LangGraph Orchestrator (7 nodes)           │
│ query → rewrite → route → synthesize →     │
│ memory → summarize → respond               │
└───────────────┬────────────────────────────┘
       ┌────────┴────────┬────────┐
       │                 │        │
┌──────▼──────┐  ┌───────▼──────┐ ┌───────────────┐
│Memory Store │  │Chroma Vector │ │Gemini LLM/Emb │
│(SQLite/SQL) │  │DB (persisted)│ │API (cloud)    │
└─────────────┘  └──────────────┘ └───────────────┘
  • Memory Store: SQLAlchemy models for conversations, summaries, and long-term facts. Default DSN is SQLite; override via MEMORY_DB_URL.
  • Vector Store: Chroma persisted under /app/chroma_data (Docker volume chroma_data).
  • Gemini: Used for both embeddings and generation; swap out by editing src/rag/embeddings.py & src/config/llm_config.py if needed.

🧰 Troubleshooting

Symptom Fix
ModuleNotFoundError: pymupdf Already bundled in requirements.txt; re-run uv pip install -r requirements.txt or rebuild Docker image.
401 from Gemini Ensure at least one GEMINI_API_KEY_* is valid and not rate-limited; keys rotate automatically but still obey quota.
LangSmith warnings Set LANGCHAIN_TRACING_V2=false or remove LANGCHAIN_API_KEY in .env.
Docker build slow Cached layers depend on .env + requirements.txt. After editing those, the uv install step reruns; otherwise it reuses the cache.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors