📄 RAG System with LangChain

A document Q&A chatbot that lets you ask questions about your documents using two different approaches: Document Injection and Retrieval-Augmented Generation (RAG). Compare both methods side-by-side through an intuitive Gradio web interface with real-time token usage tracking.

✨ Features

Dual Chat Modes — Switch between Document Injection and Full RAG to compare approaches
Document Upload — Upload .txt, .md, and .pdf files, or use default documents
Real-time Token Tracking — Monitor cumulative input/output/total token usage per session
Vector Similarity Search — RAG mode uses HuggingFace embeddings with in-memory vector store
Conversation History — Multi-turn conversations with full context retention
Shareable Link — Gradio generates a public URL for remote access
Standalone CLI Tools — Run RAG or injection modes from the command line

🏗️ Architecture

flowchart TB
    subgraph UI["Gradio Web UI"]
        Upload["📁 Upload Documents"]
        ModeSelect["🔘 Mode Selector"]
        Chat["💬 Chat Interface"]
        Tokens["📊 Token Stats"]
    end

    subgraph Injection["Document Injection Mode"]
        AllDocs["All Documents"] --> SysPrompt["System Prompt"]
    end

    subgraph RAG["Full RAG Mode"]
        Embed["HuggingFace Embeddings<br/><i>all-MiniLM-L6-v2</i>"]
        VecStore["In-Memory Vector Store"]
        SimSearch["Similarity Search<br/><i>k=2</i>"]
        Embed --> VecStore --> SimSearch --> RAGPrompt["System Prompt"]
    end

    subgraph LLM["Language Model"]
        OpenAI["OpenAI GPT-4o-mini"]
    end

    Upload --> |".txt / .md / .pdf"| Loader["Document Loader"]
    Loader --> Injection
    Loader --> RAG
    ModeSelect --> |"Document Injection"| Injection
    ModeSelect --> |"Full RAG"| RAG
    SysPrompt --> OpenAI
    RAGPrompt --> OpenAI
    OpenAI --> Chat
    OpenAI --> Tokens

How the Two Modes Work

	Document Injection	Full RAG
Approach	Injects all document content into the system prompt	Embeds documents into vectors, retrieves only the 2 most relevant chunks
Pros	Simple, complete context available	Token-efficient, scales to larger document sets
Cons	Token-heavy, limited by context window	May miss relevant info if not in top-k results
Best for	Small document sets	Larger document collections

🛠️ Tech Stack

Library	Version	Purpose
LangChain	≥ 0.3.0	LLM orchestration framework
langchain-openai	≥ 0.2.0	OpenAI GPT integration
langchain-huggingface	≥ 0.1.0	HuggingFace embedding models
langchain-community	≥ 0.3.0	Community document loaders
langchain-google-genai	≥ 2.0.0	Google Gemini integration (optional)
Gradio	≥ 5.0.0	Web UI framework
pypdf	≥ 4.0.0	PDF text extraction
sentence-transformers	≥ 3.0.0	Embedding model runtime
python-dotenv	≥ 1.0.0	Environment variable management

📁 Project Structure

Rag_System_LangChain/
├── app.py                          # Main application — Gradio web UI with both modes
├── full_rag_system.py              # Standalone CLI — RAG mode with directory loaders
├── context_injection.py            # Standalone CLI — Document injection mode
├── documents_injection_system.py   # Utility — Recursive document loader (.txt/.md/.pdf)
├── requirements.txt                # Python dependencies
├── common-sections-template.md     # Sample markdown template
├── .env                            # API keys (not committed)
└── docs/                           # Default document directory
    ├── Rabbit Company Sample.txt   # Sample company data
    └── Serena Company Sample.txt   # Sample company data

Module Reference

`app.py` — Main Application

The primary entrypoint with the full Gradio web interface. Implements both chat modes in a single application.

Function	Description
`load_file_content(file_path)`	Reads `.txt`, `.md`, or `.pdf` files and returns text content
`load_default_documents()`	Loads all documents from the `docs/` folder
`load_uploaded_documents(files)`	Loads documents from user-uploaded files
`get_documents(uploaded_files)`	Returns uploaded docs (priority) or defaults
`create_vector_store(documents)`	Creates an `InMemoryVectorStore` with HuggingFace embeddings
`injection_chat(message, history, documents)`	Handles Document Injection mode — all docs in system prompt
`rag_chat(message, history, documents)`	Handles RAG mode — similarity search → relevant chunks only
`chat(message, history, mode, uploaded_files)`	Routes to the appropriate chat mode

`full_rag_system.py` — Standalone RAG CLI

Command-line RAG chatbot using LangChain's PyPDFDirectoryLoader and DirectoryLoader. Loads all documents from docs/, builds a vector store, and enters an interactive Q&A loop.

`documents_injection_system.py` — Document Loader Utility

Standalone utility that recursively reads all .txt, .md, .pdf files from a folder using pypdf and pathlib. Returns a list of {filepath: content} dictionaries.

🚀 Installation

Prerequisites

Python 3.10+
An OpenAI API key

Setup

# Clone the repository
git clone https://github.com/YOUR_USERNAME/Rag_System_LangChain.git
cd Rag_System_LangChain

# Create and activate virtual environment
python -m venv .venv
source .venv/bin/activate  # macOS/Linux
# .venv\Scripts\activate   # Windows

# Install dependencies
pip install -r requirements.txt

# Create environment file
cp .env.example .env
# Edit .env and add your OpenAI API key

Environment Variables

Create a .env file in the project root:

OPENAI_API_KEY=sk-your-api-key-here

# Optional: for Google Gemini support
# GOOGLE_API_KEY=your-google-api-key

💡 Usage

Web UI (Recommended)

python app.py

This launches the Gradio interface and prints a local URL (and a public share link). Open it in your browser to:

Select a mode — Choose between "Document Injection" or "Full RAG"
Load documents — Upload files or click "Default" to use the docs/ folder
Ask questions — Type questions about your documents in the chat
Monitor tokens — Track cumulative token usage in the stats bar below the chat

CLI — RAG Mode

python full_rag_system.py

Interactive command-line chatbot using RAG with vector similarity search. Loads all documents from docs/ automatically.

Adding Documents

Place your .txt, .md, or .pdf files in the docs/ folder, or upload them directly through the web UI.

⚙️ Configuration

Changing the LLM Model

In app.py, modify the model initialization:

# Default
llm = ChatOpenAI(model='gpt-4o-mini')

# Use a different model
llm = ChatOpenAI(model='gpt-4o')       # More capable, higher cost
llm = ChatOpenAI(model='gpt-3.5-turbo') # Lower cost

Adjusting RAG Parameters

In the rag_chat() function, adjust the number of retrieved chunks:

# Retrieve more context (default: k=2)
retrieved_docs = vector_store.similarity_search(message, k=4)

Using Google Gemini (Alternative)

Uncomment the Gemini configuration in full_rag_system.py and set GOOGLE_API_KEY in your .env:

from langchain_google_genai import ChatGoogleGenerativeAI
llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash")

🤝 Contributing

Contributions are welcome! Here's how to get started:

Fork the repository
Create a branch for your feature or fix:
```
git checkout -b feature/my-feature
```
Make your changes and ensure they work

Commit with a clear message:

git commit -m "Add: description of your change"

Push and open a Pull Request:
```
git push origin feature/my-feature
```

Ideas for Contribution

Add persistent vector store support (ChromaDB, FAISS)
Implement document chunking for large files
Add support for more file formats (.docx, .csv)
Build a conversation export feature
Add streaming responses

📝 License

This project is licensed under the MIT License.

Built with LangChain and Gradio

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📄 RAG System with LangChain

✨ Features

🏗️ Architecture

How the Two Modes Work

🛠️ Tech Stack

📁 Project Structure

Module Reference

`app.py` — Main Application

`full_rag_system.py` — Standalone RAG CLI

`documents_injection_system.py` — Document Loader Utility

🚀 Installation

Prerequisites

Setup

Environment Variables

💡 Usage

Web UI (Recommended)

CLI — RAG Mode

Adding Documents

⚙️ Configuration

Changing the LLM Model

Adjusting RAG Parameters

Using Google Gemini (Alternative)

🤝 Contributing

Ideas for Contribution

📝 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
__pycache__		__pycache__
README.md		README.md
app.py		app.py
context_injection.py		context_injection.py
documents_injection_system.py		documents_injection_system.py
full_rag_system.py		full_rag_system.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

📄 RAG System with LangChain

✨ Features

🏗️ Architecture

How the Two Modes Work

🛠️ Tech Stack

📁 Project Structure

Module Reference

app.py — Main Application

full_rag_system.py — Standalone RAG CLI

documents_injection_system.py — Document Loader Utility

🚀 Installation

Prerequisites

Setup

Environment Variables

💡 Usage

Web UI (Recommended)

CLI — RAG Mode

Adding Documents

⚙️ Configuration

Changing the LLM Model

Adjusting RAG Parameters

Using Google Gemini (Alternative)

🤝 Contributing

Ideas for Contribution

📝 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`app.py` — Main Application

`full_rag_system.py` — Standalone RAG CLI

`documents_injection_system.py` — Document Loader Utility

Packages