This project implements a Retrieval-Augmented Generation (RAG) pipeline that allows users to upload PDF files and chat with their content through a conversational interface.
It maintains chat history, reformulates follow-up questions contextually, and retrieves accurate answers using semantic search and LLM reasoning.
- 📄 Upload Multiple PDFs: Ingest and process multiple documents at once.
- 🧩 Automatic Chunking: Splits long text into overlapping chunks for better context handling.
- 🔍 Context-Aware Retrieval: Retrieves the most relevant document chunks using semantic similarity.
- 💬 Conversational Memory: Maintains session-based chat history for coherent multi-turn conversations.
- ⚙️ RAG Pipeline: Combines retriever + LLM for accurate and context-grounded answers.
- 🔑 Groq LLM Integration: Uses
llama-3.1-8b-instantmodel for fast, high-quality responses. - 🧠 Standalone Question Rewriting: Reformulates follow-up questions into self-contained ones for improved retrieval.
- 🎨 Streamlit UI: Simple and interactive interface for chatting with PDFs.
| Component | Technology |
|---|---|
| Framework | Streamlit |
| LLM | Groq API – llama-3.1-8b-instant |
| Vector Store | Chroma |
| Embeddings | Hugging Face – all-MiniLM-L6-v2 |
| Document Loader | PyPDFLoader (LangChain) |
| Memory | ChatMessageHistory (LangChain) |
| Environment | Python 3.10+, dotenv |
📦 Conversational-RAG
│
├── app.py # Main Streamlit app
├── requirements.txt # Dependencies
├── .env # Environment variables (HF_TOKEN)
└── README.md # Project documentationIf you want to run your own copy of this project:
-
Fork the Repository
- Go to the GitHub repository:
👉 https://github.com/avanigupta06/ChatDoc-AI - Click the “Fork” button (top-right corner) to create your own copy.
- Go to the GitHub repository:
-
Clone Your Fork
git clone https://github.com/<your-github-username>/ChatDoc-AI.git cd ChatDoc-AI
-
Create a Virtual Environment
python -m venv venv venv\Scripts\activate # On Windows source venv/bin/activate # On macOS/Linux
-
Install Dependencies
pip install -r requirements.txt
-
Set Up Environment Variables
Create a .env file in the project root and add: HF_TOKEN=your_huggingface_api_token -
Run the App
streamlit run app.py
Open in Browser Visit the local URL shown in the terminal (usually http://localhost:8501 ).
You’ll need two keys to run this project:
| API | Purpose | Where to Get It |
|---|---|---|
| Hugging Face Token | Embeddings (all-MiniLM-L6-v2) |
https://huggingface.co/settings/tokens |
| Groq API Key | Access to Llama 3.1 model | https://console.groq.com/keys |
You can input the Groq API Key directly in the Streamlit interface when prompted.
- Upload PDFs → PDFs are parsed into text using
PyPDFLoader. - Chunk Text → Long text is split into 5000-character chunks with 500 overlap.
- Generate Embeddings → Each chunk is converted into a numerical vector using Hugging Face embeddings.
- Store in Chroma DB → The embeddings are stored for semantic retrieval.
- Ask a Question → The LLM reformulates the question if needed and retrieves relevant chunks.
- Answer Generation → Llama-3.1 uses retrieved context to generate concise, grounded answers.
- Chat Memory → Conversation history is maintained per session for contextual continuity.
If you find this project useful, please ⭐ star this repository on GitHub!

