🧠 Retrieval-Augmented Generation (RAG) Document Assistant

A fully functional AI-powered document question-answering system built using FastAPI, LangChain, ChromaDB, and Groq Llama 3.1.
This project allows users to upload files and query them using natural language. The system retrieves relevant text chunks, feeds them to the LLM, and returns accurate answers with citations.

🚀 Features

Upload and index documents dynamically
Support for multiple file types (PDF, DOCX, PPTX, TXT, CSV, JSON, HTML, Excel)
Automatic text extraction using LangChain loaders
Chunking using RecursiveCharacterTextSplitter
Embedding generation using Sentence Transformers
Vector similarity search using ChromaDB (persistent storage)
Integration with Groq Llama 3.1 for fast inference
Returns answers with source citations and similarity scores
API accessible through Swagger UI and Postman

🧩 Tech Stack

Area	Technology
Framework	FastAPI
Embeddings	Sentence Transformers (all-MiniLM-L6-v2)
Vector DB	ChromaDB
LLM	Groq Llama 3.1 Instant
Document Parsing	LangChain Loaders
API Testing	Swagger UI / Postman
Language	Python

📂 Project Structure

📦 RAG-Assistant
 ┣ 📁 data/                # Optional base documents
 ┣ 📁 uploads/             # User uploaded files
 ┣ chatbot.py              # Core RAG logic + vector store builder
 ┣ rag_app.py              # FastAPI application
 ┣ requirements.txt        # Dependencies
 ┣ README.md               # Documentation

⚙️ Setup Instructions

1️⃣ Clone the repository

git clone https://github.com/<your-username>/<repo-name>.git
cd <repo-name>

2️⃣ Create virtual environment

python -m venv .venv
.venv\Scripts\activate   # Windows
source .venv/bin/activate # Mac/Linux

3️⃣ Install dependencies

pip install -r requirements.txt

4️⃣ Environment variables

Create .env file:

GROQ_API_KEY=your_groq_key_here

▶️ Run the Application

uvicorn rag_app:app --reload --port 8000

Visit API docs:
👉 http://localhost:8000/docs

📌 API Endpoints

Method	Endpoint	Function
GET	`/`	Health check
POST	`/upload`	Upload a document to index
POST	`/ask`	Ask a question based on stored documents

🧪 Example Usage

Upload File

POST /upload (form-data)

Key	Value
file	example.pdf

Ask a Question

POST /ask
{
  "question": "Explain number of factors formula."
}

Response includes:

AI Answer
Retrieved context
Similarity scores
Source file references

🛠 Future Enhancements

Frontend UI (React / Next.js / Streamlit)
OCR for handwriting & scanned documents
Authentication & multi-user workspace
Deployment with Docker + AWS EC2
Query chat memory & history

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
tools		tools
.gitignore		.gitignore
README.md		README.md
agent.py		agent.py
chatbot.py		chatbot.py
pyproject.toml		pyproject.toml
rag_app.py		rag_app.py
requirement.txt		requirement.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 Retrieval-Augmented Generation (RAG) Document Assistant

🚀 Features

🧩 Tech Stack

📂 Project Structure

⚙️ Setup Instructions

1️⃣ Clone the repository

2️⃣ Create virtual environment

3️⃣ Install dependencies

4️⃣ Environment variables

▶️ Run the Application

📌 API Endpoints

🧪 Example Usage

Upload File

Ask a Question

🛠 Future Enhancements

👤 Author

Madhav M S

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🧠 Retrieval-Augmented Generation (RAG) Document Assistant

🚀 Features

🧩 Tech Stack

📂 Project Structure

⚙️ Setup Instructions

1️⃣ Clone the repository

2️⃣ Create virtual environment

3️⃣ Install dependencies

4️⃣ Environment variables

▶️ Run the Application

📌 API Endpoints

🧪 Example Usage

Upload File

Ask a Question

🛠 Future Enhancements

👤 Author

Madhav M S

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages