Skip to content

VampiricCyborg/RAG_Document_Assistant

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🧠 Retrieval-Augmented Generation (RAG) Document Assistant

A fully functional AI-powered document question-answering system built using FastAPI, LangChain, ChromaDB, and Groq Llama 3.1.
This project allows users to upload files and query them using natural language. The system retrieves relevant text chunks, feeds them to the LLM, and returns accurate answers with citations.


🚀 Features

  • Upload and index documents dynamically
  • Support for multiple file types (PDF, DOCX, PPTX, TXT, CSV, JSON, HTML, Excel)
  • Automatic text extraction using LangChain loaders
  • Chunking using RecursiveCharacterTextSplitter
  • Embedding generation using Sentence Transformers
  • Vector similarity search using ChromaDB (persistent storage)
  • Integration with Groq Llama 3.1 for fast inference
  • Returns answers with source citations and similarity scores
  • API accessible through Swagger UI and Postman

🧩 Tech Stack

Area Technology
Framework FastAPI
Embeddings Sentence Transformers (all-MiniLM-L6-v2)
Vector DB ChromaDB
LLM Groq Llama 3.1 Instant
Document Parsing LangChain Loaders
API Testing Swagger UI / Postman
Language Python

📂 Project Structure

📦 RAG-Assistant
 ┣ 📁 data/                # Optional base documents
 ┣ 📁 uploads/             # User uploaded files
 ┣ chatbot.py              # Core RAG logic + vector store builder
 ┣ rag_app.py              # FastAPI application
 ┣ requirements.txt        # Dependencies
 ┣ README.md               # Documentation

⚙️ Setup Instructions

1️⃣ Clone the repository

git clone https://github.com/<your-username>/<repo-name>.git
cd <repo-name>

2️⃣ Create virtual environment

python -m venv .venv
.venv\Scripts\activate   # Windows
source .venv/bin/activate # Mac/Linux

3️⃣ Install dependencies

pip install -r requirements.txt

4️⃣ Environment variables

Create .env file:

GROQ_API_KEY=your_groq_key_here

▶️ Run the Application

uvicorn rag_app:app --reload --port 8000

Visit API docs:
👉 http://localhost:8000/docs


📌 API Endpoints

Method Endpoint Function
GET / Health check
POST /upload Upload a document to index
POST /ask Ask a question based on stored documents

🧪 Example Usage

Upload File

POST /upload (form-data)

Key Value
file example.pdf

Ask a Question

POST /ask
{
  "question": "Explain number of factors formula."
}

Response includes:

  • AI Answer
  • Retrieved context
  • Similarity scores
  • Source file references

🛠 Future Enhancements

  • Frontend UI (React / Next.js / Streamlit)
  • OCR for handwriting & scanned documents
  • Authentication & multi-user workspace
  • Deployment with Docker + AWS EC2
  • Query chat memory & history

👤 Author

Madhav M S

About

AI-powered document assistant using Retrieval-Augmented Generation (RAG), FastAPI, ChromaDB, and Groq LLM. Supports multi-document upload and natural-language querying with citations

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages