A Retrieval-Augmented Generation (RAG) chatbot built with Python, FAISS, and Groq LLMs. This system allows you to query, search, and chat with your own local documents using semantic search and context-aware response generation.
This project demonstrates a production-style RAG pipeline, focusing on clean architecture, modular design, and scalability.
Traditional LLMs do not have access to your private or local documents. RAG ChatBot bridges this gap by combining:
- Vector-based retrieval (FAISS)
- Semantic embeddings (all-MiniLM-L6-v2)
- LLM-based response generation (Groq/Llama 3)
You ask a natural language question, and the system:
- Searches your documents intelligently (Semantic Search).
- Retrieves the most relevant content chunks.
- Generates accurate, grounded answers based strictly on your data.
- 📄 Multi-Format Support: Chat with PDFs, text files, or markdown notes.
- 🔍 Semantic Intelligence: Uses vector similarity instead of simple keyword matching.
- ⚡ Lightning Fast: High-performance similarity search using FAISS.
- 🧠 Context-Aware: Answers powered by Groq's high-speed inference engine.
- 🔐 Secure: Environment-based API key management.
- 🧩 Modular: Clean separation of concerns for easy extension.
RAGChatBot/
├── app.py # Main entry point for the chatbot
├── main.py # Optional alternative entry/test script
│
├── src/ # Core application logic
│ ├── __init__.py
│ ├── data_loader.py # Document loading and preprocessing
│ ├── embedding.py # Chunking & embedding pipeline
│ ├── vectorstore.py # FAISS vector store management
│ └── search.py # RAG search & retrieval logic
│
├── data/ # Place your source documents here
├── faiss_store/ # Persisted FAISS index files
├── notebook/ # Jupyter notebooks for experiments
│
├── .env # Environment variables (Local only)
├── .gitignore # Prevents sensitive data from being pushed
├── pyproject.toml # Project metadata
├── uv.lock # Locked dependencies (uv)
└── README.md
🛠️ Installation
1️⃣ Clone the Repository
git clone [https://github.com/YourUsername/RAGChatBot.git](https://github.com/YourUsername/RAGChatBot.git)
cd RAGChatBot
2️⃣ Set Up Virtual Environment (using uv)
Ensure uv is installed.
uv venv
.venv\Scripts\activate
source .venv/bin/activate
uv sync
3️⃣ Configure Environment Variables Create a .env file in the project root:
GROQ_API_KEY=your_actual_groq_api_key
Usage
Run the ChatBot Bash
python app.py
🔄 The Pipeline in Action Load: Documents are read and cleaned from the data/ folder.
Embed: Text is split into chunks and converted into vector embeddings.
Store: Vectors are indexed in faiss_store/ for instant retrieval.
Query: Your question is compared against the index to find relevant text.
Generate: The LLM receives the question + the retrieved text to give a factual answer.
🧩 Adding New Documents Add your new files (.txt, .pdf, etc.) to the data/ directory.
In app.py, temporarily uncomment:
# store.build_from_documents(docs)
Run the app once to rebuild the index, then re-comment the line to save time on future runs.
⚙️ Configuration
-
Top-K Retrieval: Change top_k in search.py or app.py to control how much context the LLM receives.
-
Chunk Size: Adjust chunking logic in src/embedding.py to handle very long or very short documents more effectively.
-
Model Selection: Switch between different Groq models (e.g., llama-3.3-70b-versatile or llama-3.1-8b-instant).
🧪 Tech Stack Python 3.12+
FAISS (Facebook AI Similarity Search)
LangChain / Groq API
HuggingFace Embeddings
uv (Package management)