Skip to content

JaiEnfer/rag-docqa

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RAG Document Question Answering

FastAPI · FAISS · Ollama · Streamlit

A complete Document Question Answering (DocQA) system built using Retrieval-Augmented Generation (RAG).
Users can upload documents (.txt or .pdf), ask natural language questions, and receive answers generated by a local Large Language Model (LLM) grounded in retrieved document context.


Features

  • Upload and ingest TXT and PDF documents
  • Chunking, embeddings, and vector similarity search (FAISS)
  • Local LLM inference using Ollama (no API key required)
  • FastAPI backend with Swagger API docs
  • Streamlit UI for interactive document upload and querying
  • Answers are grounded in the retrieved document context with source references

Tech Stack

Backend

  • Python 3.11+
  • FastAPI, Uvicorn
  • LangChain (text splitters)
  • Sentence-Transformers (embeddings)
  • FAISS (vector store)

LLM

  • Ollama (model: phi3, easily switchable)

Frontend

  • Streamlit

Project Structure

rag-docqa-demo/ 
├── app/
│   ├── main.py        # FastAPI routes (/ingest, /ask, /health)\\
│   ├── rag.py         # FAISS store, retrieval, Ollama generation
│   ├── ingest.py      # TXT/PDF loaders and chunking logic
│   ├── prompts.py    # System prompt template
│   ├── schemas.py    # API request/response schemas
│   └── settings.py   # Environment configuration
├── ui/
│   └── app.py         # Streamlit UI
├── data/
│   └── uploads/       # Uploaded files (local only)
├── requirements.txt
├── .env               # Local environment variables (not committed)
└── README.md

Prerequisites (Windows)

  • Python 3.11+
  • Git (optional)
  • Ollama installed locally

Install Ollama (Windows)

  1. Download and install Ollama:
    https://ollama.com/download
  2. Restart PowerShell and verify installation:
ollama --version
  1. Pull a lightweight model (recommended for laptops):
ollama pull phi3
  1. Quick test:
ollama run phi3

Test by Typing: What is Retrieval-Augmented Generation?

Exit:

/bye

Local Setup

Create and activate virtual environment

cd D:\Projects\rag-docqa-demo  
python -m venv .venv  
.\.venv\Scripts\Activate

2️⃣ Install dependencies

pip install -r requirements.txt

Create .env file

Create a file named .env in the project root:

embedding_model=sentence-transformers/all-MiniLM-L6-v2
vector_store_path=./data/faiss_index
ollama_model=phi3
ollama_url=http://localhost:11434


Run Backend (FastAPI)

cd D:\Projects\rag-docqa-demo  
.\.venv\Scripts\Activate  
uvicorn app.main:app --reload

Swagger API docs:
http://127.0.0.1:8000/docs

Health check:
http://127.0.0.1:8000/health


Run UI (Streamlit)

Open a second PowerShell window:

cd D:\Projects\rag-docqa-demo  
.\.venv\Scripts\Activate  
streamlit run ui\app.py

UI will open at:
http://localhost:8501


How to Use

  1. Start FastAPI backend
  2. Start Streamlit UI
  3. Upload a .txt or .pdf document
  4. Ask questions such as:
    • “Summarize the document in 3 bullet points.”
    • “What are the key requirements mentioned?”
    • “What does the document say about data privacy?”

Notes & Troubleshooting

UI shows “API not reachable”
Make sure backend is running:

uvicorn app.main:app --reload

/ask endpoint fails or is slow
Ensure Ollama is running and the model exists:

ollama list  
ollama pull phi3

Performance tips

  • Use smaller models (phi3) for CPU-only machines
  • Large PDFs may take longer during ingestion

Thank You

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages