FastAPI · FAISS · Ollama · Streamlit
A complete Document Question Answering (DocQA) system built using Retrieval-Augmented Generation (RAG).
Users can upload documents (.txt or .pdf), ask natural language questions, and receive answers generated by a local Large Language Model (LLM) grounded in retrieved document context.
- Upload and ingest TXT and PDF documents
- Chunking, embeddings, and vector similarity search (FAISS)
- Local LLM inference using Ollama (no API key required)
- FastAPI backend with Swagger API docs
- Streamlit UI for interactive document upload and querying
- Answers are grounded in the retrieved document context with source references
Backend
- Python 3.11+
- FastAPI, Uvicorn
- LangChain (text splitters)
- Sentence-Transformers (embeddings)
- FAISS (vector store)
LLM
- Ollama (model:
phi3, easily switchable)
Frontend
- Streamlit
rag-docqa-demo/
├── app/
│ ├── main.py # FastAPI routes (/ingest, /ask, /health)\\
│ ├── rag.py # FAISS store, retrieval, Ollama generation
│ ├── ingest.py # TXT/PDF loaders and chunking logic
│ ├── prompts.py # System prompt template
│ ├── schemas.py # API request/response schemas
│ └── settings.py # Environment configuration
├── ui/
│ └── app.py # Streamlit UI
├── data/
│ └── uploads/ # Uploaded files (local only)
├── requirements.txt
├── .env # Local environment variables (not committed)
└── README.md
- Python 3.11+
- Git (optional)
- Ollama installed locally
- Download and install Ollama:
https://ollama.com/download - Restart PowerShell and verify installation:
ollama --version- Pull a lightweight model (recommended for laptops):
ollama pull phi3- Quick test:
ollama run phi3Test by Typing: What is Retrieval-Augmented Generation?
Exit:
/byecd D:\Projects\rag-docqa-demo
python -m venv .venv
.\.venv\Scripts\Activatepip install -r requirements.txtCreate a file named .env in the project root:
embedding_model=sentence-transformers/all-MiniLM-L6-v2
vector_store_path=./data/faiss_index
ollama_model=phi3
ollama_url=http://localhost:11434
cd D:\Projects\rag-docqa-demo
.\.venv\Scripts\Activate
uvicorn app.main:app --reloadSwagger API docs:
http://127.0.0.1:8000/docs
Health check:
http://127.0.0.1:8000/health
Open a second PowerShell window:
cd D:\Projects\rag-docqa-demo
.\.venv\Scripts\Activate
streamlit run ui\app.pyUI will open at:
http://localhost:8501
- Start FastAPI backend
- Start Streamlit UI
- Upload a
.txtor.pdfdocument - Ask questions such as:
- “Summarize the document in 3 bullet points.”
- “What are the key requirements mentioned?”
- “What does the document say about data privacy?”
UI shows “API not reachable”
Make sure backend is running:
uvicorn app.main:app --reload/ask endpoint fails or is slow
Ensure Ollama is running and the model exists:
ollama list
ollama pull phi3Performance tips
- Use smaller models (
phi3) for CPU-only machines - Large PDFs may take longer during ingestion
Thank You