A powerful document question-answering system using RAG (Retrieval Augmented Generation) technology. This application allows users to upload PDF documents and ask questions about their contents, receiving accurate answers based on the document context.
- 📚 PDF document processing and storage
- 🔍 Semantic search using vector embeddings
- 🤖 Advanced LLM-based question answering
- 💻 User-friendly web interface
- 📊 Source tracking and analysis details
- ⚡ Performance optimizations with caching
- 🛡️ Robust error handling and logging
- Python 3.8+
- Ollama installed and running locally
- Sufficient disk space for document storage and vector embeddings
- Clone the repository:
git clone https://github.com/pushpendra-tripathi/docwhisperer.git
cd docwhisperer- Create a virtual environment (recommended):
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies:
pip install -r requirements.txt- Create a
.envfile (optional):
CHROMA_PATH=chroma
DATA_PATH=data
DEFAULT_MODEL=deepseek-r1:1.5b
EMBEDDING_MODEL=snowflake-arctic-embed2
CHUNK_SIZE=800
CHUNK_OVERLAP=80
TOP_K_RESULTS=20
TEMPERATURE=0.0- Start the web application:
streamlit run app.py-
Open your browser and navigate to the provided URL (typically http://localhost:8501)
-
Use the sidebar to:
- Upload PDF documents
- Process uploaded documents
- Reset the database if needed
-
Ask questions in the chat interface about the uploaded documents
app.py: Main Streamlit web applicationquery_data.py: Core RAG query engine implementationpopulate_database.py: Document processing and database managementconfig.py: Configuration settingsget_embedding_function.py: Embedding model configurationtest_rag.py: Test suite for RAG functionality
The application can be configured through environment variables or the config.py file:
CHROMA_PATH: Directory for vector storeDATA_PATH: Directory for uploaded documentsDEFAULT_MODEL: Ollama model for question answeringEMBEDDING_MODEL: Model for document embeddingsCHUNK_SIZE: Document chunk size for processingCHUNK_OVERLAP: Overlap between chunksTOP_K_RESULTS: Number of relevant chunks to retrieveTEMPERATURE: LLM temperature setting
The application includes several performance optimizations:
- LRU caching for document retrieval
- Efficient document chunking
- Optimized vector similarity search
- Retries for LLM queries
- Streamlined UI updates
The application implements comprehensive error handling:
- Graceful handling of upload errors
- Retry mechanism for LLM queries
- Detailed error logging
- User-friendly error messages
- Database state validation
- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.

