WebMind AI is a Universal Web RAG (Retrieval-Augmented Generation) pipeline. It allows users to paste any URL, instantly scrape and index the text, and chat with an AI assistant that grounds its answers strictly in the provided website content.
- Universal Scraping: Extracts clean text from any valid web URL using BeautifulSoup4.
- Smart Chunking: Processes text efficiently for vector embedding to ensure highly relevant context retrieval.
- Vector Database: Uses Qdrant for lightning-fast semantic search and memory management.
- Enterprise Security: Securely manages API keys with cloud-native fallback methods to prevent accidental exposure.
- Optimized Deployment: Utilizes Streamlit resource caching to handle heavy AI models and bypass cloud server boot timeouts.
- Frontend UI: Streamlit
- AI Engine: Google Gemini 3 Flash (via Google GenAI SDK)
- Vector DB: Qdrant (Local/Memory)
- Language: Python 3.x
-
Clone the repository: git clone https://github.com/pratham21-ux/webmind-ai.git cd webmind-ai
-
Set up a Virtual Environment (Optional but Recommended): python -m venv .venv source .venv/bin/activate # On Windows use: .venv\Scripts\activate
-
Install dependencies: pip install -r requirements.txt
-
Set up Environment Variables: Create a .env file in the root directory and add your Google Gemini API key: GEMINI_API_KEY=your_actual_api_key_here
-
Run the App: streamlit run app.py
The application is currently live and deployed via Streamlit Community Cloud. Try WebMind AI Here
This project is open-source and available for educational and personal use.