This project is a custom chatbot that leverages LangChain, HuggingFace embeddings, and FAISS vector storage to provide conversational responses based on data extracted from a website. The chatbot is exposed as a RESTful API using Flask, making it easy to integrate into other applications.
- Web Scraping: Extracts data from a specified URL using LangChain's
WebBaseLoader. - Embeddings: Uses HuggingFace embeddings to convert text into vector representations.
- Vector Storage: Stores embeddings in a FAISS vector store for efficient similarity search.
- Conversational Memory: Maintains conversation history using LangChain's
ConversationBufferMemory. - RESTful API: Exposes the chatbot as a Flask API for easy integration.
- Customizable: Supports different HuggingFace models and embedding configurations.
- LangChain: For building the conversational chain and handling embeddings.
- HuggingFace Transformers: For generating embeddings and powering the chatbot's language model.
- FAISS: For efficient vector storage and similarity search.
- Flask: For creating the RESTful API.
- Python: The core programming language.
The chatbot uses HuggingFace's google/flan-t5-large as its language model. This model is fine-tuned for text generation tasks and provides high-quality responses for conversational AI.
git clone https://github.com/your-username/your-repo-name.git
cd your-repo-nameInstall the required Python packages:
pip install -r requirements.txt- Go to HuggingFace Hub and generate an API token.
- Set the token as an environment variable:
- Windows:
set HUGGINGFACEHUB_API_TOKEN=your_api_token_here
- Linux/macOS:
export HUGGINGFACEHUB_API_TOKEN=your_api_token_here
- Windows:
Start the Flask server:
python app.pyThe server will start at http://127.0.0.1:5000.
You can interact with the chatbot using curl, Postman, or any HTTP client.
Send a POST request to the /chat endpoint with a JSON payload:
curl -X POST http://127.0.0.1:5000/chat \
-H "Content-Type: application/json" \
-d '{"message": "What technical courses are available?"}'{
"response": "The available technical courses include Python programming, Machine Learning, and Data Science."
}The chatbot extracts data from a specified URL using LangChain's WebBaseLoader.
The extracted data is converted into embeddings using HuggingFace's sentence-transformers/all-mpnet-base-v2.
The embeddings are stored in a FAISS vector store for efficient retrieval.
A conversational chain is initialized using HuggingFace's google/flan-t5-large model and the FAISS vector store.
The chatbot is exposed as a RESTful API using Flask. Users can send messages to the /chat endpoint and receive responses.
Modify the model_name parameter in the create_vector_store function:
embedding_model = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")Replace the HuggingFace model (google/flan-t5-large) with any other supported model:
llm = HuggingFaceEndpoint(
repo_id="another-model-name",
huggingfacehub_api_token=huggingfacehub_api_token,
temperature=0.7,
max_length=512
)Extend the Flask API by adding more routes in app.py.
Contributions are welcome! Please open an issue or submit a pull request for any improvements or bug fixes.
This project is licensed under the MIT License. See the LICENSE file for details.

