This project implements a mental support chatbot using a Retrieval-Augmented Generation (RAG) framework. The chatbot is designed to provide compassionate, thoughtful, and context-aware responses by leveraging pre-existing datasets and integrating advanced natural language processing techniques.
The chatbot:
- Uses a csv file of psychological contexts and responses to generate insightful answers.
- Embeds data using
SentenceTransformersand stores it in a Pinecone vector database. - Retrieves relevant context dynamically during conversation.
- Fine-tunes an LLM model to act as a compassionate, friendly personal assistant.
- Processes queries to provide meaningful, concise, and empathetic responses.
- Data Preprocessing:
- The data is taken from huggingface, removed the third column from it because of irrelevant responses.
- Converted to csv using pandas.
- Load the dataset using LangChain's CSVLoader.
- Format the data into question-answer pairs.
- Text Chunking:
- Use LangChain's
RecursiveCharacterTextSplitterfor chunking large texts.
- Use LangChain's
- Embeddings:
- Generate vector embeddings using
SentenceTransformers(all-MiniLM-L6-v2).
- Generate vector embeddings using
- Storage:
- Store embeddings in Pinecone for fast and scalable retrieval.
- Retrieval-Augmented Generation:
- Retrieve context based on user queries.
- Use LLaMA 3.3 for generating responses augmented with retrieved context.
- Language Models: LLaMA 3.3
- Embeddings: SentenceTransformers (all-MiniLM-L6-v2)
- Database: Pinecone
- Frameworks: LangChain, Streamlit
- Programming Language: Python