RAG Chatbot with LangChain, Pinecone, and Hugging Face

A Retrieval-Augmented Generation (RAG) chatbot that processes PDF documents and provides intelligent answers based on document content using state-of-the-art language models and vector databases.

Features

PDF Document Processing: Extract and process text from PDF files
Vector Database Storage: Store document embeddings in Pinecone for efficient retrieval
Conversational Memory: Maintain chat history for contextual conversations
Advanced Retrieval: Uses Cohere's reranking for improved document relevance
Multiple LLM Support: Built with Falcon-7B-Instruct via Hugging Face Hub

Architecture

The system consists of two main components:

Document Indexing (indexing_script.py): Processes PDFs and stores embeddings
Chat Interface (chat_script.py): Handles user queries and generates responses

Prerequisites

Python 3.7+
Valid API keys for:
- Pinecone
- Cohere
- Hugging Face Hub

Installation

Clone the repository:

git clone <repository-url>
cd rag-chatbot

Install required packages:

pip install -r requirements.txt

Required Dependencies

langchain
pinecone-client
cohere
PyPDF2
transformers
torch
tqdm

Configuration

API Keys Setup

Replace the placeholder API keys in both scripts:

# In both files, update these variables:
pine_cone_key = "your_pinecone_api_key"
cohere_key = "your_cohere_api_key" 
huggin_face_api = "your_hugging_face_api_key"

Environment Variables (Alternative)

You can also set environment variables:

export PINECONE_API_KEY="your_pinecone_api_key"
export COHERE_API_KEY="your_cohere_api_key"
export HUGGINGFACE_API_TOKEN="your_hugging_face_api_key"

Usage

Step 1: Document Indexing

First, process your PDF document and create the vector database:

Place your PDF file in the project directory (default: "Attention is all you need.pdf")
Run the indexing script:

python indexing_script.py

This will:

Extract text from the PDF
Split text into chunks (500 tokens each)
Generate embeddings using Cohere
Store embeddings in Pinecone index named "chatdatabase"

Step 2: Start Chatting

Once indexing is complete, run the chat interface:

python chat_script.py

The system will initialize and you can start asking questions about your document.

Configuration Options

Text Splitting

Chunk Size: 500 tokens (adjustable in TokenTextSplitter)
Chunk Overlap: 25 tokens

Retrieval Settings

Top-K Results: 4 documents retrieved per query
Reranking: Cohere rerank for improved relevance

Memory Settings

Token Limit: 1000 tokens for conversation history
Memory Type: ConversationTokenBufferMemory

LLM Configuration

Model: tiiuae/falcon-7b-instruct
Temperature: 0.6
Max New Tokens: 2000

Customization

Change PDF Document

Update the file path in the indexing script:

Your_text_data = get_pdf_text("path/to/your/document.pdf")

Modify Prompts

Customize the system prompt in chat_script.py:

template = """
Your custom prompt here...
{question}
"""

Switch LLM Models

Replace the Falcon model with another Hugging Face model:

repo_id = "your_preferred_model"

Pinecone Setup

Create a Pinecone account at pinecone.io
Create a new index named "chatdatabase"
Use the starter environment: 'gcp-starter'

Cohere Setup

Get API key from Cohere
The system uses:
- embed-english-light-v2.0 for embeddings
- Default reranker for document reranking

Troubleshooting

Common Issues

API Key Errors: Ensure all API keys are valid and have sufficient quota
Index Not Found: Run the indexing script before the chat script
Memory Issues: Reduce chunk size or max_token_limit for large documents
Slow Responses: Consider using a different embedding model or reducing retrieval count

Error Handling

The code includes basic error handling, but you may want to add:

API rate limit handling
Network timeout management
Input validation

Performance Optimization

Batch Processing: Process multiple PDFs by modifying the indexing loop
Caching: Implement caching for frequently asked questions
Model Optimization: Use quantized models for faster inference

Future Enhancements

Support for multiple document formats (Word, TXT, etc.)
Web interface using Streamlit or Chainlit
Multi-language support
Advanced conversation features
Document source citations
Conversation export functionality

Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests if applicable
Submit a pull request

License

[Add your license here]

Support

For questions or issues, please create an issue or contact [your-email].

Acknowledgments

Built with LangChain
Vector storage by Pinecone
Embeddings by Cohere
Language model from Hugging Face

Note: This is a development version. For production use, implement proper error handling, logging, and security measures.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Attention is all you need.pdf		Attention is all you need.pdf
README.md		README.md
chian_restrieval_exp.py		chian_restrieval_exp.py
rag_exp.py		rag_exp.py

Folders and files

Latest commit

History

Repository files navigation

RAG Chatbot with LangChain, Pinecone, and Hugging Face

Features

Architecture

Prerequisites

Installation

Required Dependencies

Configuration

API Keys Setup

Environment Variables (Alternative)

Usage

Step 1: Document Indexing

Step 2: Start Chatting

Configuration Options

Text Splitting

Retrieval Settings

Memory Settings

LLM Configuration

Customization

Change PDF Document

Modify Prompts

Switch LLM Models

Pinecone Setup

Cohere Setup

Troubleshooting

Common Issues

Error Handling

Performance Optimization

Future Enhancements

Contributing

License

Support

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages