Recipe Retriever RAG 🍳

A Retrieval Augmented Generation (RAG) system for finding recipes using natural language search. This project uses semantic search to match your queries with relevant recipes, even if they don't contain the exact words you're looking for.

Features

Natural Language Search: Find recipes using conversational queries like "healthy vegetarian dinner" or "quick breakfast ideas"
Semantic Understanding: Leverages modern embedding models to understand the meaning behind your queries
Text Chunking: Splits recipes into smaller pieces for more accurate semantic search
Interactive UI: User-friendly Streamlit interface for searching and viewing recipes
Flexible Data Sources: Use our sample recipe data or upload your own recipe CSV file
Detailed Recipe Display: View ingredients, directions, nutrition information, and more

Technologies Used

LangChain: Framework for building applications with language models
Sentence Transformers: For generating embeddings from recipe text
FAISS: Vector database for efficient similarity search
Streamlit: For the web-based user interface
Pandas: For data processing and manipulation

Installation

Clone this repository:

git clone https://github.com/ShivaniNR/Recipe-Retriever-RAG.git
cd Recipe-Retriever-RAG

Create a virtual environment (optional but recommended):

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install the required dependencies:
```
pip install -r requirements.txt
```

Project Structure

Recipe-Retriever-RAG/
│
├── data/                    # Data directory
│   ├── raw/                 # Raw recipe data (Food.com-recipes.csv)
│   ├── processed/           # Processed recipe data
│   └── vectorstore/         # Vector database files
│
├── src/                     # Source code
│   ├── data_preprocessing.py # Data loading and preprocessing
│   ├── embeddings.py        # Embedding generation and search
│   └── setup_script.py      # One-time setup script
│
├── app.py                   # Streamlit web application
└── requirements.txt         # Project dependencies

Implementation Details

Component Architecture

┌─────────────────┐          ┌───────────────────┐
│                 │          │                   │
│  Raw Recipe CSV ├─────────►│   DataPipeline    │
│                 │          │                   │
└─────────────────┘          └─────────┬─────────┘
                                       │
                                       │ Processed CSV
                                       ▼
┌─────────────────┐          ┌─────────────────────┐
│                 │          │                     │
│  HuggingFace    │◄─────────┤ SimpleRecipeEmbeddings
│  Model          │          │                     │
│                 │          └──────────┬──────────┘
└─────────────────┘                     │
                                        │ Vector Embeddings
                                        ▼
┌─────────────────┐          ┌─────────────────────┐         ┌─────────────┐
│                 │          │                     │         │             │
│  User Query     ├─────────►│   Streamlit App     │◄────────┤ FAISS Vector│
│                 │          │                     │         │ Store       │
└─────────────────┘          └─────────────────────┘         │             │
                                       │                     └─────────────┘
                                       │
                                       ▼
                              ┌─────────────────────┐
                              │                     │
                              │   Search Results    │
                              │                     │
                              └─────────────────────┘

Sequence Diagram

┌─────┐          ┌───────────────┐          ┌─────────────────┐          ┌──────────┐          ┌─────────┐
│Setup│          │DataPipeline   │          │RecipeEmbeddings │          │FAISS     │          │Streamlit│
└──┬──┘          └───────┬───────┘          └────────┬────────┘          └────┬─────┘          └────┬────┘
   │                     │                           │                        │                     │
   │ Run setup_script.py │                           │                        │                     │
   ├────────────────────►│                           │                        │                     │
   │                     │                           │                        │                     │
   │                     │ Load & process CSV        │                        │                     │
   │                     │◄──────────────────────────┤                        │                     │
   │                     │                           │                        │                     │
   │                     │ Return processed data     │                        │                     │
   │                     ├──────────────────────────►│                        │                     │
   │                     │                           │                        │                     │
   │                     │                           │ Create embeddings      │                     │
   │                     │                           ├───────────────────────►│                     │
   │                     │                           │                        │                     │
   │                     │                           │ Store vectors          │                     │
   │                     │                           │◄───────────────────────┤                     │
   │                     │                           │                        │                     │
   │                     │                           │                        │                     │
   │                     │                           │                        │                     │
┌──┴──┐          ┌───────┴───────┐          ┌────────┴────────┐          ┌────┴─────┐          ┌────┴────┐
│User │          │               │          │                 │          │          │          │         │
└──┬──┘          └───────────────┘          └─────────────────┘          └──────────┘          └────┬────┘
   │                                                                                                │
   │ Run Streamlit app                                                                              │
   ├─────────────────────────────────────────────────────────────────────────────────────────────►│
   │                                                                                                │
   │ Enter search query                                                                             │
   ├─────────────────────────────────────────────────────────────────────────────────────────────►│
   │                                                                                                │
   │                     ┌───────────────┐          ┌────────┬────────┐          ┌────┬─────┐     │
   │                     │               │          │RecipeEmbeddings │          │FAISS     │     │
   │                     └───────┬───────┘          └────────┬────────┘          └────┬─────┘     │
   │                             │                           │                        │           │
   │                             │                           │ Convert query to vector│           │
   │                             │                           │◄───────────────────────┤           │
   │                             │                           │                        │           │
   │                             │                           │ Search similar vectors │           │
   │                             │                           ├───────────────────────►│           │
   │                             │                           │                        │           │
   │                             │                           │ Return similar recipes │           │
   │                             │                           │◄───────────────────────┤           │
   │                             │                           │                        │           │
   │                             │                           │                        │           │
   │ Display search results                                                                        │
   │◄─────────────────────────────────────────────────────────────────────────────────────────────┤
   │                                                                                                │
┌──┴──┐          ┌───────────────┐          ┌─────────────────┐          ┌──────────┐          ┌────┴────┐
│     │          │               │          │                 │          │          │          │         │
└─────┘          └───────────────┘          └─────────────────┘          └──────────┘          └─────────┘

Core Components

1. Data Preprocessing (`src/data_preprocessing.py`)

The DataPipeline class handles the preprocessing of raw recipe data:

Efficient data loading: Reads CSV data with optimized dtypes and chunking for memory efficiency
Time parsing: Converts ISO duration formats to minutes
Ingredient processing: Combines quantities and ingredient parts
Recipe categorization: Automatically categorizes recipes by difficulty and time
Searchable text creation: Generates optimized text for semantic search

2. Embeddings Generation (`src/embeddings.py`)

The SimpleRecipeEmbeddings class manages the creation and search of recipe embeddings:

Model loading: Uses HuggingFace's Sentence Transformers (default: 'all-MiniLM-L6-v2')
Document preparation: Converts processed recipe data to LangChain Document objects
Vector creation: Generates embeddings for each recipe
FAISS integration: Stores embeddings in a FAISS vector database for efficient similarity search
Search functionality: Provides semantic search capabilities with metadata filtering

3. Streamlit Application (`app.py`)

The web interface provides a user-friendly way to interact with the recipe search system:

Cached loading: Efficiently loads the vectorstore once
Search interface: Allows natural language queries
Result display: Shows recipe details including ingredients, instructions, and metadata
Suggestion buttons: Provides example queries for users to try

4. Setup Script (`src/setup_script.py`)

A one-time setup script that:

Loads and preprocesses the raw recipe data
Creates embeddings and stores them in a FAISS vector database

Usage

Setup the system:
```
python src/setup_script.py
```
Run the Streamlit app:
```
streamlit run app.py
```
Enter your recipe search query in the search box and explore the results!

How It Works

Data Processing: Recipes are loaded and cleaned by the DataPipeline class
Embedding Generation: The SimpleRecipeEmbeddings class uses a Sentence Transformer model to convert recipe text into vector embeddings
Vector Storage: Embeddings are stored in a FAISS index for efficient similarity search
Query Processing: When you enter a search query, it's converted to an embedding and compared to the recipe embeddings
Result Retrieval: The most similar recipes are retrieved and displayed based on semantic similarity

Future Improvements

Implement text chunking for handling longer recipes
Implement dietary restriction filtering
Add ingredient substitution suggestions
Add user accounts and saved favorites

License

This project is open source and available under the MIT License.

Acknowledgments

Built with Streamlit
Embedding models from Sentence Transformers
Vector search with FAISS
Recipe data from Food.com dataset

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
src		src
.gitignore		.gitignore
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Recipe Retriever RAG 🍳

Features

Technologies Used

Installation

Project Structure

Implementation Details

Component Architecture

Sequence Diagram

Core Components

1. Data Preprocessing (`src/data_preprocessing.py`)

2. Embeddings Generation (`src/embeddings.py`)

3. Streamlit Application (`app.py`)

4. Setup Script (`src/setup_script.py`)

Usage

How It Works

Future Improvements

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Recipe Retriever RAG 🍳

Features

Technologies Used

Installation

Project Structure

Implementation Details

Component Architecture

Sequence Diagram

Core Components

1. Data Preprocessing (src/data_preprocessing.py)

2. Embeddings Generation (src/embeddings.py)

3. Streamlit Application (app.py)

4. Setup Script (src/setup_script.py)

Usage

How It Works

Future Improvements

License

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. Data Preprocessing (`src/data_preprocessing.py`)

2. Embeddings Generation (`src/embeddings.py`)

3. Streamlit Application (`app.py`)

4. Setup Script (`src/setup_script.py`)

Packages