A Python-based OCR and LLM classification system for parsing handwritten recipe images into structured data for a searchable web application.
This project processes directories of handwritten recipe images (notecards, recipe pages) and extracts structured information including:
- Recipe name
- Recipe type and subtype classification
- Ingredients list
- Instructions
- Cooking time (when present)
- Required utensils (e.g., oven, 9x13 pan, etc.)
The input recipe directory should be organized as:
Recipe Directory/
├── Recipe Type Dir/
│ ├── Recipe Name/
│ │ ├── image.jpg (single image)
│ │ └── OR
│ │ ├── recipe-name (A).jpg (front of notecard)
│ │ └── recipe-name (B).jpg (back of notecard)recipe-parser/
├── src/
│ ├── ocr/ # OCR processing modules
│ ├── llm/ # LLM classification and parsing
│ ├── pipeline/ # End-to-end processing pipeline
│ └── utils/ # Helper utilities
├── output/ # Processed recipe data (JSON)
├── tests/ # Unit tests
├── scripts/ # Executable scripts
├── requirements.txt # Python dependencies
└── .env.example # Environment variable template- Python 3.10+
- Tesseract OCR installed on your system
- Install Tesseract OCR:
brew install tesseract- Install Python dependencies:
pip install -r requirements.txt- Configure environment variables:
cp .env.example .env
# Edit .env with your API keyspython scripts/process_recipes.py --input /path/to/recipe/dir --output ./outputpython scripts/process_recipes.py --input /path/to/main/recipe/dir --recursive --output ./outputEach recipe is exported as a JSON file with the following structure:
{
"name": "Chocolate Chip Cookies",
"type": "Dessert",
"subtype": "Cookies",
"ingredients": [...],
"instructions": [...],
"cooking_time": "25 minutes",
"utensils": ["oven", "baking sheet", "mixing bowl"],
"source_images": ["path/to/image.jpg"]
}pytest tests/black src/ tests/ scripts/
ruff check src/ tests/ scripts/MIT