🇨🇳 中文版 | 🇺🇸 English
This project automatically generates JSON recipe data compatible with cooking modules from Markdown files in the dishes/ directory via GitHub Actions. The optimized parser has achieved 100% parsing success rate, supporting complete data extraction from 324 recipes.
- 📊 324 Recipes: Complete recipe collection covering 10 categories
- 🎯 100% Parsing Success Rate: All recipes successfully extract steps and ingredients information
- 🔧 Multi-format Support: Compatible with dash (-), asterisk (*), and plus (+) list formats
- ⚡ Real-time Updates: Automatically regenerates JSON data when recipes are modified
.github/
workflows/
generate-recipes.yml # GitHub Action workflow
scripts/
generate_recipes.py # Python parsing script
test_compatibility.py # Compatibility testing script
all_recipes.json # Generated recipe data file (324 recipes)
- Push to main/master branch with modifications to
dishes/**/*.mdfiles - Manual trigger (workflow_dispatch)
- Pull Request containing modifications to
dishes/**/*.md
- Checkout code
- Setup Python environment
- Run parsing script to generate JSON
- Check for changes
- Auto-commit and push if changes detected
Generated JSON is fully compatible with domobot cooking module, containing the following fields:
{
"id": "category/dish_name", // Unique identifier
"name": "Recipe Name", // Recipe title
"description": "Recipe description", // Description from MD file
"source_path": "dishes/path/file.md", // Source file path
"category": "Category", // Chinese category mapped from directory
"difficulty": 3, // Difficulty level (1-7 stars)
"servings": 2, // Serving size (people)
"tags": ["Tag1", "Tag2"], // Auto-generated tags
"ingredients": [ // Ingredients list
{
"name": "Ingredient Name",
"quantity": 100, // Amount (optional)
"unit": "g", // Unit (optional)
"text_quantity": "- Ingredient 100g", // Original text
"notes": null // Notes (optional)
}
],
"steps": [ // Cooking steps
{
"step": 1,
"description": "Step description"
}
]
}| Directory | Chinese Category | Recipe Count |
|---|---|---|
| aquatic | 水产 | 24 |
| breakfast | 早餐 | 21 |
| condiment | 调料 | 9 |
| dessert | 甜品 | 18 |
| drink | 饮品 | 21 |
| meat_dish | 荤菜 | 97 |
| semi-finished | 半成品加工 | 10 |
| soup | 汤羹 | 22 |
| staple | 主食 | 48 |
| vegetable_dish | 素菜 | 54 |
Total: 324 recipes
The parser is optimized to support multiple list marker formats:
# Recipe Name
Brief description text.
Estimated cooking difficulty: ★★★
## Essential Ingredients and Tools
- Ingredient1 # Dash format
* Ingredient2 # Asterisk format
+ Ingredient3 # Plus format
## Calculation
Determine how many servings to make before cooking. One serving feeds 2 people.
Total amount:
- Ingredient1: 100g * servings
* Ingredient2: 50ml * servings
+ Ingredient3: 2 pieces * servings
## Instructions
- Step 1 description # Dash format
* Step 2 description # Asterisk format
+ Step 3 description # Plus format
1. Step 4 description # Numbered formatThe parser includes comprehensive error handling mechanisms:
- Outputs detailed debugging information on parsing failures
- Skips invalid or corrupted Markdown files
- Reports parsing success statistics
# No additional dependencies required, Python standard library only
cd HowToCook-masterpython scripts/generate_recipes.pypython scripts/test_compatibility.pyThe script outputs:
- Total number of processed recipes (324)
- Recipe count statistics by category
- Warning messages during parsing
- Parsing success rate (100%)
- Encoding Issues: Ensure all MD files use UTF-8 encoding
- File Names: Avoid special characters in file names
- Format Consistency: Maintain Markdown format consistency for proper parsing
- Auto-commit: Modifying files in dishes directory triggers automatic JSON regeneration
- List Formats: Supports
-,*,+list markers, no need for unified format
Generated JSON format is fully compatible with existing cooking module code:
- ✅ Supports category search (
recipe.get("category")) - ✅ Supports name search (
recipe.get("name")) - ✅ Supports ingredient search (
ingredient.get("name")) - ✅ Supports tag search (
recipe.get("tags")) - ✅ Supports difficulty display (
recipe.get("difficulty")) - ✅ Supports serving information (
recipe.get("servings")) - ✅ Supports detailed ingredient and step display
- ✅ Supports random recommendation feature
Add new mappings to the CATEGORY_MAP dictionary in scripts/generate_recipes.py:
CATEGORY_MAP = {
'new_category': 'New Category Name',
# ... other categories
}Main parsing method locations:
parse_steps(): Step parsing logic, supports multiple list formatsparse_ingredients(): Ingredient parsing logicparse_difficulty(): Difficulty level parsingparse_servings(): Serving information parsing
Key regular expressions in current parser:
# Operations section matching (fixed)
operations_match = re.search(r'## 操作\s*\n(.*?)(?=\n##|\n$)', content, re.DOTALL)
# Support multiple list formats
if (line.startswith('-') or line.startswith('*') or line.startswith('+')) and len(line) > 2:# Run complete tests
python scripts/test_compatibility.py
# Example output:
# ✅ Parsing success rate: 100.0% (324/324)
# ✅ All recipes have complete step information
# ✅ JSON format validation passed- Step parsing failure: Check if supported list formats are used (
-,*,+,1.) - Incomplete ingredient parsing: Confirm "## 必备原料和工具" and "## 计算" sections are properly formatted
- Encoding errors: Ensure files are saved with UTF-8 encoding
Development Environment: Python 3.x
Dependencies: Python standard library only (re, json, pathlib, os)
Parsing Engine: Custom regular expression parser
Data Format: UTF-8 encoded JSON
Test Coverage: 324 recipes, 10 categories
🤖 This automation system was designed and developed with assistance from Claude Code, continuously optimized to achieve 100% parsing success rate