A refactored product description generator designed to demonstrate how AI-assisted systems can be structured for reliability, maintainability, and operational use.
This project focuses on transforming a working but monolithic LLM script into a modular, testable pipeline with explicit error handling, validation, and reproducible outputs.
The goal is not to build a complex AI system, but to show how AI generation should be engineered when used in real content or product workflows.
Many AI-assisted scripts work in isolation but fail when integrated into real workflows because:
- errors occur silently
- outputs are inconsistent or untraceable
- logic is tightly coupled and difficult to modify
- API usage becomes expensive during iteration
- generated content cannot be audited or validated
This project addresses those problems by refactoring an automated product description generator into a structured pipeline that:
- separates responsibilities across modules
- fails explicitly and shows where errors occur
- validates both inputs and outputs
- avoids unnecessary regeneration through caching
- provides clear execution feedback through run summaries
The result is a small but production-minded example of an AI content generation workflow.
The pipeline performs the following steps:
- Loads structured product data from JSON
- Validates product schema using Pydantic
- Builds structured prompts for the LLM
- Generates product descriptions via API
- Applies quality checks to generated output
- Caches results to avoid repeated API calls
- Produces structured output with audit metadata
Each generated result includes:
- product metadata
- generated description
- quality validation status
- prompt version used
- cache status (API vs cached result)
The focus of this refactoring exercise is reliability rather than model capability.
In real marketing or product environments:
- content generation is iterative
- prompts evolve over time
- outputs must be traceable
- failures must be diagnosable
- regeneration must be controlled to manage cost
This project demonstrates how to apply software engineering principles to AI-assisted workflows so they can be maintained and extended safely.
- Clear separation of concerns
- Easier testing and debugging
- Explicit error reporting (WHAT / WHERE / WHY)
- Reusable helper functions
- Deterministic caching reduces API cost
- Generated content can be quality-checked automatically
- Outputs remain reproducible across iterations
- Prompt changes can be versioned safely
- Items requiring manual review are identifiable
src/app/ ├── client.py # LLM API wrapper ├── io.py # File loading and saving ├── models.py # Product schema validation ├── prompts.py # Prompt construction ├── quality.py # Output quality checks └── pipeline.py # Orchestration logic
The main pipeline coordinates modules without embedding business logic directly in the entry point.
- Python 3.10+
- OpenAI API
- Pydantic (data validation)
- JSON-based input/output
- Modular Python architecture
No frameworks are used intentionally to keep the system transparent and easy to inspect.
pip install -r requirements.txt
The main pipeline coordinates modules without embedding business logic directly in the entry point.
---
## Tech Stack
- Python 3.10+
- OpenAI API
- Pydantic (data validation)
- JSON-based input/output
- Modular Python architecture
No frameworks are used intentionally to keep the system transparent and easy to inspect.
---
## How to Run
### 1. Install dependencies
```bash
pip install -r requirements.txt
### 2. Create a .env file:
OPENAI_API_KEY=your_api_key_here
### 3 Run the pipeline.
python main.py
Validated: 2/2 products
OK (CACHE): P001 - Wireless Bluetooth Headphones
OK (CACHE): P002 - Laptop Stand
Run summary
- total_products: 2
- generated_via_api: 0
- loaded_from_cache: 2
- quality_passed: 2
- quality_failed: 0
- errors: 0
The project includes test inputs to demonstrate explicit error handling:
INPUT_JSON=data/does_not_exist.json python main.py
INPUT_JSON=data/malformed.json python main.py
INPUT_JSON=data/invalid_products.json python main.py
These cases demonstrate:
FileNotFoundError reporting
JSON parsing errors with location details
Schema validation errors with invalid fields listed
### Design Principles Demonstrated
Do not fail silently
- Separate generation from validation
- Keep prompts versioned and traceable
- Make AI outputs auditable
- Prefer simple, modular components over complex abstractions
This project is intentionally limited in scope. It is not intended to be a full production system, but a clear example of how AI-assisted generation can be refactored into a maintainable and testable structure.
April Atkinson
AI Go-To-Market & Commercial Systems Consultant