An intelligent image captioning system powered by BLIP and FastAPI that automatically generates:
- Alt-text for accessibility and SEO
- Social media captions optimized for different platforms
- SEO metadata for improved search visibility
- Batch processing for large inventories (up to 50 images at once)
This is a free, open-source service with usage limits to prevent abuse:
- 10 requests/minute • 100 requests/hour • 500 requests/day
- 1,000 images/day per IP • 50 images max per batch
- See RATE_LIMITS.md for complete details
- BLIP (Bootstrapping Language-Image Pre-training) for advanced image analysis
- FastAPI for high-performance parallel processing with rate limiting
- Next.js frontend with modern UI and bulk upload
- Batch processing system for large-scale inventories
- IP-based rate limiting to ensure fair usage
Automatic generation of descriptive, keyword-rich alt-text for product images, improving both accessibility and SEO.
Example: "Red A-line cotton dress with floral pattern"
Key Features:
- Multiple length variations (short, medium, standard)
- SEO score calculation
- Keyword integration
- Accessibility-focused descriptions
Engaging, platform-optimized captions with relevant hashtags. According to research, 71% of online shoppers consider product images essential in purchasing decisions.
Supported Platforms:
- Instagram (up to 2,200 characters)
- Twitter (up to 280 characters)
- Facebook (up to 63,206 characters)
- LinkedIn (up to 3,000 characters)
- Pinterest (up to 500 characters)
Key Features:
- Platform-specific optimization
- Intelligent hashtag generation
- Call-to-action integration
- Engagement score prediction
Keyword integration for improved search visibility and organic traffic generation through intelligent content optimization.
Key Features:
- SEO-friendly filename generation
- Title tag optimization
- Meta description generation
- Open Graph tags for social sharing
- Schema.org markup
- Keyword analysis and recommendations
Process up to 50 images simultaneously with real-time progress tracking and export capabilities.
Key Features:
- Parallel processing with configurable concurrency (5 concurrent tasks)
- Real-time progress updates with polling
- CSV and JSON export options
- Success/failure tracking per image
- Estimated completion time
- Automatic retry logic
- Rate limit compliance (max 50 images per batch)
- Python 3.8 or higher
- pip package manager
- Clone the repository:
git clone https://github.com/maidulcu/dynamicweblab-ai-image-caption.git
cd dynamicweblab-ai-image-caption- Create a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies:
pip install -r requirements.txt- Configure environment variables:
cp .env.example .env
# Edit .env with your settings- Run the backend:
python main.pyThe API will be available at http://localhost:8000
- Run the Next.js frontend (optional):
cd frontend
npm install
npm run devThe frontend will be available at http://localhost:3000
Legacy HTML Interface (http://localhost:8000):
- Upload an image
- Enter product details (optional)
- Add SEO keywords (optional)
- Generate complete package with one click
Next.js Frontend (http://localhost:3000):
- Home Page: Feature overview and navigation
- Single Upload (
/single): Process individual images with full product details - Batch Upload (
/batch): Upload up to 100 images with drag-and-drop, real-time progress tracking, and CSV/JSON export
POST /api/v1/generate/completeParameters:
image(file): Image filekeywords(string, optional): Comma-separated keywordsproduct_name(string, optional): Product nameproduct_category(string, optional): Product categoryproduct_brand(string, optional): Brand nameproduct_material(string, optional): Material typeplatforms(string, optional): Comma-separated platforms (default: instagram,facebook,twitter)
Example:
curl -X POST "http://localhost:8000/api/v1/generate/complete" \
-F "image=@product.jpg" \
-F "product_name=Cotton Dress" \
-F "product_category=Women's Fashion" \
-F "keywords=dress,fashion,summer"Response:
{
"success": true,
"alt_text": {
"standard": "Red A-line cotton dress with floral pattern",
"short": "Red cotton dress",
"medium": "Red A-line cotton dress",
"seo_score": 85
},
"social_captions": {
"instagram": {
"caption": "Obsessed with this red cotton dress! 💕\n\nShop now! Link in bio\n\n#fashion #dress #summerstyle",
"engagement_score": 75,
"hashtag_count": 3
}
},
"seo_metadata": {
"filename": "womens-fashion-cotton-dress-red.jpg",
"title": "Cotton Dress - Women's Fashion",
"meta_description": "Red A-line cotton dress with floral pattern. Made from Cotton. Shop now.",
"seo_score": 90
}
}POST /api/v1/generate/alt-textPOST /api/v1/generate/social-captionAdditional Parameters:
platform(required): instagram, twitter, facebook, linkedin, or pinterestbrand_voice(optional): casual_engaging, concise_witty, professionalcustom_message(optional): Custom message to include
POST /api/v1/generate/seo-metadataPOST /api/v1/analyzeReturns base image analysis including detected colors, elements, and dimensions.
GET /api/v1/platformsReturns list of supported social media platforms.
GET /api/v1/healthPOST /api/v1/batch/uploadParameters:
images(files): List of image files (max 100)keywords(string, optional): Comma-separated keywordsplatforms(string): Comma-separated platforms (default: instagram,facebook,twitter)
Response:
{
"success": true,
"batch_id": "uuid-here",
"total_images": 25,
"status": "processing"
}GET /api/v1/batch/status/{batch_id}Response:
{
"batch_id": "uuid",
"status": "processing",
"progress": {
"total": 25,
"processed": 15,
"successful": 14,
"failed": 1,
"percentage": 60,
"estimated_completion": "30s"
}
}GET /api/v1/batch/results/{batch_id}Returns complete results for all processed images.
GET /api/v1/batch/export/{batch_id}?format=csvDownload results as CSV or JSON file.
GET /api/v1/rate-limit/statusReturns your current rate limit quota and usage.
This is a free service with IP-based rate limiting:
| Limit Type | Free Tier |
|---|---|
| Requests per minute | 10 |
| Requests per hour | 100 |
| Requests per day | 500 |
| Images per day | 1,000 |
| Max batch size | 50 |
Rate limit headers are included in all responses:
X-RateLimit-Remaining-Minute: 8
X-RateLimit-Remaining-Hour: 95
X-RateLimit-Remaining-Day: 487
X-RateLimit-Images-Remaining: 950
See RATE_LIMITS.md for complete documentation on rate limits, best practices, and handling rate limit errors.
Interactive API documentation is available at:
- Swagger UI:
http://localhost:8000/docs - ReDoc:
http://localhost:8000/redoc
- Backend: FastAPI (Python) with async support
- AI/ML:
- HuggingFace Transformers
- BLIP (Bootstrapping Language-Image Pre-training)
- PyTorch (with GPU acceleration)
- Image Processing: PIL, OpenCV
- Frontend:
- Next.js 14 (App Router) with TypeScript
- Tailwind CSS for styling
- react-dropzone for drag-and-drop
- Legacy HTML/CSS/JavaScript interface
- Batch Processing: Asyncio with semaphore-based concurrency control
dynamicweblab-ai-image-caption/
├── core/
│ ├── __init__.py
│ ├── image_analyzer.py # Core image analysis engine (BLIP)
│ ├── alt_text_generator.py # Alt-text generation
│ ├── social_caption_generator.py # Social media captions
│ ├── seo_optimizer.py # SEO metadata optimization
│ └── batch_processor.py # Batch processing with progress tracking
├── api/
│ ├── __init__.py
│ └── routes.py # API endpoints (single + batch)
├── frontend/ # Next.js frontend
│ ├── app/
│ │ ├── page.tsx # Home page
│ │ ├── single/
│ │ │ └── page.tsx # Single image upload
│ │ └── batch/
│ │ └── page.tsx # Bulk upload with progress
│ ├── package.json
│ └── README.md
├── static/
│ └── index.html # Legacy web interface
├── uploads/ # Temporary upload directory
├── batch_results/ # Batch export directory
├── config.py # Configuration management
├── main.py # FastAPI application
├── requirements.txt # Python dependencies
├── .env.example # Environment variables template
├── Dockerfile # Docker configuration
├── docker-compose.yml # Docker Compose setup
└── README.md # This file
Edit .env file to customize settings:
# API Configuration
API_HOST=0.0.0.0
API_PORT=8000
DEBUG=True
# Model Configuration
CAPTION_MODEL=Salesforce/blip-image-captioning-large
USE_CLAUDE_API=False
# SEO Configuration
DEFAULT_KEYWORDS=product,shop,buy,online,quality
MAX_ALT_TEXT_LENGTH=125
# Social Media Configuration
INSTAGRAM_MAX_LENGTH=2200
TWITTER_MAX_LENGTH=280
FACEBOOK_MAX_LENGTH=63206- Processing Time: 2-5 seconds per image (depending on hardware)
- GPU Acceleration: Automatically uses CUDA if available
- Batch Processing:
- Up to 50 images per batch (free tier limit)
- 5 concurrent processing tasks (configurable)
- Real-time progress tracking every 2 seconds
- Estimated completion time calculation
- ~2-5 minutes for 50 images
- Rate Limiting: IP-based with minimal overhead
- Concurrent Requests: FastAPI async support
- Scalability: Horizontal scaling with load balancers
- Keep between 50-125 characters for optimal SEO
- Include primary keyword naturally
- Describe the image accurately
- Avoid "image of" or "picture of" prefixes
- Use platform-specific optimal lengths
- Include 1-3 relevant hashtags (varies by platform)
- Add clear call-to-action
- Match your brand voice
- Use descriptive filenames with hyphens
- Include target keywords naturally
- Write compelling meta descriptions
- Implement Schema.org markup
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Make your changes
- Submit a pull request
This project is licensed under the MIT License.
For issues and questions:
- GitHub Issues: Create an issue
- Email: support@dynamicweblab.com
- BLIP model by Salesforce Research
- HuggingFace Transformers library
- FastAPI framework
- Batch processing API
- Multiple language support
- Custom model training
- Integration with e-commerce platforms
- Advanced analytics dashboard
- A/B testing for captions
- Brand voice customization
- Image editing suggestions
Built with ❤️ by DynamicWebLab