AI Image Caption Generator

An intelligent image captioning system powered by BLIP and FastAPI that automatically generates:

Alt-text for accessibility and SEO
Social media captions optimized for different platforms
SEO metadata for improved search visibility
Batch processing for large inventories (up to 50 images at once)

🆓 Free Service with Rate Limits

This is a free, open-source service with usage limits to prevent abuse:

10 requests/minute • 100 requests/hour • 500 requests/day
1,000 images/day per IP • 50 images max per batch
See RATE_LIMITS.md for complete details

Key Technologies

BLIP (Bootstrapping Language-Image Pre-training) for advanced image analysis
FastAPI for high-performance parallel processing with rate limiting
Next.js frontend with modern UI and bulk upload
Batch processing system for large-scale inventories
IP-based rate limiting to ensure fair usage

Features

1. Alt-Text Generation

Automatic generation of descriptive, keyword-rich alt-text for product images, improving both accessibility and SEO.

Example: "Red A-line cotton dress with floral pattern"

Key Features:

Multiple length variations (short, medium, standard)
SEO score calculation
Keyword integration
Accessibility-focused descriptions

2. Social Media Captions

Engaging, platform-optimized captions with relevant hashtags. According to research, 71% of online shoppers consider product images essential in purchasing decisions.

Supported Platforms:

Instagram (up to 2,200 characters)
Twitter (up to 280 characters)
Facebook (up to 63,206 characters)
LinkedIn (up to 3,000 characters)
Pinterest (up to 500 characters)

Key Features:

Platform-specific optimization
Intelligent hashtag generation
Call-to-action integration
Engagement score prediction

3. SEO Optimization

Keyword integration for improved search visibility and organic traffic generation through intelligent content optimization.

Key Features:

SEO-friendly filename generation
Title tag optimization
Meta description generation
Open Graph tags for social sharing
Schema.org markup
Keyword analysis and recommendations

4. Batch Processing

Process up to 50 images simultaneously with real-time progress tracking and export capabilities.

Key Features:

Parallel processing with configurable concurrency (5 concurrent tasks)
Real-time progress updates with polling
CSV and JSON export options
Success/failure tracking per image
Estimated completion time
Automatic retry logic
Rate limit compliance (max 50 images per batch)

Installation

Prerequisites

Python 3.8 or higher
pip package manager

Setup

Clone the repository:

git clone https://github.com/maidulcu/dynamicweblab-ai-image-caption.git
cd dynamicweblab-ai-image-caption

Create a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Configure environment variables:

cp .env.example .env
# Edit .env with your settings

Run the backend:

python main.py

The API will be available at http://localhost:8000

Run the Next.js frontend (optional):

cd frontend
npm install
npm run dev

The frontend will be available at http://localhost:3000

Usage

Web Interfaces

Legacy HTML Interface (http://localhost:8000):

Upload an image
Enter product details (optional)
Add SEO keywords (optional)
Generate complete package with one click

Next.js Frontend (http://localhost:3000):

Home Page: Feature overview and navigation
Single Upload (/single): Process individual images with full product details
Batch Upload (/batch): Upload up to 100 images with drag-and-drop, real-time progress tracking, and CSV/JSON export

API Endpoints

1. Generate Complete Package

POST /api/v1/generate/complete

Parameters:

image (file): Image file
keywords (string, optional): Comma-separated keywords
product_name (string, optional): Product name
product_category (string, optional): Product category
product_brand (string, optional): Brand name
product_material (string, optional): Material type
platforms (string, optional): Comma-separated platforms (default: instagram,facebook,twitter)

Example:

curl -X POST "http://localhost:8000/api/v1/generate/complete" \
  -F "image=@product.jpg" \
  -F "product_name=Cotton Dress" \
  -F "product_category=Women's Fashion" \
  -F "keywords=dress,fashion,summer"

Response:

{
  "success": true,
  "alt_text": {
    "standard": "Red A-line cotton dress with floral pattern",
    "short": "Red cotton dress",
    "medium": "Red A-line cotton dress",
    "seo_score": 85
  },
  "social_captions": {
    "instagram": {
      "caption": "Obsessed with this red cotton dress! 💕\n\nShop now! Link in bio\n\n#fashion #dress #summerstyle",
      "engagement_score": 75,
      "hashtag_count": 3
    }
  },
  "seo_metadata": {
    "filename": "womens-fashion-cotton-dress-red.jpg",
    "title": "Cotton Dress - Women's Fashion",
    "meta_description": "Red A-line cotton dress with floral pattern. Made from Cotton. Shop now.",
    "seo_score": 90
  }
}

2. Generate Alt-Text Only

POST /api/v1/generate/alt-text

3. Generate Social Caption

POST /api/v1/generate/social-caption

Additional Parameters:

platform (required): instagram, twitter, facebook, linkedin, or pinterest
brand_voice (optional): casual_engaging, concise_witty, professional
custom_message (optional): Custom message to include

4. Generate SEO Metadata

POST /api/v1/generate/seo-metadata

5. Analyze Image

POST /api/v1/analyze

Returns base image analysis including detected colors, elements, and dimensions.

6. List Platforms

GET /api/v1/platforms

Returns list of supported social media platforms.

7. Health Check

GET /api/v1/health

8. Batch Upload (Process Multiple Images)

POST /api/v1/batch/upload

Parameters:

images (files): List of image files (max 100)
keywords (string, optional): Comma-separated keywords
platforms (string): Comma-separated platforms (default: instagram,facebook,twitter)

Response:

{
  "success": true,
  "batch_id": "uuid-here",
  "total_images": 25,
  "status": "processing"
}

9. Check Batch Status

GET /api/v1/batch/status/{batch_id}

Response:

{
  "batch_id": "uuid",
  "status": "processing",
  "progress": {
    "total": 25,
    "processed": 15,
    "successful": 14,
    "failed": 1,
    "percentage": 60,
    "estimated_completion": "30s"
  }
}

10. Get Batch Results

GET /api/v1/batch/results/{batch_id}

Returns complete results for all processed images.

11. Export Batch Results

GET /api/v1/batch/export/{batch_id}?format=csv

Download results as CSV or JSON file.

12. Check Rate Limit Status

GET /api/v1/rate-limit/status

Returns your current rate limit quota and usage.

Rate Limits

This is a free service with IP-based rate limiting:

Limit Type	Free Tier
Requests per minute	10
Requests per hour	100
Requests per day	500
Images per day	1,000
Max batch size	50

Rate limit headers are included in all responses:

X-RateLimit-Remaining-Minute: 8
X-RateLimit-Remaining-Hour: 95
X-RateLimit-Remaining-Day: 487
X-RateLimit-Images-Remaining: 950

See RATE_LIMITS.md for complete documentation on rate limits, best practices, and handling rate limit errors.

API Documentation

Interactive API documentation is available at:

Swagger UI: http://localhost:8000/docs
ReDoc: http://localhost:8000/redoc

Technology Stack

Backend: FastAPI (Python) with async support
AI/ML:
- HuggingFace Transformers
- BLIP (Bootstrapping Language-Image Pre-training)
- PyTorch (with GPU acceleration)
Image Processing: PIL, OpenCV
Frontend:
- Next.js 14 (App Router) with TypeScript
- Tailwind CSS for styling
- react-dropzone for drag-and-drop
- Legacy HTML/CSS/JavaScript interface
Batch Processing: Asyncio with semaphore-based concurrency control

Project Structure

dynamicweblab-ai-image-caption/
├── core/
│   ├── __init__.py
│   ├── image_analyzer.py          # Core image analysis engine (BLIP)
│   ├── alt_text_generator.py      # Alt-text generation
│   ├── social_caption_generator.py # Social media captions
│   ├── seo_optimizer.py           # SEO metadata optimization
│   └── batch_processor.py         # Batch processing with progress tracking
├── api/
│   ├── __init__.py
│   └── routes.py                  # API endpoints (single + batch)
├── frontend/                      # Next.js frontend
│   ├── app/
│   │   ├── page.tsx              # Home page
│   │   ├── single/
│   │   │   └── page.tsx          # Single image upload
│   │   └── batch/
│   │       └── page.tsx          # Bulk upload with progress
│   ├── package.json
│   └── README.md
├── static/
│   └── index.html                 # Legacy web interface
├── uploads/                       # Temporary upload directory
├── batch_results/                 # Batch export directory
├── config.py                      # Configuration management
├── main.py                        # FastAPI application
├── requirements.txt               # Python dependencies
├── .env.example                   # Environment variables template
├── Dockerfile                     # Docker configuration
├── docker-compose.yml             # Docker Compose setup
└── README.md                      # This file

Configuration

Edit .env file to customize settings:

# API Configuration
API_HOST=0.0.0.0
API_PORT=8000
DEBUG=True

# Model Configuration
CAPTION_MODEL=Salesforce/blip-image-captioning-large
USE_CLAUDE_API=False

# SEO Configuration
DEFAULT_KEYWORDS=product,shop,buy,online,quality
MAX_ALT_TEXT_LENGTH=125

# Social Media Configuration
INSTAGRAM_MAX_LENGTH=2200
TWITTER_MAX_LENGTH=280
FACEBOOK_MAX_LENGTH=63206

Performance

Processing Time: 2-5 seconds per image (depending on hardware)
GPU Acceleration: Automatically uses CUDA if available
Batch Processing:
- Up to 50 images per batch (free tier limit)
- 5 concurrent processing tasks (configurable)
- Real-time progress tracking every 2 seconds
- Estimated completion time calculation
- ~2-5 minutes for 50 images
Rate Limiting: IP-based with minimal overhead
Concurrent Requests: FastAPI async support
Scalability: Horizontal scaling with load balancers

Best Practices

Alt-Text

Keep between 50-125 characters for optimal SEO
Include primary keyword naturally
Describe the image accurately
Avoid "image of" or "picture of" prefixes

Social Media Captions

Use platform-specific optimal lengths
Include 1-3 relevant hashtags (varies by platform)
Add clear call-to-action
Match your brand voice

SEO Metadata

Use descriptive filenames with hyphens
Include target keywords naturally
Write compelling meta descriptions
Implement Schema.org markup

Contributing

Contributions are welcome! Please:

Fork the repository
Create a feature branch
Make your changes
Submit a pull request

License

This project is licensed under the MIT License.

Support

For issues and questions:

GitHub Issues: Create an issue
Email: support@dynamicweblab.com

Acknowledgments

BLIP model by Salesforce Research
HuggingFace Transformers library
FastAPI framework

Roadmap

Built with ❤️ by DynamicWebLab

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
api		api
core		core
frontend		frontend
middleware		middleware
static		static
uploads		uploads
utils		utils
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
EXAMPLES.md		EXAMPLES.md
LICENSE		LICENSE
MEDIUM_SECURITY_ISSUES.md		MEDIUM_SECURITY_ISSUES.md
MOONDREAM_SETUP.md		MOONDREAM_SETUP.md
RATE_LIMITS.md		RATE_LIMITS.md
README.md		README.md
SECURITY_FIXES.md		SECURITY_FIXES.md
config.py		config.py
docker-compose.yml		docker-compose.yml
main.py		main.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

AI Image Caption Generator

🆓 Free Service with Rate Limits

Key Technologies

Features

1. Alt-Text Generation

2. Social Media Captions

3. SEO Optimization

4. Batch Processing

Installation

Prerequisites

Setup

Usage

Web Interfaces

API Endpoints

1. Generate Complete Package

2. Generate Alt-Text Only

3. Generate Social Caption

4. Generate SEO Metadata

5. Analyze Image

6. List Platforms

7. Health Check

8. Batch Upload (Process Multiple Images)

9. Check Batch Status

10. Get Batch Results

11. Export Batch Results

12. Check Rate Limit Status

Rate Limits

API Documentation

Technology Stack

Project Structure

Configuration

Performance

Best Practices

Alt-Text

Social Media Captions

SEO Metadata

Contributing

License

Support

Acknowledgments

Roadmap

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages