Your AI-Powered Social Media Caption Assistant
SnapScribe is a powerful AI-based caption generator designed for content creators, influencers, brands, and marketers. Upload an image, take a photo directly from your device camera, or type a prompt β and let SnapScribe craft scroll-stopping, tone-appropriate captions tailored for Instagram, Twitter / X, Facebook, and LinkedIn.
SnapScribe is a personal project built to explore integrating Generative AI into real-world web applications. It combines file handling, camera capture, API rate-limiting, prompt engineering, image processing, and AI model invocation to deliver a practical, production-ready caption-generation tool.
The frontend is built with React and served directly from the Express backend β no separate static host needed.
π Live App: gc-snapscribe
π οΈ Source Code: GitHub Repository
- π· Smart Image Upload β Drag & drop, browse, or take a live photo. Max 5MB per image.
- πΈ In-App Camera β Capture directly from your front or back camera with a live viewfinder. Preview, retake, or accept before use.
- π Text Prompt Input β Enter a topic, keyword, or scene description to guide the caption.
- π Tone Selection β Choose from
Fun,Romantic,Aesthetic,Sassy,Professional,Inspirational,Witty,Chill,Luxury,Dark Humor, andNostalgic. - π Platform Awareness β Select your target platform (Instagram, Twitter / X, Facebook, LinkedIn) so the AI adapts the caption's style, length, and tone accordingly.
- β¨ Single, High-Quality Captions β One unique, emotionally resonant caption per generation β no noise, no filler.
- π Ready-to-Copy Output β Each caption card shows its platform tag and tone, with a one-click copy button.
- π‘οΈ Rate Limiting β Built-in protection against misuse.
- π Secure & Fast β Helmet, CORS, and timeout handling included.
- Upload or capture an image (optional) β drag & drop, browse files, or use the in-app camera.
- Select a tone that matches your post's vibe.
- Choose your platform β Instagram, Twitter / X, Facebook, or LinkedIn.
- Enter a text prompt (optional) β describe the scene, mood, or intent.
- The AI processes your inputs and returns a scroll-stopping caption within ~180 characters.
- Copy and post π
When both an image and a prompt are provided, the AI combines both into a single, context-aware caption β it does not treat them separately.
- Opens a full-screen camera modal with a live viewfinder.
- Toggle between front camera (selfie) and back camera on supported devices.
- Hit the shutter button to capture, then accept the photo to use it or retake if needed.
- Captured photos are automatically resized and passed to the AI alongside your prompt.
- π± Influencers curating their brand identity
- πΈ Creators posting daily lifestyle content
- π§ Marketers running campaigns across platforms
- π§ Individuals breaking through creative block
| Input Type | Prompt / Image Description | Tone | Platform | Output Caption |
|---|---|---|---|---|
| Image | Cozy coffee shop corner | Aesthetic | "Where lattes meet lazy afternoons βπ" | |
| Text | Monday Motivation | Inspirational | "You were not born to be mediocre. Rise. π₯" | |
| Both | Selfie at beach, "golden hour" | Romantic | "You + me + sunsets = everything I need. π β€οΈ" | |
| Camera | Live photo at a rooftop | Witty | Twitter /X | "Sky's the limit β until rent's due. π #RooftopLife" |
Backend
- Node.js + Express β API server and static file host for the React frontend
- Mistral AI (
mistral-small-latest) via@langchain/mistralaiβ primary caption generation model - Google Gemini (
gemini-2.5-flash-lite) via@langchain/google-genaiβ fallback / image-capable model - Multer β multipart file upload handling
- Helmet & CORS β security middleware
- Rate Limiting β request throttling
- Timeout handling β reliability under load
Frontend
- React + Vite β UI framework and build tool
- TanStack Query β async state and mutation management
- react-dropzone β drag-and-drop image uploads
- MediaDevices API β in-browser camera access (front & back)
- shadcn/ui β accessible select components
- react-toastify β toast notifications
- Outfit + Playfair Display β typography
SnapScribe/
βββ client/ # React frontend (built output served by Express)
β βββ src/
β βββ api/
β β βββ generatePrompt.js
β βββ components/
β β βββ App.jsx
β β βββ Logo.jsx
β β βββ CaptionUploader.jsx
β β βββ FileUploader.jsx
β β βββ CameraModal.jsx # β New: in-app camera
β β βββ PlatformSelector.jsx # β New: platform buttons
β β βββ Selector.jsx
β β βββ Input.jsx
β β βββ CaptionLogs.jsx
β βββ hooks/
β β βββ useGenerateCaption.js
β βββ utils/
β βββ ImageResolution.js
βββ server/
β βββ services/
β β βββ ai.service.js # Mistral + Gemini model logic
β βββ utils/
β β βββ utils.js # buildPrompt
β β βββ constants.js
β βββ server.js # Serves API + built React frontend
βββ .env
βββ package.json
βββ README.md
git clone https://github.com/gc-MayankPun/AI-Caption-Generator.git
cd AI-Caption-Generator# Backend
npm install
# Frontend
cd client && npm install && cd ..Create a .env file in the root:
MISTRAL_API_KEY=your_mistral_api_key
GEMINI_API_KEY=your_gemini_api_key
PORT=3000cd client && npm run build && cd ..The Express server serves the built React app from
client/distβ no separate deployment needed.
npm startVisit http://localhost:3000 β the app is live.
- πΎ Caption History & Save Feature
- πͺ Tone Suggestions via AI
- π¨ Auto-style formatting per platform
- π Performance dashboard
- π Bulk caption generator for creators
MIT β see LICENSE for details.