AgentCut

Turn a text prompt into a fully produced video. 6 AI agents collaborate through a structured pipeline -- from creative direction to final cut.

How It Works

User Prompt
    |
    v
[Director Agent] -- Analyzes creative vision, plans shots
    |
    v
[Script Agent] -- Writes production script with video prompts, narration, subtitles
    |
    +------------------+------------------+
    |                  |                  |
    v                  v                  v
[Visual Agent]    [Voice Agent]     [Music Agent]
 Hailuo Video      Speech TTS        Music Gen
    |                  |                  |
    +------------------+------------------+
    |
    v
[Editor Agent] -- ffmpeg: concat, mix audio, burn subtitles
    |
    v
Final MP4

Visual, Voice, and Music agents run in parallel after the Script is ready. The Editor waits for all three, then composites the final video.

Features

Multi-agent SOP -- 6 specialized agents, each with a single responsibility, orchestrated in a production pipeline
Parallel execution -- Video, voice, and music generation run concurrently to minimize total wait time
Real-time progress -- SSE streaming shows every agent's status with completed/active/pending card states and elapsed timer
One prompt, full video -- From text description to a complete MP4 with narration, subtitles, and background music
~$0.50 per video -- Typical cost for a 3-shot video using MiniMax APIs
Style presets -- 5 one-click preset templates (Tech Product, Travel Vlog, Social Media Ad, Explainer, Brand Story)
Script preview -- Collapsible preview of the production script after the Script agent completes
Production summary -- Total time, shot count, agent count, and estimated cost displayed after video completion
Job isolation -- Each job gets a separate output directory, enabling safe concurrent production
Auto-cleanup -- Jobs older than 1 hour are automatically cleaned up (memory + disk)
Graceful error handling -- Per-stage error propagation with structured logging and frontend error display

Tech Stack

Component	Technology
LLM	MiniMax M1 (Director + Script agents)
Video	MiniMax Hailuo 2.3 (1080P, 6s clips)
Voice	MiniMax Speech-2.6-HD (TTS with emotion)
Music	MiniMax Music-2.0 (instrumental BGM)
Composition	ffmpeg (concat, audio mix, subtitles)
Backend	Python FastAPI + SSE streaming
Frontend	HTML + Tailwind CSS

Quick Start

Prerequisites

Python 3.10+
ffmpeg (brew install ffmpeg or apt install ffmpeg)
MiniMax API key

Setup

git clone https://github.com/calderbuild/agentcut.git
cd agentcut

pip install -r backend/requirements.txt

cp .env.example .env
# Edit .env and add your MiniMax API key

Run

python -m backend.main

Open http://localhost:8000

Docker

cp .env.example .env
# Edit .env and add your MiniMax API key

docker compose up

Usage

Enter a video description (e.g. "A sunset over Tokyo, cinematic aerial view")
Choose duration and shot count
Click Start Production
Watch the 6 agents work in real-time
Download the final MP4

API

Endpoint	Method	Description
`/api/create`	POST	Start a production job
`/api/stream/{job_id}`	GET	SSE event stream for progress
`/api/status/{job_id}`	GET	Poll job status
`/api/download/{job_id}`	GET	Download final video
`/api/health`	GET	Health check

Create a video

curl -X POST http://localhost:8000/api/create \
  -H "Content-Type: application/json" \
  -d '{"prompt": "A sunset over Tokyo", "duration": 18, "num_shots": 3}'

Cost

Each video generation uses MiniMax API credits:

LLM calls (Director + Script): ~$0.01
Video generation (3 shots): ~$0.30-0.60
Voice TTS (3 shots): ~$0.01
Music generation: ~$0.05

Estimated total per video: ~$0.40-0.70

Troubleshooting

ffmpeg subtitle filter not available: If subtitles are not burned into the video, your ffmpeg build lacks libass. Install with brew install ffmpeg (macOS, usually includes it) or apt install ffmpeg libass-dev (Linux). AgentCut will still produce videos without subtitles if the filter is missing.

API key errors: Make sure .env contains a valid MINIMAX_API_KEY. The server will refuse to start if the key is missing.

Video generation timeout: Hailuo video generation can take 2-5 minutes per shot. The pipeline polls every 10 seconds with a 5-minute timeout per shot.

Music generation fails: Music generation is non-blocking. If it fails, the pipeline continues and the Music card shows an amber "Skipped" state. The final video is produced without background music.

LLM returns invalid JSON: Director and Script agents automatically retry once with an explicit "return valid JSON" instruction if the first response fails to parse.

Concurrent jobs: Each job writes to its own backend/output/job_{id}/ directory. Multiple simultaneous productions will not conflict.

Project Structure

backend/
  agents/
    director.py   -- Creative director: prompt -> shot outline
    script.py     -- Scriptwriter: outline -> production script
    visual.py     -- Visual artist: script -> video clips (parallel)
    voice.py      -- Voice artist: script -> narration audio
    music.py      -- Composer: style+mood -> background music
    editor.py     -- Editor: ffmpeg composition
  pipeline.py     -- Agent orchestration (parallel execution)
  main.py         -- FastAPI server + SSE streaming
  config.py       -- Configuration + validation
frontend/
  index.html      -- Single-page app with real-time pipeline UI

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
backend		backend
docs		docs
frontend		frontend
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AgentCut

How It Works

Features

Tech Stack

Quick Start

Prerequisites

Setup

Run

Docker

Usage

API

Create a video

Cost

Troubleshooting

Project Structure

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AgentCut

How It Works

Features

Tech Stack

Quick Start

Prerequisites

Setup

Run

Docker

Usage

API

Create a video

Cost

Troubleshooting

Project Structure

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages