All workflows accept an optional credentials object for runtime credential injection. This is inherited from the base MuxAIOptions interface and is not repeated for each workflow below.
Analyzes a Mux video or audio asset and returns AI-generated metadata.
Parameters:
assetId(string) - Mux asset ID (video or audio-only)options(optional) - Configuration options
Options:
provider?: 'openai' | 'anthropic' | 'google'- AI provider (default: 'openai')tone?: 'neutral' | 'playful' | 'professional'- Analysis tone (default: 'neutral')model?: string- AI model to use (defaults:gpt-5.1,claude-sonnet-4-5, orgemini-3-flash-preview)languageCode?: string- Language code for transcript track selection (e.g., 'en', 'fr'). When omitted, prefers English if available.outputLanguageCode?: string- BCP 47 language code (e.g., 'en', 'fr', 'ja') for the generated title, description, and tags. When omitted or set to'auto', auto-detects from the selected transcript track's language. Falls back to unconstrained (LLM decides) if no language metadata is available.includeTranscript?: boolean- Include transcript in analysis (default: true)cleanTranscript?: boolean- Remove VTT timestamps and formatting from transcript (default: true)imageSubmissionMode?: 'url' | 'base64'- How to submit storyboard to AI providers (default: 'url')imageDownloadOptions?: object- Options for image download when using base64 modetimeout?: number- Request timeout in milliseconds (default: 10000)retries?: number- Maximum retry attempts (default: 3)retryDelay?: number- Base delay between retries in milliseconds (default: 1000)maxRetryDelay?: number- Maximum delay between retries in milliseconds (default: 10000)exponentialBackoff?: boolean- Whether to use exponential backoff (default: true)
promptOverrides?: object- Override specific sections of the prompt for custom use casestask?: string- Override the main task instructiontitle?: string- Override title generation guidancedescription?: string- Override description generation guidancekeywords?: string- Override keywords generation guidancequalityGuidelines?: string- Override quality guidelines
Returns:
interface SummaryAndTagsResult {
assetId: string;
title: string; // Short title
description: string; // Detailed description
tags: string[]; // Up to 10 relevant keywords
storyboardUrl?: string; // Video storyboard URL (undefined for audio-only assets)
usage?: TokenUsage; // Token usage from the AI provider
transcriptText?: string; // Raw transcript text (when includeTranscript is true)
}Analyzes a Mux asset for inappropriate content using OpenAI's Moderation API or Hive's Moderation API.
- For video assets, this moderates storyboard thumbnails (image moderation).
- For audio-only assets, this moderates the underlying transcript text (text moderation).
Parameters:
assetId(string) - Mux asset ID (video or audio-only)options(optional) - Configuration options
Options:
provider?: 'openai' | 'hive'- Moderation provider (default: 'openai')model?: string- OpenAI moderation model to use (default:omni-moderation-latest)languageCode?: string- Transcript language code when moderating audio-only assets (optional)thresholds?: { sexual?: number; violence?: number }- Custom thresholds (default: {sexual: 0.7, violence: 0.8})thumbnailInterval?: number- Seconds between thumbnails for long videos (default: 10)thumbnailWidth?: number- Thumbnail width in pixels (default: 640)maxSamples?: number- Maximum number of thumbnails to sample. Acts as a cap: ifthumbnailIntervalproduces fewer samples than this limit the interval is respected; otherwise samples are evenly distributed with first and last frames pinned. (default: unlimited)maxConcurrent?: number- Maximum concurrent API requests (default: 5)imageSubmissionMode?: 'url' | 'base64'- How to submit images to AI providers (default: 'url')imageDownloadOptions?: object- Options for image download when using base64 modetimeout?: number- Request timeout in milliseconds (default: 10000)retries?: number- Maximum retry attempts (default: 3)retryDelay?: number- Base delay between retries in milliseconds (default: 1000)maxRetryDelay?: number- Maximum delay between retries in milliseconds (default: 10000)exponentialBackoff?: boolean- Whether to use exponential backoff (default: true)
Hive note (audio-only): transcript moderation submits text_data and requires a Hive Text Moderation project/API key. If you use a Visual Moderation key, Hive will reject the request (see Hive Text Moderation docs).
Returns:
{
assetId: string;
mode: 'thumbnails' | 'transcript';
isAudioOnly: boolean;
thumbnailScores: Array<{ // Individual thumbnail results
url: string;
time?: number; // Time in seconds of the thumbnail within the video
sexual: number; // 0-1 score
violence: number; // 0-1 score
error: boolean;
errorMessage?: string;
}>;
maxScores: { // Highest scores across all thumbnails (or transcript chunks for audio-only)
sexual: number;
violence: number;
};
coverage: {
requestedSampleCount: number;
successfulSampleCount: number;
failedSampleCount: number;
sampleCoverage: number; // 0-1 fraction of requested samples that succeeded
isPartial: boolean; // true when some samples failed but the workflow still returned a result
isLowConfidence: boolean; // true when coverage is thin and thresholds should be interpreted cautiously
};
exceedsThreshold: boolean; // true if content should be flagged
thresholds: { // Threshold values used
sexual: number;
violence: number;
};
usage?: TokenUsage; // Workflow usage metadata
}Analyzes video frames to detect burned-in captions (hardcoded subtitles) that are permanently embedded in the video image.
Parameters:
assetId(string) - Mux video asset IDoptions(optional) - Configuration options
Options:
provider?: 'openai' | 'anthropic' | 'google'- AI provider (default: 'openai')model?: string- AI model to use (defaults:gpt-5.1,claude-sonnet-4-5, orgemini-3-flash-preview)imageSubmissionMode?: 'url' | 'base64'- How to submit storyboard to AI providers (default: 'url')imageDownloadOptions?: object- Options for image download when using base64 modetimeout?: number- Request timeout in milliseconds (default: 10000)retries?: number- Maximum retry attempts (default: 3)retryDelay?: number- Base delay between retries in milliseconds (default: 1000)maxRetryDelay?: number- Maximum delay between retries in milliseconds (default: 10000)exponentialBackoff?: boolean- Whether to use exponential backoff (default: true)
promptOverrides?: object- Override specific sections of the detection prompttask?: string- Override the main analysis task instructionanalysisSteps?: string- Override the step-by-step analysis procedurepositiveIndicators?: string- Override criteria for classifying text as captionsnegativeIndicators?: string- Override criteria for ruling out captions
Returns:
{
assetId: string;
hasBurnedInCaptions: boolean; // Whether burned-in captions were detected
confidence: number; // Confidence score (0.0-1.0)
detectedLanguage: string | null; // Language of detected captions, or null
storyboardUrl: string; // URL to analyzed storyboard
usage?: TokenUsage; // Token usage from the AI provider
}Detection Logic:
- Analyzes video storyboard frames to identify text overlays
- Distinguishes between actual captions and marketing/end-card text
- Text appearing only in final 1-2 frames is classified as marketing copy
- Caption text must appear across multiple frames throughout the timeline
- Optimized prompts minimize false positives
Answer questions about asset content by analyzing storyboard frames and optional transcripts. For audio-only assets, this workflow analyzes transcript text only. By default, answers are "yes"/"no", but you can override the allowed responses.
Parameters:
assetId(string) - Mux asset ID (video or audio-only)questions(array) - Array of question objects- Each question object must have a
questionfield (string) - Each question may optionally include
answerOptions?: string[](defaults to["yes", "no"]) - Example:
[{ question: "What is the production quality?", answerOptions: ["amateur", "semi-pro", "professional"] }]
- Each question object must have a
options(optional) - Configuration options
Options:
provider?: 'openai' | 'anthropic' | 'google'- AI provider (default: 'openai')model?: string- AI model to use (defaults:gpt-5.1,claude-sonnet-4-5, orgemini-3-flash-preview)languageCode?: string- Language code for transcript track selection (e.g., 'en', 'fr'). When omitted, prefers English if available.includeTranscript?: boolean- Include transcript in analysis (default: true, required for audio-only assets)cleanTranscript?: boolean- Remove VTT timestamps and formatting from transcript (default: true)imageSubmissionMode?: 'url' | 'base64'- How to submit storyboard to AI providers (default: 'url')imageDownloadOptions?: object- Options for image download when using base64 modetimeout?: number- Request timeout in milliseconds (default: 10000)retries?: number- Maximum retry attempts (default: 3)retryDelay?: number- Base delay between retries in milliseconds (default: 1000)maxRetryDelay?: number- Maximum delay between retries in milliseconds (default: 10000)exponentialBackoff?: boolean- Whether to use exponential backoff (default: true)
storyboardWidth?: number- Storyboard resolution in pixels (default: 640)
Returns:
interface AskQuestionsResult {
assetId: string;
answers: Array<{
question: string; // The original question
answer: string | null; // Answer from allowed options (null when skipped)
confidence: number; // Confidence score (0.0-1.0)
reasoning: string; // AI's explanation based on observable evidence or why the question was skipped
skipped: boolean; // True when the question was not answerable from the asset content
}>;
storyboardUrl?: string; // URL to analyzed storyboard (undefined for audio-only assets)
usage?: TokenUsage; // Token usage from the AI provider
transcriptText?: string; // Raw transcript (when includeTranscript is true)
}Examples:
// Single question
const result = await askQuestions("asset-id", [
{ question: "Does this video contain cooking?" }
]);
console.log(result.answers[0].answer); // "yes" or "no" by default
console.log(result.answers[0].confidence); // 0.95
console.log(result.answers[0].reasoning); // "A chef prepares ingredients..."
// Multiple questions (efficient single API call)
const result = await askQuestions("asset-id", [
{ question: "Does this video contain people?" },
{ question: "Is this video in color?" },
{ question: "Does this video contain violence?" }
]);
// Without transcript (visual-only analysis)
const result = await askQuestions("asset-id", questions, {
includeTranscript: false
});
// Per-question answer options — mix yes/no with classification scales
const result = await askQuestions("asset-id", [
{ question: "Does this contain cooking?" }, // answer options default to yes/no
{ question: "What is the production quality?", answerOptions: ["amateur", "semi-pro", "professional"] },
{ question: "What is the primary content type?", answerOptions: ["tutorial", "entertainment", "news", "advertisement"] },
{ question: "What is the overall sentiment?", answerOptions: ["positive", "neutral", "negative"] },
]);Tips for Effective Questions:
- Be specific and focused on observable evidence
- Frame questions positively (prefer "Is X present?" over "Is X not present?")
- Avoid ambiguous or subjective questions
- Questions should have clear answers that map to your allowed options
- The AI prioritizes visual evidence when transcript and visuals conflict
Generate AI-powered insights explaining viewer engagement patterns by analyzing hotspot data, heatmap statistics, visual frames, and transcripts.
Parameters:
assetId(string) - Mux asset IDoptions(optional) - Configuration options
Options:
-
provider?: 'openai' | 'anthropic' | 'google'- AI provider (default: 'openai') -
model?: string- AI model to use (defaults:gpt-5.1,claude-sonnet-4-5, orgemini-3-flash-preview) -
hotspotLimit?: number- Number of engagement moments to analyze per direction (default: 5, range: 1-10). Note: actual moment count may be up to 2x this value since both peaks and valleys are fetched. -
timeframe?: string- Engagement data timeframe (default: '7:days')- Examples:
'60:minutes','24:hours','7:days','30:days'
- Examples:
-
skipShots?: boolean- Skip shots integration, use thumbnails instead (default: false). Recommended for latency-sensitive use cases.
Returns:
interface EngagementInsightsResult {
assetId: string;
momentInsights: Array<{
startMs: number; // Start time in milliseconds
endMs: number; // End time in milliseconds
timestamp: string; // Human-readable timestamp (e.g., "2:15")
engagementScore: number; // Normalized score (0.0-1.0)
insight: string; // Explanation of engagement pattern
}>;
overallInsight: {
summary: string; // Overall engagement summary
trends: string[]; // Key trends identified
};
usage?: { // Token usage statistics
inputTokens: number;
outputTokens: number;
totalTokens: number;
};
}Examples:
// Basic usage - informational insights
const result = await generateEngagementInsights("asset-id");
result.momentInsights.forEach(m => {
console.log(`${m.timestamp}: ${m.insight}`);
});
// Custom timeframe
const result = await generateEngagementInsights("asset-id", {
timeframe: "30:days",
hotspotLimit: 5,
});
console.log(result.overallInsight.summary);
console.log("Trends:", result.overallInsight.trends);
// Low-latency mode (skip shots polling)
const result = await generateEngagementInsights("asset-id", {
skipShots: true,
});Requirements:
- Newer or low-view videos may not have sufficient engagement data
- Works with both video and audio-only assets (audio-only skips visual analysis)
Use Cases:
- Content optimization based on viewer behavior
- Understanding what drives re-watching and engagement
- Identifying pacing issues and drop-off points
- A/B testing video variations
- Providing engagement feedback to content creators
Translates existing captions from one language to another and optionally adds them as a new track to the Mux asset. The source language is inferred from the track's metadata.
Parameters:
assetId(string) - Mux asset ID (video or audio-only)trackId(string) - ID of the source caption track to translatetoLanguageCode(string) - Target language code (e.g., 'es', 'fr', 'de')options- Configuration options
Options:
provider: 'openai' | 'anthropic' | 'google'- AI provider (required)model?: string- Model to use (defaults to the provider's chat model if omitted)uploadToMux?: boolean- Whether to upload translated track to Mux (default: true)s3Endpoint?: string- S3-compatible storage endpoints3Region?: string- S3 region (default: 'auto')s3Bucket?: string- S3 bucket namestorageAdapter?: StorageAdapter- Optional adapter withputObjectandcreatePresignedGetUrlmethodss3SignedUrlExpirySeconds?: number- Expiry duration in seconds for S3 presigned GET URLs (default: 86400 / 24 hours)chunking?: object- Optional VTT-aware chunking controls for large caption translationsenabled?: boolean- Set tofalseto translate all cues in a single structured request (default:true)minimumAssetDurationSeconds?: number- Prefer a single request until the asset is at least this long (default:1800)targetChunkDurationSeconds?: number- Soft target for chunk duration once chunking starts (default:1800)maxConcurrentTranslations?: number- Max number of concurrent translation requests when chunking (default:4)maxCuesPerChunk?: number- Hard cap for cues included in a single AI translation chunk (default:80)maxCueTextTokensPerChunk?: number- Approximate cap for cue text tokens included in a single AI translation chunk (default:2000)
Returns:
interface TranslationResult {
assetId: string;
trackId: string; // Source track ID
sourceLanguageCode: string; // Inferred from track metadata
targetLanguageCode: string;
sourceLanguage: LanguageCodePair; // { iso639_1: string; iso639_3: string }
targetLanguage: LanguageCodePair; // { iso639_1: string; iso639_3: string }
originalVtt: string; // Original VTT content
translatedVtt: string; // Translated VTT content
uploadedTrackId?: string; // Mux track ID (if uploaded)
presignedUrl?: string; // S3 presigned URL (default expiry: 24 hours)
usage?: TokenUsage; // Token usage from the AI provider
}Supported Languages:
All ISO 639-1 language codes are automatically supported using Intl.DisplayNames. Examples: Spanish (es), French (fr), German (de), Italian (it), Portuguese (pt), Polish (pl), Japanese (ja), Korean (ko), Chinese (zh), Russian (ru), Arabic (ar), Hindi (hi), Thai (th), Swahili (sw), and many more.
Chunking Behavior:
- Chunking is enabled by default for
translateCaptions - Shorter assets are translated in a single request until
minimumAssetDurationSecondsis reached - When chunking is active, requests stay aligned to VTT cues and the final VTT is rebuilt locally
- Chunk size is bounded by both cue count and approximate cue text token budget
Edits a caption track using LLM-powered profanity censorship, static find/replace, or both. Optionally uploads the edited track to Mux.
Parameters:
assetId(string) - Mux asset ID (video or audio-only)trackId(string) - ID of the caption track to editoptions- Configuration options
Options:
provider?: 'openai' | 'anthropic' | 'google'- AI provider (required whenautoCensorProfanityis set)model?: string- Model to use (defaults to the provider's chat model if omitted)autoCensorProfanity?: object- LLM-powered profanity censorship (optional)mode?: 'blank' | 'remove' | 'mask'- Replacement strategy (default: 'blank')'blank':shit→[____](bracketed underscores matching word length)'remove': word removed entirely'mask':shit→????(question marks matching word length)
alwaysCensor?: string[]- Words to always censor regardless of LLM outputneverCensor?: string[]- Words to never censor even if the LLM flags them (takes precedence overalwaysCensor)
replacements?: Array<{ find: string; replace: string }>- Static find/replace pairs (optional, no LLM needed)uploadToMux?: boolean- Whether to upload edited track to Mux (default: true)deleteOriginalTrack?: boolean- Whether to delete the original track after uploading the edited one (default: true)s3Endpoint?: string- S3-compatible storage endpoints3Region?: string- S3 region (default: 'auto')s3Bucket?: string- S3 bucket nametrackNameSuffix?: string- Suffix appended to the original track name in parentheses (default: 'edited', e.g. "Subtitles (edited)")storageAdapter?: StorageAdapter- Optional adapter withputObjectandcreatePresignedGetUrlmethodss3SignedUrlExpirySeconds?: number- Expiry duration in seconds for S3 presigned GET URLs (default: 86400 / 24 hours)
At least one of autoCensorProfanity or replacements must be provided.
Returns:
interface ReplacementRecord {
cueStartTime: number; // Start time of the cue where the replacement occurred (seconds)
before: string; // Original word/phrase
after: string; // Replacement text
}
interface EditCaptionsResult {
assetId: string;
trackId: string;
originalVtt: string; // Original VTT content
editedVtt: string; // Edited VTT content
totalReplacementCount: number; // Total replacements across all operations
autoCensorProfanity?: { // Present when autoCensorProfanity was used
replacements: ReplacementRecord[]; // Each censored word with cue timing
};
replacements?: { // Present when replacements were used
replacements: ReplacementRecord[]; // Each static replacement with cue timing
};
uploadedTrackId?: string; // Mux track ID (if uploaded)
presignedUrl?: string; // S3 presigned URL (default expiry: 24 hours)
usage?: TokenUsage; // Token usage (only present if LLM was used)
}Generates AI-powered chapter markers by analyzing video or audio transcripts. Creates logical chapter breaks based on topic changes and content transitions.
Parameters:
assetId(string) - Mux asset ID (video or audio-only)options(optional) - Configuration options
Options:
languageCode?: string- Language code for captions (e.g., 'en', 'es', 'fr'). When omitted, prefers English if available.outputLanguageCode?: string- BCP 47 language code (e.g., 'en', 'fr', 'ja') for the generated chapter titles. When omitted or set to'auto', auto-detects from the selected transcript track's language. Falls back to unconstrained (LLM decides) if no language metadata is available.provider?: 'openai' | 'anthropic' | 'google'- AI provider (default: 'openai')model?: string- AI model to use (defaults:gpt-5.1,claude-sonnet-4-5, orgemini-3-flash-preview)promptOverrides?: object- Override specific sections of the chaptering prompttask?: string- Override the main task instructionoutputFormat?: string- Override the expected output format descriptionchapterGuidelines?: string- Override chapter count and formatting guidelinestitleGuidelines?: string- Override chapter title style guidelines
minChaptersPerHour?: number- Minimum chapters to generate per hour of content (default: 3)maxChaptersPerHour?: number- Maximum chapters to generate per hour of content (default: 8)
Returns:
{
assetId: string;
languageCode?: string; // Resolved from input or track metadata
chapters: Array<{
startTime: number; // Chapter start time in seconds
title: string; // Descriptive chapter title
}>;
usage?: TokenUsage; // Token usage from the AI provider
}Requirements:
- Asset must have a ready caption/transcript track
- When
languageCodeis omitted, prefers an English track if available - Uses existing auto-generated or uploaded captions/transcripts
Example Output:
// Perfect format for Mux Player
player.addChapters([
{ startTime: 0, title: "Introduction and Setup" },
{ startTime: 45, title: "Main Content Discussion" },
{ startTime: 120, title: "Conclusion" }
]);Creates AI-dubbed audio tracks from existing media content using ElevenLabs voice cloning and translation. Uses the default audio track on your asset. Source language is auto-detected unless fromLanguageCode is provided.
Parameters:
assetId(string) - Mux asset ID (video or audio-only; must have audio.m4a static rendition)toLanguageCode(string) - Target language code (e.g., 'es', 'fr', 'de')options(optional) - Configuration options
Options:
provider?: 'elevenlabs'- AI provider (default: 'elevenlabs')fromLanguageCode?: string- Optional source language code passed to ElevenLabssource_lang(ISO 639-1 or ISO 639-3, default: auto-detect)numSpeakers?: number- Number of speakers (default: 0 for auto-detect)uploadToMux?: boolean- Whether to upload dubbed track to Mux (default: true)s3Endpoint?: string- S3-compatible storage endpoints3Region?: string- S3 region (default: 'auto')s3Bucket?: string- S3 bucket namestorageAdapter?: StorageAdapter- Optional adapter withputObjectandcreatePresignedGetUrlmethodss3SignedUrlExpirySeconds?: number- Expiry duration in seconds for S3 presigned GET URLs (default: 86400 / 24 hours)
Returns:
interface TranslateAudioResult {
assetId: string;
targetLanguageCode: string;
targetLanguage: LanguageCodePair; // { iso639_1: string; iso639_3: string }
dubbingId: string; // ElevenLabs dubbing job ID
uploadedTrackId?: string; // Mux audio track ID (if uploaded)
presignedUrl?: string; // S3 presigned URL (default expiry: 24 hours)
usage?: TokenUsage; // Workflow usage metadata
}Requirements:
- Asset must have an
audio.m4astatic rendition (auto-requested if missing) - ElevenLabs API key with Creator plan or higher
- S3-compatible storage for Mux ingestion
Supported Languages:
ElevenLabs supports 32+ languages with automatic language name detection via Intl.DisplayNames. Supported languages include English, Spanish, French, German, Italian, Portuguese, Polish, Japanese, Korean, Chinese, Russian, Arabic, Hindi, Thai, and many more. Track names are automatically generated (e.g., "Polish (auto-dubbed)").
Generate vector embeddings for transcript chunks from video or audio assets for semantic search.
Deprecated:
generateVideoEmbeddings is deprecated. Use generateEmbeddings instead.
Parameters:
assetId(string) - Mux asset ID (video or audio-only)options(optional) - Configuration options
Options:
provider?: 'openai' | 'google'- Embedding provider (default: 'openai')model?: string- Model to use (defaults:text-embedding-3-smallfor OpenAI,gemini-embedding-001for Google)chunkingStrategy?: object- How to chunk the transcripttype: 'token' | 'vtt'- Chunking methodmaxTokens?: number- Maximum tokens per chunk (default: 500)overlap?: number- Token overlap between chunks (for type: 'token', default: 100)overlapCues?: number- VTT cue overlap between chunks (for type: 'vtt', default: 2)
languageCode?: string- Language code for transcript track selection. When omitted, prefers English if available.batchSize?: number- Maximum number of chunks to process concurrently (default: 5)
Returns:
{
assetId: string;
chunks: Array<{
chunkId: string;
embedding: number[]; // Vector embedding
metadata: {
startTime?: number; // Chunk start time in seconds
endTime?: number; // Chunk end time in seconds
tokenCount: number;
};
}>;
averagedEmbedding: number[]; // Single embedding for entire transcript
provider: string;
model: string;
metadata: {
totalChunks: number;
totalTokens: number;
chunkingStrategy: string;
embeddingDimensions: number;
generatedAt: string;
};
usage?: TokenUsage; // Workflow usage metadata
}Customize specific sections of the summarization prompt for different use cases like SEO, social media, or technical analysis. See the Prompt Customization guide for a full overview of the prompt builder pattern.
Tip: Before adding overrides, read through the default summarization prompt template in src/workflows/summarization.ts (the summarizationPromptBuilder config) so that you have clear context on what each section does and what you're changing.
import { getSummaryAndTags } from "@mux/ai/workflows";
// SEO-optimized metadata
const seoResult = await getSummaryAndTags(assetId, {
tone: "professional",
promptOverrides: {
task: "Generate SEO-optimized metadata that maximizes discoverability.",
title: "Create a search-optimized title (50-60 chars) with primary keyword front-loaded.",
keywords: "Focus on high search volume terms and long-tail keywords.",
},
});
// Social media optimized for engagement
const socialResult = await getSummaryAndTags(assetId, {
promptOverrides: {
title: "Create a scroll-stopping headline using emotional triggers or curiosity gaps.",
description: "Write shareable copy that creates FOMO and works without watching the video.",
keywords: "Generate hashtag-ready keywords for trending and niche community tags.",
},
});
// Technical/production analysis
const technicalResult = await getSummaryAndTags(assetId, {
tone: "professional",
promptOverrides: {
task: "Analyze cinematography, lighting, and production techniques.",
title: "Describe the production style or filmmaking technique.",
description: "Provide a technical breakdown of camera work, lighting, and editing.",
keywords: "Use industry-standard production terminology.",
},
});Available override sections:
| Section | Description |
|---|---|
task |
Main instruction for what to analyze |
title |
Guidance for generating the title |
description |
Guidance for generating the description |
keywords |
Guidance for generating keywords/tags |
qualityGuidelines |
General quality instructions |
Each override can be a simple string (replaces the section content) or a full PromptSection object for advanced control over XML tag names and attributes.
Returned by all workflows in the usage field:
interface TokenUsage {
inputTokens?: number; // Tokens in the input prompt
outputTokens?: number; // Tokens generated in the output
totalTokens?: number; // Total tokens consumed
reasoningTokens?: number; // Chain-of-thought reasoning tokens
cachedInputTokens?: number; // Input tokens served from cache
metadata?: {
assetDurationSeconds?: number;
thumbnailCount?: number;
};
}Returned by translateCaptions and translateAudio:
interface LanguageCodePair {
iso639_1: string; // Two-letter code (e.g., "en", "es") — use for Mux/browser players
iso639_3: string; // Three-letter code (e.g., "eng", "spa") — use for ElevenLabs
}