Summary
Teach users how to add video understanding capabilities to their agents using frame extraction and vision models. This section covers the batch-oriented architecture and sets clear expectations about maturity and limitations.
Course Section Outline
- Video processing architecture — FFmpeg frame extraction piped to a vision model
- When video processing is appropriate (batch analysis) vs. not ready (real-time streaming)
- Configuring the VideoPreprocessor service and its extraction parameters
- Frame extraction strategies — uniform sampling, keyframe detection, scene change
- Integrating video content with the file upload endpoint and storage in MinIO
- Maturity expectations and current limitations — processing latency, model accuracy on frames
- Cost implications — each frame is a vision model call
Lab Exercise
Upload a short video clip (under 30 seconds) through the file upload endpoint. Observe the frame extraction process. Ask the agent to describe what happens in the video and verify it synthesizes information across multiple extracted frames.
Companion Issues
Companion issues filed on fips-agents/agent-template, fips-agents/gateway-template, fips-agents/ui-template, and fips-agents/fips-agents-cli.
Size
S-M
Summary
Teach users how to add video understanding capabilities to their agents using frame extraction and vision models. This section covers the batch-oriented architecture and sets clear expectations about maturity and limitations.
Course Section Outline
Lab Exercise
Upload a short video clip (under 30 seconds) through the file upload endpoint. Observe the frame extraction process. Ask the agent to describe what happens in the video and verify it synthesizes information across multiple extracted frames.
Companion Issues
Companion issues filed on fips-agents/agent-template, fips-agents/gateway-template, fips-agents/ui-template, and fips-agents/fips-agents-cli.
Size
S-M