Skip to content

fix(render): preserve source fps/sample-rate + accurate frame seeking to prevent A/V drift#31

Open
daniellegurgel wants to merge 1 commit into
browser-use:mainfrom
daniellegurgel:main
Open

fix(render): preserve source fps/sample-rate + accurate frame seeking to prevent A/V drift#31
daniellegurgel wants to merge 1 commit into
browser-use:mainfrom
daniellegurgel:main

Conversation

@daniellegurgel
Copy link
Copy Markdown

@daniellegurgel daniellegurgel commented May 10, 2026

Summary

Two related fixes to prevent A/V drift when concatenating many segments. Discovered while editing a 3h37min source with 525 cuts — lip sync went out of sync progressively over the timeline.

1. Stop forcing -r 24 and -ar 48000 in extract_segment()

The hardcoded -r 24 was downsampling source FPS (e.g. 30fps → 24fps), dropping ~20% of frames per segment. The hardcoded -ar 48000 was upsampling audio (44.1kHz → 48kHz). Both transformations introduced cumulative timing errors over hundreds of concatenated segments.

By removing the hardcoded values, ffmpeg passes through the source rates intact — no resampling or framerate conversion. Same fix applied to both apply_loudnorm_two_pass() calls (-ar 48000 removed there as well).

2. Hybrid seek (-ss before AND after -i) in extract_segment()

Before:

-ss seg_start -i source -t duration

Fast seek but inexact (jumps to nearest keyframe before seg_start). With many cuts, sub-frame errors accumulate.

After:

-ss (seg_start - 1.0) -i source -ss 1.0 -t duration

Fast seek to 1s before, then accurate (frame-by-frame) seek for the last 1s. Frame-exact start for every segment.

Per-segment overhead: ~1s of decode. For 525 cuts, ~9 min of extra processing — acceptable trade for zero drift.

Test plan

  • Validated on a 3h37min source with 525 cuts (Portuguese language educational content, 30fps, 44.1kHz source). Lip sync stable throughout the 2h output.
  • No regressions in normal use cases (smaller cut counts).

Notes

This was found while building a content production pipeline using video-use. The drift was very subtle on small edits but devastating on long-form (>1h) edits with many cuts.

🤖 Generated with Claude Code


Summary by cubic

Prevents A/V drift by preserving source frame rate and sample rate, and by using accurate frame seeking for segment extraction. This keeps lip sync stable on long edits with many cuts, with a small per-segment decode cost.

  • Bug Fixes
    • Stop forcing -r 24 and -ar 48000 in extract_segment(); removed -ar 48000 in apply_loudnorm_two_pass(). ffmpeg now passes through source FPS/sample-rate unchanged.
    • Use hybrid seek in extract_segment(): -ss to max(0, start-1s) before -i, then -ss for fine seek. Produces frame-exact starts and avoids cumulative timing errors (~1s extra decode per segment).

Written for commit 9ed65c8. Summary will update on new commits.

Two related fixes to prevent A/V drift when concatenating many segments
(observed with 525-segment edits where lip sync went out of sync over time):

1. Stop forcing -r 24 and -ar 48000 in extract_segment()
   Original source fps (e.g. 30fps) was being downsampled to 24fps,
   dropping ~20% of frames. Audio was being upsampled from 44.1kHz to
   48kHz. Both transformations introduced cumulative timing errors over
   hundreds of concatenated segments. By removing the hardcoded values,
   ffmpeg passes through the source rates intact.
   Same fix also applied to apply_loudnorm_two_pass() (-ar 48000 removed).

2. Hybrid seek (-ss before AND after -i) in extract_segment()
   Previous: -ss seg_start -i source -t duration
     Fast seek but inexact (jumps to nearest keyframe before seg_start).
     With 525 cuts, sub-frame errors accumulate.
   New: -ss (seg_start - 1.0) -i source -ss 1.0 -t duration
     Fast seek to 1s before, then accurate (frame-by-frame) seek
     for the last 1s. Frame-exact start for every segment.
     Per-segment overhead: ~1s of decode. For 525 cuts that's ~9 min
     extra processing — acceptable trade for zero drift.

Validated on a 3h37min source with 525 cuts. Lip sync now stable
throughout the full output.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 1 file

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="helpers/render.py">

<violation number="1" location="helpers/render.py:189">
P1: Removing fixed output FPS/sample-rate from per-segment encodes makes the concat inputs non-uniform, which can break the concat demuxer’s `-c copy` requirements on mixed-source EDLs.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.

Comment thread helpers/render.py
seek_pre = max(0.0, seg_start - 1.0)
seek_fine = seg_start - seek_pre # geralmente 1.0 (menor se seg_start < 1)

cmd = [
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot May 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Removing fixed output FPS/sample-rate from per-segment encodes makes the concat inputs non-uniform, which can break the concat demuxer’s -c copy requirements on mixed-source EDLs.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At helpers/render.py, line 189:

<comment>Removing fixed output FPS/sample-rate from per-segment encodes makes the concat inputs non-uniform, which can break the concat demuxer’s `-c copy` requirements on mixed-source EDLs.</comment>

<file context>
@@ -178,16 +178,25 @@ def extract_segment(
+    seek_pre = max(0.0, seg_start - 1.0)
+    seek_fine = seg_start - seek_pre  # geralmente 1.0 (menor se seg_start < 1)
+
     cmd = [
         "ffmpeg", "-y",
-        "-ss", f"{seg_start:.3f}",
</file context>
Fix with Cubic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant