perf(recorder): wrap MJPEG frames in MKV with timestamps#41191
Conversation
Previously the recorder piped MJPEG via image2pipe and repeated the last frame ~25x/sec on the Node side to fake a constant frame rate. Instead, wrap each MJPEG frame in a minimal streaming Matroska container with an explicit timestamp and let ffmpeg handle CFR frame duplication. This cuts the ffmpeg process CPU usage significantly on the common e2e case.
|
Let's not merge this before the version cut, it's risky and i'd like to soak it on CI for a while. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Test results for "tests others"3 flaky19950 passed, 672 skipped Merge workflow run. |
Test results for "tests 2"3 fatal errors, not part of any test 45 flaky93391 passed, 4206 skipped Merge workflow run. |
When several screencast frames map to the same constant-frame-rate output slot (e.g. fast bursts before dispose), we kept the first frame and only updated _lastFrame for later ones without ever piping them. The final real frame was then emitted once at timestamp+addTime, an isolated trailing cluster that ffmpeg's "-r 25" CFR conversion drops - leaving the whole video showing only the first (stale) frame. Coalesce to the most recent frame per slot instead, emitting it once the slot is complete, and at stop emit the last real frame at its own slot plus the >=1s hold. Preserves the per-slot dedup while making the last encoded frame reflect the latest pixels.
|
@yury-s please take a look at the approach, I wonder if you think it's worth the risk. |
Test results for "MCP"7289 passed, 1122 skipped Merge workflow run. |
Test results for "tests 1"2 flaky39547 passed, 743 skipped Merge workflow run. |
| // "-avioflags direct" reduces general buffering. | ||
| // We wrap each incoming MJPEG frame into a minimal Matroska stream (see ./ebml.ts) with an | ||
| // explicit timestamp, and let ffmpeg read frame timing from that stream. | ||
| // "-f matroska -i pipe:0" forces input to be read from standard input as Matroska. |
There was a problem hiding this comment.
do we compile ffmpeg with matroska supprot? I thought we included only minimal required tool set.
| }); | ||
| this._process = launchedProcess; | ||
| this._gracefullyClose = gracefullyClose; | ||
| launchedProcess.stdin!.write(writeHeader(w, h)); |
There was a problem hiding this comment.
if we want to allow providing custom ffmpeg args, this will break
| if (error) | ||
| debugLogger.log('browser', `ffmpeg failed to write: ${String(error)}`); | ||
| }); | ||
| private _emitFrame(frame: Buffer, timestampMs: number) { |
There was a problem hiding this comment.
_emitFrame drops write backpressure. The old _sendFrame awaited stdin.write(frame, cb) and chained via _lastWritePromise, throttling the producer when ffmpeg/disk fell behind. This is now fire-and-forget, so if ffmpeg can't keep up (busy page + overbooked CPU — the video_stress.js case), frames buffer unbounded in stdin instead of applying backpressure. Was peak Node RSS measured under the noise workload? If this is intentional, worth a comment saying so.
|
|
||
| // Encodes a value as an EBML variable-length size integer (vint): the leading bits select | ||
| // the byte length and are followed by the big-endian value. | ||
| function vint(value: number): Buffer { |
There was a problem hiding this comment.
I guess the way we ensure correctness of this code is to eyeball the resulting videos?
The video recorder piped MJPEG to ffmpeg via
-f image2pipeand, to fake a constant frame rate, repeated the previous JPEG on the Node side ~25×/sec — which ffmpeg then had to decode and discard.This PR instead wraps each MJPEG frame in a minimal streaming Matroska (MKV) container (new
ebml.tswriter) with an explicit timestamp and lets ffmpeg handle constant-frame-rate duplication itself.Result: −43% ffmpeg CPU on static / low-animation pages — the common e2e case. (Peak ffmpeg RSS also drops ~33% there.) The gain shrinks as pages get busier, since then most frames are genuinely distinct and have to be encoded either way.
Full measurements
Measured the ffmpeg process before/after across three page types: static (mostly still, the common e2e case), moderate (some animation), and noise (worst case: full-viewport random noise every frame).
ffmpeg CPU time:
ffmpeg peak RSS:
The Node.js side is roughly neutral: slightly lower CPU on static pages (~27ms → ~24ms), within noise on busy pages. The JPEG buffer is written by reference —
writeClusterHeader()returns only the ~15-25 byte cluster/block wrapper and never copies the frame — so we avoid the buffer-copy overhead a mux would otherwise add.Notes:
ebml.tsemits only the subset of Matroska needed for a single live MJPEG track.-avioflags directis intentionally dropped: the Matroska demuxer seeks while parsing the header, and direct mode routes those seeks to the non-seekablepipe:0instead of AVIO's buffer, which breaks header parsing.