Skip to content

Good bye ffmpeg#24

Closed
dignifiedquire wants to merge 285 commits intomainfrom
good-bye-ffmpeg
Closed

Good bye ffmpeg#24
dignifiedquire wants to merge 285 commits intomainfrom
good-bye-ffmpeg

Conversation

@dignifiedquire
Copy link
Copy Markdown
Contributor

No description provided.

- Add VaapiEncoder using cros-codecs stateless H.264 encoder with VAAPI backend
- Feature flag: vaapi (linux-only, forwarded through iroh-live)
- RGBA/BGRA -> I420 -> NV12 -> VA surface -> encode -> Annex B -> length-prefixed
- 5 ignored HW tests (encode_basic, roundtrip, keyframe_interval, timestamps, config)
- Add --codec vaapi-h264 to publish.rs and rooms.rs examples
- Add linux-vaapi job to hw-accel CI workflow
- Fix pre-existing AudioBackend::new() calls in examples
The async VTB callback was reading the shared frame_count to compute
timestamps, which is racy — two callbacks could fire before the main
thread increments the counter, producing duplicate timestamps.

Instead, read the presentation timestamp directly from the
CMSampleBuffer, which VTB sets from the CMTime we pass to encode_frame.
- Implement VideoFrame trait for Nv12Frame to use new_vaapi() constructor
- Fix import paths (EncoderConfig, VA constants, Display, Image)
- Remove SurfacePool/PooledSurface usage (private in cros-libva)
- Use Image::create_from with tuple args instead of Resolution
- Fix Display::open() unwrapping (returns Option<Rc<Display>>)
TODO: run on self-hosted runners with real GPU hardware.
GitHub-hosted runners lack VideoToolbox/VAAPI support,
so HW tests are allowed to fail. Compile-check and clippy
jobs remain the real gatekeepers.
Comment thread .claude/settings.local.json Outdated
- Add mono↔stereo channel conversion in OpusAudioDecoder. The decoder
  now converts between source and target channel counts after decoding
  and resampling. This fixes garbled audio when encoding mono (mic) and
  decoding to stereo (speakers).

- Disable DTX and FEC in the Opus encoder. These features cause
  high-pitched artifacts after speech without proper decoder-side
  comfort noise handling (deferred to Phase 3).

- Add cross-channel pipeline tests verifying mono→stereo and
  stereo→mono roundtrips with energy preservation checks.
- Add AudioOutputInfo struct and list_audio_outputs() function
- Accept output_device parameter in AudioBackend::new()
- On Linux, default to pipewire for output (matching input behavior)
- Update all call sites to pass None for output device
When switching audio input devices, the publisher previously created a new
AudioBackend internally while the old one still held the CPAL input stream
open. On macOS this caused two CPAL instances competing for the same mic,
leading to underflows and eventually no audio.

Add set_audio_backend() so the caller can replace the backend before
set_audio() runs, ensuring only one AudioBackend exists at a time.
Switch from the dav1d C library to rav1d (pure Rust port) for AV1
decoding. This eliminates the last non-system dynamic dependency
(libdav1d), enabling fully static builds on macOS and Linux.

- Replace dav1d crate with rav1d git dep (memorysafety/rav1d)
- Add minimal safe wrapper (rav1d_safe.rs) over rav1d's C API
- Rename dav1d_dec.rs to av1_dec.rs
- All 98 tests pass, 0 clippy warnings
rav1d compiles from source, so libdav1d-dev and meson are no longer
needed. Keep nasm for x86 SIMD assembly.
Replace the C++ FFI webrtc-audio-processing library with sonora, a pure
Rust port. Key improvements:

- Use sonora's deinterleaved f32 API (process_capture_f32/process_render_f32)
  to eliminate interleave/deinterleave overhead in the audio pipeline
- Per-channel ring buffers in firewheel nodes instead of interleaved buffers
- Dynamic sample rate from audio device instead of hardcoded 48kHz
- Dynamic frame size computation (sample_rate / 100) for 10ms chunks
- Direct set_stream_delay_ms() call instead of config-based approach

Depends on sonora fork with MonoVad Send bound fix.
…ic reconfiguration

- Replace single sample_rate_hz with separate capture_config/render_config
  (sonora::StreamConfig) on AecProcessorConfig
- Use process_*_f32_with_config() to pass StreamConfig per call, allowing
  sonora to handle sample rate changes without rebuilding the processor
- Reconfigure frame size and buffers dynamically in new_stream() when the
  audio device changes sample rate
- AecProcessorConfig::new(sample_rate, capture_channels, render_channels)
  constructor for cleaner API
Frando and others added 27 commits March 16, 2026 21:27
Reduce demo code from ~1257 to ~710 lines by extracting reusable Android
building blocks into moq-media-android: camera source (CameraFrameSource /
SharedCameraSource), EGL extension wrappers (HardwareBuffer→EGLImage→GL),
and generic JNI handle helpers (Arc<Mutex<T>> ↔ i64).

Logcat stays in the demo as a separate module — filter string and tag are
app-specific choices, not SDK policy.

Demo cleanup: rename underscore-prefixed fields with #[allow(dead_code,
reason)], consolidate JNI helpers (read_jstring, borrow_handle,
take_handle), simplify getStatusLine (remove RefCell hack), use .and_then
chains and .inspect_err().ok() for cleaner error handling.

Call example: extract subscribe_call helper to deduplicate dial/accept,
proper dead_code annotation for remote_audio.

Update PLANS.md and platforms.md with tested Android status. Remove
OVERNIGHT.md from tracking. Add SDK extraction and demo review findings
to REVIEW.md.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…eviations

Remove the backward-compat img() alias on VideoFrame — all 28 call sites
migrated to the existing rgba_image() method. No reason to keep the alias
since there are no external consumers yet.

Rename abbreviated identifiers for clarity:
- prev → previous_capture in PublishCaptureController (what config it tracks)
- desc → description in VTB decoder (matches the field it destructures)
- Ok::<_, n0_error::AnyError>(...) → n0_error::Ok(...) in call example

Fix ~20 doc comments to follow RFC 1574 style:
- Remove leading "A"/"An" articles (e.g. "A video decoder that..." →
  "Video decoder that...")
- Replace "This is..."/"This function..." with verb-first phrasing
- Add missing doc on PacketSource::read() trait method

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Switch all moq-*/hang/web-transport-* deps to git (kixelated/moq main,
n0-computer/web-transport-iroh main) and iroh-smol-kv to iroh-097 branch,
declared once in [workspace.dependencies]. This replaces the old
[patch.crates-io] approach.

iroh 0.97 breaking changes: Endpoint::builder(presets::N0), PathInfo
methods now return Option, requested_track() no longer returns Option.

The new moq-lite publishes an empty initial catalog, so subscribers see
no renditions until the publisher sets video/audio. Add ready(),
video_ready(), audio_ready() async methods to RemoteBroadcast that watch
the catalog until the requested media type appears. Update all 21
pipeline_integration tests to use these — all pass.

Gate android-demo/rust behind cfg(target_os = "android") so workspace
builds succeed on non-Android hosts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The relay bridges iroh P2P publishers to browser viewers (and vice versa)
through moq-relay. Browsers connect via WebTransport using @moq/watch and
@moq/publish web components; CLI tools connect via iroh's QUIC transport.
Both sides share a common moq-lite origin, so a broadcast published from
either side is visible to the other.

The relay binary (`iroh-live-relay`) runs two servers: a QUIC server
(noq + iroh backends via moq-relay) for media transport, and an HTTP
server for static files and the TLS fingerprint endpoint. The web app
is built with Vite and embedded at compile time via `include_dir`.

Changes:
- New `iroh-live-relay` crate with `RelayServer` (persistent data dir,
  stable iroh key), clap CLI, moq-relay QUIC accept loop, and static
  file serving
- Web app in `iroh-live-relay/web/` using @moq/watch and @moq/publish
- `iroh-moq`: add `origin_producer()`, `origin_consumer()` accessors
  and doc comments on `session_connect`/`session_accept`
- `rusty-codecs`: add blinking yellow marker to test pattern (centered
  25% area square, 15-frame on/off cycle) for E2E video verification
- `publish` example: add `--test-source`, `--relay`, `--name` flags
  for relay publishing with deterministic video
- New `subscribe_test` example for CLI-side frame reception testing
- Playwright E2E test infrastructure with relay fixture (tempdir,
  port 0), iroh-to-browser test (video content verification via
  blinking marker detection), and browser-to-iroh test
- Mark V4L2 camera tests as `#[ignore]` (need exclusive device access)
- Workspace deps: add moq-relay and moq-native (path deps to local
  moq checkout on feat/relay-lib branch)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tickets now carry an optional list of relay URLs so clients can reach
publishers through a relay when direct P2P connectivity is unavailable.
The field is backward-compatible: empty relay lists are omitted during
serialization, and deserialization defaults to an empty list when the
field is absent.

Both ticket types gain a `with_relay_urls` builder method.
`CallTicket::into_live_ticket` propagates relay URLs to the LiveTicket.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The subscribe module and standalone audio decode pipeline had no test
coverage. These gaps made it hard to refactor with confidence.

T1 — subscribe.rs coverage (8 tests):
- RemoteBroadcast: name accessor, empty broadcast state, catalog
  propagation after video publish
- VideoTrack: rendition name, decoder name, initial current_frame
- AudioTrack: rendition name, pause/resume handle cycle

T2 — AudioDecoderPipeline standalone tests (2 tests):
- Roundtrip: sine → Opus encode → pipe → Opus decode → verify non-silent
  RMS energy in captured output
- Shutdown: encoder dropped → source closes → decoder pipeline detects
  stop via stopped() future

T3 (render.rs) and T5 (PublishCaptureController) are deferred: T3 needs
GPU hardware, T5 needs capture backend mocking.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The I420 (planar YUV 4:2:0) branch of rgba_image() was an
unimplemented!() stub (RC1 in REVIEW.md). V4L2 decoders can produce
I420 frames, so this path needs to work for software rendering.

Uses the existing yuv420_to_rgba_from_slices() conversion with BT.601
limited-range matrix, matching the NV12 path. Feature-gated behind
h264 or av1 (same as NV12) since the conversion depends on yuvutils-rs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Several public APIs in rusty-codecs panicked on errors that can occur in
production. These are risky because they crash the process instead of
letting callers handle the failure.

RC2: VideoCodec::best_available() returns Option instead of panicking
when no codec feature is enabled. All callers updated — binaries use
.expect(), the controller propagates via Result.

RC3: alloc_va_dma_frame() returns Result instead of four .expect()
calls on VAAPI surface creation/export. The FramePool closure retains
.expect() with a comment, since the pool API requires infallible
allocation.

RC4: WgpuVideoRenderer::render() returns Result instead of panicking
on GPU frame download failure. Callers (egui, dioxus, wgpu example)
updated to handle the error.

RC5: Six .unwrap() calls on output_texture/nv12_planes in render.rs
replaced with .context()? to propagate instead of panicking when the
renderer is used before initialization.

RC6: VTB encoder's build_force_keyframe_props() and
build_source_image_attrs() return Result instead of .expect() on FFI
dictionary creation.

RC7: Added invariant comment on take_owned() explaining why the
Borrowed branch is unreachable (alloc() always creates Owned buffers).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ON15: Kotlin MainActivity.onDisconnect() now cancels the render loop
coroutine before zeroing sessionHandle and calling disconnect(). The
previous ordering created a window where the render loop could read a
freed handle. onDestroy() follows the same order.

ON16: Added performance warning and TODO to CameraHelper.yuvToRgba()
documenting the pixel-by-pixel CPU conversion bottleneck. The main
camera path already pushes NV12 planes directly, bypassing this method.
A proper fix needs libyuv or a GPU shader.

ON17: Removed the dangling `startPublish` external declaration from
IrohBridge.kt. No JNI implementation existed for it — dial() handles
publish setup atomically, eliminating the TOCTOU window.

ON18: PipeWire thread join() in Drop now uses a 2-second polling
timeout. If the PipeWire main loop stalls, the thread is detached with
a warning instead of blocking the caller indefinitely. Applied to both
screen and camera capturers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The secret key loading pattern (read IROH_SECRET env var, or generate
and print) was duplicated across publish.rs and rooms.rs. Moving it to
the library makes it available to all examples and applications without
copy-paste.

The library version uses tracing::info instead of println for
consistency with the rest of the codebase.

(phase 7 of current prompt)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Checks off items completed in this session: subscribe/pipeline tests,
I420 conversion, error propagation in codecs and renderer, invariant
comment on take_owned().

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The Playwright e2e tests had several issues preventing them from passing:

- Relay startup output: tracing goes to stdout with ANSI codes, but the
  test fixture only listened on stderr. Added plain println! lines for
  machine-readable output (iroh endpoint, http port, quic port) and
  updated the fixture to listen on both streams.

- Port binding: the relay logged the CLI --bind arg ([::]:0) instead of
  the actual bound port. Now uses server.local_addr() for the real port.

- WebTransport fingerprint flow: moq-lite's dev mode fetches the TLS
  fingerprint from the same origin it connects to via WebTransport. With
  separate HTTP and QUIC ports this broke. Bind HTTP (TCP) to the same
  port as QUIC (UDP) so browsers can fetch the fingerprint and connect
  via WebTransport on the same origin.

- Video preset: test used "P360" but strum serialization is "360p".

- Canvas content detection: replaced static timeout + CSS dimension
  check with polling for actual non-black canvas pixels, eliminating
  flaky yellow detection failures from pipeline latency.

- subscribe_test: added retry logic (5 attempts, 1s backoff) for
  Live::subscribe — the publisher may not have announced the catalog
  yet when the subscriber connects through the relay.

Added relay_bridge.rs integration tests covering all four transport
bridging directions (noq↔noq, iroh↔iroh, noq→iroh, iroh→noq). These
proved the relay bridging works correctly and isolated the original
browser-to-iroh failure to a catalog format mismatch in the test, not
a transport issue.

Updated plans/relay-browser.md Step 7 with detailed ACME cert
provisioning design using instant-acme, HTTP-01 challenges via the
existing axum server, shared certs between HTTPS and QUIC, and
automatic renewal.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The DRM render loop was blocking the tokio runtime thread, starving
the async packet ingestion and decode pipeline — only the initial
buffered frames rendered, then fps dropped to 0.

Move rendering to a dedicated OS thread that receives frames via a
bounded tokio channel. The async frame pump runs on tokio, calling
next_frame().await, keeping the runtime free for network I/O and
decode. The render thread blocks on channel recv, converts to RGBA,
uploads the texture, and flips via set_crtc.

Other fixes:
- Audio: use NullAudioBackend on Pi (no PipeWire/PulseAudio) instead
  of AudioBackend::default() which panicked on missing audio devices
- DRM init: acquire DRM master + VT graphics mode before set_crtc
- DRM flip: use set_crtc with active mode+connector (page_flip
  returned EBUSY without event loop integration)
- GBM surface: force LINEAR modifier for vc4 scanout compatibility
- Framebuffer: use add_planar_framebuffer (drmModeAddFB2) instead of
  legacy add_framebuffer which vc4 rejected
- fb-demo: add test pattern command for display testing without network

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
v4l2r's Encoder abstraction had a bug where enqueue_capture_buffers()
failed immediately with NoFreeBuffer on bcm2835-codec (Pi Zero 2),
despite buffers being properly allocated. The CAPTURE queue never had
buffers queued, so the encoder accepted input frames but never produced
encoded output.

Replace v4l2r's stateful encoder with raw V4L2 ioctls matching ffmpeg's
exact h264_v4l2m2m sequence (verified via strace). Key fixes:

- Correct V4L2 buffer type constants: VIDEO_OUTPUT_MPLANE=10 (not 8),
  VIDEO_CAPTURE_MPLANE=9
- Correct V4L2 control IDs using CODEC_BASE=0x00990900
- Correct struct alignment: 4-byte padding in v4l2_format (208 bytes),
  8-byte v4l2_plane.m union on 64-bit
- G_FMT before S_FMT: bcm2835-codec rejects zeroed format structs
- Signal encoder init before waiting for first frame (was a deadlock)
- Queue first OUTPUT buffer before STREAMON (matches ffmpeg)
- recv_timeout-based encode loop to keep draining device between frames

Verified on Pi Zero 2 W: produces H.264 keyframes (~12KB) and P-frames
(~1KB) from rpicam-vid YUV420 input at 720p30.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
FFmpeg provides a single interface to all major hardware video
accelerators — the native rusty-codecs backends (cros-codecs VAAPI,
v4l2r) are excellent but don't yet cover every platform FFmpeg does
(NVENC, QSV, AMF, Raspberry Pi V4L2 M2M). Adding FFmpeg as an
optional backend lets us reach those platforms immediately while
keeping the native backends as the preferred path where they exist.

Encoder (FfmpegH264Encoder):
- Probes HW backends at init: V4L2 M2M (ARM first), VAAPI, NVENC,
  QSV, AMF, with libx264 software fallback.
- VAAPI path includes full device/frames context setup and NV12
  surface upload.
- Handles NAL format correctly per backend: libx264 produces
  length-prefixed (annexb=0), HW encoders produce Annex B natively.
  Converts to caller's requested format (Annex B or avcC).
- avcC extradata built from Annex B SPS/PPS for HW encoders, taken
  directly from libx264 extradata for software.
- Accepts all FrameData variants (Packed RGBA/BGRA, I420, NV12, GPU)
  with sws_scale conversion to each backend's expected input format.

Decoder (FfmpegVideoDecoder):
- Probes V4L2 M2M by name on ARM Linux (Raspberry Pi, Rockchip).
- Sets up VAAPI hwaccel on x86 Linux via hw_device_ctx on the generic
  h264 decoder — decoded VAAPI surfaces are transferred to software
  via av_hwframe_transfer_data before sws_scale to RGBA/BGRA.
- Falls back to generic software h264 decoder.
- Viewport downscaling via sws_scale.

Integration:
- Feature-gated behind `ffmpeg` (off by default), depends on
  ffmpeg-next 8 / ffmpeg-sys-next 8 (system FFmpeg, no static linking).
- FfmpegH264 variant added to VideoCodec enum with full dispatch in
  codec.rs and dynamic.rs.
- DynamicVideoDecoder tries FFmpeg before openh264 software when both
  features are enabled; serves as sole H.264 decoder when only
  `ffmpeg` is enabled.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tickets are the link between the Pi Zero publisher (QR on e-paper) and
the Android viewer. The old format (name@base32) was opaque binary,
incompatible with Android intents or URL handling.

New format: `iroh-live:<base64url(postcard(EndpointAddr))>/<name>`

Uses the same postcard serialization for EndpointAddr (preserving all
addressing info — ID, relay URLs, IP addresses) but encodes with
base64url instead of base32, and wraps it in an iroh-live: URI with
the broadcast name as a readable path segment. The old name@base32
format is still accepted for backward compatibility.

CallTicket now delegates serialization to LiveTicket (broadcast name
fixed to "call"), eliminating duplicate format code.

Android changes:
- Add ZXing barcode scanner (zxing-android-embedded 4.3.0)
- Add "Scan" button that launches QR scanner, auto-connects on decode
- Add iroh-live: intent filter so the app opens from any QR scanner
  or link containing an iroh-live: URI
- Handle incoming iroh-live: intents in onCreate

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Publish at 360p instead of 720p — better suited for the Pi Zero 2's
limited bandwidth and encode throughput.

Add --encoder CLI arg to publish: hardware (V4L2, default), software
(openh264), or ffmpeg. Add ffmpeg option to --decoder on watch.

Wire FfmpegH264Encoder into moq-media's VideoRenditions::add() and add
ffmpeg feature passthrough from moq-media to rusty-codecs.

The ffmpeg feature cannot be cross-compiled: ffmpeg-sys-next's build
script compiles+runs a host-side feature-check binary using target
include paths from pkg-config. The host compiler cannot parse aarch64
headers, and version differences (host 8.x vs Pi 5.x) would produce
wrong feature detection even if it could. build.sh uses default features
(V4L2 hardware encoder) for cross-compilation; ffmpeg can be built
natively on the Pi.

build.sh improvements: cross-compiler wrapper for C build scripts
(CC_aarch64_unknown_linux_gnu), arch-specific include path in sysroot.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Connects to a relay's iroh endpoint and publishes the broadcast there
in addition to P2P, so browser and non-P2P clients can subscribe
through the relay.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Consolidates demo apps under demos/. Updates workspace member path,
Cargo.toml path dependencies, README paths, and Makefile.toml references.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Runs on push to main/android and on PRs:
- check + clippy (workspace, all targets)
- cargo fmt --check
- cargo test --workspace + relay bridge tests (serial)
- Playwright e2e: pre-builds relay + examples, installs Chromium,
  runs browser tests with --workers=1, uploads artifacts on failure

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
All in-repo crates (iroh-live, iroh-moq, moq-media, rusty-codecs,
rusty-capture, moq-media-egui, moq-media-android) are now declared as
workspace dependencies in the root Cargo.toml. Member crates reference
them with `{ workspace = true }` instead of relative paths.

moq-relay and moq-native switch from local path deps (../moq/rs/...) to
git deps pointing at Frando/moq feat/relay-lib branch. This lets CI
build without cloning a sibling repo.

Merged hw-accel.yml into ci.yml — single workflow with check/test/clippy/
fmt as the gate job, then e2e-browser and hw-accel as dependent jobs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Frando
Copy link
Copy Markdown
Member

Frando commented Mar 17, 2026

See #31

@Frando Frando closed this Mar 17, 2026
@Frando Frando deleted the good-bye-ffmpeg branch April 1, 2026 11:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants