Skip to content

RTSPWriter rgb stream produces gray output due to missing -pix_fmt yuv420p in FFmpeg command #632

@h4kyu

Description

@h4kyu

Description

Environment

Description

When using RTSPWriter with the rgb annotator, the RTSP stream connects successfully and MediaMTX receives a publisher, but the stream content is gray (ie. no scene is visible). FFmpeg and VLC both show gray output.

Root cause

In rtsp.py, the output -pix_fmt yuv420p is commented out:

"-preset", "ll",
# '-pix_fmt', 'yuv420p',   <-- never applied

Isaac Sim's RGB annotator outputs rgba frames. Without an explicit output pixel format, FFmpeg attempts an implicit conversion from rgba to a format hevc_nvenc can accept. This conversion runs at ~8 fps instead of the required 30 fps, causing the pipe buffer to fill break. The stream appears to publish (MediaMTX shows it online) but carries no valid frame content.

Evidence

FFmpeg stats confirming the bottleneck:

Stream #0:0: Video: hevc (Main), rgba(progressive), 1280x720
frame=3083 fps=8.1 q=7.0 drop=2977 speed=0.27x

2977 out of 3083 frames dropped. FFmpeg running at 0.27x realtime.

Additional symptoms:

  • HEVC reference frame warnings: Could not find ref with POC -16
  • Isaac Sim GUI error: 'NoneType' object has no attribute 'pipe'
  • Stream dies with i/o timeout in MediaMTX after ~20-110 seconds

Fix

See linked PR. Three changes to the FFmpeg command in RTSPCamera.__init__:

  1. Uncomment -pix_fmt yuv420p on the output side
  2. Change deprecated -preset ll to -preset p1
  3. Replace deprecated -vsync cfr with -fps_mode vfr and remove
    conflicting -r flag

After the fix:

Stream #0:0: Video: hevc (Main), yuv420p(progressive), 1280x720
fps=25-30   speed=1x   drop=~0

Isaac Sim version

5.1.0

Operating System (OS)

Ubuntu 24.04.3 LTS

GPU Name

NVIDIA L40S

GPU Driver and CUDA versions

Driver 580.126.09, CUDA 13.0

Logs

No response

Additional information

See below for detailed information regarding replicating the issue and the fix:

Component Details
Host AWS EC2 NVIDIA Isaac Sim Development Workstation AMI (Ubuntu 24, x86_64)
Instance type g6e.4xlarge
Isaac Sim path /opt/IsaacSim
omni.replicator.core 1.12.27+107.3.3
isaacsim.replicator.agent.core 0.7.28+107.3.3
Python cp311 (Isaac Sim bundled)
FFmpeg 6.1.1-3ubuntu5 (system)
RTSP server MediaMTX v1.18.2
Camera prim /World/RTSP_Camera_01
Stream path RTSPWriter_World_RTSP_Camera_01_rgb
GPU NVIDIA L40S (hevc_nvenc, h264_nvenc, av1_nvenc all available)

Maintenance Notes

  • The patched rtsp.py will be overwritten if Isaac Sim updates isaacsim.replicator.agent.core. Keep /home/ubuntu/rtsp_patched.py as a reference.
  • After any Isaac Sim extension update, re-check the new rtsp.py for the same four issues (preset, pix_fmt, fps_mode, -r flag) and re-apply the patch using apply_patch.py.
  • The setup script must be re-run after every Isaac Sim restart. Python runtime state does not persist across restarts.
  • MediaMTX must be running before the setup script is executed. Verify with ss -tlnp | grep 8554. Start with mediamtx & if not running.
  • VLC startup latency of 3–5 seconds is normal: RTSP handshake + waiting for the first HEVC keyframe.
  • Do not run multiple versions of the setup script in the same Isaac Sim session without restarting. Stale patches and writer registrations accumulate in the Python runtime and produce confusing behavior.
  • Motivation

    Stream a camera from an Isaac Sim stage over RTSP from an AWS EC2 instance and view the live stream in VLC on a local machine.

    A camera prim at /World/RTSP_Camera_01 is attached to a Replicator render product. RTSPWriter (isaacsim.replicator.agent.core) captures RGB frames from that render product on each simulation step, pipes the raw pixel data into a spawned FFmpeg subprocess which encodes it as HEVC, and publishes the encoded stream to a MediaMTX RTSP server listening on TCP port 8554 on the same EC2 instance. The stream is then viewable in VLC on a local machine over the public EC2 IP. The expected stream URLs for this setup are

    # EC2-local
    rtsp://127.0.0.1:8554/RTSPWriter_World_RTSP_Camera_01_rgb
    
    # Remote VLC
    rtsp://<EC2_PUBLIC_IP>:8554/RTSPWriter_World_RTSP_Camera_01_rgb
    

    To setup the RTSPWriter and begin feeding frames into the pipe, the following was ran in the Isaac Sim Script Editor.

    # enable extensions
    from isaacsim.core.utils.extensions import enable_extension
    enable_extension("omni.replicator.core")
    enable_extension("isaacsim.replicator.agent.core")
    enable_extension("isaacsim.replicator.writers")
    import isaacsim.replicator.agent.core.data_generation.writers.rtsp
    print("[INFO] Extensions enabled.")
    
    import omni.usd
    import omni.replicator.core as rep
    from pxr import UsdGeom
    
    stage = omni.usd.get_context().get_stage()
    camera_path = "/World/RTSP_Camera_01"
    camera_prim = stage.GetPrimAtPath(camera_path)
    
    if not camera_prim.IsValid():
        raise RuntimeError(f"Camera prim does not exist: {camera_path}")
    if not camera_prim.IsA(UsdGeom.Camera):
        raise RuntimeError(f"Prim exists but is not a USD Camera: {camera_path}")
    
    resolution = (1280, 720)
    rtsp_base = "rtsp://127.0.0.1:8554/RTSPWriter"
    
    render_product = rep.create.render_product(camera_path, resolution=resolution)
    
    # setup RTSPWriter
    writer = rep.WriterRegistry.get("RTSPWriter")
    writer.initialize(rtsp_stream_url=rtsp_base, rtsp_rgb=True, device=0)
    writer.attach([render_product])
    
    rep.orchestrator.set_capture_on_play(True)

    The stream was grey and the pipe appeared to be failing.

    Tracing Root Cause

    This section outlines how I got to the root cause and rules out other possible issues. Skip to the following section if not interested.

    a. camera and render product

    Before diagnosing the streaming path it was important to rule out that the camera itself was misconfigured, misplaced, or not seeing the scene.

    A BasicWriter test was run in the Script Editor to write RGB PNG frames to disk instead of streaming via RTSP:

    import omni.replicator.core as rep
    
    camera_path = "/World/RTSP_Camera_01"
    resolution = (1280, 720)
    
    render_product = rep.create.render_product(camera_path, resolution=resolution)
    
    writer = rep.WriterRegistry.get("BasicWriter")
    writer.initialize(output_dir="/tmp/basic_out", rgb=True)
    writer.attach([render_product])
    
    rep.orchestrator.set_capture_on_play(True)

    Generated PNGs showed a valid camera view of the scene.

    This ruled out:

    • Camera prim path
    • Camera placement and clipping
    • Replicator render product validity

    b. MediaMTX and network

    To diagnose whether the problem was in the publishing (RTSPWriter/FFmpeg) or the receiving (MediaMTX, EC2 networking, VLC), I used FFmpeg directly to pull a frame from the stream to test the full path from MediaMTX to the consumer:

    ffmpeg -rtsp_transport tcp \
      -i rtsp://127.0.0.1:8554/RTSPWriter_World_RTSP_Camera_01_rgb \
      -frames:v 1 /tmp/rtsp_frame.jpg

    FFmpeg successfully connected, detected an HEVC stream, and wrote a valid, albeit grey, JPEG. This confirmed:

    • MediaMTX is receiving a publisher
    • RTSP path naming is correct
    • EC2 port 8554 is reachable
    • FFmpeg can decode the stream

    The grey JPEG suggested an issue with a broken or stalled encoding pipe.

    c. encoder availability

    The hevc_nvenc encoder was tested in isolation with no issues:

    ffmpeg -y -f lavfi -i testsrc=size=640x360:rate=30 \
      -t 3 -c:v hevc_nvenc -f null -

    d. RTSPWriter source code analysis

    The remaining culprit was the RTSPWriter itself, specifically how it built the FFmpeg command and managed the pipe lifecycle. Indeed, Isaac Sim logs indicate an error when running the sum:

    /Render/PostProcess/SDGPipeline/Replicator_RTSPWriterWriter_01: [/Render/PostProcess/SDGPipeline] Assertion raised in compute - 'NoneType' object has no attribute 'pipe'
    

    RTSPWriter source located at:

    /opt/IsaacSim/extscache/isaacsim.replicator.agent.core-0.7.28+107.3.3/
      isaacsim/replicator/agent/core/data_generation/writers/rtsp.py
    
    • The FFmpeg command is built in RTSPCamera.__init__ and the subprocess is spawned once in _config_stream_cameras(), with the resulting pipe stored in self.pipe.
    • On each frame, write() looks up the camera in self.cameras, retrieves the pipe, and calls anno_streamer.pipe.stdin.write(data.tobytes()).
      • critically, this is done with no null check on anno_streamer. If anything in that chain is broken, it crashes with NoneType has no attribute 'pipe'.

    e. runtime diagnostics via Monkey-Patching

    To further diagnose, I wanted to observe exactly what values were flowing through the writer at write time to determine when exactly the pipe died.

    Since the source file is in a read-only mount under /opt/IsaacSim/extscache, runtime patches were applied via the Isaac Sim Script Editor without modifying any files.

    Note: Isaac Sim's Python runtime persists across Script Editor runs within the same session. Applying multiple conflicting patches causes unpredictable behavior (e.g., self.cameras alternating between empty and populated on every frame). Restart Isaac Sim between patch iterations to get a clean slate.

    1. check for camera lookup key mismatch

    fetch_anno_streamer() looks up the camera in self.cameras using a string key derived from the live data dict at write time. That key has to exactly match the key used to store the camera at attach time, which comes from generate_stream_camera_data(). Otherwise the lookup would return None on every single frame, causing the crash. This was worth checking before looking at the pipe itself.

    import isaacsim.replicator.agent.core.data_generation.writers.rtsp as rtsp_mod
    
    _original = rtsp_mod.RTSPWriter.stream_each_camera
    
    def _patched(self, annotator_dict):
        camera_path = str(annotator_dict["camera"])
        print(f"[DEBUG] write() key:    '{camera_path}'")
        print(f"[DEBUG] cameras keys:   {list(self.cameras.keys())}")
        print(f"[DEBUG] key match:      {camera_path in self.cameras}")
        _original(self, annotator_dict)
    
    rtsp_mod.RTSPWriter.stream_each_camera = _patched
    [DEBUG] write() key:    '/World/RTSP_Camera_01'
    [DEBUG] cameras keys:   ['/World/RTSP_Camera_01']
    [DEBUG] key match:      True
    

    Keys matched. Camera lookup key mismatch is not the issue.

    1. pipe health check per frame

    With the lookup confirmed working, the focus shifted to the pipe object itself. Either anno_streamer.pipe could be None (meaning _config_stream_cameras never ran), or the pipe process could have already exited by the time write() tried to use it. This patch checked these conditions per frame and also logged the frame data shape and dtype to confirm Isaac was actually delivering valid pixel data.

    import isaacsim.replicator.agent.core.data_generation.writers.rtsp as rtsp_mod
    
    _original_rgb = rtsp_mod.RTSPWriter.rgb_push_stream_frame
    _frame_count = [0]
    
    def _patched_rgb(self, anno_streamer, annotator_data):
        _frame_count[0] += 1
        if anno_streamer is None:
            print(f"[DEBUG] frame {_frame_count[0]}: anno_streamer is None")
            return
        if anno_streamer.pipe is None:
            print(f"[DEBUG] frame {_frame_count[0]}: pipe is None")
            print(f"[DEBUG] command was: {anno_streamer.command}")
            return
        exit_code = anno_streamer.pipe.poll()
        if exit_code is not None:
            print(f"[DEBUG] frame {_frame_count[0]}: pipe process exited with code {exit_code}")
            return
        data = annotator_data.get("data", None)
        if data is None:
            print(f"[DEBUG] frame {_frame_count[0]}: annotator_data has no 'data' key")
            return
        print(f"[DEBUG] frame {_frame_count[0]}: pipe alive, data shape {data.shape}, dtype {data.dtype}")
        _original_rgb(self, anno_streamer, annotator_data)
    
    rtsp_mod.RTSPWriter.rgb_push_stream_frame = _patched_rgb
    print("[DEBUG] RGB frame patch applied.")
    [DEBUG] frame 1: pipe alive, data shape (720, 1280, 4), dtype uint8
    [DEBUG] frame 2: pipe alive, data shape (720, 1280, 4), dtype uint8
    [DEBUG] frame 3: pipe alive, data shape (720, 1280, 4), dtype uint8
    [DEBUG] frame 4: pipe alive, data shape (720, 1280, 4), dtype uint8
    [DEBUG] frame 5: pipe process exited with code 224
    [DEBUG] frame 6: pipe process exited with code 224
    ...
    

    The pipe died after a few successful writes.

    Root Causes

    a. wrong output pixel format

    Isaac Sim's RGB annotator outputs frames in rgba format. hevc_nvenc needs to encode those frames into a format that HEVC supports for streaming, and [HEVC's Main profile](https://www.telestream.net/pdfs/whitepapers/HEVC.pdf) only supports 4:2:0 chroma format (yuv420p). It does not support rgba.

    The FFmpeg command built by RTSPWriter had -pix_fmt yuv420p commented out on the output side, and used the deprecated preset ll:

    File: rtsp.py, lines 377–379

    "-preset",
    "ll",
    # '-pix_fmt', 'yuv420p',   <-- commented out, never applied

    As a result, FFmpeg received rgba frames from Isaac and tried to encode them directly with hevc_nvenc. The Isaac Sim console confirmed this:

    Stream #0:0: Video: hevc (Main), rgba(progressive), 1280x720
    frame=3083 fps=8.1 q=7.0 drop=2977 speed=0.27x
    

    FFmpeg performed an implicit conversion that ran at 8.1 fps instead of the required 30 fps. I suspect the encoding pipe backed up, filled the buffer, and eventually broke.

    b. deprecated and contradictory FFmpeg flags

    By changing the preset from the deprecated ll to p1, and -vsync cfr to -fps_mode vfr, the command still contained -r 30 (which implies CFR) alongside -fps_mode vfr (which is VFR). These are mutually exclusive, throwing an error in Isaac Sim logs:

    One of -r/-fpsmax was specified together a non-CFR -vsync/-fps_mode. This is contradictory.
    Error opening output file rtsp://127.0.0.1:8554/RTSPWriter_World_RTSP_Camera_01_rgb.
    Error opening output files: Invalid argument
    [DEBUG] frame 2: pipe exited with code 234
    

    FFmpeg refused to start at all, giving exit code 234 on every frame from frame 2 onward.

    Fix

    file to edit: rtsp.py

    /opt/IsaacSim/extscache/isaacsim.replicator.agent.core-0.7.28+107.3.3/
      isaacsim/replicator/agent/core/data_generation/writers/rtsp.py
    

    Note: This file is in a read-only mount under /opt/IsaacSim/extscache. You need sudo to overwrite it.

    Optionally backup the original:

    cp /opt/IsaacSim/extscache/isaacsim.replicator.agent.core-0.7.28+107.3.3/isaacsim/replicator/agent/core/data_generation/writers/rtsp.py \
       /home/ubuntu/rtsp.py.bak
    1. Create a working copy:
    cp /home/ubuntu/rtsp.py.bak /home/ubuntu/rtsp_patched.py
    1. Apply all four changes using this Python script:

    Save as /home/ubuntu/apply_patch.py:

    path = '/home/ubuntu/rtsp_patched.py'
    c = open(path).read()
    
    # preset ll -> p1  (ll is deprecated in FFmpeg 6.x)
    c = c.replace('"ll"', '"p1"')
    
    # uncomment -pix_fmt yuv420p on the output side
    c = c.replace(
        "            # '-pix_fmt', 'yuv420p',",
        '            "-pix_fmt",\n            "yuv420p",'
    )
    
    # replace deprecated -vsync with -fps_mode
    c = c.replace('"-vsync",', '"-fps_mode",')
    c = c.replace('"cfr",', '"vfr",')
    
    # remove -r flag (conflicts with -fps_mode vfr)
    c = c.replace('            "-r",\n            str(self.fps),\n', '')
    
    open(path, 'w').write(c)
    print('Patch applied.')

    Run it:

    python3 /home/ubuntu/apply_patch.py
    1. Verify the changes:
    sed -n '370,405p' /home/ubuntu/rtsp_patched.py

    Expected output around the relevant block:

    "-c:v",
    vcodec,
    "-preset",
    "p1",
    "-pix_fmt",
    "yuv420p",
    "-maxrate:v",
    f"{self.bitrate}k",
    "-bufsize:v",
    "64M",
    "-fps_mode",
    "vfr",
    "-f",
    "rtsp",
    1. Copy the patched file over the original:
    sudo cp /home/ubuntu/rtsp_patched.py \
      /opt/IsaacSim/extscache/isaacsim.replicator.agent.core-0.7.28+107.3.3/isaacsim/replicator/agent/core/data_generation/writers/rtsp.py

    summary of changes

    Change Before After Line
    Preset "-preset", "ll" "-preset", "p1" 378
    Output pix_fmt # '-pix_fmt', 'yuv420p' (commented out) "-pix_fmt", "yuv420p" (active) 379
    fps_mode "-vsync", "cfr" "-fps_mode", "vfr" 392
    Remove -r flag "-r", str(self.fps) (removed) 394

    Working Setup Script

    Run this single combined script in the Isaac Sim Script Editor after each restart. Save it as a persistent tab.

    # enable extensions
    from isaacsim.core.utils.extensions import enable_extension
    enable_extension("omni.replicator.core")
    enable_extension("isaacsim.replicator.agent.core")
    enable_extension("isaacsim.replicator.writers")
    import isaacsim.replicator.agent.core.data_generation.writers.rtsp
    print("[INFO] Extensions enabled.")
    
    # setup
    import omni.usd
    import omni.replicator.core as rep
    from pxr import UsdGeom
    
    stage = omni.usd.get_context().get_stage()
    camera_path = "/World/RTSP_Camera_01"
    camera_prim = stage.GetPrimAtPath(camera_path)
    
    if not camera_prim.IsValid():
        raise RuntimeError(f"Camera prim does not exist: {camera_path}")
    if not camera_prim.IsA(UsdGeom.Camera):
        raise RuntimeError(f"Prim exists but is not a USD Camera: {camera_path}")
    
    resolution = (1280, 720)
    rtsp_base = "rtsp://127.0.0.1:8554/RTSPWriter"
    
    render_product = rep.create.render_product(camera_path, resolution=resolution)
    
    writer = rep.WriterRegistry.get("RTSPWriter")
    writer.initialize(rtsp_stream_url=rtsp_base, rtsp_rgb=True, device=0)
    
    writer.attach([render_product])
    rep.orchestrator.set_capture_on_play(True)

    Note: RTSPWriter requires explicit extension loading and a direct import to register itself

    order of operations each session

    1. Verify MediaMTX is running: ss -tlnp | grep 8554
    2. If not running: mediamtx &
    3. Open Isaac Sim, load stage
    4. Open Script Editor, run the combined script above
    5. Press Play
    6. Open VLC → rtsp://<EC2_PUBLIC_IP>:8554/RTSPWriter_World_RTSP_Camera_01_rgb

    Secondary Issues (Open)

    a. stop/play cycling breaks the stream

    When the Isaac Sim timeline is stopped and Play is pressed again, the stream does not recover automatically.

    on_final_frame() is called on Stop, which kills the FFmpeg pipe and clears self.cameras. When Play resumes, attach() and _config_stream_cameras() are not called again, so self.cameras stays empty. All subsequent write() calls find anno_streamer = None.

    Current workaround: Re-run the setup script after each Stop before pressing Play again.

    Proper fix: Register a timeline event listener that calls writer.attach() again when play resumes, or patch on_final_frame() to preserve the camera configuration for reconnection.

    b. frame dropping due to Isaac render rate exceeding stream rate

    Even with /app/runLoops/main/rateLimitFrequency set to 30, Isaac still delivers frames to the pipe at a higher rate:

    frame 1900: pipe alive, shape (720, 1280, 4), dtype uint8  drop=1075  speed=1x
    

    ~57% of frames are dropped. Isaac's render loop and physics loop run at different rates. When physics hasn't stepped but the render loop has, a duplicate frame is produced. FFmpeg detects and drops duplicates in VFR mode. This does not cause instability but reduces encoding efficiency.

    c. incorrect hwaccel flags for raw pipe input

    For NVENC annotators, RTSPWriter adds decode-side GPU memory flags to the FFmpeg command:
    rtsp.py ~ line 347

    self.command += ["-hwaccel", "cuda", "-hwaccel_output_format", "cuda", "-hwaccel_device", str(self.device)]

    such that the FFmpeg command becomes:

    ffmpeg \
      -hwaccel cuda -hwaccel_output_format cuda -hwaccel_device 0 \  # ← problem
      -f rawvideo -pix_fmt rgba -s 1280x720 -i - \
      -c:v hevc_nvenc -preset ll \
      ...

    [Official FFmpeg docs](https://ffmpeg.org/ffmpeg.html) indicate these flags are intended for decoder GPU memory management. When the input is a raw pipe from stdin, there is no decoder. They did not cause a crash on the tested EC2 instance, but when tested standalone with hevc_nvenc and a raw pipe, FFmpeg selected the non-standard gbrp pixel format instead of yuv420p:

    Stream #0:0: Video: hevc (Rext), gbrp(pc, gbr/unknown/unknown, progressive), 640x360
    

    After adding -pix_fmt yuv420p on the output side, the encoder correctly produced hevc (Main) yuv420p regardless. The hwaccel flags are now harmless but still wrong.

    Ruled Out

    Component Evidence
    Camera prim path BasicWriter PNG showed valid camera view
    Camera placement / clipping BasicWriter PNG showed expected scene content
    Cesium scene visibility BasicWriter PNG showed Cesium terrain correctly
    Replicator render product BasicWriter confirmed valid render product
    RGB annotator BasicWriter wrote correct rgba uint8 data
    NVENC encoder availability hevc_nvenc test ran at 10x speed, 90 frames, no errors
    MediaMTX server FFmpeg connected locally; stream published correctly
    RTSP path naming Correct path confirmed in MediaMTX logs
    EC2 networking / port 8554 FFmpeg captured frame locally; VLC connected remotely
    Camera lookup key mismatch Patch confirmed both keys = /World/RTSP_Camera_01, match: True
    hwaccel flags (this system) FFmpeg accepted the command with hwaccel flags and published stream

    Maintenance Notes

    • The patched rtsp.py will be overwritten if Isaac Sim updates isaacsim.replicator.agent.core. Keep /home/ubuntu/rtsp_patched.py as a reference.
    • After any Isaac Sim extension update, re-check the new rtsp.py for the same four issues (preset, pix_fmt, fps_mode, -r flag) and re-apply the patch using apply_patch.py.
    • The setup script must be re-run after every Isaac Sim restart. Python runtime state does not persist across restarts.
    • MediaMTX must be running before the setup script is executed. Verify with ss -tlnp | grep 8554. Start with mediamtx & if not running.
    • VLC startup latency of 3–5 seconds is normal: RTSP handshake + waiting for the first HEVC keyframe.
    • Do not run multiple versions of the setup script in the same Isaac Sim session without restarting. Stale patches and writer registrations accumulate in the Python runtime and produce confusing behavior.

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions