Add ST2110 source operator for SMPTE ST 2110-20 video ingestion#1447
Add ST2110 source operator for SMPTE ST 2110-20 video ingestion#1447bhashemian merged 12 commits intonvidia-holoscan:mainfrom
Conversation
Greptile SummaryThis PR adds a new The implementation is well-structured and the prior review round addressed several significant issues (removed dead
Confidence Score: 3/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Net as Network (UDP Multicast)
participant Sock as Linux Socket
participant Drain as drain_socket()
participant Parse as parse_and_copy_packet()
participant PinnedMem as Pinned Memory (triple-buffer)
participant Emit as emit_next_frame()
participant GPU as GPU Buffers
participant Raw as raw_output port
participant RGBA as rgba_output port
participant NV12 as nv12_output port
Note over Net,Sock: PeriodicCondition triggers compute() at framerate×10
Sock->>Drain: recvmmsg() batch (up to 256 pkts)
Drain->>Parse: parse_and_copy_packet(pkt_data, pkt_size)
Parse->>Parse: Parse RTP header (seq, timestamp, marker)
Parse->>Parse: Parse ST2110 header (line_num, line_offset)
Parse->>PinnedMem: memcpy payload → frame_buffer[assembling_idx]
alt Marker bit set OR new timestamp
Parse->>PinnedMem: frame_complete = true
end
Note over Emit,GPU: Phase 2: emit oldest complete frame
Emit->>PinnedMem: find_next_frame_to_emit() (RTP timestamp order)
Emit->>GPU: cudaMemcpyAsync(gpu_raw_buffer, pinned_memory)
Emit->>PinnedMem: memset(0) + mark_frame_emitted()
Emit->>Raw: emit_raw_frame() → GXF Tensor on raw_output
alt enable_rgba_output && YCbCr-4:2:2-10bit
Emit->>GPU: convert_ycbcr422_10bit_to_rgba CUDA kernel
Emit->>RGBA: VideoBuffer + Tensor on rgba_output
end
alt enable_nv12_output && YCbCr-4:2:2-10bit
Emit->>GPU: convert_ycbcr422_10bit_to_nv12 CUDA kernel
Emit->>NV12: VideoBuffer + Tensor on nv12_output
end
Last reviewed commit: efffc68 |
Adds st2110_source operator and st2110_demo application for receiving SMPTE ST 2110-20 uncompressed video over multicast UDP using standard Linux sockets with CUDA pinned memory and GPU-accelerated color conversion. Features: - Triple-buffer architecture with recvmmsg() batch receive - CUDA kernels for YCbCr-4:2:2-10bit to RGBA/NV12 conversion - Python bindings via pybind11 - Multiple output modes (raw, RGBA, NV12) - Holohub CLI integration with run modes Based on PR nvidia-holoscan#1346 by @Ollie-cj and @xrlabs-cj with review feedback from @bhashemian addressed: - Removed separate shell scripts in favor of holohub CLI modes - Added in-class member initializers for FrameBuffer struct - Fixed CMAKE_PREFIX_PATH in BUILD.md - Removed holoviz from operator dependencies (SDK built-in) - Consistent multicast address across config files - Clean rebase on upstream/main (no unrelated file changes) Signed-off-by: Mikael Brudfors <mbrudfors@nvidia.com>
a525824 to
af574b4
Compare
|
@bhashemian All comments addressed and all tests passed, please review |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds a new ST2110Source operator (C++/CUDA) with Python bindings, CUDA kernels, docs, metadata, and build entries, plus a new st2110_demo application (Python) with configs, README, and top-level app registration. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 11
🧹 Nitpick comments (2)
operators/st2110_source/python/st2110_source_pydoc.hpp (1)
51-100: Documentperiodic_conditionandtensor_allocatortoo.
operators/st2110_source/st2110_source.cppregisters both parameters insetup(), but they’re absent from this public doc block. That leaves the Python API docs incomplete for users who need custom scheduling or allocator injection.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@operators/st2110_source/python/st2110_source_pydoc.hpp` around lines 51 - 100, The pydoc is missing entries for the two parameters registered in setup(): add documented Parameter sections for periodic_condition (type: callable or Scheduling::PeriodicCondition/optional; explain it accepts a callable used to drive operator scheduling or an object implementing the periodic condition interface and state the default behavior) and tensor_allocator (type: TensorAllocator-like object or callable/optional; explain it injects a custom allocator for output tensors and note the default allocator used), referencing the exact symbols periodic_condition and tensor_allocator and the setup() registration so readers can locate where they’re wired; keep wording and style consistent with the existing parameter docs (type, short description, and default value).operators/st2110_source/metadata.json (1)
29-34: Make the hardware requirement match the advertised support matrix.
platformsadvertises bothx86_64andaarch64, andoperators/st2110_source/BUILD.mdalso documents an x86_64/ConnectX path, but this metadata currently reads as Thor-only. A NIC-focused requirement here would keep the catalog metadata aligned with the rest of the PR.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@operators/st2110_source/metadata.json` around lines 29 - 34, The hardware requirement in the metadata's "requirements" -> "hardware" entry currently names "NVIDIA Thor AGX with MGBE" which is too specific; update the "name" and "description" fields to a NIC-focused requirement (e.g., "Network interface with multicast support" or "NIC with multicast support (e.g., Thor AGX or Mellanox ConnectX)"), so it isn't Thor-only, and ensure this hardware text aligns with the advertised "platforms" (x86_64 and aarch64) and the operator docs (BUILD.md); edit the "name" and "description" keys under "requirements" -> "hardware" accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@applications/st2110_demo/CMakeLists.txt`:
- Around line 16-17: Add a lightweight CTest smoke test for the new project
st2110_demo by enabling CTest (call enable_testing()) and using
find_package(Python3 COMPONENTS Interpreter) in CMakeLists to locate the
interpreter; then add_test (e.g., NAME st2110_demo_smoke) that runs the
package's CLI Python entrypoint (import the module used by the app's CLI or run
its __main__ import) and asserts the expected config selection/exit code. Ensure
the test uses add_test and references the discovered Python3_EXECUTABLE so CI
runs a non-hardware import/config smoke check for st2110_demo.
In `@applications/st2110_demo/metadata.json`:
- Around line 35-42: Add a top-level "dependencies" entry to this application's
metadata.json mirroring the operator prerequisite declared in "requirements":
create a "dependencies" object (or array if your manifest uses that shape) that
lists the operator "st2110_source" with version "1.0" so schema validation and
automation can resolve the operator; update the same file's metadata fields if
missing to ensure the manifest includes required fields such as name, authors,
language, version, changelog, holoscan_sdk, platforms, tags, ranking, and the
new dependencies entry.
In `@applications/st2110_demo/st2110_demo_app.py`:
- Around line 1-15: This module (st2110_demo_app.py) is executable but lacks a
shebang; add a Python shebang as the very first line (e.g. use /usr/bin/env
python3) so the script runs correctly when executed directly, ensure it is the
first byte in the file and keep the file executable permission.
In `@operators/st2110_source/CMakeLists.txt`:
- Around line 18-42: The CMakeLists currently links the imported target
CUDA::cuda_driver without ensuring it exists; add a call to
find_package(CUDAToolkit REQUIRED) before you call target_link_libraries so the
CUDA::cuda_driver target is created. Update the project(...) scope where CUDA is
declared and insert find_package(CUDAToolkit REQUIRED) prior to linking the
st2110_source target (referencing the st2110_source target and the
target_link_libraries invocation) so standalone builds (Option B) will find the
CUDA driver target.
In `@operators/st2110_source/metadata.json`:
- Around line 1-39: The metadata.json for operator "ST2110SourceOp" is missing
the required dependencies field; add a top-level "dependencies" key inside the
"operator" object per the operator.schema.json (e.g., either an empty array []
if there are no runtime dependencies or a list of dependency objects/strings
used by this operator), ensuring the new "dependencies" entry follows the same
structure as other operators in the repo and includes any package names/versions
or hardware/software requirements as appropriate.
In `@operators/st2110_source/README.md`:
- Around line 184-192: The README's permanent sysctl example shows raw
"net.core.rmem_max=..." lines inside a bash code block which will error if
copy-pasted; update the block so it is executable or clearly a config snippet:
either replace the two lines with a tee/echo command that appends to
/etc/sysctl.conf (e.g. echo "net.core.rmem_max=..." | sudo tee -a
/etc/sysctl.conf) or move the two lines into a plain-text/config code block and
keep the temporary commands in the bash block; reference the exact keys
net.core.rmem_max and net.core.rmem_default and the target file /etc/sysctl.conf
when making the change.
In `@operators/st2110_source/st2110_kernels.cu`:
- Around line 40-46: Untrusted packet-derived indices (line_num, line_offset,
payload_len from line_numbers/line_offsets/line_lengths) are used to compute
out_ptr and can cause out-of-bounds GPU writes; either validate on the host
before kernel launch that line_num < frame_height and (uint32_t)line_offset +
(uint32_t)payload_len <= line_stride for every pkt_idx, or extend the kernel API
to accept frame_height and line_stride and perform those checks inside the
kernel (skip the write or clamp/return on failure) before computing out_ptr and
performing any write from in_pkt to out; update callers that prepare
line_numbers/line_offsets/line_lengths accordingly so invalid packets are
rejected or handled safely.
- Around line 125-126: The wrappers st2110_frame_reassembly(),
st2110_simple_reorder(), convert_ycbcr422_10bit_to_rgba(), and
convert_ycbcr422_10bit_to_nv12() currently launch kernels and return void;
change them to detect and propagate kernel launch errors by either returning
cudaError_t or calling cudaGetLastError() immediately after the <<<...>>> launch
and returning that error (or converting to your internal error type). Locate the
kernel invocations (st2110_frame_reassembly_kernel<<<...>>>(...),
st2110_simple_reorder_kernel<<<...>>>(...),
convert_ycbcr422_10bit_to_rgba_kernel<<<...>>>(...),
convert_ycbcr422_10bit_to_nv12_kernel<<<...>>>(...)) and add a post-launch check
that captures cudaGetLastError() and returns or logs it so callers can handle
failures synchronously instead of discovering them at cudaStreamSynchronize().
In `@operators/st2110_source/st2110_source.hpp`:
- Around line 127-133: The inline comment on the FrameBuffer member
std::atomic<bool> frame_complete is too long and exceeds the 100-character
limit; shorten the comment or place it on the previous line as a separate
comment. Edit the FrameBuffer struct (symbol: FrameBuffer) and update the
declaration std::atomic<bool> frame_complete{false}; by moving its explanatory
comment ("Frame reassembly complete (atomic for thread safety)") to its own line
above the declaration or compressing it to a shorter phrase such as "Frame
reassembly complete" so no line exceeds 100 characters.
- Around line 194-197: The header declares cudaStream_t (and GPU buffer pointers
like gpu_raw_buffer_, gpu_rgba_buffer_, gpu_nv12_buffer_) but doesn't include
the CUDA runtime types; add `#include` <cuda_runtime.h> to the top of
st2110_source.hpp so cudaStream_t is defined when this header is parsed (remove
reliance on st2110_kernels.cuh being included only in the .cpp).
---
Nitpick comments:
In `@operators/st2110_source/metadata.json`:
- Around line 29-34: The hardware requirement in the metadata's "requirements"
-> "hardware" entry currently names "NVIDIA Thor AGX with MGBE" which is too
specific; update the "name" and "description" fields to a NIC-focused
requirement (e.g., "Network interface with multicast support" or "NIC with
multicast support (e.g., Thor AGX or Mellanox ConnectX)"), so it isn't
Thor-only, and ensure this hardware text aligns with the advertised "platforms"
(x86_64 and aarch64) and the operator docs (BUILD.md); edit the "name" and
"description" keys under "requirements" -> "hardware" accordingly.
In `@operators/st2110_source/python/st2110_source_pydoc.hpp`:
- Around line 51-100: The pydoc is missing entries for the two parameters
registered in setup(): add documented Parameter sections for periodic_condition
(type: callable or Scheduling::PeriodicCondition/optional; explain it accepts a
callable used to drive operator scheduling or an object implementing the
periodic condition interface and state the default behavior) and
tensor_allocator (type: TensorAllocator-like object or callable/optional;
explain it injects a custom allocator for output tensors and note the default
allocator used), referencing the exact symbols periodic_condition and
tensor_allocator and the setup() registration so readers can locate where
they’re wired; keep wording and style consistent with the existing parameter
docs (type, short description, and default value).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 51ed7103-ce79-47ba-b409-492e29d60981
⛔ Files ignored due to path filters (2)
applications/st2110_demo/st2110_demo.jpgis excluded by!**/*.jpgoperators/st2110_source/st2110_source.jpgis excluded by!**/*.jpg
📒 Files selected for processing (21)
applications/CMakeLists.txtapplications/st2110_demo/CMakeLists.txtapplications/st2110_demo/README.mdapplications/st2110_demo/metadata.jsonapplications/st2110_demo/st2110_demo_app.pyapplications/st2110_demo/st2110_demo_config.yamlapplications/st2110_demo/st2110_demo_config_nv12_only.yamlapplications/st2110_demo/st2110_demo_config_rgba_only.yamloperators/CMakeLists.txtoperators/st2110_source/BUILD.mdoperators/st2110_source/CMakeLists.txtoperators/st2110_source/README.mdoperators/st2110_source/metadata.jsonoperators/st2110_source/python/CMakeLists.txtoperators/st2110_source/python/__init__.pyoperators/st2110_source/python/st2110_source.cppoperators/st2110_source/python/st2110_source_pydoc.hppoperators/st2110_source/st2110_kernels.cuoperators/st2110_source/st2110_kernels.cuhoperators/st2110_source/st2110_source.cppoperators/st2110_source/st2110_source.hpp
There was a problem hiding this comment.
Actionable comments posted: 6
♻️ Duplicate comments (6)
operators/st2110_source/st2110_kernels.cu (3)
39-57:⚠️ Potential issue | 🔴 CriticalValidate packet-derived coordinates before the device write.
line_numbers,line_offsets, andline_lengthsare consumed as trusted offsets here. A malformed descriptor can moveout_ptrpast the frame buffer and turn one bad packet into an out-of-bounds global write. Pass frame-height/line-bytes into the kernel, or reject invalid descriptors in the host wrapper before launch.
113-126:⚠️ Potential issue | 🟠 MajorSurface CUDA launch failures at the wrapper boundary.
All four wrappers enqueue kernels and return without checking
cudaGetLastError(). Invalid launch configs or bad device pointers will only surface later on some unrelated synchronize call, which makes frame-level error handling much harder.Also applies to: 129-140, 317-333, 336-352
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@operators/st2110_source/st2110_kernels.cu` around lines 113 - 126, The wrapper functions (e.g., st2110_frame_reassembly which launches st2110_frame_reassembly_kernel) currently enqueue kernels and return without checking the launch status; after each <<<...>>> kernel launch add a check for the CUDA launch error (use cudaPeekAtLastError() or cudaGetLastError()), capture the cudaError_t, and handle/propagate it instead of silently returning (adjust the wrapper signature to return cudaError_t or log+abort consistently). Apply the same pattern to the other wrapper functions that launch kernels (the ones covering the other ranges referenced) so any invalid launch config or bad device pointer is detected and reported immediately at the wrapper boundary.
317-352:⚠️ Potential issue | 🔴 CriticalReject zero/odd frame shapes before launching the conversion kernels.
These kernels assume 2-pixel groups, and the NV12 path also writes 2x2 blocks.
width == 0/height == 0yields an invalid 0-block launch, oddwidthmisindexes the second pixel, and oddheighttruncates the last NV12 row.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@operators/st2110_source/st2110_kernels.cu` around lines 317 - 352, Before launching the CUDA kernels, validate the frame shape: in both convert_ycbcr422_10bit_to_rgba and convert_ycbcr422_10bit_to_nv12 ensure width and height are non‑zero and width is even; additionally ensure height is even for the NV12 path. If the checks fail, return early (or propagate an error) and do not call the kernel launch (convert_ycbcr422_10bit_to_rgba_kernel / convert_ycbcr422_10bit_to_nv12_kernel); this prevents 0-block launches and misindexed/partial pixel writes for odd dimensions.operators/st2110_source/st2110_source.hpp (1)
21-32:⚠️ Potential issue | 🔴 CriticalInclude the CUDA runtime types in the public header.
This header declares
cudaStream_ton Line 198 but does not include<cuda_runtime.h>, so any translation unit that includesst2110_source.hppbefore another CUDA header will see an undefined type.Suggested fix
`#include` <array> `#include` <atomic> +#include <cuda_runtime.h> `#include` <memory> `#include` <string> `#include` <vector>Also applies to: 194-198
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@operators/st2110_source/st2110_source.hpp` around lines 21 - 32, The header st2110_source.hpp declares/uses the CUDA runtime type cudaStream_t (around the ST2110 source declarations) but doesn't include <cuda_runtime.h>; add `#include` <cuda_runtime.h> to the public header (st2110_source.hpp) so any translation unit that includes this header has the CUDA runtime types available; locate the cudaStream_t usage and add the include near the other standard includes at the top of the file to resolve the undefined type.applications/st2110_demo/st2110_demo_app.py (1)
1-1:⚠️ Potential issue | 🟡 MinorAdd a shebang if this file remains executable.
The file is marked executable, so invoking it directly will not dispatch to Python correctly without a shebang.
Suggested fix
+#!/usr/bin/env python3 # SPDX-FileCopyrightText: Copyright (c) 2025-2026, XRlabs. All rights reserved. # SPDX-License-Identifier: Apache-2.0🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@applications/st2110_demo/st2110_demo_app.py` at line 1, This file is marked executable but lacks a shebang; add a proper shebang (e.g. using /usr/bin/env python3) as the very first line of st2110_demo_app.py so invoking the script directly runs under Python, and if the file should not be executable instead remove the executable bit — ensure the shebang is the first line and commit that change.operators/st2110_source/metadata.json (1)
27-29:⚠️ Potential issue | 🟠 MajorAdd the required
dependenciesfield.This operator object still omits
dependencies, which is expected by the operator metadata schema used in this repo.Suggested fix
"tags": ["Networking and Distributed Computing", "Video Capture", "ST 2110", "Broadcast"], "ranking": 1, + "dependencies": [], "requirements": {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@operators/st2110_source/metadata.json` around lines 27 - 29, The operator metadata is missing the required "dependencies" field; update the operator object in metadata.json (near the "tags", "ranking", and "requirements" keys) to include a "dependencies" property (e.g., an array, even if empty) so the metadata conforms to the repo's operator schema; ensure the key is named exactly "dependencies" and placed at the same object level as "requirements" and "ranking".
🧹 Nitpick comments (1)
operators/st2110_source/BUILD.md (1)
7-7: Consider fixing markdown formatting for better readability.The static analysis tool flagged a few formatting inconsistencies:
- Lines 7, 12: Headings should have a blank line below them
- Lines 108, 119, 127: Code blocks should be surrounded by blank lines
These are minor style issues that improve document readability and markdown linting compliance.
📝 Proposed formatting fixes
For heading spacing (lines 7, 12):
### Hardware Requirements + - NVIDIA GPU (any CUDA-capable GPU)### Software Requirements + - Ubuntu 22.04 or laterFor code block spacing (line 108):
**Solution**: Install CMake 3.24+ + ```bashFor code block spacing (line 119):
**Solution**: Add CUDA to PATH + ```bashFor code block spacing (line 127):
**Solution**: Verify Holoscan SDK installation + ```bashAlso applies to: 12-12, 108-108, 119-119, 127-127
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@operators/st2110_source/BUILD.md` at line 7, Add missing blank lines to satisfy Markdown lint rules: insert a blank line after the headings (e.g., the "Hardware Requirements" heading and the following top-level heading) so there is one empty line between each heading and the subsequent paragraph, and ensure each fenced code block around the examples at the locations flagged (the bash blocks around the snippets currently at lines shown by the linter) is preceded and followed by a blank line so there is an empty line before the opening ```bash and after the closing ```; update BUILD.md accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@operators/st2110_source/BUILD.md`:
- Around line 15-16: The BUILD.md CUDA requirement is inconsistent with
metadata.json and CMakeLists.txt; inspect the operator source for CUDA API usage
(look for cuMemHostAlloc, cudaMemcpyAsync, cudaStreamCreate, cudaStreamDestroy,
cudaMallocHost, etc.) to confirm the minimum CUDA that supports all used APIs;
if the code only uses APIs available since CUDA 11.x, update BUILD.md to "CUDA
Toolkit 11.x or later" and add a matching software requirement entry in
metadata.json listing CUDA Toolkit 11.x (with tested versions if available); if
CUDA 12.x is actually required, add CUDA Toolkit 12.x to metadata.json's
software requirements and add a brief rationale in BUILD.md naming the specific
APIs or features that necessitate CUDA 12, and also add a CUDA version
constraint to CMakeLists.txt (e.g., set a minimum CUDA version) so all three
places (BUILD.md, metadata.json, CMakeLists.txt) consistently reflect the
validated minimum.
In `@operators/st2110_source/metadata.json`:
- Around line 10-12: Update the metadata.json to use full semantic versioning:
change the "version" value from "1.0" to "1.0.0" and rename the corresponding
key in the "changelog" object from "1.0" to "1.0.0" (keeping the existing
changelog text); ensure any future metadata entries follow major.minor.patch
format for both the "version" field and changelog keys.
In `@operators/st2110_source/st2110_source.cpp`:
- Around line 716-742: find_next_frame_to_emit is leaving stale completed
buffers (older than expected_timestamp) pinned forever; update its loop to
detect when a completed buffer has diff < 0 (using the same int32_t wraparound
comparison of ts - expected_timestamp) and clear/reset that buffer (e.g., reset
frame_buffers_[i].frame_complete and related metadata) so late frames are
dropped and the slot is freed, and increment frames_dropped_ (or otherwise
account for the dropped stale frame) when you discard it; keep the existing
logic that selects newer frames (newest_idx) and computing frames_skipped using
timestamp_delta.
- Around line 210-212: Validate that rtp_header_size_ is not larger than
max_packet_size_.get() and that incoming packet lengths are at least
rtp_header_size_ + sizeof(ST2110Header) before computing payload_size_ or
reading headers; update initialize() to check and reject/adjust invalid
rtp_header_size_ (referencing payload_size_, max_packet_size_, rtp_header_size_,
ST2110Header) and update parse_and_copy_packet() to verify packet.size() (or
remaining_size) >= rtp_header_size_.get() + sizeof(ST2110Header) before
subtracting or casting, and return/log an error when the validation fails to
avoid underflow and out-of-bounds reads.
- Around line 893-899: The tensors created with wrapMemory(...) are aliasing the
single reusable device buffers (gpu_raw_buffer_, gpu_rgba_buffer_,
gpu_nv12_buffer_) which get overwritten on the next compute(), causing
downstream consumers (rgba_output, nv12_output) to see torn/duplicated frames;
fix by ensuring each emitted Entity owns its own device memory: either allocate
a new device buffer per emitted tensor and copy the current contents into it
before calling wrapMemory, or change the wrapMemory deleter to capture and hold
a shared ownership (e.g. a shared_ptr or GXF object) of the backing GPU buffer
so the buffer is not reused until the tensor is destroyed; locate uses of
wrapMemory in compute(), and update the calls that reference gpu_raw_buffer_,
gpu_rgba_buffer_, and gpu_nv12_buffer_ (also at the other noted occurrences) to
perform one of these approaches.
---
Duplicate comments:
In `@applications/st2110_demo/st2110_demo_app.py`:
- Line 1: This file is marked executable but lacks a shebang; add a proper
shebang (e.g. using /usr/bin/env python3) as the very first line of
st2110_demo_app.py so invoking the script directly runs under Python, and if the
file should not be executable instead remove the executable bit — ensure the
shebang is the first line and commit that change.
In `@operators/st2110_source/metadata.json`:
- Around line 27-29: The operator metadata is missing the required
"dependencies" field; update the operator object in metadata.json (near the
"tags", "ranking", and "requirements" keys) to include a "dependencies" property
(e.g., an array, even if empty) so the metadata conforms to the repo's operator
schema; ensure the key is named exactly "dependencies" and placed at the same
object level as "requirements" and "ranking".
In `@operators/st2110_source/st2110_kernels.cu`:
- Around line 113-126: The wrapper functions (e.g., st2110_frame_reassembly
which launches st2110_frame_reassembly_kernel) currently enqueue kernels and
return without checking the launch status; after each <<<...>>> kernel launch
add a check for the CUDA launch error (use cudaPeekAtLastError() or
cudaGetLastError()), capture the cudaError_t, and handle/propagate it instead of
silently returning (adjust the wrapper signature to return cudaError_t or
log+abort consistently). Apply the same pattern to the other wrapper functions
that launch kernels (the ones covering the other ranges referenced) so any
invalid launch config or bad device pointer is detected and reported immediately
at the wrapper boundary.
- Around line 317-352: Before launching the CUDA kernels, validate the frame
shape: in both convert_ycbcr422_10bit_to_rgba and convert_ycbcr422_10bit_to_nv12
ensure width and height are non‑zero and width is even; additionally ensure
height is even for the NV12 path. If the checks fail, return early (or propagate
an error) and do not call the kernel launch
(convert_ycbcr422_10bit_to_rgba_kernel / convert_ycbcr422_10bit_to_nv12_kernel);
this prevents 0-block launches and misindexed/partial pixel writes for odd
dimensions.
In `@operators/st2110_source/st2110_source.hpp`:
- Around line 21-32: The header st2110_source.hpp declares/uses the CUDA runtime
type cudaStream_t (around the ST2110 source declarations) but doesn't include
<cuda_runtime.h>; add `#include` <cuda_runtime.h> to the public header
(st2110_source.hpp) so any translation unit that includes this header has the
CUDA runtime types available; locate the cudaStream_t usage and add the include
near the other standard includes at the top of the file to resolve the undefined
type.
---
Nitpick comments:
In `@operators/st2110_source/BUILD.md`:
- Line 7: Add missing blank lines to satisfy Markdown lint rules: insert a blank
line after the headings (e.g., the "Hardware Requirements" heading and the
following top-level heading) so there is one empty line between each heading and
the subsequent paragraph, and ensure each fenced code block around the examples
at the locations flagged (the bash blocks around the snippets currently at lines
shown by the linter) is preceded and followed by a blank line so there is an
empty line before the opening ```bash and after the closing ```; update BUILD.md
accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: fcc5c30f-d8d8-4846-9322-bb7bc8669409
⛔ Files ignored due to path filters (2)
applications/st2110_demo/st2110_demo.jpgis excluded by!**/*.jpgoperators/st2110_source/st2110_source.jpgis excluded by!**/*.jpg
📒 Files selected for processing (21)
applications/CMakeLists.txtapplications/st2110_demo/CMakeLists.txtapplications/st2110_demo/README.mdapplications/st2110_demo/metadata.jsonapplications/st2110_demo/st2110_demo_app.pyapplications/st2110_demo/st2110_demo_config.yamlapplications/st2110_demo/st2110_demo_config_nv12_only.yamlapplications/st2110_demo/st2110_demo_config_rgba_only.yamloperators/CMakeLists.txtoperators/st2110_source/BUILD.mdoperators/st2110_source/CMakeLists.txtoperators/st2110_source/README.mdoperators/st2110_source/metadata.jsonoperators/st2110_source/python/CMakeLists.txtoperators/st2110_source/python/__init__.pyoperators/st2110_source/python/st2110_source.cppoperators/st2110_source/python/st2110_source_pydoc.hppoperators/st2110_source/st2110_kernels.cuoperators/st2110_source/st2110_kernels.cuhoperators/st2110_source/st2110_source.cppoperators/st2110_source/st2110_source.hpp
✅ Files skipped from review due to trivial changes (1)
- operators/st2110_source/README.md
🚧 Files skipped from review as they are similar to previous changes (12)
- applications/st2110_demo/metadata.json
- applications/st2110_demo/st2110_demo_config_nv12_only.yaml
- applications/CMakeLists.txt
- applications/st2110_demo/README.md
- applications/st2110_demo/st2110_demo_config.yaml
- applications/st2110_demo/CMakeLists.txt
- applications/st2110_demo/st2110_demo_config_rgba_only.yaml
- operators/CMakeLists.txt
- operators/st2110_source/python/init.py
- operators/st2110_source/CMakeLists.txt
- operators/st2110_source/python/CMakeLists.txt
- operators/st2110_source/st2110_kernels.cuh
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@applications/st2110_demo/README.md`:
- Around line 233-236: Replace the hard-coded relative link to the operator
README for the "ST2110 Source Operator" with the HoloHub glossary term reference
used across READMEs; locate the line containing the link text "ST2110 Source
Operator" and change its URL target from
"../../operators/st2110_source/README.md" to the appropriate glossary term
reference (the standardized glossary key for the st2110_source operator) so the
README uses the shared glossary term per holohub conventions.
- Around line 42-62: The README currently uses a hardcoded path
"st2110_demo_config.yaml" with its application path; replace that direct path
reference with the project's glossary term for the HoloHub config location (the
canonical config file location token defined in the project's README) so the doc
uses the glossary for HoloHub-specific locations; also update the same hardcoded
reference in the GStreamer receiver section (the paragraph that references the
st2110_demo_config.yaml file) to use the same glossary term so both occurrences
are consistent.
- Around line 35-40: Update the build command in the README to target the demo
application instead of the operator: replace the `./holohub build st2110_source`
invocation with `./holohub build st2110_demo` (keeping the
`--configure-args='-DHOLOHUB_BUILD_PYTHON=ON'` flag) so the demo app
(st2110_demo) is built and its operator dependencies are pulled in
automatically.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: c33b87ea-9eec-42e7-afed-4db345374291
📒 Files selected for processing (2)
applications/st2110_demo/README.mdoperators/st2110_source/BUILD.md
🚧 Files skipped from review as they are similar to previous changes (1)
- operators/st2110_source/BUILD.md
Signed-off-by: Mikael Brudfors <mbrudfors@nvidia.com>
Signed-off-by: Mikael Brudfors <mbrudfors@nvidia.com>
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
operators/st2110_source/README.md (1)
184-192:⚠️ Potential issue | 🟡 MinorMake the permanent
sysctlexample executable or change the fence type.Lines 189-191 are still raw
/etc/sysctl.confentries inside abashblock, so that snippet fails if someone copy-pastes it as shown.📝 Suggested doc fix
# Permanent (add to /etc/sysctl.conf) -net.core.rmem_max=268435456 -net.core.rmem_default=268435456 +echo "net.core.rmem_max=268435456" | sudo tee -a /etc/sysctl.conf +echo "net.core.rmem_default=268435456" | sudo tee -a /etc/sysctl.conf🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@operators/st2110_source/README.md` around lines 184 - 192, The README shows permanent sysctl entries in a bash code block which is non-executable; update the snippet so it is actually runnable or change the fence type: either replace the fenced bash block with an ini/plain block (remove the bash header) showing the two lines "net.core.rmem_max=268435456" and "net.core.rmem_default=268435456", or keep the bash fence and provide executable commands to append them (e.g. use echo 'net.core.rmem_max=268435456' | sudo tee -a /etc/sysctl.conf && echo 'net.core.rmem_default=268435456' | sudo tee -a /etc/sysctl.conf && sudo sysctl -p) so readers can copy-paste; adjust the README entry around the permanent settings to reference the chosen change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@operators/st2110_source/README.md`:
- Around line 57-70: Add documentation entries for the missing public parameters
in the README table: document header_size (type uint16_t, purpose: L2–L4 header
byte size used to compute payload offset, default 42), rtp_header_size (type
uint16_t, purpose: RTP header length to strip/interpret, include actual default
from st2110_source.cpp), and enable_reorder_kernel (type bool, purpose:
enable/disable packet reordering kernel, include actual default). For each
parameter add Type, Description, and Default columns consistent with the
existing table and update the README usage/requirements section with a short
note about when to change header_size and enable_reorder_kernel.
---
Duplicate comments:
In `@operators/st2110_source/README.md`:
- Around line 184-192: The README shows permanent sysctl entries in a bash code
block which is non-executable; update the snippet so it is actually runnable or
change the fence type: either replace the fenced bash block with an ini/plain
block (remove the bash header) showing the two lines
"net.core.rmem_max=268435456" and "net.core.rmem_default=268435456", or keep the
bash fence and provide executable commands to append them (e.g. use echo
'net.core.rmem_max=268435456' | sudo tee -a /etc/sysctl.conf && echo
'net.core.rmem_default=268435456' | sudo tee -a /etc/sysctl.conf && sudo sysctl
-p) so readers can copy-paste; adjust the README entry around the permanent
settings to reference the chosen change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 6cc8ddc4-00db-4771-8e61-88e6bb315ab7
⛔ Files ignored due to path filters (1)
applications/st2110_demo/st2110_demo.jpgis excluded by!**/*.jpg
📒 Files selected for processing (2)
applications/st2110_demo/README.mdoperators/st2110_source/README.md
🚧 Files skipped from review as they are similar to previous changes (1)
- applications/st2110_demo/README.md
Signed-off-by: B Hashemian <3968947+bhashemian@users.noreply.github.com>
Signed-off-by: B Hashemian <3968947+bhashemian@users.noreply.github.com>
- Fix hardcoded cudaSetDevice(0): use cudaGetDevice() to respect current GPU context - Add early return on RGBA/NV12 format mismatch instead of logging and continuing - Remove unused header_size_ parameter from operator, bindings, docs, and configs - Remove dead enable_reorder_kernel_ parameter (CUDA reorder kernels never invoked) - Fix build command in README: ./holohub build st2110_demo - Add rp_filter troubleshooting tip to demo README Signed-off-by: Mikael Brudfors <mbrudfors@nvidia.com>
for more information, see https://pre-commit.ci
bhashemian
left a comment
There was a problem hiding this comment.
LGTM! Thanks for this contribution!
| * - 5 bytes for 2 pixels (pgroup) | ||
| * - Layout: Cb0 Y0 Cr0 Y1 Cb0[1:0]Y0[1:0]Cr0[1:0]Y1[1:0] (last byte has LSBs) | ||
| * - 10-bit values packed as: high 8 bits in first 4 bytes, low 2 bits in 5th byte | ||
| * | ||
| * Output format: RGBA 8-bit (4 bytes per pixel) | ||
| * | ||
| * ITU-R BT.709 conversion: | ||
| * R = 1.164 * (Y - 64) + 1.793 * (Cr - 512) | ||
| * G = 1.164 * (Y - 64) - 0.213 * (Cb - 512) - 0.533 * (Cr - 512) | ||
| * B = 1.164 * (Y - 64) + 2.112 * (Cb - 512) | ||
| * | ||
| * Each thread processes 2 pixels (1 pgroup). |
There was a problem hiding this comment.
Color conversion coefficients in documentation don't match actual implementation
The docstring for convert_ycbcr422_10bit_to_rgba_kernel (and mirrored in st2110_kernels.cuh) states:
R = 1.164 * (Y - 64) + 1.793 * (Cr - 512)
G = 1.164 * (Y - 64) - 0.213 * (Cb - 512) - 0.533 * (Cr - 512)
B = 1.164 * (Y - 64) + 2.112 * (Cb - 512)
These are legacy limited-range coefficients in 10-bit absolute units, and none of them match the actual implementation, which correctly uses normalized BT.709 coefficients:
float R = Y_norm + 1.5748f * Cr_norm;
float G = Y_norm - 0.1873f * Cb_norm - 0.4681f * Cr_norm;
float B = Y_norm + 1.8556f * Cb_norm;(where 1.5748 = 2*(1-Kr) for Kr=0.2126, the correct BT.709 luma coefficient for red).
The documented formula's 1.793 coefficient doesn't correspond to any standard BT.709, BT.2020, or SMPTE 240M definition. A developer modifying the kernel in the future could be misled into "correcting" the implementation to match the wrong documented formula.
Consider updating the comments to describe the normalized form actually used:
// BT.709 normalized:
// R' = Y'_norm + 1.5748 * Cr'_norm
// G' = Y'_norm - 0.1873 * Cb'_norm - 0.4681 * Cr'_norm
// B' = Y'_norm + 1.8556 * Cb'_norm
// where Y'_norm in [0,1] and Cb'/Cr'_norm in [-0.5, +0.5]
The same stale formula is repeated in st2110_kernels.cuh's docblock for convert_ycbcr422_10bit_to_rgba.
| // Parse RTP header | ||
| const RTPHeader* rtp = reinterpret_cast<const RTPHeader*>(rtp_start); | ||
| uint16_t sequence_number = ntohs(rtp->sequence_number); | ||
| uint32_t timestamp = ntohl(rtp->timestamp); | ||
| bool marker = (rtp->marker_payload_type & 0x80) != 0; | ||
|
|
||
| // Skip to ST2110 header (after RTP header) | ||
| const uint8_t* st2110_start = rtp_start + rtp_header_size_.get(); | ||
| remaining_size -= rtp_header_size_.get(); |
There was a problem hiding this comment.
RTP CC and extension fields not validated — variable-length RTP headers silently misparse ST2110 offset
The ST2110 header start is computed by advancing exactly rtp_header_size_.get() bytes past the RTP header:
const uint8_t* st2110_start = rtp_start + rtp_header_size_.get();This assumes the RTP header is always the fixed 12-byte minimum. However, per RFC 3550 the RTP header has two variable-length sections:
- CSRC list — when the
CCfield (low 4 bits of byte 0) is non-zero,CC × 4additional bytes follow the fixed header. - Header extension — when the
Xbit (bit 4 of byte 0) is set, a 4-byte extension header plusextension_length × 4bytes follow the CSRC list.
If any ST 2110 sender (or a software test tool like GStreamer's rtpvrawpay) emits packets with CC > 0 or X = 1, the byte currently interpreted as st2110->extended_sequence_number will actually be CSRC or extension data, producing corrupted line_number, line_offset, and line_length values for every packet.
Consider adding a basic validation at the top of parse_and_copy_packet():
uint8_t cc = rtp->version_padding_extension & 0x0F;
bool has_extension = (rtp->version_padding_extension & 0x10) != 0;
size_t rtp_actual_size = rtp_header_size_.get() + cc * 4;
if (has_extension && packet_size > rtp_actual_size + 4) {
const uint8_t* ext_header = rtp_start + rtp_actual_size;
uint16_t ext_len = ntohs(*reinterpret_cast<const uint16_t*>(ext_header + 2));
rtp_actual_size += 4 + ext_len * 4;
}
// Use rtp_actual_size instead of rtp_header_size_.get() as offsetAt a minimum, warn when CC or X are non-zero so that users of GStreamer or other software senders are alerted.
…ia-holoscan#1447) ## Summary Adds `st2110_source` operator and `st2110_demo` application for receiving SMPTE ST 2110-20 uncompressed video over multicast UDP. - **st2110_source operator**: Receives ST 2110-20 RTP packets via Linux sockets, reassembles frames using triple-buffer with CUDA pinned memory, and provides GPU-accelerated YCbCr-4:2:2 to RGBA/NV12 conversion - **st2110_demo application**: Python demo with Holoviz visualization, multiple run modes (default/rgba/nv12) via holohub CLI - **Python bindings** via pybind11 This is a clean rebase of PR nvidia-holoscan#1346 by @Ollie-cj and @xrlabs-cj onto current `main`, with review feedback from @bhashemian addressed: - Renamed `st2110_test` to `st2110_demo` throughout - Replaced separate shell scripts with holohub CLI run modes in `metadata.json` - Simplified documentation to use `./holohub run st2110_demo` - Removed DPDK references from documentation (code uses Linux sockets) - Added in-class member initializers for `FrameBuffer` struct - Fixed `CMAKE_PREFIX_PATH` in BUILD.md - Removed `holoviz` from operator dependencies (SDK built-in) - Fixed operator metadata first tag to approved category - Added GStreamer-based software testing instructions to README - Consistent multicast address across config files ## Test plan - [x] Builds in Holoscan v3.11.0 container (`nvcr.io/nvidia/clara-holoscan/holoscan:v3.11.0-cuda12-dgpu`) on aarch64 (IGX Orin) - [x] Operator initializes correctly (stream format, socket creation, CUDA pinned memory allocation) - [x] Multicast UDP reception verified (packets received from GStreamer sender over ConnectX-7) - [x] RTP/ST2110 header parsing verified (line numbers, offsets extracted correctly) - [x] Python bindings import successfully - [x] `git diff upstream/main --stat` shows only st2110 files (no 630-file noise from original PR) - [ ] End-to-end frame display with professional ST 2110 source (requires Blackmagic or similar hardware) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * ST2110 Source operator for multicast ST 2110-20 ingest with raw, RGBA, and NV12 outputs and Python bindings. * New st2110_demo application with configurable run modes (default/rgba/nv12) and visualizer options. * **Documentation** * Added comprehensive READMEs, build guide, metadata, and multiple sample YAML configs (quick start, network/video parameters, troubleshooting). <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Mikael Brudfors <mbrudfors@nvidia.com> Signed-off-by: B Hashemian <3968947+bhashemian@users.noreply.github.com> Co-authored-by: Bruce Hashemian <3968947+bhashemian@users.noreply.github.com>
Summary
Adds
st2110_sourceoperator andst2110_demoapplication for receiving SMPTE ST 2110-20 uncompressed video over multicast UDP.This is a clean rebase of PR #1346 by @Ollie-cj and @xrlabs-cj onto current
main, with review feedback from @bhashemian addressed:st2110_testtost2110_demothroughoutmetadata.json./holohub run st2110_demoFrameBufferstructCMAKE_PREFIX_PATHin BUILD.mdholovizfrom operator dependencies (SDK built-in)Test plan
nvcr.io/nvidia/clara-holoscan/holoscan:v3.11.0-cuda12-dgpu) on aarch64 (IGX Orin)git diff upstream/main --statshows only st2110 files (no 630-file noise from original PR)Summary by CodeRabbit
New Features
Documentation