Add custom bt601 full range cuda kernel and fix color conversion#323
Add custom bt601 full range cuda kernel and fix color conversion#323
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a CUDA full-range BT.601 NV12→RGB converter, exposes a Changes
Sequence Diagram(s)sequenceDiagram
participant RTP as RTP Source
participant Op as NvStreamDecoderOp
participant Decoder as HW Decoder
participant CUDA as nv12_to_rgb_fullrange
participant NPP as NPP_NV12ToRGB
RTP->>Op: RTP packet/frame
Op->>Decoder: submit packet / decode
Decoder-->>Op: decoded NV12 frame + full_range_flag
Op->>Op: resolve range (force_full_range or detected)
alt full-range
Op->>CUDA: launch nv12_to_rgb_fullrange (cudaStream)
CUDA-->>Op: RGB frame
else limited-range
Op->>NPP: call nppiNV12ToRGB_... (Ctx)
NPP-->>Op: RGB frame
end
Op-->>RTP: emit RGB tensor/frame
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@examples/camera_streamer/operators/nv_stream_decoder/nv_stream_decoder_op.cpp`:
- Around line 225-230: The custom CUDA kernel path that calls
nv12_to_rgb_fullrange_bt601 lacks post-launch error checking; after calling
nv12_to_rgb_fullrange_bt601 add a cudaGetLastError() check (e.g., cudaError_t
err = cudaGetLastError()) and handle non-success by logging/returning the same
error flow used by the NPP path (mirror its processLogger/error return behavior)
so launch failures are caught and the function returns early on error.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: bbfbb1b0-a798-44dd-8626-ad12f9dcaa6b
📒 Files selected for processing (2)
examples/camera_streamer/operators/nv_stream_decoder/nv_stream_decoder_op.cppexamples/camera_streamer/operators/nv_stream_decoder/nv_stream_decoder_op_py.cpp
examples/camera_streamer/operators/nv_stream_decoder/nv_stream_decoder_op.cpp
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
Adds full-range BT.601 NV12→RGB conversion support to the camera_streamer NVDEC-based decoder to fix color issues with OAK-D VPU-encoded streams, and exposes an override to force full-range handling when bitstream metadata is missing.
Changes:
- Add
force_full_rangeparameter toNvStreamDecoderOp(C++ + pybind) and wire it from the teleop subgraph. - Implement a custom CUDA kernel for full-range BT.601 NV12→RGB conversion and use it when full-range is selected.
- Switch the limited-range conversion path to use NPP’s BT.709 CSC variant.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| examples/camera_streamer/teleop_camera_subgraph.py | Forces full-range decode for OAK-D streams (currently for all OAK-D). |
| examples/camera_streamer/operators/nv_stream_decoder/nv_stream_decoder_op_py.cpp | Exposes force_full_range in Python bindings and docs. |
| examples/camera_streamer/operators/nv_stream_decoder/nv_stream_decoder_op.hpp | Adds operator parameters/state for range selection. |
| examples/camera_streamer/operators/nv_stream_decoder/nv_stream_decoder_op.cpp | Adds range detection/override and switches NV12→RGB implementation paths. |
| examples/camera_streamer/operators/nv_stream_decoder/nv12_to_rgb.cuh | Declares the custom CUDA conversion API. |
| examples/camera_streamer/operators/nv_stream_decoder/nv12_to_rgb.cu | Implements the full-range BT.601 NV12→RGB CUDA kernel. |
| examples/camera_streamer/operators/nv_stream_decoder/CMakeLists.txt | Builds the new CUDA source into the decoder library. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
examples/camera_streamer/operators/nv_stream_decoder/nv_stream_decoder_op.cpp
Outdated
Show resolved
Hide resolved
examples/camera_streamer/operators/nv_stream_decoder/nv_stream_decoder_op.cpp
Outdated
Show resolved
Hide resolved
examples/camera_streamer/operators/nv_stream_decoder/nv_stream_decoder_op.cpp
Outdated
Show resolved
Hide resolved
examples/camera_streamer/operators/nv_stream_decoder/nv_stream_decoder_op.cpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 1
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: ff4b44e6-fbd1-4b52-8317-325306809639
📒 Files selected for processing (1)
examples/camera_streamer/operators/nv_stream_decoder/nv_stream_decoder_op.cpp
examples/camera_streamer/operators/nv_stream_decoder/nv_stream_decoder_op.cpp
Show resolved
Hide resolved
examples/camera_streamer/operators/nv_stream_decoder/nv_stream_decoder_op.cpp
Show resolved
Hide resolved
There was a problem hiding this comment.
♻️ Duplicate comments (1)
examples/camera_streamer/operators/nv_stream_decoder/nv_stream_decoder_op.cpp (1)
185-197: 🧹 Nitpick | 🔵 TrivialMinor optimization opportunity (previously noted).
When
force_full_range_is true,decoder_->GetVideoFormatInfo()is still called even though its result doesn't affectuse_full_range_. This could be skipped:if (force_full_range_.get()) { use_full_range_ = true; HOLOSCAN_LOG_INFO("NV12->RGB color range: full (force_full_range=true)"); } else { auto fmt = decoder_->GetVideoFormatInfo(); int bitstream_flag = fmt.video_signal_description.video_full_range_flag; use_full_range_ = (bitstream_flag != 0); HOLOSCAN_LOG_INFO("NV12->RGB color range: {} (bitstream flag={})", use_full_range_ ? "full" : "limited", bitstream_flag); }That said, the current approach provides useful diagnostic info by always logging the bitstream flag, which can help debug encoder misconfigurations. If the
GetVideoFormatInfo()call is lightweight, keeping it for diagnostics is reasonable.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/camera_streamer/operators/nv_stream_decoder/nv_stream_decoder_op.cpp` around lines 185 - 197, When determining use_full_range_ inside the range_detected_ block, avoid calling decoder_->GetVideoFormatInfo() when force_full_range_.get() is true since that call doesn't affect the result; implement a conditional: if force_full_range_.get() set use_full_range_ = true and log that fact (using HOLOSCAN_LOG_INFO with force_full_range_.get()), else call decoder_->GetVideoFormatInfo(), read fmt.video_signal_description.video_full_range_flag, set use_full_range_ = (bitstream_flag != 0) and log the bitstream flag; keep the range_detected_ = true behavior unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In
`@examples/camera_streamer/operators/nv_stream_decoder/nv_stream_decoder_op.cpp`:
- Around line 185-197: When determining use_full_range_ inside the
range_detected_ block, avoid calling decoder_->GetVideoFormatInfo() when
force_full_range_.get() is true since that call doesn't affect the result;
implement a conditional: if force_full_range_.get() set use_full_range_ = true
and log that fact (using HOLOSCAN_LOG_INFO with force_full_range_.get()), else
call decoder_->GetVideoFormatInfo(), read
fmt.video_signal_description.video_full_range_flag, set use_full_range_ =
(bitstream_flag != 0) and log the bitstream flag; keep the range_detected_ =
true behavior unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 2669718a-96da-42e8-9ac9-e7c5fbac6086
📒 Files selected for processing (1)
examples/camera_streamer/operators/nv_stream_decoder/nv_stream_decoder_op.cpp
Nvenc color conversion is CSC so for custom encoding we need csc nv12->rgb
On the other hand, Oak-d uses bt601 fullrange color conversion ondevice for rgb->nv12 which is not supported by nppi. This adds a custom cuda kernel to do the correct color conversion for oakd. Alternative is to use 2 nppi methods (go to yuv420 then rgb) which is not as efficient as this.
Summary by CodeRabbit