Skip to content
Closed
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 45 additions & 2 deletions src/torchcodec/_core/CUDACommon.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -57,8 +57,7 @@ void initializeCudaContextWithPytorch(const torch::Device& device) {
// Color space and color range are independent concepts, so we can have a BT.709
// with full range, and another one with limited range. Same for BT.601.
//
// In the first version of this note we'll focus on the full color range. It
// will later be updated to account for the limited range.
// First, we'll consider the conversion in the full color range.
//
// Color conversion matrix
// -----------------------
Expand Down Expand Up @@ -110,6 +109,50 @@ void initializeCudaContextWithPytorch(const torch::Device& device) {
//
// Which matches https://en.wikipedia.org/wiki/YCbCr#ITU-R_BT.709_conversion
//
//
// Next, lets consider encoding RGB -> YUV in the limited color range.
// Y is in the range [16-235] and U,V are in [16-240].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading this might be slightly confusing because the first part of the note above started with

Conventionally Y is in [0, 1] range, and U and V are in [-0.5, 0.5]

And that's also how the matrices below are defined. I think we can get around this by saying something like:

In limited range, the [0, 255] range is mapped into [16-235] for Y, and into [16-240] for U and V.

// The reduced range provides margins for errors during processing.
// https://en.wikipedia.org/wiki/YCbCr#Y%E2%80%B2PbPr_to_Y%E2%80%B2CbCr
//
// To encode RGB -> YUV in limited range, we start with the full range conversion
// matrix derived above, then scale it to compress into the limited ranges:
// - RGB [0,255] -> Y [16,235]: compress by (235-16)/255 ≈ 219/255
// - RGB [0,255] -> U,V [16,240]: compress by (240-16)/255 ≈ 224/255
Comment on lines +120 to +121
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this meant to be "≈" or should it be "=" ?

//
// ```py
// import torch
// kr, kg, kb = 0.2126, 0.7152, 0.0722 # BT.709 luma coefficients
// u_scale = 2 * (1 - kb)
// v_scale = 2 * (1 - kr)
//
// rgb_to_yuv_full = torch.tensor([
// [kr, kg, kb],
// [-kr/u_scale, -kg/u_scale, (1-kb)/u_scale],
// [(1-kr)/v_scale, -kg/v_scale, -kb/v_scale]
// ])
//
// full_to_limited_y_scale = 219.0 / 255.0
// full_to_limited_uv_scale = 224.0 / 255.0
//
// rgb_to_yuv_limited = rgb_to_yuv_full * torch.tensor([
// [full_to_limited_y_scale],
// [full_to_limited_uv_scale],
// [full_to_limited_uv_scale]
// ])
//
// print("RGB->YUV matrix (Limited Range BT.709):")
// print(rgb_to_yuv_limited)
// ```
//
// This yields:
// tensor([[ 0.1826, 0.6142, 0.0620],
// [-0.1006, -0.3386, 0.4392],
// [ 0.4392, -0.3989, -0.0403]])
Comment on lines +148 to +151
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, could we find an authoritative source from which we can validate that these values are correct? Something similar to what we have for the full range:

// Which matches https://en.wikipedia.org/wiki/YCbCr#ITU-R_BT.709_conversion

It doesn't have to be for 709 specifically, but validating it for one limited range config would be great.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to find a source that gets our exact values. There is one section on that wiki page that gets close, but results in different values due to an optimization: https://en.wikipedia.org/wiki/YCbCr#Approximate_8-bit_matrices_for_BT.601

//
// Which is the matrix we store in CudaDeviceInterface.
// TODO: land PR that adds this matrix in CudaDeviceInterface
//
// Color conversion in NPP
// -----------------------
// https://docs.nvidia.com/cuda/npp/image_color_conversion.html.
Expand Down
Loading