Skip to content

[Bug]: Low PSNR (~8 dB in best case/~25 dB in worst case below cuda) on NV12->RGB color conversion output #1954

@dvrogozh

Description

@dvrogozh

Which component impacted?

Video Processing

Is it regression? Good in old configuration?

None

What happened?

On:

  • Battle Image G21 (0xe20b)
  • Ubuntu 25.04, kernel 6.14.0-28-generic
  • Driver stack installed per https://dgpu-docs.intel.com/driver/client/overview.html instruction from Kobuk team PPA repository, versions:
    • intel-media-va-driver-non-free: 25.3.2-0ubuntu1~25.04~ppa2
  • FFmpeg n6.1.2

Usage scenario:

  • Video decoding (say, h264 with NV12 output) + color space conversion to RGB24. The reason behind RGB24 is that this is a format used by default in AI training, fine tuning and inference. Specifically, that's a format used in a https://github.com/pytorch/torchcodec.
  • As Intel GPUs currently does not support RGB24, we perform conversion to one of supported RGB32 color format then just copy relevant data. The issue described below does not actually depend on that, but we give examples on RGB24 as this format allows to compare results with CUDA.

Results:

  • Average PSNRs comparing to CPU processed reference:
Backend Pipeline Avg. PSNR
CUDA torchcodec (ffmpeg-nvdec + self-written NPP color conversion) 50.872619
QSV ffmpeg (dec + scale_qsv) 44.622594
VAAPI torchcodec PR-558 (ffmpeg-vaapi dec + self-written VAAPI color conversion) 44.622594
VAAPI ffmpeg-vaapi (dec + scale_vaapi) or torchcodec PR-832 24.701474

Expectation:

  1. Intel pipeline quality to be on par with CUDA (as of now best case scenarion is 8 dB behind)
  2. ffmpeg-vaapi pipeline quality to be on par with ffmpeg-qsv (as of now 16 dB behind)

What's the usage scenario when you are seeing the problem?

Video Analytics

What impacted?

No response

Debug Information

Analysis:

  • Self-written VAAPI color conversion sets VAProcPipelineParameterBuffer::surface_color_standard as VAProcColorStandardBT709, other values of VAProcPipelineParameterBuffer (except widht/heigh) were zeroed
  • As experiment, setting VAProcPipelineParameterBuffer::output_color_standard to VAProcColorStandardBT601 does not change output stream
  • I did not check how ffmpeg-qsv and libvpl sets these parameters, but ffmpeg-qsv quality fully matches self-written VAAPI color conversion pipeline
  • ffmpeg-vaapi scale_vaapi filter sets VAProcColorStandardExplicit for both VAProcPipelineParameterBuffer::surface_color_standard and VAProcPipelineParameterBuffer::output_color_standard + respective input_color_properties and output_color_properties. These settings seem to be correct on ffmpeg side, but give significantly lower quality
  • As an experiment, changing VAProcPipelineParameterBuffer::surface_color_standard or VAProcPipelineParameterBuffer::output_color_standard "fixes" scale_vaapi filter which matches quality level of ffmpeg-qsv or self-written VAAPI conversion

Detailed results:

python3 dump.py -i test/resources/nasa_13013.mp4 -o nasa_13013_torchcodec_cpu.rgb -d cpu -s 0:390
python3 dump.py -i test/resources/nasa_13013.mp4 -o nasa_13013_torchcodec_cuda.rgb -d cuda:0 -s 0:390
python3 dump.py -i test/resources/nasa_13013.mp4 -o nasa_13013_torchcodec_cuda.rgb -d xpu:0 -s 0:390

ffmpeg -i test/resources/nasa_13013.mp4 -vf "scale=480:270:sws_flags=bilinear,format=rgb24" -y nasa_13013_ffmpeg_cpu.rgb
ffmpeg -i test/resources/nasa_13013.mp4 -vf "format=rgb24" -y nasa_13013_ffmpeg_cpu2.rgb
ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD129 -hwaccel_output_format vaapi -i test/resources/nasa_13013.mp4 -vf "scale_vaapi=480:270:format=rgba,hwdownload,format=rgba,format=rgb24" -y nasa_13013_ffmpeg_vaapi.rgb
ffmpeg -hwaccel qsv -qsv_device /dev/dri/renderD129 -c:v h264_qsv -i test/resources/nasa_13013.mp4 -vf "scale_qsv=480:270:format=rgb32,hwdownload,format=rgb32,format=rgb24" -y nasa_13013_ffmpeg_qsv.rgb
$ sha1sum *.rgb
7d307c4cfcf2680e413c943646894499aa641b2e  nasa_13013_ffmpeg_cpu.rgb                       # n6.1.2
7d307c4cfcf2680e413c943646894499aa641b2e  nasa_13013_ffmpeg_cpu2.rgb                      # n6.1.2
6dcf7083da51717e06902b81dcef11adaa7ae071  nasa_13013_torchcodec_cuda.rgb
718e35799ebfdd78d91914b474748cbb3095636b  nasa_13013_ffmpeg_qsv.rgb                       # n6.1.2
718e35799ebfdd78d91914b474748cbb3095636b  nasa_13013_ffmpeg_vaapi_in=BT709_out=BT709.rgb  # patched ffmpeg-vaapi
718e35799ebfdd78d91914b474748cbb3095636b  nasa_13013_ffmpeg_vaapi_out=BT601.rgb           # patched ffmpeg-vaapi
718e35799ebfdd78d91914b474748cbb3095636b  nasa_13013_ffmpeg_vaapi_out=BT709.rgb           # patched ffmpeg-vaapi
7d307c4cfcf2680e413c943646894499aa641b2e  nasa_13013_torchcodec_cpu.rgb
6dcf7083da51717e06902b81dcef11adaa7ae071  nasa_13013_torchcodec_cuda.rgb
2e036b96d4b10501f3b3b30b9d3e3313214fc6bc  nasa_13013_ffmpeg_vaapi.rgb                     # n6.1.2
2e036b96d4b10501f3b3b30b9d3e3313214fc6bc  nasa_13013_torchcodec_ffmpeg_vaapi_filters.rgb
718e35799ebfdd78d91914b474748cbb3095636b  nasa_13013_torchcodec_vaapi.rgb
  • PNSR values:
ffmpeg -f rawvideo -pix_fmt rgb24 -s:v 480x270 -i nasa_13013_torchcodec_cuda.rgb -f rawvideo -pix_fmt rgb24 -s:v 480x270 -i nasa_13013_ffmpeg_cpu.rgb -filter_complex "psnr" -f null /dev/null
[Parsed_psnr_0 @ 0x561f70f6d2c0] PSNR r:53.017530 g:48.658710 b:52.270208 average:50.872619 min:49.593809 max:51.807851

ffmpeg -f rawvideo -pix_fmt rgb24 -s:v 480x270 -i nasa_13013_torchcodec_ffmpeg_vaapi_filters.rgb -f rawvideo -pix_fmt rgb24 -s:v 480x270 -i nasa_13013_ffmpeg_cpu.rgb -filter_complex "psnr" -f null /dev/null
[Parsed_psnr_0 @ 0x5588df31d180] PSNR r:24.489627 g:24.480980 b:25.169051 average:24.701474 min:24.309702 max:26.582759

ffmpeg -f rawvideo -pix_fmt rgb24 -s:v 480x270 -i nasa_13013_ffmpeg_vaapi.rgb -f rawvideo -pix_fmt rgb24 -s:v 480x270 -i nasa_13013_ffmpeg_cpu.rgb -filter_complex "psnr" -f null /dev/null
[Parsed_psnr_0 @ 0x64513338d8c0] PSNR r:24.489627 g:24.480980 b:25.169051 average:24.701474 min:24.309702 max:26.582759

ffmpeg -f rawvideo -pix_fmt rgb24 -s:v 480x270 -i nasa_13013_torchcodec_vaapi.rgb -f rawvideo -pix_fmt rgb24 -s:v 480x270 -i nasa_13013_ffmpeg_cpu.rgb -filter_complex "psnr" -f null /dev/null
[Parsed_psnr_0 @ 0x56d294e56940] PSNR r:40.279087 g:44.622594 b:41.263727 average:41.695772 min:39.498336 max:43.352288

ffmpeg -f rawvideo -pix_fmt rgb24 -s:v 480x270 -i nasa_13013_ffmpeg_qsv.rgb -f rawvideo -pix_fmt rgb24 -s:v 480x270 -i nasa_13013_ffmpeg_cpu.rgb -filter_complex "psnr" -f null /dev/null
[Parsed_psnr_0 @ 0x61a5d080b940] PSNR r:40.279087 g:44.622594 b:41.263727 average:41.695772 min:39.498336 max:43.352288

Do you want to contribute a patch to fix the issue?

None

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions