Skip to content

Comments

Significant speedup for YUV/NV12/P010 and related codecs#1374

Merged
awawa-dev merged 2 commits intomasterfrom
vectorized_decoders
Dec 19, 2025
Merged

Significant speedup for YUV/NV12/P010 and related codecs#1374
awawa-dev merged 2 commits intomasterfrom
vectorized_decoders

Conversation

@awawa-dev
Copy link
Owner

@awawa-dev awawa-dev commented Dec 18, 2025

Changes:

  • optimized video processing
  • Log: replace old C-style format with modern C++20 for compile-time safety

1sec = 1000000us
Image input size: 1080p
Rpi5 vs Intel N100 performance are compared below

Note

Offline benchmark: Old HyperHDR v21 vs New vectorized codecs HyperHDR v22

Platform: Windows 11 Version 23H2 (Intel(R) N100)
System: winnt 10.0.22631

File Output size ToneMap Old avg [us] Old median New avg [us] New median Gain [%]
test_image_nv12.bin 1920x1080F OFF 1970.98 1894 1470.01 1407 25.42%
1920x1080F ON 1947.75 1893 1435.28 1410.5 26.31%
960x540 Q OFF 676.35 636 449.06 437 33.61%
960x540 Q ON 681.36 638 442.23 438 35.10%
GeoMean: 30.24%
--------------------- ----------- ------- ------------ ---------- ------------ ---------- --------
test_image_p010.bin 1920x1080F OFF 2604.06 2513 1868.22 1802 28.26%
1920x1080F ON 3270.05 3204 2327.37 2330 28.83%
960x540 Q OFF 830.34 794 671.74 654.5 19.10%
960x540 Q ON 963.99 919 888.84 877 7.80%
GeoMean: 21.44%
--------------------- ----------- ------- ------------ ---------- ------------ ---------- --------
test_image_yuyv.bin 1920x1080F OFF 1976.71 1900 1665.35 1619 15.75%
1920x1080F ON 2078.43 1997 1954.68 1892 5.95%
960x540 Q OFF 774.33 724.5 486.3 459 37.20%
960x540 Q ON 803.4 729 489.3 463 39.10%
GeoMean: 25.80%
--------------------- ----------- ------- ------------ ---------- ------------ ---------- --------
test_image_uyvy.bin 1920x1080F OFF 1980.41 1901 1805.78 1714 8.82%
1920x1080F ON 2037.86 1904 1674.16 1661.5 17.85%
960x540 Q OFF 822.2 722 589.88 579 28.26%
960x540 Q ON 792.72 724.5 584.34 574 26.29%
GeoMean: 20.66%
--------------------- ----------- ------- ------------ ---------- ------------ ---------- --------
test_image_yuv420.bin 1920x1080F OFF 1922.69 1862 1849.97 1776 3.78%
1920x1080F ON 1876.33 1822 1828.52 1775 2.55%
960x540 Q OFF 697.95 642 602.43 568 13.69%
960x540 Q ON 736.11 648 572.89 563 22.17%
GeoMean: 10.91%
--------------------- ----------- ------- ------------ ---------- ------------ ---------- --------
test_image_rgb24.bin 1920x1080F OFF 2860.67 2714 2091.8 1944 26.88%
1920x1080F ON 5105.05 4973 3420.98 3343.5 32.99%
960x540 Q OFF 867.96 752 462.39 444 46.73%
960x540 Q ON 1379.05 1305 1024.26 964 25.73%
GeoMean: 33.64%
--------------------- ----------- ------- ------------ ---------- ------------ ---------- --------
test_image_xrgb32.bin 1920x1080F OFF 2888 2773.5 2164.83 2026.5 25.04%
1920x1080F ON 5179.32 5021.5 3602.41 3550 30.45%
960x540 Q OFF 731.17 698 534.42 517 26.91%
960x540 Q ON 1442.56 1351 1061.06 1033.5 26.45%
GeoMean: 27.24%
--------------------- ----------- ------- ------------ ---------- ------------ ---------- --------
TotalMean: 24.58%

Note

Offline benchmark: Old HyperHDR v21 vs New vectorized codecs HyperHDR v22

Platform: Debian GNU/Linux 12 (bookworm) (Raspberry Pi 5 Model B Rev 1.0)
System: linux 6.12.47+rpt-rpi-2712

File Output size ToneMap Old avg [us] Old median New avg [us] New median Gain [%]
test_image_nv12.bin 1920x1080F OFF 3105.29 2726 2029.13 2005 34.66%
1920x1080F ON 2724.64 2723 2016.58 2016 25.99%
960x540 Q OFF 1330.27 1081 719.59 716 45.91%
960x540 Q ON 1061.39 1057.5 720.21 720 32.14%
GeoMean: 35.09%
--------------------- ----------- ------- ------------ ---------- ------------ ---------- --------
test_image_p010.bin 1920x1080F OFF 3470.04 3471.5 2405.77 2359 30.67%
1920x1080F ON 3799.88 3798 3127.05 3127 17.71%
960x540 Q OFF 1339.44 1151.5 1002.22 1002 25.18%
960x540 Q ON 1594.51 1593 1248.58 1247.5 21.70%
GeoMean: 23.96%
--------------------- ----------- ------- ------------ ---------- ------------ ---------- --------
test_image_yuyv.bin 1920x1080F OFF 2859.73 2478 2226.15 2200 22.16%
1920x1080F ON 2473.02 2473 2221.93 2222 10.15%
960x540 Q OFF 1182.6 1171 799.27 706 32.41%
960x540 Q ON 968.19 962 717.86 717 25.86%
GeoMean: 23.06%
--------------------- ----------- ------- ------------ ---------- ------------ ---------- --------
test_image_uyvy.bin 1920x1080F OFF 3111.92 3086 2360.99 2222 24.13%
1920x1080F ON 2593.34 2592 2235.67 2235 13.79%
960x540 Q OFF 1361.4 1357.5 991.23 926 27.19%
960x540 Q ON 1096.19 1091 924.2 923 15.69%
GeoMean: 20.40%
--------------------- ----------- ------- ------------ ---------- ------------ ---------- --------
test_image_yuv420.bin 1920x1080F OFF 3243.83 2886 2473.87 2444.5 23.74%
1920x1080F ON 2860.28 2860 2457.51 2458 14.08%
960x540 Q OFF 1278.62 1270.5 979.36 891 23.40%
960x540 Q ON 1018.22 1016 906.11 905.5 11.01%
GeoMean: 18.25%
--------------------- ----------- ------- ------------ ---------- ------------ ---------- --------
test_image_rgb24.bin 1920x1080F OFF 1212.97 1176 1164.5 1058 4.00%
1920x1080F ON 5924.63 5921 5967.26 5960 -0.72%
960x540 Q OFF 932.24 936 911.46 752 2.23%
960x540 Q ON 1692.55 1685.5 1711.16 1708.5 -1.10%
GeoMean: 1.12%
--------------------- ----------- ------- ------------ ---------- ------------ ---------- --------
test_image_xrgb32.bin 1920x1080F OFF 1502.59 1472.5 1389.83 1317 7.50%
1920x1080F ON 5884.07 5872.5 5871.03 5873.5 0.22%
960x540 Q OFF 593.8 590 561.38 584 5.46%
960x540 Q ON 1721.2 1717.5 1715.52 1715.5 0.33%
GeoMean: 3.43%
--------------------- ----------- ------- ------------ ---------- ------------ ---------- --------
TotalMean: 18.64%

Log: replace old C-style format with modern C++20 for compile-time safety
@awawa-dev awawa-dev merged commit eaa8715 into master Dec 19, 2025
20 checks passed
@awawa-dev awawa-dev deleted the vectorized_decoders branch December 19, 2025 19:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant