Skip to content

Additional AVX optimizations with build and vmaf accuracy fixes#1452

Merged
kylophone merged 8 commits intoNetflix:masterfrom
vaibhavk2:avx-optz
Mar 20, 2026
Merged

Additional AVX optimizations with build and vmaf accuracy fixes#1452
kylophone merged 8 commits intoNetflix:masterfrom
vaibhavk2:avx-optz

Conversation

@vaibhavk2
Copy link
Contributor

@vaibhavk2 vaibhavk2 commented Dec 3, 2025

This PR replaces and supersedes PR #1439, which is now stale.
It carries forward the original AVX2/AVX512 work, fixes accuracy issues, and resolves build problems so the AVX paths are ready to land.

Summary

  • Introduces AVX2 and AVX512 implementations for key libvmaf functions to improve performance on modern x86 CPUs.
  • Preserves scalar behavior and fixes previously observed accuracy drift between scalar and AVX implementations.
  • Cleans up includes / pragmas and fixes picture.h path issues seen when building with AVX optimizations enabled.

Details

The series includes the following logical changes:

  1. Intel AVX optimizations
  2. Build fix for picture.h
  3. Vmaf score accuracy fix for avx2 and avx512

Testing

On an Intel Xeon test system: (GNR with Ubuntu 24.04)

  • Built libvmaf with multiple configurations:
    • scalar baseline
    • AVX2
    • AVX512
  • Verified successful build for all configurations.
  • Ran VMAF on sample YUV pairs with vmaf_v0.6.1.json and checked:
    • AVX2 and AVX512 vmaf scores match the scalar baseline
    • Performance (AVX2 and AVX512) improves over scalar

Supersedes

  • This PR replaces Intel additional AVX optimizations #1439 and includes:
    • Original AVX optimization work by Francois Hannebicq.
    • Accuracy fixes and follow-up changes by Christopher Bird and Vaibhav Shankar.

Once this is merged, PR #1439 can be closed in favor of this updated, tested series.

fhannebi and others added 5 commits December 4, 2025 05:05
    Resolve the path for picture.h file while compiling with AVX
    optimization patches.

Signed-off-by: Vaibhav Shankar <vaibhav.shankar@intel.com>
…ss identified accuracy discrepancies

Signed-off-by: Christopher Bird <christopher.a.bird@intel.com>
This change fixes accuracy mismatches in the AVX-512 implementation
of adm_decouple_512. The issue was reproducible only with narrow width
video files from Netflix’s

Test:
- Verified AVX-512 output matches scalar implementation

Signed-off-by: Vaibhav Shankar <vaibhav.shankar@intel.com>
Resolves accuracy mismatches in the AVX2 implementation of
adm_dwt2_8_avx2. The issue was reproducible only with specific
narrow width video files.

Test:
- Verified AVX2 output(accuracy) matches the scalar implementation

Signed-off-by: Vaibhav Shankar <vaibhav.shankar@intel.com>
Update AVX-related code to use data types compatible with both Linux and Windows.

Tested on Windows (MSYS2) and Linux:
- Build completes successfully
- All tests pass

Signed-off-by: Vaibhav Shankar <vaibhav.shankar@intel.com>
@kylophone kylophone merged commit 44a9254 into Netflix:master Mar 20, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants