-
Notifications
You must be signed in to change notification settings - Fork 22
Sq8 dist functions L2 [MOD-13169] #877
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Implemented inner product and cosine distance functions for SQ8-to-SQ8 vectors in SVE, NEON, and AVX512 architectures. - Added corresponding distance function selection logic in IP_space.cpp and function headers in IP_space.h. - Created benchmarks for SQ8-to-SQ8 distance functions to evaluate performance across different architectures. - Developed unit tests to validate the correctness of the new distance functions against expected results. - Ensured compatibility with existing optimization features for various CPU architectures.
…mproving performance
… SQ8-to-SQ8 calculations
… NEON and AVX512 headers
- Implemented NEON, SVE, and AVX512F optimized functions for calculating L2 squared distance between SQ8 (scalar quantized 8-bit) vectors. - Introduced helper functions for processing vector elements using NEON and SVE intrinsics. - Updated L2_space.cpp and L2_space.h to include new distance function for SQ8-to-SQ8. - Enhanced AVX512F, NEON, and SVE function selectors to choose the appropriate implementation based on CPU features. - Added unit tests to validate the correctness of the new L2 squared distance functions. - Updated benchmark tests to include performance measurements for the new implementations.
…ocumentation accordingly
…tance assertion tolerance
…om/RedisAI/VectorSimilarity into dorer-sq8-dist-functions-l2
… using AVX512 VNNI; add benchmarks and tests for new functionality
…VE, and AVX512; add corresponding selection functions and update tests for consistency.
…update benchmarks and tests for new functionality
- Updated distance function declarations in IP_space.h to clarify that SQ8-to-SQ8 functions use precomputed sum/norm. - Removed precomputed distance function implementations for AVX512F, NEON, and SVE architectures from their respective source files. - Adjusted benchmark tests to remove references to precomputed distance functions and ensure they utilize the updated quantization methods. - Modified utility functions to support the creation of SQ8 quantized vectors with precomputed sum and norm. - Updated unit tests to reflect changes in the quantization process and removed tests specifically for precomputed distance functions.
…nsistency - Updated include paths in AVX512F_BW_VL_VNNI.cpp to reflect new naming conventions. - Modified unit tests in test_spaces.cpp to streamline vector initialization and quantization processes. - Replaced repetitive code with utility functions for populating and quantizing vectors. - Enhanced assertions in tests to ensure optimized distance functions are correctly chosen and validated. - Removed unnecessary parameters from utility functions to simplify their interfaces. - Improved test coverage for edge cases, including zero and constant vectors, ensuring accuracy across various scenarios.
…om/RedisAI/VectorSimilarity into dorer-sq8-dist-functions-l2
…om/RedisAI/VectorSimilarity into dorer-sq8-dist-functions-l2
… ARM architecture
…om/RedisAI/VectorSimilarity into dorer-sq8-dist-functions-l2
…to dorer-sq8-dist-functions-l2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR implements L2 squared distance functions for SQ8-to-SQ8 vector comparisons (where both vectors are scalar quantized to 8-bit integers). The implementation leverages the mathematical identity ||x - y||² = ||x||² + ||y||² - 2*IP(x, y) to efficiently compute L2 distance by reusing existing inner product implementations.
Key changes:
- Added SQ8-to-SQ8 L2 distance functions with SIMD optimizations for multiple architectures (SVE, SVE2, NEON, NEON_DOTPROD, AVX512)
- Refactored SQ8_SQ8_InnerProduct to extract a common implementation function that returns raw inner product values, enabling reuse for L2 calculations
- Added comprehensive unit tests and edge case tests for the new L2 functionality
- Updated benchmarks to include L2 distance measurements for SQ8-to-SQ8 operations
Reviewed changes
Copilot reviewed 23 out of 23 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/VecSim/spaces/L2_space.h | Added declaration for L2_SQ8_SQ8_GetDistFunc dispatcher |
| src/VecSim/spaces/L2_space.cpp | Implemented L2_SQ8_SQ8_GetDistFunc to select appropriate L2 implementation based on CPU features |
| src/VecSim/spaces/L2/L2.h | Added SQ8_SQ8_L2Sqr function declaration |
| src/VecSim/spaces/L2/L2.cpp | Implemented base SQ8_SQ8_L2Sqr using the L2 identity formula |
| src/VecSim/spaces/L2/L2_SVE_SQ8_SQ8.h | Added SVE-optimized L2 implementation |
| src/VecSim/spaces/L2/L2_NEON_SQ8_SQ8.h | Added NEON-optimized L2 implementation |
| src/VecSim/spaces/L2/L2_NEON_DOTPROD_SQ8_SQ8.h | Added NEON DOTPROD-optimized L2 implementation |
| src/VecSim/spaces/L2/L2_AVX512F_BW_VL_VNNI_SQ8_SQ8.h | Added AVX512-optimized L2 implementation |
| src/VecSim/spaces/functions/SVE.h | Added Choose_SQ8_SQ8_L2_implementation_SVE declaration |
| src/VecSim/spaces/functions/SVE.cpp | Implemented SVE L2 chooser function |
| src/VecSim/spaces/functions/SVE2.h | Added Choose_SQ8_SQ8_L2_implementation_SVE2 declaration |
| src/VecSim/spaces/functions/SVE2.cpp | Implemented SVE2 L2 chooser function |
| src/VecSim/spaces/functions/NEON.h | Added Choose_SQ8_SQ8_L2_implementation_NEON declaration |
| src/VecSim/spaces/functions/NEON.cpp | Implemented NEON L2 chooser function |
| src/VecSim/spaces/functions/NEON_DOTPROD.h | Added Choose_SQ8_SQ8_L2_implementation_NEON_DOTPROD declaration |
| src/VecSim/spaces/functions/NEON_DOTPROD.cpp | Implemented NEON_DOTPROD L2 chooser function |
| src/VecSim/spaces/functions/AVX512F_BW_VL_VNNI.h | Added Choose_SQ8_SQ8_L2_implementation_AVX512F_BW_VL_VNNI declaration |
| src/VecSim/spaces/functions/AVX512F_BW_VL_VNNI.cpp | Implemented AVX512 L2 chooser function |
| src/VecSim/spaces/IP/IP.h | Added SQ8_SQ8_InnerProduct_Impl declaration for shared implementation |
| src/VecSim/spaces/IP/IP.cpp | Refactored SQ8_SQ8_InnerProduct to extract common implementation returning raw inner product |
| tests/utils/tests_utils.h | Added SQ8_SQ8_NotOptimized_L2Sqr helper for testing non-optimized L2 calculation |
| tests/unit/test_spaces.cpp | Added comprehensive L2 tests including optimization tests and edge cases (self-distance, symmetry, zero/constant vectors, extreme values) |
| tests/benchmark/spaces_benchmarks/bm_spaces_sq8_sq8.cpp | Updated benchmarks to include L2 measurements alongside existing IP benchmarks |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 23 out of 23 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #877 +/- ##
==========================================
+ Coverage 97.05% 97.06% +0.01%
==========================================
Files 127 128 +1
Lines 7560 7586 +26
==========================================
+ Hits 7337 7363 +26
Misses 223 223 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
|
||
| // Get precomputed sum of squares from both vectors | ||
| // Layout: [uint8_t values (dim)] [min_val] [delta] [sum] [sum_of_squares] | ||
| const float sum_sq_1 = *reinterpret_cast<const float *>(pVect1 + dimension + 3 * sizeof(float)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should have a macro/ enum of the metadata indexes instead of hardcoding them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just thinking about it.
Maybe I will add it in the renaming pr
Describe the changes in the pull request
Added support to SQ8 to SQ8 L2 spaces.
Intel:
AVX512_F_BW_VL_VNNI - needs it to support uint8 and in8 operations.
ARM:
SVE
NEON
The cosine functions assume that the vectors are normlized therefore don't divide by the norm.
Which issues this PR fixes
Main objects this PR modified
Mark if applicable