Skip to content

Commit f611074

Browse files
authored
Update _posts/2025-03-27-Boost-OpenSearch-VectorSearch-Performance-With-Intel-AVX512.md
Signed-off-by: Nathan Bower <[email protected]>
1 parent 6ab7027 commit f611074

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

_posts/2025-03-27-Boost-OpenSearch-VectorSearch-Performance-With-Intel-AVX512.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,8 @@ The techniques used in vector search are computationally expensive, and Intel AV
3434
- In compiler optimizations, such as auto-vectorization
3535
The corresponding optimized assembly instructions are generated when the accelerator is correctly utilized. AVX2 generates instructions using YMM registers, and AVX-512 generates instructions using ZMM registers. Performance is enhanced by allowing ZMM registers to handle 32 double-precision and 64 single-precision floating-point operations per clock cycle within 512-bit vectors. Additionally, these registers can process eight 64-bit and sixteen 32-bit integers. With up to two 512-bit fused multiply-add (FMA) units, AVX-512 effectively doubles the width of data registers, the number of registers, and the width of FMA units compared to Intel AVX2 YMM registers. Beyond these improvements, Intel AVX-512 offers increased parallelism, which leads to faster data processing and improved performance in compute-intensive applications such as scientific simulations, analytics, and machine learning. It also provides enhanced support for complex number calculations and accelerates tasks like cryptography and data compression. Furthermore, AVX-512 includes new instructions that improve the efficiency of certain algorithms, reduce power consumption, and optimize resource utilization, making it a powerful tool for modern computing needs.
3636

37-
By registering double width to 512 bits, the use of ZMM registers instead of YMM registers can potentially double the data throughput and computational power. When the AVX-512 extension is detected, the Faiss distance and scalar quantizer functions process 16 vectors per loop compared to 8 vectors per loop for the AVX2 extension.
37+
By registering double width to 512 bits, the use of ZMM registers instead of YMM registers can potentially double the data throughput and computational power. When the AVX-512 extension is detected, the Faiss distance and scalar quantizer functions process 16 vectors per loop compared to 8 vectors per loop for the AVX2 extension.
38+
3839
Thus, in vector search using k-nearest neighbors (k-NN), index build times and vector search performance can be enhanced with the use of these new hardware extensions.
3940

4041
## The hot spot in OpenSearch vector search

0 commit comments

Comments
 (0)