-
Notifications
You must be signed in to change notification settings - Fork 751
Description
Hii,
I’m currently working on performance optimization of Milvus on ARM architectures, and since Milvus internally leverages Knowhere, which in turn uses HNSWlib for the HNSW index, I’ve been analyzing the end-to-end execution flow.
While going through the source code (particularly space_l2.h and space_ip.h), I noticed that:
-
HNSWlib currently does not contain any architecture-specific vectorized kernels (e.g., NEON or SVE for ARM).
-
The existing distance computation functions (for L2 and Inner Product) are implemented using scalar C++ loops, and rely purely on compiler auto-vectorization.
-
This leads to suboptimal performance on ARM compared to x86 builds, where other ecosystems (e.g., FAISS) sometimes include architecture-tuned kernels (SSE/AVX).
My questions:
-
Is there any plan to include NEON or SVE-optimized kernels for ARM platforms (especially for the distance functions inside space_l2.h and space_ip.h)?
-
If not yet planned, would the maintainers be open to a contribution/PR that introduces such ARM-specific optimizations (behind compile-time flags like __ARM_NEON / __ARM_FEATURE_SVE)?
-
Are there any ongoing discussions or design considerations around this that I can align with before starting the implementation?