Skip to content

Commit 90d5f9a

Browse files
ENH: Convert arithm_fp from C universal intrinsics to C++ using Highway (#17)
* ENH, SIMD: Initial implementation of Highway wrapper A thin wrapper over Google's Highway SIMD library to simplify its interface. This commit provides the implementation of that wrapper, consisting of: - simd.hpp: Main header defining the SIMD namespaces and configuration - simd.inc.hpp: Template header included multiple times with different namespaces The wrapper eliminates Highway's class tags by: - Using lane types directly which can be deduced from arguments - Leveraging namespaces (np::simd and np::simd128) for different register widths A README is included to guide usage and document design decisions. * SIMD: Update wrapper with improved docs and type support - Fix hardware/platform terminology in documentation for clarity - Add support for long double in template specializations - Add kMaxLanes constant to expose maximum vector width information - Follows clang formatting style for consistency with NumPy codebase. * SIMD: Improve isolation and constexpr handling in wrapper - Add anonymous namespace around implementation to ensure each translation unit gets its own constants based on local flags - Use HWY_LANES_CONSTEXPR for Lanes function to ensure proper constexpr evaluation across platforms * Update Highway submodule to latest master * SIMD: Fix compile error by using MaxLanes instead of Lanes for array size Replace hn::Lanes(f64) with hn::MaxLanes(f64) when defining the index array size to fix error C2131: "expression did not evaluate to a constant". This error occurs because Lanes() isn't always constexpr compatible, especially with scalable vector extensions. MaxLanes() provides a compile-time constant value suitable for static array allocation and should be used with non-scalable SIMD extensions when defining fixed-size arrays. * Convert arithm_fp to highway. --------- Co-authored-by: Sayed Adel <seiko@imavr.com>
1 parent a25e0a2 commit 90d5f9a

File tree

3 files changed

+617
-653
lines changed

3 files changed

+617
-653
lines changed

numpy/_core/meson.build

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -902,13 +902,14 @@ umath_gen_headers = [
902902
foreach gen_mtargets : [
903903
[
904904
'loops_arithm_fp.dispatch.h',
905-
src_file.process('src/umath/loops_arithm_fp.dispatch.c.src'),
905+
src_file.process('src/umath/loops_arithm_fp.dispatch.cpp.src'),
906906
[
907907
[AVX2, FMA3], SSE2,
908908
ASIMD, NEON,
909909
VSX3, VSX2,
910910
VXE, VX,
911911
LSX,
912+
RVV,
912913
]
913914
],
914915
[

0 commit comments

Comments
 (0)