-
Notifications
You must be signed in to change notification settings - Fork 278
Enable compiling arm/neon with MSVC for windows on arm64 #612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Can you rebase your PR on the master branch please? This would make the diff easier to read (and would solve the conflicts). |
8e7a023
to
a82ed95
Compare
Yes sure |
The change might be a bit invasive. Let me know if you need help with the review. Another approach to support MSVC would be to wrap the vector types in a custom class so that all types like float32x4_t and int32x4_t can be distinguished by the dispatcher but it would require too many conversions from the wrapper class to the native class before passing onto intrinsic functions etc which can be done with the user-defined operator but I guess performance cost will be there. |
We cannot do such a change while not setting up CI for that platform + arch combination. i'll try to prepare that. |
#else | ||
#define XSIMD_WITH_NEON64 0 | ||
#endif | ||
#elif defined(_MSC_VER) && defined(_M_ARM64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a specific section for MSVC workarounds at the bottom. Also, from MSVC docs it is implied that NEON support is always available.
#elif defined(_MSC_VER) && defined(_M_ARM64) | |
#elif defined(_MSC_VER) | |
# if defined(_M_ARM64) | |
#define XSIMD_WITH_NEON64 1 | |
#else | |
#define XSIMD_WITH_NEON 1 | |
#endif |
This patch enables building arm/neon with MSVC compiler for windows on arm64 target and contains following changes,
Replace the dispatcher mechanism and use an explicit function selection using scalar types. This is required as MSVC intrinsics uses the same underlying type for multiple neon vector types and function selection using target vector type causes the wrong function to be called.
Add a function to convert
Initializer_list<batch<T>>
to neon vector type as there are no constructors provided for the same operation in MSVC.NEON/NEON64 identification using MSVC specific flags
Add a
_
to intrinsics wrapper functions. MSVC defines some intrinsics using macros and without the prefix the wrapper function names get replaced by the pre-processor.