-
-
Notifications
You must be signed in to change notification settings - Fork 70
Fix GCC 16 LTO build failure with multi-target dispatch #225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
HI @stratakis thanks for reporting the issue and patching it up! Happy to merge this as soon as the CI passes. I am assuming NumPy needs an update after merging this one? |
Yes, numpy bundles it, so it will need an update. |
|
Close/open to trigger CI runs. Not sure why they didnt find a runner. |
|
Also something I'd like to point out is that this file has mixed windows line endings and linux ones (CRLF and LF). Working on Linux it was a bit of a pain, as git by default does conversions of line endings on changes. Not sure if important, but if Linux is the main target you might wanna consider converting everything to LF? |
Yes, linux is the main target and I do want to get that fixed in a separate PR. Feel free to add a commit to this PR, if you would like. Also, please rebase when #226 is merged in a few minutes. |
|
Please rebase to run the CI. |
Remove always_inline from STDSortComparator. This function is passed to std::sort, and GCC 16 rejects forced inlining when the caller and callee have mismatched target attributes during LTO. This occurs when the library is used via the header-only static include with multiple dispatch targets (e.g. AVX-512 and baseline) linked together, as NumPy does. Standalone builds are unaffected since they use -march=skylake-avx512 for the entire translation unit.
cb42dd6 to
4273453
Compare
|
Thanks @stratakis! |
|
Remove always_inline from STDSortComparator. This function is passed to std::sort, and GCC 16 rejects forced inlining when the caller and callee have mismatched target attributes during LTO.
This occurs when the library is used via the header-only static include with multiple dispatch targets (e.g. AVX-512 and baseline) linked together, as NumPy does.
Standalone builds are unaffected since they use -march=skylake-avx512 for the entire translation unit.
Fixes: #224