-
Notifications
You must be signed in to change notification settings - Fork 796
[SYCL][NATIVECPU] Materialize floating point atomics in LLVM pass #15888
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about fetch_sub? I can see the following in SYCL spec for atomic_ref:
Floating fetch_sub(Floating operand,
memory_order order = default_read_modify_write_order,
memory_scope scope = default_scope) const noexcept;
Thanks @maarquitos14 , it appears to be implemented as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
|
Would it be possible to add e2e test to also check that we get the expected result? |
| if (DeviceTriple.isNVPTX() || DeviceTriple.isAMDGPU() || | ||
| (DeviceTriple.isSPIR() && | ||
| DeviceSubArch != llvm::Triple::SPIRSubArch_fpga)) | ||
| DeviceSubArch != llvm::Triple::SPIRSubArch_fpga) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a FE test? I assume there is already a FE test checking the existence of this macro for other targets. You can just add a RUN there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added it, thank you
There are already |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FE LGTM. Thanks!
|
@intel/llvm-gatekeepers, this looks ready to merge, thank you |
This PR sets
SYCL_USE_NATIVE_FP_ATOMICSwhen compiling for Native CPU, and provides an implementation for said atomics via an LLVM pass that defines them throughatomicrmwinstructions.