Skip to content
Draft
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion libc/shared/rpc_util.h
Original file line number Diff line number Diff line change
Expand Up @@ -157,7 +157,7 @@ RPC_ATTRS void sleep_briefly() {
asm("nanosleep.u32 64;" ::: "memory");
#elif defined(__AMDGPU__) && defined(RPC_TARGET_IS_GPU)
__builtin_amdgcn_s_sleep(2);
#elif __has_builtin(__builtin_ia32_pause)
#elif __has_builtin(__builtin_ia32_pause) && !defined(__SPIRV__)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary? This seems like a massive bug that needs to be fixed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me debug more into it, will reply here when I have a better answer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is showing up it means that the x64 builtins are being emitted, which is very bad. I'm going to guess something got crossed while defining the target info.

Copy link
Member Author

@sarnex sarnex Jan 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I debugged into it quickly before and found the problem was that it seems there's some map in the preprocessor between builtins and features those builtins require, and that feature map was empty and when it's empty __has_builtin returns true, but I don't actually know what that feature map repesents or why it's empty, will do a real investigation on this which I should have done to begin with :P

Copy link
Member Author

@sarnex sarnex Jan 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This turned out to be pretty annoying, I have a draft PR here that attempts to fix it, will mark it ready for review once CI passes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it's okay to have this target-check workaround until we resolve the discussion on the __has_builtin PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way you have it not just implicitly assumes the host is x64. If we're dedicated to working around that then you'd just need to put an empty ifdef at the top.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx, I just pushed a commit that hopefully fixes what you meant. If you feel strongly about not doing the workaround I'm happy to wait, but since IMO it's relatively simple and self-contained I was thinking we could not block the PR with that.

__builtin_ia32_pause();
#elif __has_builtin(__builtin_arm_isb)
__builtin_arm_isb(0xf);
Expand Down
3 changes: 3 additions & 0 deletions offload/DeviceRTL/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -264,6 +264,9 @@ compileDeviceRTLLibrary(amdgpu amdgcn-amd-amdhsa -Xclang -mcode-object-version=n
add_custom_target(omptarget.devicertl.nvptx)
compileDeviceRTLLibrary(nvptx nvptx64-nvidia-cuda --cuda-feature=+ptx63)

add_custom_target(omptarget.devicertl.spirv64)
compileDeviceRTLLibrary(spirv64 spirv64)

# Archive all the object files generated above into a static library
add_library(omptarget.devicertl STATIC)
set_target_properties(omptarget.devicertl PROPERTIES
Expand Down
Loading