Skip to content

Conversation

sarnex
Copy link
Member

@sarnex sarnex commented Sep 9, 2025

It seems that for whatever reason we must:

  1. Declare aux builtins when the compiling for an offload device
    and
  2. Define the aux builtin target macros when compiling for an offload device.

In cpuid.h we try to define __cpuidex if it is not defined. Given the above, the function will both be defined as a builtin in the compiler and we can't rely on the X86 macros to be undefined in the case the aux-triple is X86.

Previously a workaround was added for NVPTX in #152556, extend it for the other offloading targets.

// builtin. Given __has_builtin does not detect builtins on aux triples, we need
// to explicitly check for some offloading cases.
#ifndef __NVPTX__
#if !defined(__NVPTX__) && !defined(__AMDGPU__) && !defined(__SPIRV__)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option is to add a TARGET_OFFLOADING macro or something and check that, let me know what you prefer.

@sarnex sarnex marked this pull request as ready for review September 10, 2025 14:49
@llvmbot llvmbot added clang Clang issues not falling into any other category backend:X86 clang:headers Headers provided by Clang, e.g. for intrinsics labels Sep 10, 2025
@llvmbot
Copy link
Member

llvmbot commented Sep 10, 2025

@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-clang

Author: Nick Sarnie (sarnex)

Changes

It seems that for whatever reason we must:

  1. Declare aux builtins when the compiling for an offload device
    and
  2. Define the aux builtin target macros when compiling for an offload device.

In cpuid.h we try to define __cpuidex if it is not defined. Given the above, the function will both be defined as a builtin in the compiler and we can't rely on the X86 macros to be undefined in the case the aux-triple is X86.

Previously a workaround was added for NVPTX in #152556, extend it for the other offloading targets.


Full diff: https://github.com/llvm/llvm-project/pull/157741.diff

2 Files Affected:

  • (modified) clang/lib/Headers/cpuid.h (+1-1)
  • (modified) clang/test/Headers/__cpuidex_conflict.c (+3)
diff --git a/clang/lib/Headers/cpuid.h b/clang/lib/Headers/cpuid.h
index ce8c79e77dc18..45700c635831d 100644
--- a/clang/lib/Headers/cpuid.h
+++ b/clang/lib/Headers/cpuid.h
@@ -348,7 +348,7 @@ static __inline int __get_cpuid_count (unsigned int __leaf,
 // In some cases, offloading will set the host as the aux triple and define the
 // builtin. Given __has_builtin does not detect builtins on aux triples, we need
 // to explicitly check for some offloading cases.
-#ifndef __NVPTX__
+#if !defined(__NVPTX__) && !defined(__AMDGPU__) && !defined(__SPIRV__)
 static __inline void __cpuidex(int __cpu_info[4], int __leaf, int __subleaf) {
   __cpuid_count(__leaf, __subleaf, __cpu_info[0], __cpu_info[1], __cpu_info[2],
                 __cpu_info[3]);
diff --git a/clang/test/Headers/__cpuidex_conflict.c b/clang/test/Headers/__cpuidex_conflict.c
index 67f2a0cf908e5..a928aa895c44d 100644
--- a/clang/test/Headers/__cpuidex_conflict.c
+++ b/clang/test/Headers/__cpuidex_conflict.c
@@ -6,6 +6,9 @@
 // Ensure that we do not run into conflicts when offloading.
 // RUN: %clang_cc1 %s -DIS_STATIC=static -ffreestanding -fopenmp -fopenmp-is-target-device -aux-triple x86_64-unknown-linux-gnu
 // RUN: %clang_cc1 -DIS_STATIC="" -triple nvptx64-nvidia-cuda -aux-triple x86_64-unknown-linux-gnu -aux-target-cpu x86-64 -fcuda-is-device -x cuda %s -o -
+// RUN: %clang_cc1 -DIS_STATIC="" -triple amdgcn-amd-amdhsa -aux-triple x86_64-unknown-linux-gnu -aux-target-cpu x86-64 -fcuda-is-device -x cuda %s -o -
+// RUN: %clang_cc1 -DIS_STATIC="" -triple spirv64 -aux-triple x86_64-unknown-linux-gnu -aux-target-cpu x86-64 -fcuda-is-device -x cuda %s -o -
+// RUN: %clang_cc1 -DIS_STATIC="" -triple spirv64 -aux-triple x86_64-unknown-linux-gnu -aux-target-cpu x86-64 -fsycl-is-device %s -o -
 
 typedef __SIZE_TYPE__ size_t;
 

Copy link
Contributor

@boomanaiden154 boomanaiden154 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable enough to me. I might try a more principled fix in clang/lib/Basic/Builtins.cpp again at some point, but that needs some more thinking and this works well for now.

@sarnex
Copy link
Member Author

sarnex commented Sep 10, 2025

Thanks Aiden, yeah I think the root issue here is just how we handle builtins for offloading targets, what we are doing seems very unexpected to me, if I make it do what I expect I see <20 lit fails, so maybe it's possible we could rework it to make sense, but yeah requires further investigation.

@sarnex sarnex merged commit 6ff97d0 into llvm:main Sep 10, 2025
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 clang:headers Headers provided by Clang, e.g. for intrinsics clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants