-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[Clang][HIP] Warn when __AMDGCN_WAVEFRONT_SIZE is used in host code #91478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@llvm/pr-subscribers-backend-x86 @llvm/pr-subscribers-clang-driver Author: Fabian Ritter (ritter-x2a) ChangesThe This patch redefines them during host compilation to issue a deprecation warning if the macros are used in host code. Their value during host compilation in actual HIP code as well as in preprocessing directives stays 64 as before. Macro uses in preprocessing directives are not diagnosed. Macro uses in device code are not affected. In a later step, after a deprecation period, we can easily adjust this implementation so that macro uses in host code cause hard errors instead of warnings. Considered Alternatives:
Implementation Rationale:
Implements SWDEV-449015. Full diff: https://github.com/llvm/llvm-project/pull/91478.diff 5 Files Affected:
diff --git a/clang/lib/Driver/ToolChains/AMDGPU.cpp b/clang/lib/Driver/ToolChains/AMDGPU.cpp
index 07965b487ea79..587aa19349d89 100644
--- a/clang/lib/Driver/ToolChains/AMDGPU.cpp
+++ b/clang/lib/Driver/ToolChains/AMDGPU.cpp
@@ -550,6 +550,17 @@ void RocmInstallationDetector::AddHIPIncludeArgs(const ArgList &DriverArgs,
CC1Args.push_back(DriverArgs.MakeArgString(P));
}
+ {
+ // This header implements diagnostics for problematic uses of
+ // device-specific macros. Since these diagnostics should be issued even
+ // when GPU headers are not included, this header is included separately.
+ SmallString<128> P(D.ResourceDir);
+ llvm::sys::path::append(P, "include");
+ CC1Args.push_back("-internal-isystem");
+ CC1Args.push_back(DriverArgs.MakeArgString(P));
+ CC1Args.append({"-include", "__clang_hip_device_macro_guards.h"});
+ }
+
const auto HandleHipStdPar = [=, &DriverArgs, &CC1Args]() {
StringRef Inc = getIncludePath();
auto &FS = D.getVFS();
diff --git a/clang/lib/Headers/CMakeLists.txt b/clang/lib/Headers/CMakeLists.txt
index 5f02c71f6ca51..31f1a73fee66a 100644
--- a/clang/lib/Headers/CMakeLists.txt
+++ b/clang/lib/Headers/CMakeLists.txt
@@ -79,6 +79,7 @@ set(hip_files
__clang_hip_math.h
__clang_hip_stdlib.h
__clang_hip_runtime_wrapper.h
+ __clang_hip_device_macro_guards.h
)
set(hlsl_h
diff --git a/clang/lib/Headers/__clang_hip_device_macro_guards.h b/clang/lib/Headers/__clang_hip_device_macro_guards.h
new file mode 100644
index 0000000000000..42782c9bb08a7
--- /dev/null
+++ b/clang/lib/Headers/__clang_hip_device_macro_guards.h
@@ -0,0 +1,55 @@
+/*===---- __clang_hip_device_macro_guards.h - guards for HIP device macros -===
+ *
+ * Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+ * See https://llvm.org/LICENSE.txt for license information.
+ * SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+ *
+ *===-----------------------------------------------------------------------===
+ */
+
+/*
+ * WARNING: This header is intended to be directly -include'd by
+ * the compiler and is not supposed to be included by users.
+ *
+ */
+
+#ifndef __CLANG_HIP_DEVICE_MACRO_GUARDS_H__
+#define __CLANG_HIP_DEVICE_MACRO_GUARDS_H__
+
+#if __HIP__
+#if !defined(__HIP_DEVICE_COMPILE__)
+// The __AMDGCN_WAVEFRONT_SIZE macros cannot hold meaningful values during host
+// compilation as devices are not initialized when the macros are defined and
+// there may indeed be devices with differing wavefront sizes in the same
+// system. This code issues diagnostics when the macros are used in host code.
+
+#undef __AMDGCN_WAVEFRONT_SIZE
+#undef __AMDGCN_WAVEFRONT_SIZE__
+
+// Reference __hip_device_macro_guard in a way that is legal in preprocessor
+// directives and does not affect the value so that appropriate diagnostics are
+// issued. Function calls, casts, or the comma operator would make the macro
+// illegal for use in preprocessor directives.
+#define __AMDGCN_WAVEFRONT_SIZE (!__hip_device_macro_guard ? 64 : 64)
+#define __AMDGCN_WAVEFRONT_SIZE__ (!__hip_device_macro_guard ? 64 : 64)
+
+// This function is referenced by the macro in device functions during host
+// compilation, it SHOULD NOT cause a diagnostic.
+__attribute__((device)) static constexpr int __hip_device_macro_guard(void) {
+ return -1;
+}
+
+// This function is referenced by the macro in host functions during host
+// compilation, it SHOULD cause a diagnostic.
+__attribute__((
+ host, deprecated("The __AMDGCN_WAVEFRONT_SIZE macros do not correspond "
+ "to the device(s) when used in host code and may only "
+ "be used in device code."))) static constexpr int
+__hip_device_macro_guard(void) {
+ return -1;
+}
+// TODO Change "deprecated" to "unavailable" to cause hard errors instead of
+// warnings.
+#endif
+#endif // __HIP__
+#endif // __CLANG_HIP_DEVICE_MACRO_GUARDS_H__
diff --git a/clang/test/Driver/hip-wavefront-size-host-diagnostics.hip b/clang/test/Driver/hip-wavefront-size-host-diagnostics.hip
new file mode 100644
index 0000000000000..e0ee44cdc2986
--- /dev/null
+++ b/clang/test/Driver/hip-wavefront-size-host-diagnostics.hip
@@ -0,0 +1,52 @@
+// REQUIRES: amdgpu-registered-target
+// RUN: %clang -xhip --offload-arch=gfx1030 --offload-host-only -pedantic -nogpuinc -nogpulib -nobuiltininc -nostdinc -fsyntax-only -Xclang -verify=onhost %s
+// RUN: %clang -xhip --offload-arch=gfx1030 --offload-device-only -pedantic -nogpuinc -nogpulib -nobuiltininc -nostdinc -fsyntax-only -Xclang -verify=ondevice %s
+
+// ondevice-no-diagnostics
+
+#define WRAPPED __AMDGCN_WAVEFRONT_SIZE__
+
+__attribute__((host, device)) void use(int, const char*);
+
+template<int N> __attribute__((host, device)) int templatify(int x) {
+ return x + N;
+}
+
+// no warning expected
+#if defined(__HIP_DEVICE_COMPILE__) && (__AMDGCN_WAVEFRONT_SIZE__ == 64) && (__AMDGCN_WAVEFRONT_SIZE == 64)
+int foo(void);
+#endif
+
+// no warning expected
+__attribute__((device)) int device_var = __AMDGCN_WAVEFRONT_SIZE__;
+
+__attribute__((device))
+void device_fun() {
+ // no warnings expected
+ use(__AMDGCN_WAVEFRONT_SIZE, "device function");
+ use(__AMDGCN_WAVEFRONT_SIZE__, "device function");
+ use(WRAPPED, "device function");
+ use(templatify<__AMDGCN_WAVEFRONT_SIZE__>(42), "device function");
+}
+
+// warning expected
+int host_var = __AMDGCN_WAVEFRONT_SIZE__; // onhost-warning {{'__hip_device_macro_guard' is deprecated: The __AMDGCN_WAVEFRONT_SIZE macros do not correspond to the device(s) when used in host code and may only be used in device code.}}
+
+__attribute__((host))
+void host_fun() {
+ // warnings expected
+ use(__AMDGCN_WAVEFRONT_SIZE, "host function"); // onhost-warning {{'__hip_device_macro_guard' is deprecated: The __AMDGCN_WAVEFRONT_SIZE macros do not correspond to the device(s) when used in host code and may only be used in device code.}}
+ use(__AMDGCN_WAVEFRONT_SIZE__, "host function"); // onhost-warning {{'__hip_device_macro_guard' is deprecated: The __AMDGCN_WAVEFRONT_SIZE macros do not correspond to the device(s) when used in host code and may only be used in device code.}}
+ use(WRAPPED, "host function"); // onhost-warning {{'__hip_device_macro_guard' is deprecated: The __AMDGCN_WAVEFRONT_SIZE macros do not correspond to the device(s) when used in host code and may only be used in device code.}}
+ use(templatify<__AMDGCN_WAVEFRONT_SIZE__>(42), "host function"); // onhost-warning {{'__hip_device_macro_guard' is deprecated: The __AMDGCN_WAVEFRONT_SIZE macros do not correspond to the device(s) when used in host code and may only be used in device code.}}
+}
+
+__attribute((host, device))
+void host_device_fun() {
+ // warnings expected
+ use(__AMDGCN_WAVEFRONT_SIZE__, "host device function"); // onhost-warning {{'__hip_device_macro_guard' is deprecated: The __AMDGCN_WAVEFRONT_SIZE macros do not correspond to the device(s) when used in host code and may only be used in device code.}}
+ use(WRAPPED, "host device function"); // onhost-warning {{'__hip_device_macro_guard' is deprecated: The __AMDGCN_WAVEFRONT_SIZE macros do not correspond to the device(s) when used in host code and may only be used in device code.}}
+ use(templatify<__AMDGCN_WAVEFRONT_SIZE__>(42), "host device function"); // onhost-warning {{'__hip_device_macro_guard' is deprecated: The __AMDGCN_WAVEFRONT_SIZE macros do not correspond to the device(s) when used in host code and may only be used in device code.}}
+}
+
+// onhost-note@__clang_hip_device_macro_guards.h:45 0+ {{'__hip_device_macro_guard' has been explicitly marked deprecated here}}
diff --git a/clang/test/Preprocessor/predefined-arch-macros.c b/clang/test/Preprocessor/predefined-arch-macros.c
index ca51f2fc22c51..ee3e26f203964 100644
--- a/clang/test/Preprocessor/predefined-arch-macros.c
+++ b/clang/test/Preprocessor/predefined-arch-macros.c
@@ -4340,7 +4340,6 @@
// RUN: %clang -x hip -E -dM %s -o - 2>&1 --offload-host-only -nogpulib \
// RUN: -nogpuinc --offload-arch=gfx803 -target x86_64-unknown-linux \
// RUN: | FileCheck -match-full-lines %s -check-prefixes=CHECK_HIP_HOST
-// CHECK_HIP_HOST: #define __AMDGCN_WAVEFRONT_SIZE__ 64
// CHECK_HIP_HOST: #define __AMDGPU__ 1
// CHECK_HIP_HOST: #define __AMD__ 1
|
AlexVlx
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks fine to me (I do hate that macro being defined on the host though, so I am biased). Thanks!
| __attribute__(( | ||
| host, deprecated("The __AMDGCN_WAVEFRONT_SIZE macros do not correspond " | ||
| "to the device(s) when used in host code and may only " | ||
| "be used in device code."))) static constexpr int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought I saw some junk trying to support pre-C++11 HIP, is that a concern here?
Is this macro defined in OpenMP? If so can we do the same thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re pre-C++11 HIP: I think we can just drop the constexpr from both variants of the guard function; since the guard function is only referenced and never called, the macros would still work as constant expressions.
re OpenMP: As far as I can see in experiments, the macros are not defined during OpenMP's host compilation. This is therefore not an issue for OpenMP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pre-C++11 HIP shouldn't be a concern anymore.
scchan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…specifiers
So far, the resolution of host/device overloads for functions in HIP/CUDA
operates as if in a host-device context for code outside of function bodies,
e.g., in expressions that are part of template arguments in top-level
declarations. This means that, if separate host and device overloads are
declared, the device overload is used in the device compilation phase and the
host overload is used in the host compilation phase.
This patch changes overload resolution in such cases to prefer overloads that
match the target of the declaration in which they occur. For example:
__device__ constexpr int get_n() { return 64; }
__host__ constexpr int get_n() { return -1; }
__device__ std::enable_if<(get_n() > 32)>::type foo() { }
Before, this code would not compile, because get_n resolved to the host
overload during host compilation, causing an error. With this patch, the call
to get_n in the declaration of the device function foo resolves to the device
overload in host and device compilation.
If attributes that affect the declaration's target occur after a call with
target-dependent overload resolution, a warning is issued. This is realized by
registering the Kinds of relevant attributes in the CUDATargetContext when they
are parsed.
This is an alternative to PR llvm#93546, which is required for PR llvm#91478.
…ithout relying on target-dependent overload resolution The __AMDGCN_WAVEFRONT_SIZE and __AMDGCN_WAVEFRONT_SIZE__ macros in HIP can only provide meaningful values during device compilation. They are currently usable in host code, but only contain the default value of 64, independent of the target device(s). This patch checks for numeric literals in clearly identifiable host code if they are the result of expanding the wavefront-size macros and issues a diagnostic if that's the case. A alternative PR, llvm#91478, relied on constexpr functions with host and device overloads (where the host overload is marked as deprecated) to diagnose uses of these macros in host code. A problem with this approach are uses of the macros outside of function bodies, e.g., in template arguments of return types, or default template arguments of functions. In these cases, calls to functions with target overloads are resolved to the host variant during host compilation and to the device variant during device compilation - independently of the target of the function they belong to. Therefore, using the wavefront size macros in such cases leads to diagnostics during host compilation with llvm#91478, even if they are only associated to a device function. PR llvm#93546 is a proposal to suppress these spurious diagnostics. PR llvm#103031 is a proposal to change the behavior of target-dependent overload resolution outside of function bodies to use the target attributes that occur before the overloaded call to select the overload candidate. In contrast to llvm#91478, this PR will not diagnose uses of the wavefront-size macros outside of function bodies or initializers of global host variables. Implements SWDEV-449015.
The __AMDGCN_WAVEFRONT_SIZE and __AMDGCN_WAVEFRONT_SIZE__ macros in HIP can only provide meaningful values during device compilation. They are currently usable in host code, but only contain the default value of 64, independent of the target device(s). This patch redefines them during host compilation to issue a deprecation warning if the macros are used in host code. Their value during host compilation in actual HIP code as well as in preprocessing directives stays 64 as before. Macro uses in preprocessing directives are not diagnosed. Macro uses in device code are not affected. Implements SWDEV-449015.
8743f8a to
4145231
Compare
|
Closing this PR in favor of a more comprehensive treatment of the AMDGCN_WAVEFRONT_SIZE situation. |
The
__AMDGCN_WAVEFRONT_SIZEand__AMDGCN_WAVEFRONT_SIZE__macros in HIP can only provide meaningful values during device compilation. They are currently usable in host code, but only contain the default value of 64, independent of the target device(s).This patch redefines them during host compilation to issue a deprecation warning if the macros are used in host code. Their value during host compilation in actual HIP code as well as in preprocessing directives stays 64 as before. Macro uses in preprocessing directives are not diagnosed. Macro uses in device code are not affected.
In a later step, after a deprecation period, we can easily adjust this implementation so that macro uses in host code cause hard errors instead of warnings.
Considered Alternatives:
Introducing a specialized diagnostic during clang's semantic analysis:
This is technically possible and allows for cleaner diagnostics, but requires HIP-specific special case handling in clang's very general
Sema::ActOnNumericConstant(...)method, since these macros appear as integer literals during parsing/semantic analysis where we know if we are in a host function. In comparison, this PR introduces less complexity to code that isindependent from HIP.
See also the previous rejected proposal, which eliminates the macros for host compilation: [clang][AMDGPU] Don't define feature macros on host code #83558
Implementation Rationale:
-nogpuinc,-nobuiltininc, and/or-nostdincCLI flags are provided, enabling consistent diagnostics with any combination of these flags. I am open to suggestions for better solutions.AMDGCN_WAVEFRONT_SIZEmacros are commonly used in preprocessing directives for conditional includes. The defined expression is carefully crafted to not break this use case:Implements SWDEV-449015.