-
Notifications
You must be signed in to change notification settings - Fork 15.5k
Closed as not planned
Closed as not planned
Copy link
Labels
Description
#include <hip/hip_runtime.h>
#include <stdio.h>
#ifdef __HIP_DEVICE_COMPILE__
#define HD __host__ __device__
#else
#define HD
#endif
HD void foo(){
printf("execute foo\n");
}
__global__ void kernel(){
foo();
}
int main() {
kernel<<<1,1>>>();
hipDeviceSynchronize();
}
}This code sample is error with clang when compiling for host target.
hip_macro.cpp:19:5: error: no matching function for call to 'foo'
foo();
^~~
hip_macro.cpp:10:9: note: candidate function not viable: call to __host__ function from __global__ function
HD void foo(){
#include <stdio.h>
#ifdef __CUDA_ARCH__
#define HD __host__ __device__
#else
#define HD
#endif
HD void foo() {
printf("execute foo\n");
}
__global__ void kernel(){
foo();
}
int main() {
kernel<<<1,1>>>();
cudaDeviceSynchronize();
}This code sample is ok with nvcc.
Why does clang parse device function once again when compiling for host target? And send an error for device function but the device code has already been generated. That does not make sense.
Real world problem:
Some header files have something like
#ifdef __CUDA_ARCH__
#define HD __host__ __device__
#else
#define HD
#endifWhen these header files are included in cuda source file, clang can not handle correctly. I know __CUDACC__ will be ok, but __CUDA_ARCH__ should be ok too. This is a problem caused by the design of compiler not caused by the language standard.