Skip to content

Conversation

@tcgu-amd
Copy link

Addresses ROCm/ROCm#5253.

The builtin expects a vector _Float16 of length 2, but clang does not emit errors when the wrong argument types is supplied (e.g. half2). This causes the compilation to pass but error during LTO.

This fix the issue by adding sema checks to the builtins.

@github-actions
Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:AMDGPU clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Sep 11, 2025
@llvmbot
Copy link
Member

llvmbot commented Sep 11, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Tim Gu (tcgu-amd)

Changes

Addresses ROCm/ROCm#5253.

The builtin expects a vector _Float16 of length 2, but clang does not emit errors when the wrong argument types is supplied (e.g. half2). This causes the compilation to pass but error during LTO.

This fix the issue by adding sema checks to the builtins.


Full diff: https://github.com/llvm/llvm-project/pull/158145.diff

2 Files Affected:

  • (modified) clang/include/clang/Sema/SemaAMDGPU.h (+2)
  • (modified) clang/lib/Sema/SemaAMDGPU.cpp (+32)
diff --git a/clang/include/clang/Sema/SemaAMDGPU.h b/clang/include/clang/Sema/SemaAMDGPU.h
index bac812a9d4fcf..9ca4418349fff 100644
--- a/clang/include/clang/Sema/SemaAMDGPU.h
+++ b/clang/include/clang/Sema/SemaAMDGPU.h
@@ -31,6 +31,8 @@ class SemaAMDGPU : public SemaBase {
   bool checkMovDPPFunctionCall(CallExpr *TheCall, unsigned NumArgs,
                                unsigned NumDataArgs);
 
+  bool checkAMDGCNAtomicFaddV2F16Type(CallExpr *TheCall);
+
   /// Create an AMDGPUWavesPerEUAttr attribute.
   AMDGPUFlatWorkGroupSizeAttr *
   CreateAMDGPUFlatWorkGroupSizeAttr(const AttributeCommonInfo &CI, Expr *Min,
diff --git a/clang/lib/Sema/SemaAMDGPU.cpp b/clang/lib/Sema/SemaAMDGPU.cpp
index baba503239e9f..c0f17f185982e 100644
--- a/clang/lib/Sema/SemaAMDGPU.cpp
+++ b/clang/lib/Sema/SemaAMDGPU.cpp
@@ -109,6 +109,10 @@ bool SemaAMDGPU::CheckAMDGCNBuiltinFunctionCall(unsigned BuiltinID,
   case AMDGPU::BI__builtin_amdgcn_cooperative_atomic_store_16x8B:
   case AMDGPU::BI__builtin_amdgcn_cooperative_atomic_store_8x16B:
     return checkCoopAtomicFunctionCall(TheCall, /*IsStore=*/true);
+  case AMDGPU::BI__builtin_amdgcn_global_atomic_fadd_v2f16:
+  case AMDGPU::BI__builtin_amdgcn_flat_atomic_fadd_v2f16:
+  case AMDGPU::BI__builtin_amdgcn_ds_atomic_fadd_v2f16:
+    return checkAMDGCNAtomicFaddV2F16Type(TheCall);
   default:
     return false;
   }
@@ -436,4 +440,32 @@ void SemaAMDGPU::handleAMDGPUMaxNumWorkGroupsAttr(Decl *D,
   addAMDGPUMaxNumWorkGroupsAttr(D, AL, AL.getArgAsExpr(0), YExpr, ZExpr);
 }
 
+
+bool SemaAMDGPU::checkAMDGCNAtomicFaddV2F16Type(CallExpr *TheCall) {
+  // Check that the pointer argument is a pointer to v2f16
+
+  Expr *Arg = TheCall->getArg(1);
+  QualType ArgType = Arg->getType();
+
+  // Check if it's a vector type
+  if (!ArgType->isVectorType()) {
+    Diag(Arg->getBeginLoc(), diag::err_typecheck_call_different_arg_types)
+        << "expected _Float16 vector of length 2" << ArgType
+        << Arg->getSourceRange();
+    return true;
+  }
+
+  const VectorType *VT = ArgType->getAs<VectorType>();
+
+  // Check element type (should be _Float16) and vector length (should be 2)
+  QualType ElementType = VT->getElementType();
+  if (!ElementType->isFloat16Type() || VT->getNumElements() != 2) {
+    Diag(Arg->getBeginLoc(), diag::err_typecheck_call_different_arg_types)
+        << "expected _Float16 vector of length 2" << ArgType
+        << Arg->getSourceRange();
+    return true;
+  }
+
+  return false;
+}
 } // namespace clang

@llvmbot
Copy link
Member

llvmbot commented Sep 11, 2025

@llvm/pr-subscribers-clang

Author: Tim Gu (tcgu-amd)

Changes

Addresses ROCm/ROCm#5253.

The builtin expects a vector _Float16 of length 2, but clang does not emit errors when the wrong argument types is supplied (e.g. half2). This causes the compilation to pass but error during LTO.

This fix the issue by adding sema checks to the builtins.


Full diff: https://github.com/llvm/llvm-project/pull/158145.diff

2 Files Affected:

  • (modified) clang/include/clang/Sema/SemaAMDGPU.h (+2)
  • (modified) clang/lib/Sema/SemaAMDGPU.cpp (+32)
diff --git a/clang/include/clang/Sema/SemaAMDGPU.h b/clang/include/clang/Sema/SemaAMDGPU.h
index bac812a9d4fcf..9ca4418349fff 100644
--- a/clang/include/clang/Sema/SemaAMDGPU.h
+++ b/clang/include/clang/Sema/SemaAMDGPU.h
@@ -31,6 +31,8 @@ class SemaAMDGPU : public SemaBase {
   bool checkMovDPPFunctionCall(CallExpr *TheCall, unsigned NumArgs,
                                unsigned NumDataArgs);
 
+  bool checkAMDGCNAtomicFaddV2F16Type(CallExpr *TheCall);
+
   /// Create an AMDGPUWavesPerEUAttr attribute.
   AMDGPUFlatWorkGroupSizeAttr *
   CreateAMDGPUFlatWorkGroupSizeAttr(const AttributeCommonInfo &CI, Expr *Min,
diff --git a/clang/lib/Sema/SemaAMDGPU.cpp b/clang/lib/Sema/SemaAMDGPU.cpp
index baba503239e9f..c0f17f185982e 100644
--- a/clang/lib/Sema/SemaAMDGPU.cpp
+++ b/clang/lib/Sema/SemaAMDGPU.cpp
@@ -109,6 +109,10 @@ bool SemaAMDGPU::CheckAMDGCNBuiltinFunctionCall(unsigned BuiltinID,
   case AMDGPU::BI__builtin_amdgcn_cooperative_atomic_store_16x8B:
   case AMDGPU::BI__builtin_amdgcn_cooperative_atomic_store_8x16B:
     return checkCoopAtomicFunctionCall(TheCall, /*IsStore=*/true);
+  case AMDGPU::BI__builtin_amdgcn_global_atomic_fadd_v2f16:
+  case AMDGPU::BI__builtin_amdgcn_flat_atomic_fadd_v2f16:
+  case AMDGPU::BI__builtin_amdgcn_ds_atomic_fadd_v2f16:
+    return checkAMDGCNAtomicFaddV2F16Type(TheCall);
   default:
     return false;
   }
@@ -436,4 +440,32 @@ void SemaAMDGPU::handleAMDGPUMaxNumWorkGroupsAttr(Decl *D,
   addAMDGPUMaxNumWorkGroupsAttr(D, AL, AL.getArgAsExpr(0), YExpr, ZExpr);
 }
 
+
+bool SemaAMDGPU::checkAMDGCNAtomicFaddV2F16Type(CallExpr *TheCall) {
+  // Check that the pointer argument is a pointer to v2f16
+
+  Expr *Arg = TheCall->getArg(1);
+  QualType ArgType = Arg->getType();
+
+  // Check if it's a vector type
+  if (!ArgType->isVectorType()) {
+    Diag(Arg->getBeginLoc(), diag::err_typecheck_call_different_arg_types)
+        << "expected _Float16 vector of length 2" << ArgType
+        << Arg->getSourceRange();
+    return true;
+  }
+
+  const VectorType *VT = ArgType->getAs<VectorType>();
+
+  // Check element type (should be _Float16) and vector length (should be 2)
+  QualType ElementType = VT->getElementType();
+  if (!ElementType->isFloat16Type() || VT->getNumElements() != 2) {
+    Diag(Arg->getBeginLoc(), diag::err_typecheck_call_different_arg_types)
+        << "expected _Float16 vector of length 2" << ArgType
+        << Arg->getSourceRange();
+    return true;
+  }
+
+  return false;
+}
 } // namespace clang

@shiltian
Copy link
Contributor

I'm not sure what this PR is trying to fix, but we do have type enforcement in clang/include/clang/Basic/BuiltinsAMDGPU.def?

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing tests


// Check element type (should be _Float16) and vector length (should be 2)
QualType ElementType = VT->getElementType();
if (!ElementType->isFloat16Type() || VT->getNumElements() != 2) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The builtin already knows what type it is. You should not need to add custom type checking

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comment! That's what I thought as well, but for some reason without explicit checking it here clang does not seem to catch the type mismatch. I am new to clang development so I don't know if that's expected.

// Check if it's a vector type
if (!ArgType->isVectorType()) {
Diag(Arg->getBeginLoc(), diag::err_typecheck_call_different_arg_types)
<< "expected _Float16 vector of length 2" << ArgType
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't require a custom message

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There doesn't seem to be a "expect vector" message so I resorted to this. What do you suggest I use instead? Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plus no diagnostic messages should be hardcoded strings like this.

You shouldn't need to do anything for the type checking. All of these builtins are marked with "t" for custom typechecking, but I don't see why. Can you just remove that from the builtin definition?

@tcgu-amd tcgu-amd force-pushed the users/tcgu/builtin-sema branch from 2192052 to 2790a7a Compare September 18, 2025 17:03
@shiltian
Copy link
Contributor

no test?

TARGET_BUILTIN(__builtin_amdgcn_global_atomic_fadd_v2bf16, "V2sV2s*1V2s", "t", "atomic-global-pk-add-bf16-inst")
TARGET_BUILTIN(__builtin_amdgcn_ds_atomic_fadd_v2bf16, "V2sV2s*3V2s", "t", "atomic-ds-pk-add-16-insts")
TARGET_BUILTIN(__builtin_amdgcn_ds_atomic_fadd_v2f16, "V2hV2h*3V2h", "t", "atomic-ds-pk-add-16-insts")
TARGET_BUILTIN(__builtin_amdgcn_ds_atomic_fadd_v2f16, "V2xV2x*3V2x", "n", "atomic-ds-pk-add-16-insts")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also make sure to test this with all of these similar intrinsics, the "t"s are all suspicious (I'm guessing if it was for anything, it was hacking around the address space of the pointer across the languages)

typedef _Float16 v2f16 __attribute__((ext_vector_type(2)));
typedef float v2f32 __attribute__((ext_vector_type(2)));
typedef _Float16 v4f16 __attribute__((ext_vector_type(4)));

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also test some of the cases with the type embedded in a structure like was crashing

Copy link
Author

@tcgu-amd tcgu-amd Oct 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the type was from hip though. How do I add a type from hip_fp16 to a clang test? Do I need to put the test in one of the hip/opencl folders? Thanks!

__builtin_amdgcn_ds_atomic_fadd_v2f16(ptr_v2f16, val_v4f16); // expected-error{{passing 'v4f16'}}
__builtin_amdgcn_ds_atomic_fadd_v2f16(ptr_v2f16); // expected-error{{too few arguments to function call}}
__builtin_amdgcn_ds_atomic_fadd_v2f16(ptr_v2f16, val, val); // expected-error{{too many arguments to function call}}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bf16 vector cases and f32 cases are also marked with t, and probably broken in the same way

@@ -0,0 +1,46 @@
// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -verify %s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you verify we have test coverage using OpenCL plus one of the C/C++/HIP variants

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:AMDGPU clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants