-
Notifications
You must be signed in to change notification settings - Fork 15k
[AMDGPU] Add sema check for global_atomic_fadd_v2f16 builtin #158145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
|
@llvm/pr-subscribers-backend-amdgpu Author: Tim Gu (tcgu-amd) ChangesAddresses ROCm/ROCm#5253. The builtin expects a vector _Float16 of length 2, but clang does not emit errors when the wrong argument types is supplied (e.g. half2). This causes the compilation to pass but error during LTO. This fix the issue by adding sema checks to the builtins. Full diff: https://github.com/llvm/llvm-project/pull/158145.diff 2 Files Affected:
diff --git a/clang/include/clang/Sema/SemaAMDGPU.h b/clang/include/clang/Sema/SemaAMDGPU.h
index bac812a9d4fcf..9ca4418349fff 100644
--- a/clang/include/clang/Sema/SemaAMDGPU.h
+++ b/clang/include/clang/Sema/SemaAMDGPU.h
@@ -31,6 +31,8 @@ class SemaAMDGPU : public SemaBase {
bool checkMovDPPFunctionCall(CallExpr *TheCall, unsigned NumArgs,
unsigned NumDataArgs);
+ bool checkAMDGCNAtomicFaddV2F16Type(CallExpr *TheCall);
+
/// Create an AMDGPUWavesPerEUAttr attribute.
AMDGPUFlatWorkGroupSizeAttr *
CreateAMDGPUFlatWorkGroupSizeAttr(const AttributeCommonInfo &CI, Expr *Min,
diff --git a/clang/lib/Sema/SemaAMDGPU.cpp b/clang/lib/Sema/SemaAMDGPU.cpp
index baba503239e9f..c0f17f185982e 100644
--- a/clang/lib/Sema/SemaAMDGPU.cpp
+++ b/clang/lib/Sema/SemaAMDGPU.cpp
@@ -109,6 +109,10 @@ bool SemaAMDGPU::CheckAMDGCNBuiltinFunctionCall(unsigned BuiltinID,
case AMDGPU::BI__builtin_amdgcn_cooperative_atomic_store_16x8B:
case AMDGPU::BI__builtin_amdgcn_cooperative_atomic_store_8x16B:
return checkCoopAtomicFunctionCall(TheCall, /*IsStore=*/true);
+ case AMDGPU::BI__builtin_amdgcn_global_atomic_fadd_v2f16:
+ case AMDGPU::BI__builtin_amdgcn_flat_atomic_fadd_v2f16:
+ case AMDGPU::BI__builtin_amdgcn_ds_atomic_fadd_v2f16:
+ return checkAMDGCNAtomicFaddV2F16Type(TheCall);
default:
return false;
}
@@ -436,4 +440,32 @@ void SemaAMDGPU::handleAMDGPUMaxNumWorkGroupsAttr(Decl *D,
addAMDGPUMaxNumWorkGroupsAttr(D, AL, AL.getArgAsExpr(0), YExpr, ZExpr);
}
+
+bool SemaAMDGPU::checkAMDGCNAtomicFaddV2F16Type(CallExpr *TheCall) {
+ // Check that the pointer argument is a pointer to v2f16
+
+ Expr *Arg = TheCall->getArg(1);
+ QualType ArgType = Arg->getType();
+
+ // Check if it's a vector type
+ if (!ArgType->isVectorType()) {
+ Diag(Arg->getBeginLoc(), diag::err_typecheck_call_different_arg_types)
+ << "expected _Float16 vector of length 2" << ArgType
+ << Arg->getSourceRange();
+ return true;
+ }
+
+ const VectorType *VT = ArgType->getAs<VectorType>();
+
+ // Check element type (should be _Float16) and vector length (should be 2)
+ QualType ElementType = VT->getElementType();
+ if (!ElementType->isFloat16Type() || VT->getNumElements() != 2) {
+ Diag(Arg->getBeginLoc(), diag::err_typecheck_call_different_arg_types)
+ << "expected _Float16 vector of length 2" << ArgType
+ << Arg->getSourceRange();
+ return true;
+ }
+
+ return false;
+}
} // namespace clang
|
|
@llvm/pr-subscribers-clang Author: Tim Gu (tcgu-amd) ChangesAddresses ROCm/ROCm#5253. The builtin expects a vector _Float16 of length 2, but clang does not emit errors when the wrong argument types is supplied (e.g. half2). This causes the compilation to pass but error during LTO. This fix the issue by adding sema checks to the builtins. Full diff: https://github.com/llvm/llvm-project/pull/158145.diff 2 Files Affected:
diff --git a/clang/include/clang/Sema/SemaAMDGPU.h b/clang/include/clang/Sema/SemaAMDGPU.h
index bac812a9d4fcf..9ca4418349fff 100644
--- a/clang/include/clang/Sema/SemaAMDGPU.h
+++ b/clang/include/clang/Sema/SemaAMDGPU.h
@@ -31,6 +31,8 @@ class SemaAMDGPU : public SemaBase {
bool checkMovDPPFunctionCall(CallExpr *TheCall, unsigned NumArgs,
unsigned NumDataArgs);
+ bool checkAMDGCNAtomicFaddV2F16Type(CallExpr *TheCall);
+
/// Create an AMDGPUWavesPerEUAttr attribute.
AMDGPUFlatWorkGroupSizeAttr *
CreateAMDGPUFlatWorkGroupSizeAttr(const AttributeCommonInfo &CI, Expr *Min,
diff --git a/clang/lib/Sema/SemaAMDGPU.cpp b/clang/lib/Sema/SemaAMDGPU.cpp
index baba503239e9f..c0f17f185982e 100644
--- a/clang/lib/Sema/SemaAMDGPU.cpp
+++ b/clang/lib/Sema/SemaAMDGPU.cpp
@@ -109,6 +109,10 @@ bool SemaAMDGPU::CheckAMDGCNBuiltinFunctionCall(unsigned BuiltinID,
case AMDGPU::BI__builtin_amdgcn_cooperative_atomic_store_16x8B:
case AMDGPU::BI__builtin_amdgcn_cooperative_atomic_store_8x16B:
return checkCoopAtomicFunctionCall(TheCall, /*IsStore=*/true);
+ case AMDGPU::BI__builtin_amdgcn_global_atomic_fadd_v2f16:
+ case AMDGPU::BI__builtin_amdgcn_flat_atomic_fadd_v2f16:
+ case AMDGPU::BI__builtin_amdgcn_ds_atomic_fadd_v2f16:
+ return checkAMDGCNAtomicFaddV2F16Type(TheCall);
default:
return false;
}
@@ -436,4 +440,32 @@ void SemaAMDGPU::handleAMDGPUMaxNumWorkGroupsAttr(Decl *D,
addAMDGPUMaxNumWorkGroupsAttr(D, AL, AL.getArgAsExpr(0), YExpr, ZExpr);
}
+
+bool SemaAMDGPU::checkAMDGCNAtomicFaddV2F16Type(CallExpr *TheCall) {
+ // Check that the pointer argument is a pointer to v2f16
+
+ Expr *Arg = TheCall->getArg(1);
+ QualType ArgType = Arg->getType();
+
+ // Check if it's a vector type
+ if (!ArgType->isVectorType()) {
+ Diag(Arg->getBeginLoc(), diag::err_typecheck_call_different_arg_types)
+ << "expected _Float16 vector of length 2" << ArgType
+ << Arg->getSourceRange();
+ return true;
+ }
+
+ const VectorType *VT = ArgType->getAs<VectorType>();
+
+ // Check element type (should be _Float16) and vector length (should be 2)
+ QualType ElementType = VT->getElementType();
+ if (!ElementType->isFloat16Type() || VT->getNumElements() != 2) {
+ Diag(Arg->getBeginLoc(), diag::err_typecheck_call_different_arg_types)
+ << "expected _Float16 vector of length 2" << ArgType
+ << Arg->getSourceRange();
+ return true;
+ }
+
+ return false;
+}
} // namespace clang
|
|
I'm not sure what this PR is trying to fix, but we do have type enforcement in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing tests
clang/lib/Sema/SemaAMDGPU.cpp
Outdated
|
|
||
| // Check element type (should be _Float16) and vector length (should be 2) | ||
| QualType ElementType = VT->getElementType(); | ||
| if (!ElementType->isFloat16Type() || VT->getNumElements() != 2) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The builtin already knows what type it is. You should not need to add custom type checking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comment! That's what I thought as well, but for some reason without explicit checking it here clang does not seem to catch the type mismatch. I am new to clang development so I don't know if that's expected.
clang/lib/Sema/SemaAMDGPU.cpp
Outdated
| // Check if it's a vector type | ||
| if (!ArgType->isVectorType()) { | ||
| Diag(Arg->getBeginLoc(), diag::err_typecheck_call_different_arg_types) | ||
| << "expected _Float16 vector of length 2" << ArgType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't require a custom message
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There doesn't seem to be a "expect vector" message so I resorted to this. What do you suggest I use instead? Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Plus no diagnostic messages should be hardcoded strings like this.
You shouldn't need to do anything for the type checking. All of these builtins are marked with "t" for custom typechecking, but I don't see why. Can you just remove that from the builtin definition?
2192052 to
2790a7a
Compare
|
no test? |
| TARGET_BUILTIN(__builtin_amdgcn_global_atomic_fadd_v2bf16, "V2sV2s*1V2s", "t", "atomic-global-pk-add-bf16-inst") | ||
| TARGET_BUILTIN(__builtin_amdgcn_ds_atomic_fadd_v2bf16, "V2sV2s*3V2s", "t", "atomic-ds-pk-add-16-insts") | ||
| TARGET_BUILTIN(__builtin_amdgcn_ds_atomic_fadd_v2f16, "V2hV2h*3V2h", "t", "atomic-ds-pk-add-16-insts") | ||
| TARGET_BUILTIN(__builtin_amdgcn_ds_atomic_fadd_v2f16, "V2xV2x*3V2x", "n", "atomic-ds-pk-add-16-insts") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also make sure to test this with all of these similar intrinsics, the "t"s are all suspicious (I'm guessing if it was for anything, it was hacking around the address space of the pointer across the languages)
| typedef _Float16 v2f16 __attribute__((ext_vector_type(2))); | ||
| typedef float v2f32 __attribute__((ext_vector_type(2))); | ||
| typedef _Float16 v4f16 __attribute__((ext_vector_type(4))); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also test some of the cases with the type embedded in a structure like was crashing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the type was from hip though. How do I add a type from hip_fp16 to a clang test? Do I need to put the test in one of the hip/opencl folders? Thanks!
| __builtin_amdgcn_ds_atomic_fadd_v2f16(ptr_v2f16, val_v4f16); // expected-error{{passing 'v4f16'}} | ||
| __builtin_amdgcn_ds_atomic_fadd_v2f16(ptr_v2f16); // expected-error{{too few arguments to function call}} | ||
| __builtin_amdgcn_ds_atomic_fadd_v2f16(ptr_v2f16, val, val); // expected-error{{too many arguments to function call}} | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bf16 vector cases and f32 cases are also marked with t, and probably broken in the same way
| @@ -0,0 +1,46 @@ | |||
| // RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -verify %s | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you verify we have test coverage using OpenCL plus one of the C/C++/HIP variants
Addresses ROCm/ROCm#5253.
The builtin expects a vector _Float16 of length 2, but clang does not emit errors when the wrong argument types is supplied (e.g. half2). This causes the compilation to pass but error during LTO.
This fix the issue by adding sema checks to the builtins.