-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[AMDGPU] Allow intrinsics llvm.amdgcn.buffer.wbinvl1 etc for GFX940+ #111078
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[AMDGPU] Allow intrinsics llvm.amdgcn.buffer.wbinvl1 etc for GFX940+ #111078
Conversation
Commit 5927c67 obsoletes instructions buffer_wbinvl1, buffer_wbinvl1_vol, etc. for GFX940+. To prevent causing problems for existing apps that use related intrinsics such as llvm.amdgcn.buffer.wbinvl1, this patch allows use of those intrinsics for GFX940+.
|
@llvm/pr-subscribers-backend-amdgpu Author: Jun Wang (jwanggit86) ChangesCommit 5927c67 obsoletes instructions buffer_wbinvl1, buffer_wbinvl1_vol, etc. for GFX940+. To prevent causing problems for existing apps that use related intrinsics such as Full diff: https://github.com/llvm/llvm-project/pull/111078.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/SIInstructions.td b/llvm/lib/Target/AMDGPU/SIInstructions.td
index 8073aca7f197fb..aa6aafd87f506c 100644
--- a/llvm/lib/Target/AMDGPU/SIInstructions.td
+++ b/llvm/lib/Target/AMDGPU/SIInstructions.td
@@ -2348,6 +2348,18 @@ def : GCNPat <
(V_RCP_IFLAG_F32_e32 (V_CVT_F32_U32_e32 $src0))))
>;
+let SubtargetPredicate = isGFX940Plus in {
+def : GCNPat <
+ (int_amdgcn_buffer_wbinvl1),
+ (BUFFER_INV)
+>;
+
+def : GCNPat <
+ (int_amdgcn_buffer_wbinvl1_vol),
+ (BUFFER_INV)
+>;
+}
+
//===----------------------------------------------------------------------===//
// VOP3 Patterns
//===----------------------------------------------------------------------===//
diff --git a/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.buffer.wbinvl1.ll b/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.buffer.wbinvl1.ll
new file mode 100644
index 00000000000000..87990f46a8fc42
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.buffer.wbinvl1.ll
@@ -0,0 +1,72 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 2
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -verify-machineinstrs < %s | FileCheck %s
+
+; Allow intrinsics llvm.amdgcn.buffer.wbinvl1 and llvm.amdgcn.buffer.wbinvl1.vol to be
+; lowered to the instruction buffer_inv for GFX940+.
+;
+declare void @llvm.amdgcn.buffer.wbinvl1()
+declare void @llvm.amdgcn.buffer.wbinvl1.vol()
+
+define void @test_wbinvl1_gfx908() #0 {
+; CHECK-LABEL: test_wbinvl1_gfx908:
+; CHECK: ; %bb.0:
+; CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; CHECK-NEXT: buffer_wbinvl1
+; CHECK-NEXT: s_setpc_b64 s[30:31]
+ call void @llvm.amdgcn.buffer.wbinvl1()
+ ret void
+}
+
+define void @test_wbinvl1_vol_gfx908() #0 {
+; CHECK-LABEL: test_wbinvl1_vol_gfx908:
+; CHECK: ; %bb.0:
+; CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; CHECK-NEXT: buffer_wbinvl1_vol
+; CHECK-NEXT: s_setpc_b64 s[30:31]
+ call void @llvm.amdgcn.buffer.wbinvl1.vol()
+ ret void
+}
+
+define void @test_wbinvl1_gfx90a() #1 {
+; CHECK-LABEL: test_wbinvl1_gfx90a:
+; CHECK: ; %bb.0:
+; CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; CHECK-NEXT: buffer_wbinvl1
+; CHECK-NEXT: s_setpc_b64 s[30:31]
+ call void @llvm.amdgcn.buffer.wbinvl1()
+ ret void
+}
+
+define void @test_wbinvl1_vol_gfx90a() #1 {
+; CHECK-LABEL: test_wbinvl1_vol_gfx90a:
+; CHECK: ; %bb.0:
+; CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; CHECK-NEXT: buffer_wbinvl1_vol
+; CHECK-NEXT: s_setpc_b64 s[30:31]
+ call void @llvm.amdgcn.buffer.wbinvl1.vol()
+ ret void
+}
+
+define void @test_wbinvl1_gfx940() #2 {
+; CHECK-LABEL: test_wbinvl1_gfx940:
+; CHECK: ; %bb.0:
+; CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; CHECK-NEXT: buffer_inv
+; CHECK-NEXT: s_setpc_b64 s[30:31]
+ call void @llvm.amdgcn.buffer.wbinvl1()
+ ret void
+}
+
+define void @test_wbinvl1_vol_gfx940() #2 {
+; CHECK-LABEL: test_wbinvl1_vol_gfx940:
+; CHECK: ; %bb.0:
+; CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; CHECK-NEXT: buffer_inv
+; CHECK-NEXT: s_setpc_b64 s[30:31]
+ call void @llvm.amdgcn.buffer.wbinvl1.vol()
+ ret void
+}
+
+attributes #0 = { nounwind "target-cpu"="gfx908" }
+attributes #1 = { nounwind "target-cpu"="gfx90a" }
+attributes #2 = { nounwind "target-cpu"="gfx940" }
|
arsenm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the opposite of what we should do. The intrinsics should fail to select. Clang should enforce the corresponding builtin is available for the subtarget
|
Are you suggesting fixing clang to disallow those intrinsics for gfx940? |
Yes |
Commit 5927c67 obsoletes instructions buffer_wbinvl1, buffer_wbinvl1_vol, etc. for GFX940+. To prevent causing problems for existing apps that use related intrinsics such as
llvm.amdgcn.buffer.wbinvl1, this patch allows use of those intrinsics for GFX940+.