Skip to content

Commit a0af7b8

Browse files
authored
AMDGPU: llvm.amdgcn.inverse.ballot needs to be convergent (#155725)
It is only defined for uniform inputs (instruction selection inserts v_readfirstlane as necessary).
1 parent deb851c commit a0af7b8

File tree

2 files changed

+6
-2
lines changed

2 files changed

+6
-2
lines changed

llvm/include/llvm/IR/IntrinsicsAMDGPU.td

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2434,9 +2434,13 @@ def int_amdgcn_ballot :
24342434
Intrinsic<[llvm_anyint_ty], [llvm_i1_ty],
24352435
[IntrNoMem, IntrConvergent, IntrWillReturn, IntrNoCallback, IntrNoFree]>;
24362436

2437+
// Inverse of ballot: return the bit corresponding to the current lane from the
2438+
// given mask.
2439+
//
2440+
// This is only defined for dynamically uniform masks and therefore convergent.
24372441
def int_amdgcn_inverse_ballot :
24382442
Intrinsic<[llvm_i1_ty], [llvm_anyint_ty],
2439-
[IntrNoMem, IntrWillReturn, IntrNoCallback, IntrNoFree]>;
2443+
[IntrNoMem, IntrConvergent, IntrWillReturn, IntrNoCallback, IntrNoFree]>;
24402444

24412445
// Lowers to S_BITREPLICATE_B64_B32.
24422446
// The argument must be uniform; otherwise, the result is undefined.

llvm/test/CodeGen/AMDGPU/llvm.amdgcn.inverse.ballot.i32.ll

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
; RUN: not llc -mtriple=amdgcn -mcpu=gfx1100 -mattr=+wavefrontsize64 -global-isel=1 < %s 2>&1 | FileCheck -check-prefix=GISEL-ERR %s
66
; RUN: not --crash llc -mtriple=amdgcn -mcpu=gfx1100 -mattr=+wavefrontsize64 -global-isel=0 < %s 2>&1 | FileCheck -check-prefix=SDAG-ERR %s
77

8-
; GISEL-ERR: LLVM ERROR: cannot select: {{.*}} = G_INTRINSIC intrinsic(@llvm.amdgcn.inverse.ballot)
8+
; GISEL-ERR: LLVM ERROR: cannot select: {{.*}} = G_INTRINSIC_CONVERGENT intrinsic(@llvm.amdgcn.inverse.ballot)
99
; SDAG-ERR: LLVM ERROR: Cannot select: intrinsic %llvm.amdgcn.inverse.ballot
1010

1111
declare i1 @llvm.amdgcn.inverse.ballot(i32)

0 commit comments

Comments
 (0)