-
Notifications
You must be signed in to change notification settings - Fork 14.9k
[X86] Fix X86 conditional load/store optimization for non-constant operands #163353
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-x86 Author: None (azwolski) ChangesThis PR fixes a bug in combineX86CloadCstore where an optimization was being applied too broadly, causing incorrect code generation. Without any assumptions about // res, flags2 = sub 0, (and (xor X, -1), Y)
// cload/cstore ..., cond_ne, flag2
// ->
// res, flags2 = sub 0, (and X, Y)
// cload/cstore ..., cond_e, flag2 We can restrict the optimization to most important case, so only apply when Basline test: #163354 Full diff: https://github.com/llvm/llvm-project/pull/163353.diff 2 Files Affected:
diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index eea84a2841764..dabd898819e26 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -58332,11 +58332,12 @@ static SDValue combineX86CloadCstore(SDNode *N, SelectionDAG &DAG) {
} else if (Op1.getOpcode() == ISD::AND && Sub.getValue(0).use_empty()) {
SDValue Src = Op1;
SDValue Op10 = Op1.getOperand(0);
- if (Op10.getOpcode() == ISD::XOR && isAllOnesConstant(Op10.getOperand(1))) {
- // res, flags2 = sub 0, (and (xor X, -1), Y)
+ if (Op10.getOpcode() == ISD::XOR && isAllOnesConstant(Op10.getOperand(1)) &&
+ llvm::isOneConstant(Op1.getOperand(1))) {
+ // res, flags2 = sub 0, (and (xor X, -1), 1)
// cload/cstore ..., cond_ne, flag2
// ->
- // res, flags2 = sub 0, (and X, Y)
+ // res, flags2 = sub 0, (and X, 1)
// cload/cstore ..., cond_e, flag2
Src = DAG.getNode(ISD::AND, DL, Op1.getValueType(), Op10.getOperand(0),
Op1.getOperand(1));
diff --git a/llvm/test/CodeGen/X86/apx/cf.ll b/llvm/test/CodeGen/X86/apx/cf.ll
index b2651e91134ee..4c689d9d0a7a9 100644
--- a/llvm/test/CodeGen/X86/apx/cf.ll
+++ b/llvm/test/CodeGen/X86/apx/cf.ll
@@ -230,6 +230,25 @@ entry:
ret void
}
+define void @and_cond(i32 %a, i1 %b) {
+; CHECK-LABEL: and_cond:
+; CHECK: # %bb.0: # %entry
+; CHECK-NEXT: testl %edi, %edi
+; CHECK-NEXT: setg %al
+; CHECK-NEXT: notb %sil
+; CHECK-NEXT: xorl %ecx, %ecx
+; CHECK-NEXT: testb %al, %sil
+; CHECK-NEXT: cfcmovnel %ecx, 0
+; CHECK-NEXT: retq
+entry:
+ %0 = icmp sgt i32 %a, 0
+ %1 = xor i1 %b, true
+ %3 = and i1 %1, %0
+ %4 = insertelement <1 x i1> zeroinitializer, i1 %3, i64 0
+ call void @llvm.masked.store.v1i32.p0(<1 x i32> zeroinitializer, ptr null, i32 1, <1 x i1> %4)
+ ret void
+}
+
define i64 @redundant_test(i64 %num, ptr %p1, i64 %in) {
; CHECK-LABEL: redundant_test:
; CHECK: # %bb.0:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
…ug (#163354) This PR adds a baseline test that exposes a bug in the current `combineX86CloadCstore` optimization. The generated assembly demonstrates incorrect behavior when the optimization is applied without proper constraints. Without any assumptions about `X` this transformation is only valid when `Y` is a non zero power of two/single-bit mask. ```cpp // res, flags2 = sub 0, (and (xor X, -1), Y) // cload/cstore ..., cond_ne, flag2 // -> // res, flags2 = sub 0, (and X, Y) // cload/cstore ..., cond_e, flag2 ``` In the provided test case, the value in `%al` is unknown at compile time. If `%al` contains `0`, the optimization cannot be applied, because `(and (xor X, -1), 0)` is not equal to `(and X, 0)`. Fix: #163353
…imization bug (#163354) This PR adds a baseline test that exposes a bug in the current `combineX86CloadCstore` optimization. The generated assembly demonstrates incorrect behavior when the optimization is applied without proper constraints. Without any assumptions about `X` this transformation is only valid when `Y` is a non zero power of two/single-bit mask. ```cpp // res, flags2 = sub 0, (and (xor X, -1), Y) // cload/cstore ..., cond_ne, flag2 // -> // res, flags2 = sub 0, (and X, Y) // cload/cstore ..., cond_e, flag2 ``` In the provided test case, the value in `%al` is unknown at compile time. If `%al` contains `0`, the optimization cannot be applied, because `(and (xor X, -1), 0)` is not equal to `(and X, 0)`. Fix: llvm/llvm-project#163353
This PR fixes a bug in combineX86CloadCstore where an optimization was being applied too broadly, causing incorrect code generation.
Without any assumptions about
X
this transformation is only valid whenY
is a non zero power of two/single-bit mask.We can restrict the optimization to most important case, so only apply when
llvm::isOneConstant(Op1.getOperand(1))
. It might be not trivial to find code that creates a SelectionDag with other values ofY
.Basline test: #163354