Skip to content

Conversation

azwolski
Copy link
Contributor

@azwolski azwolski commented Oct 14, 2025

This PR adds a baseline test that exposes a bug in the current combineX86CloadCstore optimization. The generated assembly demonstrates incorrect behavior when the optimization is applied without proper constraints.

Without any assumptions about X this transformation is only valid when Y is a non zero power of two/single-bit mask.

      // res, flags2 = sub 0, (and (xor X, -1), Y)
      // cload/cstore ..., cond_ne, flag2
      // ->
      // res, flags2 = sub 0, (and X, Y)
      // cload/cstore ..., cond_e, flag2

In the provided test case, the value in %al is unknown at compile time. If %al contains 0, the optimization cannot be applied, because (and (xor X, -1), 0) is not equal to (and X, 0).

Fix: #163353

@llvmbot
Copy link
Member

llvmbot commented Oct 14, 2025

@llvm/pr-subscribers-backend-x86

Author: None (azwolski)

Changes

This PR adds a baseline test that exposes a bug in the current combineX86CloadCstore optimization. The generated assembly demonstrates incorrect behavior when the optimization is applied without proper constraints.

Without any assumptions about X this transformation is only valid when Y is a non zero power of two/single-bit mask.

      // res, flags2 = sub 0, (and (xor X, -1), Y)
      // cload/cstore ..., cond_ne, flag2
      // ->
      // res, flags2 = sub 0, (and X, Y)
      // cload/cstore ..., cond_e, flag2

In the provided test case, the value in %al is unknown at compile time. If %al contains 0, the optimization cannot be applied, because (and (xor X, -1), 0) is not equal to ~(and X, 0).

Fix: #163353


Full diff: https://github.com/llvm/llvm-project/pull/163354.diff

1 Files Affected:

  • (modified) llvm/test/CodeGen/X86/apx/cf.ll (+18)
diff --git a/llvm/test/CodeGen/X86/apx/cf.ll b/llvm/test/CodeGen/X86/apx/cf.ll
index b2651e91134ee..4cce2226a91fb 100644
--- a/llvm/test/CodeGen/X86/apx/cf.ll
+++ b/llvm/test/CodeGen/X86/apx/cf.ll
@@ -230,6 +230,24 @@ entry:
   ret void
 }
 
+define void @and_cond(i32 %a, i1 %b) {
+; CHECK-LABEL: and_cond:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    testl %edi, %edi
+; CHECK-NEXT:    setg %al
+; CHECK-NEXT:    xorl %ecx, %ecx
+; CHECK-NEXT:    testb %al, %sil
+; CHECK-NEXT:    cfcmovel %ecx, 0
+; CHECK-NEXT:    retq
+entry:
+  %0 = icmp sgt i32 %a, 0
+  %1 = xor i1 %b, true
+  %3 = and i1 %1, %0
+  %4 = insertelement <1 x i1> zeroinitializer, i1 %3, i64 0
+  call void @llvm.masked.store.v1i32.p0(<1 x i32> zeroinitializer, ptr null, i32 1, <1 x i1> %4)
+  ret void
+}
+
 define i64 @redundant_test(i64 %num, ptr %p1, i64 %in) {
 ; CHECK-LABEL: redundant_test:
 ; CHECK:       # %bb.0:

Copy link
Contributor

@phoebewang phoebewang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - cheers!

@RKSimon RKSimon enabled auto-merge (squash) October 15, 2025 10:41
@RKSimon RKSimon disabled auto-merge October 15, 2025 10:42
@RKSimon RKSimon enabled auto-merge (squash) October 15, 2025 10:42
@RKSimon RKSimon merged commit fcd7b8d into llvm:main Oct 15, 2025
13 of 14 checks passed
RKSimon pushed a commit that referenced this pull request Oct 15, 2025
…erands (#163353)

This PR fixes a bug in combineX86CloadCstore where an optimization was
being applied too broadly, causing incorrect code generation.

Without any assumptions about `X` this transformation is only valid when
`Y` is a non zero power of two/single-bit mask.
```cpp
      // res, flags2 = sub 0, (and (xor X, -1), Y)
      // cload/cstore ..., cond_ne, flag2
      // ->
      // res, flags2 = sub 0, (and X, Y)
      // cload/cstore ..., cond_e, flag2
```

We can restrict the optimization to most important case, so only apply
when `llvm::isOneConstant(Op1.getOperand(1))`. It might be not trivial
to find code that creates a SelectionDag with other values of `Y`.

Basline test: #163354
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Oct 15, 2025
…constant operands (#163353)

This PR fixes a bug in combineX86CloadCstore where an optimization was
being applied too broadly, causing incorrect code generation.

Without any assumptions about `X` this transformation is only valid when
`Y` is a non zero power of two/single-bit mask.
```cpp
      // res, flags2 = sub 0, (and (xor X, -1), Y)
      // cload/cstore ..., cond_ne, flag2
      // ->
      // res, flags2 = sub 0, (and X, Y)
      // cload/cstore ..., cond_e, flag2
```

We can restrict the optimization to most important case, so only apply
when `llvm::isOneConstant(Op1.getOperand(1))`. It might be not trivial
to find code that creates a SelectionDag with other values of `Y`.

Basline test: llvm/llvm-project#163354
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants