[PowerPC] Optimize not equal compares against zero vectors #150422

Himadhith · 2025-07-24T13:56:40Z

This patch is for special cases involving 0 vectors. During the comparison of vector operands, current code generation checks with vcmpequh (vector compare equal unsigned halfword) followed by a negation xxlnor (VSX Vector Logical NOR XX3-form).

This means that for the special case, instead of using vcmpequh and then negating the result, we can directly use vcmpgtuh (vector compare greater than unsigned halfword).

As a result the negation is avoided since the only condition where this will be false is for 0 as it is an unsigned halfword.

llvmbot · 2025-07-24T13:57:10Z

@llvm/pr-subscribers-backend-powerpc

Author: None (Himadhith)

Changes

This patch is for special cases involving 0 vectors. During the comparison of vector operands, current code generation checks with vcmpequh (vector compare equal unsigned halfword) followed by a negation xxlnor (VSX Vector Logical NOR XX3-form).

This means that for the special case, instead of using vcmpequh and then negating the result, we can directly use vcmpgtuh (vector compare greater than unsigned halfword).

As a result the negation is avoided since the only condition where this will be false is for 0 as it is an unsigned halfword.

Full diff: https://github.com/llvm/llvm-project/pull/150422.diff

2 Files Affected:

(modified) llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp (+8-1)
(modified) llvm/test/CodeGen/PowerPC/check-zero-vector.ll (+27-33)

diff --git a/llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp b/llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
index 415164fc9e2cb..9d4289d241cd1 100644
--- a/llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
+++ b/llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
@@ -4569,7 +4569,14 @@ bool PPCDAGToDAGISel::trySETCC(SDNode *N) {
   if (!IsStrict && LHS.getValueType().isVector()) {
     if (Subtarget->hasSPE())
       return false;
-
+    // Check if RHS or LHS vector operands are 0 and change SETNE to either
+    // SETUGT or SETULT.
+    if (CC == ISD::SETNE) {
+      if (ISD::isBuildVectorAllZeros(RHS.getNode()))
+        CC = ISD::SETUGT;
+      else if (ISD::isBuildVectorAllZeros(LHS.getNode()))
+        CC = ISD::SETULT;
+    }
     EVT VecVT = LHS.getValueType();
     bool Swap, Negate;
     unsigned int VCmpInst =
diff --git a/llvm/test/CodeGen/PowerPC/check-zero-vector.ll b/llvm/test/CodeGen/PowerPC/check-zero-vector.ll
index 59173e22edf26..27fe863f01438 100644
--- a/llvm/test/CodeGen/PowerPC/check-zero-vector.ll
+++ b/llvm/test/CodeGen/PowerPC/check-zero-vector.ll
@@ -19,23 +19,21 @@ define i32 @test_Greater_than(ptr %colauths, i32 signext %ncols) {
 ; POWERPC_64LE:  .LBB0_6: # %vector.body
 ; POWERPC_64LE-NEXT:    #
 ; POWERPC_64LE-NEXT:    lxv [[R1:[0-9]+]], -64(4)
-; POWERPC_64LE-NEXT:    vcmpequh [[R2:[0-9]+]], [[R2]], [[R3:[0-9]+]]
-; POWERPC_64LE-NEXT:    xxlnor [[R1]], [[R1]], [[R1]]
-; POWERPC_64LE-NEXT:    vmrghh [[R4:[0-9]+]], [[R2]], [[R2]]
-; POWERPC_64LE-NEXT:    vmrglh [[R2]], [[R2]], [[R2]]
-; POWERPC_64LE-NEXT:    xxland [[R5:[0-9]+]], [[R5]], [[R6:[0-9]+]]
-; POWERPC_64LE-NEXT:    xxland [[R1]], [[R1]], [[R6]]
-; POWERPC_64LE-NEXT:    vadduwm [[R7:[0-9]+]], [[R7]], [[R4]]
+; POWERPC_64LE-NEXT:    vcmpgtuh [[R2:[0-9]+]], [[R2]], [[R3:[0-9]+]]
+; POWERPC_64LE-NEXT:    vmrglh [[R4:[0-9]+]], [[R2]], [[R2]]
+; POWERPC_64LE-NEXT:    vmrghh [[R2]], [[R2]], [[R2]]
+; POWERPC_64LE-NEXT:    xxland [[R1]], [[R1]], [[R5:[0-9]+]]
+; POWERPC_64LE-NEXT:    xxland [[R6:[0-9]+]], [[R6]], [[R5]]
+; POWERPC_64LE-NEXT:    vadduwm [[R7:[0-9]+]], [[R7]], [[R2]]
 ; POWERPC_64LE:  .LBB0_10: # %vec.epilog.vector.body
 ; POWERPC_64LE-NEXT:    #
 ; POWERPC_64LE-NEXT:    lxv [[R8:[0-9]+]], 0(4)
 ; POWERPC_64LE-NEXT:    addi 4, 4, 16
-; POWERPC_64LE-NEXT:    vcmpequh [[R9:[0-9]+]], [[R9]], [[R10:[0-9]+]]
-; POWERPC_64LE-NEXT:    xxlnor [[R8]], [[R8]], [[R8]]
+; POWERPC_64LE-NEXT:    vcmpgtuh [[R9:[0-9]+]], [[R9]], [[R10:[0-9]+]]
 ; POWERPC_64LE-NEXT:    vmrglh [[R11:[0-9]+]], [[R9]], [[R9]]
 ; POWERPC_64LE-NEXT:    vmrghh [[R9]], [[R9]], [[R9]]
-; POWERPC_64LE-NEXT:    xxland [[R12:[0-9]+]], [[R12]], [[R6]]
-; POWERPC_64LE-NEXT:    xxland [[R8]], [[R8]], [[R6]]
+; POWERPC_64LE-NEXT:    xxland [[R12:[0-9]+]], [[R12]], [[R5]]
+; POWERPC_64LE-NEXT:    xxland [[R8]], [[R8]], [[R5]]
 ; POWERPC_64LE-NEXT:    vadduwm [[R7]], [[R7]], [[R9]]
 ; POWERPC_64LE-NEXT:    vadduwm [[R3]], [[R3]], [[R11]]
 ; POWERPC_64LE-NEXT:    bdnz .LBB0_10
@@ -45,19 +43,17 @@ define i32 @test_Greater_than(ptr %colauths, i32 signext %ncols) {
 ; POWERPC_64:  L..BB0_6: # %vector.body
 ; POWERPC_64-NEXT:    #
 ; POWERPC_64-NEXT:    lxv [[R1:[0-9]+]], -64(4)
-; POWERPC_64-NEXT:    vcmpequh [[R2:[0-9]+]], [[R2]], [[R3:[0-9]+]]
-; POWERPC_64-NEXT:    xxlnor [[R1]], [[R1]], [[R1]]
-; POWERPC_64-NEXT:    vmrglh [[R4:[0-9]+]], [[R2]], [[R2]]
-; POWERPC_64-NEXT:    vmrghh [[R2]], [[R2]], [[R2]]
+; POWERPC_64-NEXT:    vcmpgtuh [[R2:[0-9]+]], [[R18:[0-9]+]], [[R3:[0-9]+]]
+; POWERPC_64-NEXT:    vmrghh [[R4:[0-9]+]], [[R2]], [[R2]]
+; POWERPC_64-NEXT:    vmrglh [[R2]], [[R2]], [[R2]]
 ; POWERPC_64-NEXT:    xxland [[R5:[0-9]+]], [[R5]], [[R6:[0-9]+]]
 ; POWERPC_64-NEXT:    xxland [[R1]], [[R1]], [[R6]]
-; POWERPC_64-NEXT:    vadduwm [[R7:[0-9]+]], [[R7]], [[R4]]
+; POWERPC_64-NEXT:    vadduwm [[R7:[0-9]+]], [[R7]], [[R2]]
 ; POWERPC_64:  L..BB0_10: # %vec.epilog.vector.body
 ; POWERPC_64-NEXT:    #
 ; POWERPC_64-NEXT:    lxv [[R8:[0-9]+]], 0(4)
 ; POWERPC_64-NEXT:    addi 4, 4, 16
-; POWERPC_64-NEXT:    vcmpequh [[R9:[0-9]+]], [[R9]], [[R10:[0-9]+]]
-; POWERPC_64-NEXT:    xxlnor [[R8]], [[R8]], [[R8]]
+; POWERPC_64-NEXT:    vcmpgtuh [[R9:[0-9]+]], [[R19:[0-9]+]], [[R10:[0-9]+]]
 ; POWERPC_64-NEXT:    vmrghh [[R11:[0-9]+]], [[R9]], [[R9]]
 ; POWERPC_64-NEXT:    vmrglh [[R9]], [[R9]], [[R9]]
 ; POWERPC_64-NEXT:    xxland [[R12:[0-9]+]], [[R12]], [[R6]]
@@ -70,28 +66,26 @@ define i32 @test_Greater_than(ptr %colauths, i32 signext %ncols) {
 ; POWERPC_32-LABEL: test_Greater_than:
 ; POWERPC_32:  L..BB0_7: # %vector.body
 ; POWERPC_32-NEXT:    #
-; POWERPC_32-NEXT:    lxv [[R1:[0-9]+]], 0(10)
-; POWERPC_32-NEXT:    addic [[R13:[0-9]+]], [[R13]], 64
-; POWERPC_32-NEXT:    addze [[R14:[0-9]+]], [[R14]]
-; POWERPC_32-NEXT:    xor [[R15:[0-9]+]], [[R13]], [[R16:[0-9]+]]
-; POWERPC_32-NEXT:    or. [[R15]], [[R15]], [[R14]]
-; POWERPC_32-NEXT:    vcmpequh [[R2:[0-9]+]], [[R2]], [[R3:[0-9]+]]
-; POWERPC_32-NEXT:    xxlnor [[R1]], [[R1]], [[R1]]
+; POWERPC_32-NEXT:    lxv [[R1:[0-9]+]], 0([[R13:[0-9]+]])
+; POWERPC_32-NEXT:    addic [[R14:[0-9]+]], [[R14]], 64
+; POWERPC_32-NEXT:    addze [[R7:[0-9]+]], [[R7]]
+; POWERPC_32-NEXT:    xor [[R15:[0-9]+]], [[R14]], [[R16:[0-9]+]]
+; POWERPC_32-NEXT:    or. [[R15]], [[R15]], [[R7]]
+; POWERPC_32-NEXT:    vcmpgtuh [[R2:[0-9]+]], [[R2]], [[R3:[0-9]+]]
 ; POWERPC_32-NEXT:    vmrglh [[R4:[0-9]+]], [[R2]], [[R2]]
 ; POWERPC_32-NEXT:    vmrghh [[R2]], [[R2]], [[R2]]
 ; POWERPC_32-NEXT:    xxland [[R5:[0-9]+]], [[R5]], [[R6:[0-9]+]]
 ; POWERPC_32-NEXT:    xxland [[R1]], [[R1]], [[R6]]
-; POWERPC_32-NEXT:    vadduwm [[R7:[0-9]+]], [[R7]], [[R4]]
+; POWERPC_32-NEXT:    vadduwm [[R7]], [[R7]], [[R4]]
 ; POWERPC_32:  L..BB0_11: # %vec.epilog.vector.body
 ; POWERPC_32-NEXT:    #
-; POWERPC_32-NEXT:    slwi [[R14]], [[R13]], 1
-; POWERPC_32-NEXT:    addic [[R13]], [[R13]], 8
+; POWERPC_32-NEXT:    slwi [[R7]], [[R14]], 1
+; POWERPC_32-NEXT:    addic [[R14]], [[R14]], 8
 ; POWERPC_32-NEXT:    addze [[R17:[0-9]+]], [[R17]]
-; POWERPC_32-NEXT:    lxvx [[R8:[0-9]+]], [[R18:[0-9]+]], [[R14]]
-; POWERPC_32-NEXT:    xor [[R14]], [[R13]], [[R16]]
-; POWERPC_32-NEXT:    or. [[R14]], [[R14]], [[R17]]
-; POWERPC_32-NEXT:    vcmpequh [[R9:[0-9]+]], [[R9]], [[R3]]
-; POWERPC_32-NEXT:    xxlnor [[R8]], [[R8]], [[R8]]
+; POWERPC_32-NEXT:    lxvx [[R8:[0-9]+]], [[R18:[0-9]+]], [[R7]]
+; POWERPC_32-NEXT:    xor [[R7]], [[R14]], [[R16]]
+; POWERPC_32-NEXT:    or. [[R7]], [[R7]], [[R17]]
+; POWERPC_32-NEXT:    vcmpgtuh [[R9:[0-9]+]], [[R9]], [[R3]]
 ; POWERPC_32-NEXT:    vmrghh [[R11:[0-9]+]], [[R9]], [[R9]]
 ; POWERPC_32-NEXT:    vmrglh [[R9]], [[R9]], [[R9]]
 ; POWERPC_32-NEXT:    xxland [[R12:[0-9]+]], [[R12]], [[R6]]

Himadhith · 2025-07-24T14:04:27Z

@AditiRM @tonykuttai

llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp

llvm/test/CodeGen/PowerPC/check-zero-vector.ll

Himadhith · 2025-07-31T05:58:27Z

NFC PR related to this patch: [NFC][PowerPC] Add test case for lockdown of vector compare greater than support for Zero vector comparisons.

Himadhith · 2025-08-07T04:57:12Z

NFC PR related to this patch: [NFC][PowerPC] Cleaning up test file and removing redundant front-end test

llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp

tonykuttai · 2025-08-11T09:18:11Z

llvm/test/CodeGen/PowerPC/recipest.ll

Are these equivalent instructions?

yes, Vector Compare Equal Floating-Point VC-form (vcmpeqfp) is replaced with Vector Compare Greater Than or Equal Floating-PointVC-form (vcmpgefp)

This look a bit suspicious to me. This patch is supposed to replace not equal compares (vcmpeqfp+vnot) against zero vectors with either a greater then or less then instruction. However, here it is replacing a vcmpeqfp directly with vcmpgefp. Maybe this is due to a missing check for a 1 use condition for SETNE?

I will do some investigation around this and post the findings here.

llvm/test/CodeGen/PowerPC/vector-popcnt-128-ult-ugt.ll

tonykuttai

LGTM with nit.

llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp

Himadhith · 2025-08-18T13:27:52Z

Gentle ping @amy-kwan @RolandF77 @AditiRM @lei137

lei137 · 2025-08-29T14:56:46Z

llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp

Suggested change

// Optimise 'Not equal to zero-vector' comparisons using 'Greater than or

// less than' operators. Example: Consider k to be any non-zero positive

// value.

// for k != 0, change SETNE to SETUGT (k > 0)

// for 0 != k, change SETNE to SETULT (0 < k)

// Optimize 'Not equal to zero-vector' comparisons to 'Greater than or

// less than' operators.

// Example: Consider k to be any non-zero positive value

// * for k != 0, change SETNE to SETUGT (k > 0)

// * for 0 != k, change SETNE to SETULT (0 < k)

lei137 · 2025-08-29T14:58:16Z

llvm/test/CodeGen/PowerPC/check-zero-vector.ll

nit: continuation of a RUN line should always be indented. Same comment for all the run line changes below.

Suggested change

; RUN: < %s | FileCheck %s --check-prefix=POWERPC_64LE

; RUN: < %s | FileCheck %s --check-prefix=POWERPC_64LE

Will do it once the NFC patch is merged.

lei137 · 2025-08-29T15:02:31Z

llvm/test/CodeGen/PowerPC/check-zero-vector.ll

Suggested change

; Optimised version using vcmpgtuh.

; Optimize zero-vector `vcmpequh` compares followed by negate to `vcmpgtuh`.

lei137 · 2025-08-29T15:04:34Z

llvm/test/CodeGen/PowerPC/pr61315.ll

Please remove local_unnamed_addr #0. Since I don't see any attributes defined here, it's likely not needed for this behaviour?

Yes that makes sense, removing local_unnamed_addr #0.

lei137 · 2025-08-29T15:26:23Z

llvm/test/CodeGen/PowerPC/vector-popcnt-128-ult-ugt.ll

I think the relationships here would be easier to see if these tests generates the full register naming so it's obvious 2 & 34 are the same registers. Please do an NFC update for these tests to add the options -ppc-asm-full-reg-names --ppc-vsr-nums-as-vr to the run lines.

done: [NFC][PowerPC] adding the arguments for register names and VSR to VR

This patch would also need a rebase to include the register naming changes.

I did not realize the other files also did not have the options -ppc-asm-full-reg-names --ppc-vsr-nums-as-vr. I will do another NFC to include those files as well. The NFC patch above took care of only llvm/test/CodeGen/PowerPC/check-zero-vector.ll.

lei137

Can we please update the title of this PR to be more accurate?
I think this PR is not just replacing the vector compare equal with gt compare instruction.
Maybe some thing like this is better:

[PowerPC] Optimize not equal compares against zero vectors

Himadhith · 2025-10-13T08:55:36Z

NFC patch to handle code coverage of floating point vectors:[NFC][PowerPC] Lockdown instructions for floating point comparison with zero-vector

…sons vector compare greater than support for Zero vector comparisons review changes

tonykuttai

LGTM. Thanks for addressing the comments.

llvmbot added the backend:PowerPC label Jul 24, 2025

Himadhith commented Jul 31, 2025

View reviewed changes

llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp Outdated Show resolved Hide resolved

Himadhith commented Jul 31, 2025

View reviewed changes

llvm/test/CodeGen/PowerPC/check-zero-vector.ll Outdated Show resolved Hide resolved

Himadhith force-pushed the check_zero_vec branch 3 times, most recently from 84e4923 to 24a2fb1 Compare August 6, 2025 14:59

tonykuttai requested review from RolandF77, amy-kwan, lei137 and tonykuttai August 8, 2025 03:52

Himadhith changed the title ~~[PowerPC] vector compare greater than support~~ [PowerPC] replace vector compare equal to with vector compare greater than Aug 8, 2025

tonykuttai reviewed Aug 11, 2025

View reviewed changes

Himadhith requested a review from tonykuttai August 11, 2025 13:47

Himadhith force-pushed the check_zero_vec branch 2 times, most recently from a312c23 to b07da25 Compare August 12, 2025 05:01

tonykuttai approved these changes Aug 18, 2025

View reviewed changes

llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp Outdated Show resolved Hide resolved

llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp Outdated Show resolved Hide resolved

Himadhith force-pushed the check_zero_vec branch from b07da25 to 3bfec50 Compare August 18, 2025 13:26

Himadhith requested a review from tonykuttai August 18, 2025 13:26

tonykuttai approved these changes Aug 28, 2025

View reviewed changes

Himadhith force-pushed the check_zero_vec branch 4 times, most recently from c9800cd to 2d6ecf5 Compare August 29, 2025 04:58

lei137 reviewed Aug 29, 2025

View reviewed changes

Himadhith changed the title ~~[PowerPC] replace vector compare equal to with vector compare greater than~~ [PowerPC] Optimize not equal compares against zero vectors Aug 30, 2025

Himadhith force-pushed the check_zero_vec branch 2 times, most recently from 6cc2136 to fa30f2d Compare September 2, 2025 11:40

Himadhith mentioned this pull request Sep 5, 2025

Request Commit Access For Himadhith #157016

Closed

Himadhith force-pushed the check_zero_vec branch from fa30f2d to 8bea7a0 Compare September 5, 2025 16:08

Himadhith force-pushed the check_zero_vec branch from ebde840 to de422ad Compare September 24, 2025 07:59

himadhith added 4 commits October 14, 2025 03:36

[PowerPC] vector compare greater than support for Zero vector compari…

163473f

…sons vector compare greater than support for Zero vector comparisons review changes

Changed the patch to work only with Integer vector types

b5c2ce7

Changed the patch to work only with Integer vector types

64c2768

Reusing EVT VecVT

319ed77

Himadhith force-pushed the check_zero_vec branch from 1fbe31b to 319ed77 Compare October 14, 2025 05:17

Merge branch 'main' into check_zero_vec

6023d4a

tonykuttai approved these changes Oct 23, 2025

View reviewed changes

tonykuttai requested a review from lei137 October 23, 2025 07:13

-    // Optimise 'Not equal to zero-vector' comparisons using 'Greater than or
-    // less than' operators. Example: Consider k to be any non-zero positive
-    // value.
-    // for k != 0, change SETNE to SETUGT (k > 0)
-    // for 0 != k, change SETNE to SETULT (0 < k)
+    // Optimize 'Not equal to zero-vector' comparisons to 'Greater than or
+    // less than' operators.
+    // Example: Consider k to be any non-zero positive value
+    // * for k != 0, change SETNE to SETUGT (k > 0)
+    // * for 0 != k, change SETNE to SETULT (0 < k)

	; RUN: < %s \| FileCheck %s --check-prefix=POWERPC_64LE
	; RUN: < %s \| FileCheck %s --check-prefix=POWERPC_64LE

	; Optimised version using vcmpgtuh.
	; Optimize zero-vector `vcmpequh` compares followed by negate to `vcmpgtuh`.

[PowerPC] Optimize not equal compares against zero vectors #150422

Are you sure you want to change the base?

[PowerPC] Optimize not equal compares against zero vectors #150422

Uh oh!

Conversation

Himadhith commented Jul 24, 2025

Uh oh!

llvmbot commented Jul 24, 2025

Uh oh!

Himadhith commented Jul 24, 2025

Uh oh!

Uh oh!

Uh oh!

Himadhith commented Jul 31, 2025

Uh oh!

Himadhith commented Aug 7, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Himadhith Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lei137 Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tonykuttai left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Himadhith commented Aug 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lei137 left a comment

Choose a reason for hiding this comment

Uh oh!

Himadhith commented Oct 13, 2025

Uh oh!

tonykuttai left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Himadhith Aug 11, 2025 •

edited

Loading

lei137 Aug 29, 2025 •

edited

Loading

Himadhith commented Aug 18, 2025 •

edited

Loading