[PowerPC] Replace vspltisw+vadduwm instructions with xxleqv+vsubuwm for adding the vector {1, 1, 1, 1} #160882

Himadhith · 2025-09-26T13:34:23Z

This patch optimizes vector addition operations involving all-ones vectors by leveraging the generation of vectors of -1s(using xxleqv, which is cheaper than generating vectors of 1s(vspltisw). These are the respective vector types.
v2i64: A + vector {1, 1}
v4i32: A + vector {1, 1, 1, 1}
v8i16: A + vector {1, 1, 1, 1, 1, 1, 1, 1}
v16i8: A + vector {1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1}

The optimized version replaces vspltisw (4 cycles) with xxleqv (2 cycles) using the following identity:
A - (-1) = A + 1.

llvmbot · 2025-09-26T13:35:03Z

@llvm/pr-subscribers-backend-powerpc

Author: None (Himadhith)

Changes

This patch leverages generation of vector of -1s to be cheaper than vector of 1s to optimize the current implementation for A + vector {1, 1, 1, 1}.

In this optimized version we replace vspltisw (4 cycles) with xxleqv (2 cycles) using the following identity:
A - (-1) = A + 1.

Full diff: https://github.com/llvm/llvm-project/pull/160882.diff

1 Files Affected:

(modified) llvm/lib/Target/PowerPC/PPCInstrVSX.td (+4)

diff --git a/llvm/lib/Target/PowerPC/PPCInstrVSX.td b/llvm/lib/Target/PowerPC/PPCInstrVSX.td
index 4e5165bfcda55..dc850d2470cfd 100644
--- a/llvm/lib/Target/PowerPC/PPCInstrVSX.td
+++ b/llvm/lib/Target/PowerPC/PPCInstrVSX.td
@@ -3627,6 +3627,10 @@ def : Pat<(v4i32 (build_vector immSExt5NonZero:$A, immSExt5NonZero:$A,
                                immSExt5NonZero:$A, immSExt5NonZero:$A)),
           (v4i32 (VSPLTISW imm:$A))>;
 
+// Optimise for vector of 1s addition operation
+def : Pat<(add v4i32:$A, (build_vector (i32 1), (i32 1), (i32 1), (i32 1))),
+          (VSUBUWM $A, (v4i32 (COPY_TO_REGCLASS (XXLEQVOnes), VSRC)))>;
+
 // Splat loads.
 def : Pat<(v8i16 (PPCldsplat ForceXForm:$A)),
           (v8i16 (VSPLTHs 3, (MTVSRWZ (LHZX ForceXForm:$A))))>;

lei137 · 2025-09-26T17:56:55Z

I'm guessing this is not ready to be reviewed as it need https://github.com/llvm/llvm-project/pull/160476/files to be in first enable to show the difference.

Himadhith · 2025-09-26T18:19:42Z

I'm guessing this is not ready to be reviewed as it need https://github.com/llvm/llvm-project/pull/160476/files to be in first enable to show the difference.

Yes as soon as the NFC patch gets merged I will rebase and the file should reflect the changes. Should I keep this as a draft till then?

tonykuttai · 2025-10-06T05:27:24Z

llvm/lib/Target/PowerPC/PPCInstrVSX.td

          (v4i32 (VSPLTISW imm:$A))>;

+// Optimize for vector of 1s addition operation
+def : Pat<(add v4i32:$A, (build_vector (i32 1), (i32 1), (i32 1), (i32 1))),


Does this work only for v4i32 vector types? Why not v2i64, v8i16 and v16i8 types?

I tried to add Patterns for the other 3 types which are not present, I noticed that for v2i64 type the tablegen pattern matching was not working as it is generating the following ISAs:

vspltisw 3, 1 vupklsw 3, 3 vaddudm 2, 2, 3

Which is difficult to replace gracefully using tablegen method. Instead, opting for DAG combiner method to handle this case in the backend.

tonykuttai · 2025-10-06T05:27:58Z

llvm/test/CodeGen/PowerPC/vector-all-ones.ll

-; This pattern is expected to be optimized in a future patch by using `xxleqv` to generate vector of -1s
-; followed by subtraction operation.
+; Optimized version of vector addition with {1,1,1,1} by replacing `vspltisw + vadduwm` with 'xxleqv + vsubuwm'
 define dso_local noundef <4 x i32> @test1(<4 x i32> %a) {


Same as above comment. Support v2i64, v8i16 and v16i8 types as well ?

Will add a NFC patch shortly to address the other 3 types.

github-actions · 2025-10-13T12:41:41Z

✅ With the latest revision this PR passed the C/C++ code formatter.

…ector of -1s is cheaper than vector of 1s

tonykuttai

Please modify the description to reflect that

ADD operation substituted with SUB
Build vector of all 1s in RHS getting replaced with Build vector of all -1s

tonykuttai · 2025-10-17T04:28:55Z

llvm/test/CodeGen/PowerPC/vec_add_sub_doubleword.ll

 ; NOVSX-NEXT:    addi 3, 3, .LCPI1_0@toc@l
 ; NOVSX-NEXT:    lvx 3, 0, 3
-; NOVSX-NEXT:    vaddudm 2, 2, 3
+; NOVSX-NEXT:    vsubudm 2, 2, 3


Please investigate why this got affected.

This was because the code did not check for VSX attribute. The hasVSX() check fixed this.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp

tonykuttai · 2025-10-17T04:35:03Z

llvm/lib/Target/PowerPC/PPCISelLowering.cpp

+  // Check if RHS is BUILD_VECTOR
+  // To satisfy commutative property a+b = b+a
+  if (RHS.getOpcode() != ISD::BUILD_VECTOR)
+    std::swap(LHS, RHS);


BUILD_VECTOR have to be on the RHS. We don't need the swap here.

tonykuttai

Thanks for addressing the comments. LGTM

Himadhith · 2025-10-17T05:47:47Z

This patch does not handle v1i128 vector type because it does not emit the instruction vspltisw.

# %bb.0:                                # %entry
        addis 3, 2, .LCPI4_0@toc@ha
        addi 3, 3, .LCPI4_0@toc@l
        lxvd2x 0, 0, 3
        xxswapd 35, 0
        vadduqm 2, 2, 3
        blr
        .long   0
        .quad   0

Himadhith requested review from AditiRM, RolandF77, amy-kwan, kamaub, lei137 and tonykuttai September 26, 2025 13:34

llvmbot added the backend:PowerPC label Sep 26, 2025

Himadhith force-pushed the himadhith/xxleqv_vec branch from 6018f73 to 40edcce Compare September 26, 2025 13:35

Himadhith mentioned this pull request Sep 26, 2025

[NFC][PowerPC] Lockdown instructions of vspltisw for addition of vector of 1s #160476

Merged

Himadhith force-pushed the himadhith/xxleqv_vec branch 4 times, most recently from 8079d5c to da6de91 Compare September 26, 2025 16:32

Himadhith force-pushed the himadhith/xxleqv_vec branch from da6de91 to 5de66e2 Compare September 27, 2025 04:01

tonykuttai requested changes Oct 6, 2025

View reviewed changes

Himadhith force-pushed the himadhith/xxleqv_vec branch 5 times, most recently from 1221560 to c27a492 Compare October 16, 2025 05:42

himadhith added 2 commits October 16, 2025 13:22

[PowerPC] Replace vspltisw instruction with xxleqv as generation of v…

cf81591

…ector of -1s is cheaper than vector of 1s

DAG combiner method as tablegen does not work with v2i64s

73bd0ed

Himadhith force-pushed the himadhith/xxleqv_vec branch 3 times, most recently from 2619e1d to d74869b Compare October 16, 2025 18:17

update checks for affected files

d74869b

tonykuttai requested changes Oct 17, 2025

View reviewed changes

Himadhith force-pushed the himadhith/xxleqv_vec branch from e50a0d6 to 67a8060 Compare October 17, 2025 05:41

tonykuttai approved these changes Oct 17, 2025

View reviewed changes

Himadhith force-pushed the himadhith/xxleqv_vec branch 2 times, most recently from 6f3cb1d to 432d6e0 Compare October 17, 2025 05:46

addressing review comments

432d6e0

[PowerPC] Replace vspltisw+vadduwm instructions with xxleqv+vsubuwm for adding the vector {1, 1, 1, 1} #160882

Are you sure you want to change the base?

[PowerPC] Replace vspltisw+vadduwm instructions with xxleqv+vsubuwm for adding the vector {1, 1, 1, 1} #160882

Conversation

Himadhith commented Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Sep 26, 2025

Uh oh!

lei137 commented Sep 26, 2025

Uh oh!

Himadhith commented Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tonykuttai left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tonykuttai left a comment

Choose a reason for hiding this comment

Uh oh!

Himadhith commented Oct 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Himadhith commented Sep 26, 2025 •

edited

Loading

Himadhith commented Sep 26, 2025 •

edited

Loading

github-actions bot commented Oct 13, 2025 •

edited

Loading