Skip to content

Conversation

@lei137
Copy link
Contributor

@lei137 lei137 commented Mar 5, 2025

Update to use REG_SEQUENCE when possible.

This patch only update td pattern to utilize REG_SEQUENCE for INSERT_SUBREG for cases where it does not produce
a nesting of REG_SEQUENCE. This seem to show some improvement in code gen for llvm/test/CodeGen/PowerPC/mmaplus-intrinsics.ll.

Fixes part of #125502

@lei137 lei137 linked an issue Mar 6, 2025 that may be closed by this pull request
@lei137 lei137 marked this pull request as ready for review March 6, 2025 14:31
@lei137 lei137 requested a review from arsenm March 6, 2025 14:32
@lei137 lei137 requested a review from nemanjai March 6, 2025 14:32
@llvmbot
Copy link
Member

llvmbot commented Mar 6, 2025

@llvm/pr-subscribers-backend-powerpc

Author: Lei Huang (lei137)

Changes

Update to use REG_SEQUENCE when possible.

This patch only update td pattern to utilize REG_SEQUENCE for INSERT_SUBREG for cases where it does not produce
a nesting of REG_SEQUENCE. This seem to show some improvement in code gen for llvm/test/CodeGen/PowerPC/mmaplus-intrinsics.ll.

Fixes part of #125502


Patch is 79.99 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/129941.diff

13 Files Affected:

  • (modified) llvm/lib/Target/PowerPC/PPCInstrMMA.td (+12-13)
  • (modified) llvm/lib/Target/PowerPC/PPCInstrP10.td (+2-8)
  • (modified) llvm/test/CodeGen/PowerPC/bfloat16-outer-product.ll (+32-32)
  • (modified) llvm/test/CodeGen/PowerPC/mma-acc-copy-hints.ll (+6-6)
  • (modified) llvm/test/CodeGen/PowerPC/mma-acc-memops.ll (+28-28)
  • (modified) llvm/test/CodeGen/PowerPC/mma-acc-spill.ll (+2-2)
  • (modified) llvm/test/CodeGen/PowerPC/mma-integer-based-outer-product.ll (+16-16)
  • (modified) llvm/test/CodeGen/PowerPC/mma-intrinsics.ll (+16-16)
  • (modified) llvm/test/CodeGen/PowerPC/mma-outer-product.ll (+160-160)
  • (modified) llvm/test/CodeGen/PowerPC/mmaplus-intrinsics.ll (+12-24)
  • (modified) llvm/test/CodeGen/PowerPC/paired-vector-intrinsics.ll (+8-8)
  • (modified) llvm/test/CodeGen/PowerPC/ppc64-acc-regalloc-bugfix.ll (+1-1)
  • (modified) llvm/test/CodeGen/PowerPC/ppc64-acc-regalloc.ll (+48-46)
diff --git a/llvm/lib/Target/PowerPC/PPCInstrMMA.td b/llvm/lib/Target/PowerPC/PPCInstrMMA.td
index 161d4d3c492f3..c40d3996dd181 100644
--- a/llvm/lib/Target/PowerPC/PPCInstrMMA.td
+++ b/llvm/lib/Target/PowerPC/PPCInstrMMA.td
@@ -1055,19 +1055,18 @@ let Predicates = [MMA, PrefixInstrs, IsISAFuture] in {
 }
 
 def ConcatsMMA {
-  dag VecsToVecPair0 =
-    (v256i1 (INSERT_SUBREG
-      (INSERT_SUBREG (IMPLICIT_DEF), $vs0, sub_vsx1),
-      $vs1, sub_vsx0));
-  dag VecsToVecPair1 =
-    (v256i1 (INSERT_SUBREG
-      (INSERT_SUBREG (IMPLICIT_DEF), $vs2, sub_vsx1),
-      $vs3, sub_vsx0));
-  dag VecsToVecQuad =
-    (BUILD_UACC (INSERT_SUBREG
-                  (INSERT_SUBREG (v512i1 (IMPLICIT_DEF)),
-                                 (KILL_PAIR VecsToVecPair0), sub_pair0),
-                  (KILL_PAIR VecsToVecPair1), sub_pair1));
+   dag VecsToVecPair0 =
+          (v256i1 (INSERT_SUBREG
+                    (INSERT_SUBREG (IMPLICIT_DEF), $vs0, sub_vsx1),
+                    $vs1, sub_vsx0));
+   dag VecsToVecPair1 =
+          (v256i1 (INSERT_SUBREG
+                    (INSERT_SUBREG (IMPLICIT_DEF), $vs2, sub_vsx1),
+                    $vs3, sub_vsx0));
+  dag VecsToVecQuad = (BUILD_UACC
+          (v512i1 (REG_SEQUENCE UACCRC,
+                                (KILL_PAIR VecsToVecPair0), sub_pair0,
+                                (KILL_PAIR VecsToVecPair1), sub_pair1)));
 }
 
 def Extracts {
diff --git a/llvm/lib/Target/PowerPC/PPCInstrP10.td b/llvm/lib/Target/PowerPC/PPCInstrP10.td
index 19247c1f3fe6d..39a1ab0d388a7 100644
--- a/llvm/lib/Target/PowerPC/PPCInstrP10.td
+++ b/llvm/lib/Target/PowerPC/PPCInstrP10.td
@@ -1139,17 +1139,11 @@ class MMIRR_XX3Form_XYP4_XAB6<bits<6> opcode, bits<8> xo, dag OOL, dag IOL,
   let Inst{63} = 0;
 }
 
-
-
 def Concats {
   dag VecsToVecPair0 =
-    (v256i1 (INSERT_SUBREG
-      (INSERT_SUBREG (IMPLICIT_DEF), $vs0, sub_vsx1),
-      $vs1, sub_vsx0));
+    (v256i1 (REG_SEQUENCE VSRpRC, $vs0, sub_vsx1, $vs1, sub_vsx0));
   dag VecsToVecPair1 =
-    (v256i1 (INSERT_SUBREG
-      (INSERT_SUBREG (IMPLICIT_DEF), $vs2, sub_vsx1),
-      $vs3, sub_vsx0));
+    (v256i1 (REG_SEQUENCE VSRpRC, $vs2, sub_vsx1, $vs3, sub_vsx0));
 }
 
 let Predicates = [PairedVectorMemops] in {
diff --git a/llvm/test/CodeGen/PowerPC/bfloat16-outer-product.ll b/llvm/test/CodeGen/PowerPC/bfloat16-outer-product.ll
index 881e563ec915a..2a3ae4f6a5007 100644
--- a/llvm/test/CodeGen/PowerPC/bfloat16-outer-product.ll
+++ b/llvm/test/CodeGen/PowerPC/bfloat16-outer-product.ll
@@ -70,10 +70,10 @@ declare <512 x i1> @llvm.ppc.mma.pmxvbf16ger2(<16 x i8>, <16 x i8>, i32, i32, i3
 define dso_local void @test52(ptr nocapture readonly %vqp, ptr nocapture readnone %vpp, <16 x i8> %vc, ptr nocapture %resp) {
 ; CHECK-LABEL: test52:
 ; CHECK:       # %bb.0: # %entry
-; CHECK-NEXT:    lxv vs1, 32(r3)
-; CHECK-NEXT:    lxv vs0, 48(r3)
 ; CHECK-NEXT:    lxv vs3, 0(r3)
 ; CHECK-NEXT:    lxv vs2, 16(r3)
+; CHECK-NEXT:    lxv vs1, 32(r3)
+; CHECK-NEXT:    lxv vs0, 48(r3)
 ; CHECK-NEXT:    xxmtacc acc0
 ; CHECK-NEXT:    xvbf16ger2pp acc0, v2, v2
 ; CHECK-NEXT:    xxmfacc acc0
@@ -85,10 +85,10 @@ define dso_local void @test52(ptr nocapture readonly %vqp, ptr nocapture readnon
 ;
 ; CHECK-BE-LABEL: test52:
 ; CHECK-BE:       # %bb.0: # %entry
-; CHECK-BE-NEXT:    lxv vs1, 16(r3)
-; CHECK-BE-NEXT:    lxv vs0, 0(r3)
 ; CHECK-BE-NEXT:    lxv vs3, 48(r3)
 ; CHECK-BE-NEXT:    lxv vs2, 32(r3)
+; CHECK-BE-NEXT:    lxv vs1, 16(r3)
+; CHECK-BE-NEXT:    lxv vs0, 0(r3)
 ; CHECK-BE-NEXT:    xxmtacc acc0
 ; CHECK-BE-NEXT:    xvbf16ger2pp acc0, v2, v2
 ; CHECK-BE-NEXT:    xxmfacc acc0
@@ -111,10 +111,10 @@ declare <512 x i1> @llvm.ppc.mma.xvbf16ger2pp(<512 x i1>, <16 x i8>, <16 x i8>)
 define dso_local void @test53(ptr nocapture readonly %vqp, ptr nocapture readnone %vpp, <16 x i8> %vc, ptr nocapture %resp) {
 ; CHECK-LABEL: test53:
 ; CHECK:       # %bb.0: # %entry
-; CHECK-NEXT:    lxv vs1, 32(r3)
-; CHECK-NEXT:    lxv vs0, 48(r3)
 ; CHECK-NEXT:    lxv vs3, 0(r3)
 ; CHECK-NEXT:    lxv vs2, 16(r3)
+; CHECK-NEXT:    lxv vs1, 32(r3)
+; CHECK-NEXT:    lxv vs0, 48(r3)
 ; CHECK-NEXT:    xxmtacc acc0
 ; CHECK-NEXT:    xvbf16ger2pn acc0, v2, v2
 ; CHECK-NEXT:    xxmfacc acc0
@@ -126,10 +126,10 @@ define dso_local void @test53(ptr nocapture readonly %vqp, ptr nocapture readnon
 ;
 ; CHECK-BE-LABEL: test53:
 ; CHECK-BE:       # %bb.0: # %entry
-; CHECK-BE-NEXT:    lxv vs1, 16(r3)
-; CHECK-BE-NEXT:    lxv vs0, 0(r3)
 ; CHECK-BE-NEXT:    lxv vs3, 48(r3)
 ; CHECK-BE-NEXT:    lxv vs2, 32(r3)
+; CHECK-BE-NEXT:    lxv vs1, 16(r3)
+; CHECK-BE-NEXT:    lxv vs0, 0(r3)
 ; CHECK-BE-NEXT:    xxmtacc acc0
 ; CHECK-BE-NEXT:    xvbf16ger2pn acc0, v2, v2
 ; CHECK-BE-NEXT:    xxmfacc acc0
@@ -152,10 +152,10 @@ declare <512 x i1> @llvm.ppc.mma.xvbf16ger2pn(<512 x i1>, <16 x i8>, <16 x i8>)
 define dso_local void @test54(ptr nocapture readonly %vqp, ptr nocapture readnone %vpp, <16 x i8> %vc, ptr nocapture %resp) {
 ; CHECK-LABEL: test54:
 ; CHECK:       # %bb.0: # %entry
-; CHECK-NEXT:    lxv vs1, 32(r3)
-; CHECK-NEXT:    lxv vs0, 48(r3)
 ; CHECK-NEXT:    lxv vs3, 0(r3)
 ; CHECK-NEXT:    lxv vs2, 16(r3)
+; CHECK-NEXT:    lxv vs1, 32(r3)
+; CHECK-NEXT:    lxv vs0, 48(r3)
 ; CHECK-NEXT:    xxmtacc acc0
 ; CHECK-NEXT:    xvbf16ger2np acc0, v2, v2
 ; CHECK-NEXT:    xxmfacc acc0
@@ -167,10 +167,10 @@ define dso_local void @test54(ptr nocapture readonly %vqp, ptr nocapture readnon
 ;
 ; CHECK-BE-LABEL: test54:
 ; CHECK-BE:       # %bb.0: # %entry
-; CHECK-BE-NEXT:    lxv vs1, 16(r3)
-; CHECK-BE-NEXT:    lxv vs0, 0(r3)
 ; CHECK-BE-NEXT:    lxv vs3, 48(r3)
 ; CHECK-BE-NEXT:    lxv vs2, 32(r3)
+; CHECK-BE-NEXT:    lxv vs1, 16(r3)
+; CHECK-BE-NEXT:    lxv vs0, 0(r3)
 ; CHECK-BE-NEXT:    xxmtacc acc0
 ; CHECK-BE-NEXT:    xvbf16ger2np acc0, v2, v2
 ; CHECK-BE-NEXT:    xxmfacc acc0
@@ -193,10 +193,10 @@ declare <512 x i1> @llvm.ppc.mma.xvbf16ger2np(<512 x i1>, <16 x i8>, <16 x i8>)
 define dso_local void @test55(ptr nocapture readonly %vqp, ptr nocapture readnone %vpp, <16 x i8> %vc, ptr nocapture %resp) {
 ; CHECK-LABEL: test55:
 ; CHECK:       # %bb.0: # %entry
-; CHECK-NEXT:    lxv vs1, 32(r3)
-; CHECK-NEXT:    lxv vs0, 48(r3)
 ; CHECK-NEXT:    lxv vs3, 0(r3)
 ; CHECK-NEXT:    lxv vs2, 16(r3)
+; CHECK-NEXT:    lxv vs1, 32(r3)
+; CHECK-NEXT:    lxv vs0, 48(r3)
 ; CHECK-NEXT:    xxmtacc acc0
 ; CHECK-NEXT:    xvbf16ger2nn acc0, v2, v2
 ; CHECK-NEXT:    xxmfacc acc0
@@ -208,10 +208,10 @@ define dso_local void @test55(ptr nocapture readonly %vqp, ptr nocapture readnon
 ;
 ; CHECK-BE-LABEL: test55:
 ; CHECK-BE:       # %bb.0: # %entry
-; CHECK-BE-NEXT:    lxv vs1, 16(r3)
-; CHECK-BE-NEXT:    lxv vs0, 0(r3)
 ; CHECK-BE-NEXT:    lxv vs3, 48(r3)
 ; CHECK-BE-NEXT:    lxv vs2, 32(r3)
+; CHECK-BE-NEXT:    lxv vs1, 16(r3)
+; CHECK-BE-NEXT:    lxv vs0, 0(r3)
 ; CHECK-BE-NEXT:    xxmtacc acc0
 ; CHECK-BE-NEXT:    xvbf16ger2nn acc0, v2, v2
 ; CHECK-BE-NEXT:    xxmfacc acc0
@@ -234,10 +234,10 @@ declare <512 x i1> @llvm.ppc.mma.xvbf16ger2nn(<512 x i1>, <16 x i8>, <16 x i8>)
 define dso_local void @test56(ptr nocapture readonly %vqp, ptr nocapture readnone %vpp, <16 x i8> %vc, ptr nocapture %resp) {
 ; CHECK-LABEL: test56:
 ; CHECK:       # %bb.0: # %entry
-; CHECK-NEXT:    lxv vs1, 32(r3)
-; CHECK-NEXT:    lxv vs0, 48(r3)
 ; CHECK-NEXT:    lxv vs3, 0(r3)
 ; CHECK-NEXT:    lxv vs2, 16(r3)
+; CHECK-NEXT:    lxv vs1, 32(r3)
+; CHECK-NEXT:    lxv vs0, 48(r3)
 ; CHECK-NEXT:    xxmtacc acc0
 ; CHECK-NEXT:    pmxvbf16ger2pp acc0, v2, v2, 0, 0, 0
 ; CHECK-NEXT:    xxmfacc acc0
@@ -249,10 +249,10 @@ define dso_local void @test56(ptr nocapture readonly %vqp, ptr nocapture readnon
 ;
 ; CHECK-BE-LABEL: test56:
 ; CHECK-BE:       # %bb.0: # %entry
-; CHECK-BE-NEXT:    lxv vs1, 16(r3)
-; CHECK-BE-NEXT:    lxv vs0, 0(r3)
 ; CHECK-BE-NEXT:    lxv vs3, 48(r3)
 ; CHECK-BE-NEXT:    lxv vs2, 32(r3)
+; CHECK-BE-NEXT:    lxv vs1, 16(r3)
+; CHECK-BE-NEXT:    lxv vs0, 0(r3)
 ; CHECK-BE-NEXT:    xxmtacc acc0
 ; CHECK-BE-NEXT:    pmxvbf16ger2pp acc0, v2, v2, 0, 0, 0
 ; CHECK-BE-NEXT:    xxmfacc acc0
@@ -275,10 +275,10 @@ declare <512 x i1> @llvm.ppc.mma.pmxvbf16ger2pp(<512 x i1>, <16 x i8>, <16 x i8>
 define dso_local void @test57(ptr nocapture readonly %vqp, ptr nocapture readnone %vpp, <16 x i8> %vc, ptr nocapture %resp) {
 ; CHECK-LABEL: test57:
 ; CHECK:       # %bb.0: # %entry
-; CHECK-NEXT:    lxv vs1, 32(r3)
-; CHECK-NEXT:    lxv vs0, 48(r3)
 ; CHECK-NEXT:    lxv vs3, 0(r3)
 ; CHECK-NEXT:    lxv vs2, 16(r3)
+; CHECK-NEXT:    lxv vs1, 32(r3)
+; CHECK-NEXT:    lxv vs0, 48(r3)
 ; CHECK-NEXT:    xxmtacc acc0
 ; CHECK-NEXT:    pmxvbf16ger2pn acc0, v2, v2, 0, 0, 0
 ; CHECK-NEXT:    xxmfacc acc0
@@ -290,10 +290,10 @@ define dso_local void @test57(ptr nocapture readonly %vqp, ptr nocapture readnon
 ;
 ; CHECK-BE-LABEL: test57:
 ; CHECK-BE:       # %bb.0: # %entry
-; CHECK-BE-NEXT:    lxv vs1, 16(r3)
-; CHECK-BE-NEXT:    lxv vs0, 0(r3)
 ; CHECK-BE-NEXT:    lxv vs3, 48(r3)
 ; CHECK-BE-NEXT:    lxv vs2, 32(r3)
+; CHECK-BE-NEXT:    lxv vs1, 16(r3)
+; CHECK-BE-NEXT:    lxv vs0, 0(r3)
 ; CHECK-BE-NEXT:    xxmtacc acc0
 ; CHECK-BE-NEXT:    pmxvbf16ger2pn acc0, v2, v2, 0, 0, 0
 ; CHECK-BE-NEXT:    xxmfacc acc0
@@ -316,10 +316,10 @@ declare <512 x i1> @llvm.ppc.mma.pmxvbf16ger2pn(<512 x i1>, <16 x i8>, <16 x i8>
 define dso_local void @test58(ptr nocapture readonly %vqp, ptr nocapture readnone %vpp, <16 x i8> %vc, ptr nocapture %resp) {
 ; CHECK-LABEL: test58:
 ; CHECK:       # %bb.0: # %entry
-; CHECK-NEXT:    lxv vs1, 32(r3)
-; CHECK-NEXT:    lxv vs0, 48(r3)
 ; CHECK-NEXT:    lxv vs3, 0(r3)
 ; CHECK-NEXT:    lxv vs2, 16(r3)
+; CHECK-NEXT:    lxv vs1, 32(r3)
+; CHECK-NEXT:    lxv vs0, 48(r3)
 ; CHECK-NEXT:    xxmtacc acc0
 ; CHECK-NEXT:    pmxvbf16ger2np acc0, v2, v2, 0, 0, 0
 ; CHECK-NEXT:    xxmfacc acc0
@@ -331,10 +331,10 @@ define dso_local void @test58(ptr nocapture readonly %vqp, ptr nocapture readnon
 ;
 ; CHECK-BE-LABEL: test58:
 ; CHECK-BE:       # %bb.0: # %entry
-; CHECK-BE-NEXT:    lxv vs1, 16(r3)
-; CHECK-BE-NEXT:    lxv vs0, 0(r3)
 ; CHECK-BE-NEXT:    lxv vs3, 48(r3)
 ; CHECK-BE-NEXT:    lxv vs2, 32(r3)
+; CHECK-BE-NEXT:    lxv vs1, 16(r3)
+; CHECK-BE-NEXT:    lxv vs0, 0(r3)
 ; CHECK-BE-NEXT:    xxmtacc acc0
 ; CHECK-BE-NEXT:    pmxvbf16ger2np acc0, v2, v2, 0, 0, 0
 ; CHECK-BE-NEXT:    xxmfacc acc0
@@ -357,10 +357,10 @@ declare <512 x i1> @llvm.ppc.mma.pmxvbf16ger2np(<512 x i1>, <16 x i8>, <16 x i8>
 define dso_local void @test59(ptr nocapture readonly %vqp, ptr nocapture readnone %vpp, <16 x i8> %vc, ptr nocapture %resp) {
 ; CHECK-LABEL: test59:
 ; CHECK:       # %bb.0: # %entry
-; CHECK-NEXT:    lxv vs1, 32(r3)
-; CHECK-NEXT:    lxv vs0, 48(r3)
 ; CHECK-NEXT:    lxv vs3, 0(r3)
 ; CHECK-NEXT:    lxv vs2, 16(r3)
+; CHECK-NEXT:    lxv vs1, 32(r3)
+; CHECK-NEXT:    lxv vs0, 48(r3)
 ; CHECK-NEXT:    xxmtacc acc0
 ; CHECK-NEXT:    pmxvbf16ger2nn acc0, v2, v2, 0, 0, 0
 ; CHECK-NEXT:    xxmfacc acc0
@@ -372,10 +372,10 @@ define dso_local void @test59(ptr nocapture readonly %vqp, ptr nocapture readnon
 ;
 ; CHECK-BE-LABEL: test59:
 ; CHECK-BE:       # %bb.0: # %entry
-; CHECK-BE-NEXT:    lxv vs1, 16(r3)
-; CHECK-BE-NEXT:    lxv vs0, 0(r3)
 ; CHECK-BE-NEXT:    lxv vs3, 48(r3)
 ; CHECK-BE-NEXT:    lxv vs2, 32(r3)
+; CHECK-BE-NEXT:    lxv vs1, 16(r3)
+; CHECK-BE-NEXT:    lxv vs0, 0(r3)
 ; CHECK-BE-NEXT:    xxmtacc acc0
 ; CHECK-BE-NEXT:    pmxvbf16ger2nn acc0, v2, v2, 0, 0, 0
 ; CHECK-BE-NEXT:    xxmfacc acc0
diff --git a/llvm/test/CodeGen/PowerPC/mma-acc-copy-hints.ll b/llvm/test/CodeGen/PowerPC/mma-acc-copy-hints.ll
index 5decd9a639af8..7e2f744ac1d71 100644
--- a/llvm/test/CodeGen/PowerPC/mma-acc-copy-hints.ll
+++ b/llvm/test/CodeGen/PowerPC/mma-acc-copy-hints.ll
@@ -28,9 +28,9 @@ define void @testMultiply(ptr nocapture noundef readonly %a, ptr nocapture nound
 ; CHECK-NEXT:    bl _Z15buildVectorPairPu13__vector_pairDv16_hS0_@notoc
 ; CHECK-NEXT:    xxsetaccz acc1
 ; CHECK-NEXT:    xvf32gerpp acc1, v31, v30
-; CHECK-NEXT:    lxv v3, 32(r1)
 ; CHECK-NEXT:    lxv vs0, 48(r1)
-; CHECK-NEXT:    xvf32gerpp acc1, v3, vs0
+; CHECK-NEXT:    lxv vs1, 32(r1)
+; CHECK-NEXT:    xvf32gerpp acc1, vs1, vs0
 ; CHECK-NEXT:    lxv v31, -48(r30) # 16-byte Folded Reload
 ; CHECK-NEXT:    lxv v30, -64(r30) # 16-byte Folded Reload
 ; CHECK-NEXT:    xxmfacc acc1
@@ -71,16 +71,16 @@ define void @testMultiply(ptr nocapture noundef readonly %a, ptr nocapture nound
 ; CHECK-BE-NEXT:    nop
 ; CHECK-BE-NEXT:    xxsetaccz acc1
 ; CHECK-BE-NEXT:    xvf32gerpp acc1, v31, v30
-; CHECK-BE-NEXT:    lxv v3, 144(r1)
 ; CHECK-BE-NEXT:    lxv vs0, 128(r1)
-; CHECK-BE-NEXT:    xvf32gerpp acc1, vs0, v3
+; CHECK-BE-NEXT:    lxv vs1, 144(r1)
+; CHECK-BE-NEXT:    xvf32gerpp acc1, vs0, vs1
 ; CHECK-BE-NEXT:    lxv v31, -48(r30) # 16-byte Folded Reload
 ; CHECK-BE-NEXT:    lxv v30, -64(r30) # 16-byte Folded Reload
 ; CHECK-BE-NEXT:    xxmfacc acc1
-; CHECK-BE-NEXT:    xxlor vs1, vs6, vs6
-; CHECK-BE-NEXT:    xxlor vs0, vs7, vs7
 ; CHECK-BE-NEXT:    xxlor vs3, vs4, vs4
 ; CHECK-BE-NEXT:    xxlor vs2, vs5, vs5
+; CHECK-BE-NEXT:    xxlor vs1, vs6, vs6
+; CHECK-BE-NEXT:    xxlor vs0, vs7, vs7
 ; CHECK-BE-NEXT:    stxv vs0, 0(r29)
 ; CHECK-BE-NEXT:    pstxv vs1, 8(r29), 0
 ; CHECK-BE-NEXT:    stxv vs2, 16(r29)
diff --git a/llvm/test/CodeGen/PowerPC/mma-acc-memops.ll b/llvm/test/CodeGen/PowerPC/mma-acc-memops.ll
index 31ddc619d9762..059d60a9608f8 100644
--- a/llvm/test/CodeGen/PowerPC/mma-acc-memops.ll
+++ b/llvm/test/CodeGen/PowerPC/mma-acc-memops.ll
@@ -26,10 +26,10 @@
 define dso_local void @testLdSt(i64 %SrcIdx, i64 %DstIdx) {
 ; LE-PAIRED-LABEL: testLdSt:
 ; LE-PAIRED:       # %bb.0: # %entry
-; LE-PAIRED-NEXT:    plxv vs1, f@PCREL+96(0), 1
-; LE-PAIRED-NEXT:    plxv vs0, f@PCREL+112(0), 1
 ; LE-PAIRED-NEXT:    plxv vs3, f@PCREL+64(0), 1
 ; LE-PAIRED-NEXT:    plxv vs2, f@PCREL+80(0), 1
+; LE-PAIRED-NEXT:    plxv vs1, f@PCREL+96(0), 1
+; LE-PAIRED-NEXT:    plxv vs0, f@PCREL+112(0), 1
 ; LE-PAIRED-NEXT:    pstxv vs0, f@PCREL+176(0), 1
 ; LE-PAIRED-NEXT:    pstxv vs1, f@PCREL+160(0), 1
 ; LE-PAIRED-NEXT:    pstxv vs2, f@PCREL+144(0), 1
@@ -40,10 +40,10 @@ define dso_local void @testLdSt(i64 %SrcIdx, i64 %DstIdx) {
 ; BE-PAIRED:       # %bb.0: # %entry
 ; BE-PAIRED-NEXT:    addis r3, r2, f@toc@ha
 ; BE-PAIRED-NEXT:    addi r3, r3, f@toc@l
-; BE-PAIRED-NEXT:    lxv vs1, 80(r3)
-; BE-PAIRED-NEXT:    lxv vs0, 64(r3)
 ; BE-PAIRED-NEXT:    lxv vs3, 112(r3)
 ; BE-PAIRED-NEXT:    lxv vs2, 96(r3)
+; BE-PAIRED-NEXT:    lxv vs1, 80(r3)
+; BE-PAIRED-NEXT:    lxv vs0, 64(r3)
 ; BE-PAIRED-NEXT:    stxv vs1, 144(r3)
 ; BE-PAIRED-NEXT:    stxv vs0, 128(r3)
 ; BE-PAIRED-NEXT:    stxv vs3, 176(r3)
@@ -135,12 +135,12 @@ define dso_local void @testXLdSt(i64 %SrcIdx, i64 %DstIdx) {
 ; LE-PAIRED-NEXT:    paddi r5, 0, f@PCREL, 1
 ; LE-PAIRED-NEXT:    sldi r3, r3, 6
 ; LE-PAIRED-NEXT:    add r6, r5, r3
-; LE-PAIRED-NEXT:    lxv vs1, 32(r6)
-; LE-PAIRED-NEXT:    lxv vs0, 48(r6)
 ; LE-PAIRED-NEXT:    lxvx vs3, r5, r3
-; LE-PAIRED-NEXT:    lxv vs2, 16(r6)
 ; LE-PAIRED-NEXT:    sldi r3, r4, 6
 ; LE-PAIRED-NEXT:    add r4, r5, r3
+; LE-PAIRED-NEXT:    lxv vs2, 16(r6)
+; LE-PAIRED-NEXT:    lxv vs1, 32(r6)
+; LE-PAIRED-NEXT:    lxv vs0, 48(r6)
 ; LE-PAIRED-NEXT:    stxvx vs3, r5, r3
 ; LE-PAIRED-NEXT:    stxv vs0, 48(r4)
 ; LE-PAIRED-NEXT:    stxv vs1, 32(r4)
@@ -153,12 +153,12 @@ define dso_local void @testXLdSt(i64 %SrcIdx, i64 %DstIdx) {
 ; BE-PAIRED-NEXT:    addi r5, r5, f@toc@l
 ; BE-PAIRED-NEXT:    sldi r3, r3, 6
 ; BE-PAIRED-NEXT:    add r6, r5, r3
+; BE-PAIRED-NEXT:    lxv vs3, 48(r6)
+; BE-PAIRED-NEXT:    lxv vs2, 32(r6)
 ; BE-PAIRED-NEXT:    lxvx vs0, r5, r3
+; BE-PAIRED-NEXT:    lxv vs1, 16(r6)
 ; BE-PAIRED-NEXT:    sldi r3, r4, 6
 ; BE-PAIRED-NEXT:    add r4, r5, r3
-; BE-PAIRED-NEXT:    lxv vs1, 16(r6)
-; BE-PAIRED-NEXT:    lxv vs3, 48(r6)
-; BE-PAIRED-NEXT:    lxv vs2, 32(r6)
 ; BE-PAIRED-NEXT:    stxvx vs0, r5, r3
 ; BE-PAIRED-NEXT:    stxv vs1, 16(r4)
 ; BE-PAIRED-NEXT:    stxv vs3, 48(r4)
@@ -253,10 +253,10 @@ entry:
 define dso_local void @testUnalignedLdSt() {
 ; LE-PAIRED-LABEL: testUnalignedLdSt:
 ; LE-PAIRED:       # %bb.0: # %entry
-; LE-PAIRED-NEXT:    plxv vs1, f@PCREL+43(0), 1
-; LE-PAIRED-NEXT:    plxv vs0, f@PCREL+59(0), 1
 ; LE-PAIRED-NEXT:    plxv vs3, f@PCREL+11(0), 1
 ; LE-PAIRED-NEXT:    plxv vs2, f@PCREL+27(0), 1
+; LE-PAIRED-NEXT:    plxv vs1, f@PCREL+43(0), 1
+; LE-PAIRED-NEXT:    plxv vs0, f@PCREL+59(0), 1
 ; LE-PAIRED-NEXT:    pstxv vs0, f@PCREL+67(0), 1
 ; LE-PAIRED-NEXT:    pstxv vs1, f@PCREL+51(0), 1
 ; LE-PAIRED-NEXT:    pstxv vs2, f@PCREL+35(0), 1
@@ -267,10 +267,10 @@ define dso_local void @testUnalignedLdSt() {
 ; BE-PAIRED:       # %bb.0: # %entry
 ; BE-PAIRED-NEXT:    addis r3, r2, f@toc@ha
 ; BE-PAIRED-NEXT:    addi r3, r3, f@toc@l
-; BE-PAIRED-NEXT:    plxv vs1, 27(r3), 0
-; BE-PAIRED-NEXT:    plxv vs0, 11(r3), 0
 ; BE-PAIRED-NEXT:    plxv vs3, 59(r3), 0
 ; BE-PAIRED-NEXT:    plxv vs2, 43(r3), 0
+; BE-PAIRED-NEXT:    plxv vs1, 27(r3), 0
+; BE-PAIRED-NEXT:    plxv vs0, 11(r3), 0
 ; BE-PAIRED-NEXT:    pstxv vs1, 35(r3), 0
 ; BE-PAIRED-NEXT:    pstxv vs0, 19(r3), 0
 ; BE-PAIRED-NEXT:    pstxv vs3, 67(r3), 0
@@ -375,19 +375,19 @@ entry:
 define dso_local void @testLdStPair(i64 %SrcIdx, i64 %DstIdx) {
 ; LE-PAIRED-LABEL: testLdStPair:
 ; LE-PAIRED:       # %bb.0: # %entry
-; LE-PAIRED-NEXT:    plxv v3, g@PCREL+32(0), 1
 ; LE-PAIRED-NEXT:    plxv vs0, g@PCREL+48(0), 1
+; LE-PAIRED-NEXT:    plxv vs1, g@PCREL+32(0), 1
 ; LE-PAIRED-NEXT:    pstxv vs0, g@PCREL+80(0), 1
-; LE-PAIRED-NEXT:    pstxv v3, g@PCREL+64(0), 1
+; LE-PAIRED-NEXT:    pstxv vs1, g@PCREL+64(0), 1
 ; LE-PAIRED-NEXT:    blr
 ;
 ; BE-PAIRED-LABEL: testLdStPair:
 ; BE-PAIRED:       # %bb.0: # %entry
 ; BE-PAIRED-NEXT:    addis r3, r2, g@toc@ha
 ; BE-PAIRED-NEXT:    addi r3, r3, g@toc@l
-; BE-PAIRED-NEXT:    lxv v3, 48(r3)
 ; BE-PAIRED-NEXT:    lxv vs0, 32(r3)
-; BE-PAIRED-NEXT:    stxv v3, 80(r3)
+; BE-PAIRED-NEXT:    lxv vs1, 48(r3)
+; BE-PAIRED-NEXT:    stxv vs1, 80(r3)
 ; BE-PAIRED-NEXT:    stxv vs0, 64(r3)
 ; BE-PAIRED-NEXT:    blr
 ;
@@ -452,12 +452,12 @@ define dso_local void @testXLdStPair(i64 %SrcIdx, i64 %DstIdx) {
 ; LE-PAIRED-NEXT:    sldi r3, r3, 5
 ; LE-PAIRED-NEXT:    paddi r5, 0, g@PCREL, 1
 ; LE-PAIRED-NEXT:    add r6, r5, r3
-; LE-PAIRED-NEXT:    lxvx v3, r5, r3
+; LE-PAIRED-NEXT:    lxvx vs0, r5, r3
 ; LE-PAIRED-NEXT:    sldi r3, r4, 5
 ; LE-PAIRED-NEXT:    add r4, r5, r3
-; LE-PAIRED-NEXT:    lxv vs0, 16(r6)
-; LE-PAIRED-NEXT:    stxvx v3, r5, r3
-; LE-PAIRED-NEXT:    stxv vs0, 16(r4)
+; LE-PAIRED-NEXT:    lxv vs1, 16(r6)
+; LE-PAIRED-NEXT:    stxvx vs0, r5, r3
+; LE-PAIRED-NEXT:    stxv vs1, 16(r4)
 ; LE-PAIRED-NEXT:    blr
 ;
 ; BE-PAIRED-LABEL: testXLdStPair:
@@ -469,9 +469,9 @@ define dso_local void @testXLdStPair(i64 %SrcIdx, i64 %DstIdx) {
 ; BE-PAIRED-NEXT:    lxvx vs0, r5, r3
 ; BE-PAIRED-NEXT:    sldi r3, r4, 5
 ; BE-PAIRED-NEXT:    add r4, r5, r3
-; BE-PAIRED-NEXT:    lxv v3, 16(r6)
+; BE-PAIRED-NEXT:    lxv vs1, 16(r6)
 ; BE-PAIRED-NEXT:    stxvx vs0, r5, r3
-; BE-PAIRED-NEXT:    stxv v3, 16(r4)
+; BE-PAIRED-NEXT:    stxv vs1, 16(r4)
 ; BE-PAIRED-NEXT:    blr
 ;
 ; LE-PWR9-LABEL: testXLdStPair:
@@ -542,19 +542,19 @@ entry:
 define dso_local void @testUnalignedLdStPair() {
 ; LE-PAIRED-LABEL: testUnalignedLdStPair:
 ; LE-PAIRED:       # %bb.0: # %entry
-; LE-PAIRED-NEXT:    plxv v3, g@PCREL+11(0), 1
 ; LE-PAIRED-NEXT:    plxv vs0, g@PCREL+27(0), 1
+; LE-PAIRED-NEXT:    plxv vs1, g@PCREL+11(0), 1
 ; LE-PAIRED-NEXT:    pstxv vs0, g@PCREL+35(0), 1
-; LE-PAIRED-NEXT:    pstxv v3, g@PCREL+19(0), 1
+; LE-PAIRED-NEXT:    pstxv vs1, g@PCREL+19(0), 1
 ; LE-PAIRED-NEXT:    blr
 ;
 ; BE-PAIRED-LABEL: testUnalignedLdStPair:
 ; BE-PAIRED:       # %bb.0: # %entry
 ; BE-PAIRED-NEXT:    addis r3, r2, g@toc@ha
 ; BE-PAIRED-NEXT:    addi r3, r3, g@toc@l
-; BE-PAIRED-NEXT:    plxv v3, 27(r3), 0
 ; BE-PAIRED-NEXT:    plxv vs0, 11(r3), 0
-; BE-PAIRED-NEXT:    pstxv v3, 35(r3), 0
+; BE-PAIRED-NEXT:    plxv vs1, 27(r3), 0
+; BE-PAIRED-NEXT:    pstxv vs1, 35(r3), 0
 ; BE-PAIRED-NEXT:    pstxv vs0, 19(r3), 0
 ; BE-PAIRED-NEXT:    blr
 ;
diff --git a/llvm/test/CodeGen/PowerPC/mma-acc-spill.ll b/llvm/test/CodeGen/PowerPC/mma-acc-spill.ll
index 681f81d74794d..abc65bed5bf6c 100644
--- a/llvm/test/CodeGen/PowerPC/mma-acc-spill.ll
+++ b/llvm/test/CodeGen/PowerPC/mma-acc-spill.ll
@@ -38,9 +38,9 @@ define void @intrinsics1(<16 x i8> %vc1, <16 x i8> %vc2, <16 x i8> %vc3, <16 x i
 ; CHECK-NEXT:    stxv v31, 144(r1) # 16-byte Folded Spill
 ; CHECK-NEXT:    vmr v31, v5
 ; CHECK-NEX...
[truncated]

@lei137 lei137 requested review from amy-kwan and diggerlin March 6, 2025 14:32
@lei137 lei137 requested a review from RolandF77 March 6, 2025 14:36
@lei137 lei137 force-pushed the lei/useREG_SEQUENCE branch from afd06e6 to fd2ad96 Compare March 18, 2025 17:33
@lei137 lei137 merged commit dbc7665 into llvm:main Mar 18, 2025
6 of 9 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Mar 18, 2025

LLVM Buildbot has detected a new failure on builder llvm-nvptx-nvidia-ubuntu running on as-builder-7 while building llvm at step 2 "checkout".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/180/builds/14922

Here is the relevant piece of the build log for the reference
Step 2 (checkout) failure: update (failure)
git version 2.34.1
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

@lei137 lei137 deleted the lei/useREG_SEQUENCE branch September 5, 2025 15:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PPC should use REG_SEQUENCE instead of sequences of INSERT_SUBREG

4 participants