Skip to content

Conversation

@preames
Copy link
Collaborator

@preames preames commented Mar 18, 2025

This implements initial code generation support for a subset of the xrivosvizip extension. Specifically, this adds support for vzipeven, vzipodd, and vzip2a, but not vzip2b, vunzip2a, or vunzip2b. The others will follow in separate patches.

One review note: The zipeven/zipodd matchers were recently rewritten to better match upstream style, so careful review there would be appreciated. The matchers don't yet support type coercion to wider types. This will be done in a future patch.

This implements initial code generation support for the xrivosvizip
extension.  A couple of things to note:
* The zipeven/zipodd matchers were recently rewritten to better match
  upstream style, so careful review there would be appreciated.
* The zipeven/zipodd cases don't yet support type coercion.  This will be
  done in a future patch.
* I subsetted the unzip2a/b support in a way which makes it functional,
  but far from optimal.  A further change will reintroduce some of
  the complexity once it's easy to test and show incremental change.
@llvmbot
Copy link
Member

llvmbot commented Mar 18, 2025

@llvm/pr-subscribers-backend-risc-v

Author: Philip Reames (preames)

Changes

This implements initial code generation support for the xrivosvizip extension. A couple of things to note:

  • The zipeven/zipodd matchers were recently rewritten to better match upstream style, so careful review there would be appreciated. The matchers don't yet support type coercion. This will be done in a future patch.
  • I subsetted the unzip2a/b support in a way which makes it functional, but far from optimal. A further change will reintroduce some of the complexity once it's easy to test and show incremental change.

Patch is 57.26 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/131933.diff

6 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+121-5)
  • (modified) llvm/lib/Target/RISCV/RISCVISelLowering.h (+9-1)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoXRivos.td (+42)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-deinterleave2.ll (+160-74)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-int-interleave.ll (+310)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-zipeven-zipodd.ll (+140)
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 27a4bbce1f5fc..db9535b1a081a 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -4553,8 +4553,10 @@ static SDValue getSingleShuffleSrc(MVT VT, SDValue V1, SDValue V2) {
 /// way through the source.
 static bool isInterleaveShuffle(ArrayRef<int> Mask, MVT VT, int &EvenSrc,
                                 int &OddSrc, const RISCVSubtarget &Subtarget) {
-  // We need to be able to widen elements to the next larger integer type.
-  if (VT.getScalarSizeInBits() >= Subtarget.getELen())
+  // We need to be able to widen elements to the next larger integer type or
+  // use the zip2a instruction at e64.
+  if (VT.getScalarSizeInBits() >= Subtarget.getELen() &&
+      !Subtarget.hasVendorXRivosVizip())
     return false;
 
   int Size = Mask.size();
@@ -4611,6 +4613,43 @@ static bool isElementRotate(std::array<std::pair<int, int>, 2> &SrcInfo,
          SrcInfo[1].second - SrcInfo[0].second == (int)NumElts;
 }
 
+static bool isAlternating(std::array<std::pair<int, int>, 2> &SrcInfo,
+                          ArrayRef<int> Mask, bool &Polarity) {
+  int NumElts = Mask.size();
+  bool NonUndefFound = false;
+  for (unsigned i = 0; i != Mask.size(); ++i) {
+    int M = Mask[i];
+    if (M < 0)
+        continue;
+    int Src = M >= (int)NumElts;
+    int Diff = (int)i - (M % NumElts);
+    bool C = Src == SrcInfo[1].first && Diff == SrcInfo[1].second;
+    if (!NonUndefFound) {
+      NonUndefFound = true;
+      Polarity = (C == i % 2);
+      continue;
+    }
+    if ((Polarity && C != i % 2) || (!Polarity && C == i % 2))
+      return false;
+  }
+  return true;
+}
+
+static bool isZipEven(std::array<std::pair<int, int>, 2> &SrcInfo,
+                      ArrayRef<int> Mask) {
+  bool Polarity;
+  return SrcInfo[0].second == 0 && SrcInfo[1].second == 1 &&
+    isAlternating(SrcInfo, Mask, Polarity) && Polarity;
+;
+}
+
+static bool isZipOdd(std::array<std::pair<int, int>, 2> &SrcInfo,
+                     ArrayRef<int> Mask) {
+  bool Polarity;
+  return SrcInfo[0].second == 0 && SrcInfo[1].second == -1 &&
+    isAlternating(SrcInfo, Mask, Polarity) && !Polarity;
+}
+
 // Lower a deinterleave shuffle to SRL and TRUNC.  Factor must be
 // 2, 4, 8 and the integer type Factor-times larger than VT's
 // element type must be a legal element type.
@@ -4870,6 +4909,36 @@ static bool isSpreadMask(ArrayRef<int> Mask, unsigned Factor, unsigned &Index) {
   return true;
 }
 
+static SDValue lowerVIZIP(unsigned Opc, SDValue Op0, SDValue Op1,
+                          const SDLoc &DL, SelectionDAG &DAG,
+                          const RISCVSubtarget &Subtarget) {
+  assert(RISCVISD::RI_VZIPEVEN_VL == Opc || RISCVISD::RI_VZIPODD_VL == Opc ||
+         RISCVISD::RI_VZIP2A_VL == Opc || RISCVISD::RI_VZIP2B_VL == Opc ||
+         RISCVISD::RI_VUNZIP2A_VL == Opc || RISCVISD::RI_VUNZIP2B_VL == Opc);
+  assert(Op0.getSimpleValueType() == Op1.getSimpleValueType());
+
+  MVT VT = Op0.getSimpleValueType();
+  MVT IntVT = VT.changeVectorElementTypeToInteger();
+  Op0 = DAG.getBitcast(IntVT, Op0);
+  Op1 = DAG.getBitcast(IntVT, Op1);
+
+  MVT ContainerVT = IntVT;
+  if (VT.isFixedLengthVector()) {
+    ContainerVT = getContainerForFixedLengthVector(DAG, IntVT, Subtarget);
+    Op0 = convertToScalableVector(ContainerVT, Op0, DAG, Subtarget);
+    Op1 = convertToScalableVector(ContainerVT, Op1, DAG, Subtarget);
+  }
+
+  auto [Mask, VL] = getDefaultVLOps(IntVT, ContainerVT, DL, DAG, Subtarget);
+  SDValue Passthru = DAG.getUNDEF(ContainerVT);
+  SDValue Res =
+    DAG.getNode(Opc, DL, ContainerVT, Op0, Op1, Passthru, Mask, VL);
+  if (IntVT.isFixedLengthVector())
+    Res = convertFromScalableVector(IntVT, Res, DAG, Subtarget);
+  Res = DAG.getBitcast(VT, Res);
+  return Res;
+}
+
 // Given a vector a, b, c, d return a vector Factor times longer
 // with Factor-1 undef's between elements. Ex:
 //   a, undef, b, undef, c, undef, d, undef (Factor=2, Index=0)
@@ -5384,6 +5453,7 @@ static SDValue lowerVECTOR_SHUFFLE(SDValue Op, SelectionDAG &DAG,
   SDLoc DL(Op);
   MVT XLenVT = Subtarget.getXLenVT();
   MVT VT = Op.getSimpleValueType();
+  EVT ElemVT = VT.getVectorElementType();
   unsigned NumElts = VT.getVectorNumElements();
   ShuffleVectorSDNode *SVN = cast<ShuffleVectorSDNode>(Op.getNode());
 
@@ -5556,6 +5626,25 @@ static SDValue lowerVECTOR_SHUFFLE(SDValue Op, SelectionDAG &DAG,
     }
   }
 
+  // If this is an e64 deinterleave(2) (possibly with two distinct sources)
+  // match to the vunzip2a/vunzip2b.
+  unsigned Index = 0;
+  if (Subtarget.hasVendorXRivosVizip() && ElemVT == MVT::i64 &&
+      ShuffleVectorInst::isDeInterleaveMaskOfFactor(Mask, 2, Index) &&
+      1 < count_if(Mask, [](int Idx) { return Idx != -1; })) {
+    MVT HalfVT = VT.getHalfNumVectorElementsVT();
+    unsigned Opc = Index == 0 ?
+        RISCVISD::RI_VUNZIP2A_VL : RISCVISD::RI_VUNZIP2B_VL;
+    V1 = lowerVIZIP(Opc, V1, DAG.getUNDEF(VT), DL, DAG, Subtarget);
+    V2 = lowerVIZIP(Opc, V2, DAG.getUNDEF(VT), DL, DAG, Subtarget);
+
+    V1 = DAG.getNode(ISD::EXTRACT_SUBVECTOR, DL, HalfVT, V1,
+                     DAG.getVectorIdxConstant(0, DL));
+    V2 = DAG.getNode(ISD::EXTRACT_SUBVECTOR, DL, HalfVT, V2,
+                     DAG.getVectorIdxConstant(0, DL));
+    return DAG.getNode(ISD::CONCAT_VECTORS, DL, VT, V1, V2);
+  }
+
   if (SDValue V =
           lowerVECTOR_SHUFFLEAsVSlideup(DL, VT, V1, V2, Mask, Subtarget, DAG))
     return V;
@@ -5596,6 +5685,15 @@ static SDValue lowerVECTOR_SHUFFLE(SDValue Op, SelectionDAG &DAG,
                          DAG.getVectorIdxConstant(OddSrc % Size, DL));
     }
 
+    // Prefer vzip2a if available.
+    // TODO: Extend to matching zip2b if EvenSrc and OddSrc allow.
+    if (Subtarget.hasVendorXRivosVizip())  {
+      EvenV = DAG.getNode(ISD::INSERT_SUBVECTOR, DL, VT, DAG.getUNDEF(VT),
+                          EvenV, DAG.getVectorIdxConstant(0, DL));
+      OddV = DAG.getNode(ISD::INSERT_SUBVECTOR, DL, VT, DAG.getUNDEF(VT),
+                          OddV, DAG.getVectorIdxConstant(0, DL));
+      return lowerVIZIP(RISCVISD::RI_VZIP2A_VL, EvenV, OddV, DL, DAG, Subtarget);
+    }
     return getWideningInterleave(EvenV, OddV, DL, DAG, Subtarget);
   }
 
@@ -5647,6 +5745,17 @@ static SDValue lowerVECTOR_SHUFFLE(SDValue Op, SelectionDAG &DAG,
       return convertFromScalableVector(VT, Res, DAG, Subtarget);
     }
 
+    if (Subtarget.hasVendorXRivosVizip() && isZipEven(SrcInfo, Mask)) {
+      SDValue Src1 = SrcInfo[0].first == 0 ? V1 : V2;
+      SDValue Src2 = SrcInfo[1].first == 0 ? V1 : V2;
+      return lowerVIZIP(RISCVISD::RI_VZIPEVEN_VL, Src1, Src2, DL, DAG, Subtarget);
+    }
+    if (Subtarget.hasVendorXRivosVizip() && isZipOdd(SrcInfo, Mask)) {
+      SDValue Src1 = SrcInfo[1].first == 0 ? V1 : V2;
+      SDValue Src2 = SrcInfo[0].first == 0 ? V1 : V2;
+      return lowerVIZIP(RISCVISD::RI_VZIPODD_VL, Src1, Src2, DL, DAG, Subtarget);
+    }
+
     // Build the mask.  Note that vslideup unconditionally preserves elements
     // below the slide amount in the destination, and thus those elements are
     // undefined in the mask.  If the mask ends up all true (or undef), it
@@ -6710,7 +6819,7 @@ static bool hasPassthruOp(unsigned Opcode) {
          Opcode <= RISCVISD::LAST_STRICTFP_OPCODE &&
          "not a RISC-V target specific op");
   static_assert(
-      RISCVISD::LAST_VL_VECTOR_OP - RISCVISD::FIRST_VL_VECTOR_OP == 127 &&
+      RISCVISD::LAST_VL_VECTOR_OP - RISCVISD::FIRST_VL_VECTOR_OP == 133 &&
       RISCVISD::LAST_STRICTFP_OPCODE - RISCVISD::FIRST_STRICTFP_OPCODE == 21 &&
       "adding target specific op should update this function");
   if (Opcode >= RISCVISD::ADD_VL && Opcode <= RISCVISD::VFMAX_VL)
@@ -6734,12 +6843,13 @@ static bool hasMaskOp(unsigned Opcode) {
          Opcode <= RISCVISD::LAST_STRICTFP_OPCODE &&
          "not a RISC-V target specific op");
   static_assert(
-      RISCVISD::LAST_VL_VECTOR_OP - RISCVISD::FIRST_VL_VECTOR_OP == 127 &&
+      RISCVISD::LAST_VL_VECTOR_OP - RISCVISD::FIRST_VL_VECTOR_OP == 133 &&
       RISCVISD::LAST_STRICTFP_OPCODE - RISCVISD::FIRST_STRICTFP_OPCODE == 21 &&
       "adding target specific op should update this function");
   if (Opcode >= RISCVISD::TRUNCATE_VECTOR_VL && Opcode <= RISCVISD::SETCC_VL)
     return true;
-  if (Opcode >= RISCVISD::VRGATHER_VX_VL && Opcode <= RISCVISD::VFIRST_VL)
+  if (Opcode >= RISCVISD::VRGATHER_VX_VL &&
+      Opcode <= RISCVISD::LAST_VL_VECTOR_OP)
     return true;
   if (Opcode >= RISCVISD::STRICT_FADD_VL &&
       Opcode <= RISCVISD::STRICT_VFROUND_NOEXCEPT_VL)
@@ -21758,6 +21868,12 @@ const char *RISCVTargetLowering::getTargetNodeName(unsigned Opcode) const {
   NODE_NAME_CASE(VZEXT_VL)
   NODE_NAME_CASE(VCPOP_VL)
   NODE_NAME_CASE(VFIRST_VL)
+  NODE_NAME_CASE(RI_VZIPEVEN_VL)
+  NODE_NAME_CASE(RI_VZIPODD_VL)
+  NODE_NAME_CASE(RI_VZIP2A_VL)
+  NODE_NAME_CASE(RI_VZIP2B_VL)
+  NODE_NAME_CASE(RI_VUNZIP2A_VL)
+  NODE_NAME_CASE(RI_VUNZIP2B_VL)
   NODE_NAME_CASE(READ_CSR)
   NODE_NAME_CASE(WRITE_CSR)
   NODE_NAME_CASE(SWAP_CSR)
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.h b/llvm/lib/Target/RISCV/RISCVISelLowering.h
index ffbc14a29006c..b271bc68427e9 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.h
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.h
@@ -403,7 +403,15 @@ enum NodeType : unsigned {
   //  vfirst.m with additional mask and VL operands.
   VFIRST_VL,
 
-  LAST_VL_VECTOR_OP = VFIRST_VL,
+  // XRivosVizip
+  RI_VZIPEVEN_VL,
+  RI_VZIPODD_VL,
+  RI_VZIP2A_VL,
+  RI_VZIP2B_VL,
+  RI_VUNZIP2A_VL,
+  RI_VUNZIP2B_VL,
+
+  LAST_VL_VECTOR_OP = RI_VUNZIP2B_VL,
 
   // Read VLENB CSR
   READ_VLENB,
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoXRivos.td b/llvm/lib/Target/RISCV/RISCVInstrInfoXRivos.td
index 78c4ed6f00412..395fd917bfe42 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoXRivos.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoXRivos.td
@@ -67,6 +67,46 @@ defm RI_VUNZIP2A_V : VALU_IV_V<"ri.vunzip2a", 0b001000>;
 defm RI_VUNZIP2B_V : VALU_IV_V<"ri.vunzip2b", 0b011000>;
 }
 
+// These are modeled after the int binop VL nodes
+def ri_vzipeven_vl : SDNode<"RISCVISD::RI_VZIPEVEN_VL", SDT_RISCVIntBinOp_VL>;
+def ri_vzipodd_vl : SDNode<"RISCVISD::RI_VZIPODD_VL", SDT_RISCVIntBinOp_VL>;
+def ri_vzip2a_vl : SDNode<"RISCVISD::RI_VZIP2A_VL", SDT_RISCVIntBinOp_VL>;
+def ri_vunzip2a_vl : SDNode<"RISCVISD::RI_VUNZIP2A_VL", SDT_RISCVIntBinOp_VL>;
+def ri_vunzip2b_vl : SDNode<"RISCVISD::RI_VUNZIP2B_VL", SDT_RISCVIntBinOp_VL>;
+
+multiclass RIVPseudoVALU_VV {
+  foreach m = MxList in {
+    defvar mx = m.MX;
+    defm "" : VPseudoBinaryV_VV<m, Commutable=0>;
+  }
+}
+
+let Predicates = [HasVendorXRivosVizip],
+  Constraints = "@earlyclobber $rd, $rd = $passthru" in {
+defm PseudoRI_VZIPEVEN   : RIVPseudoVALU_VV;
+defm PseudoRI_VZIPODD   : RIVPseudoVALU_VV;
+defm PseudoRI_VZIP2A   : RIVPseudoVALU_VV;
+defm PseudoRI_VUNZIP2A   : RIVPseudoVALU_VV;
+defm PseudoRI_VUNZIP2B   : RIVPseudoVALU_VV;
+}
+
+multiclass RIVPatBinaryVL_VV<SDPatternOperator vop, string instruction_name,
+                              list<VTypeInfo> vtilist = AllIntegerVectors,
+                              bit isSEWAware = 0> {
+  foreach vti = vtilist in
+    let Predicates = GetVTypePredicates<vti>.Predicates in
+      def : VPatBinaryVL_V<vop, instruction_name, "VV",
+                           vti.Vector, vti.Vector, vti.Vector, vti.Mask,
+                           vti.Log2SEW, vti.LMul, vti.RegClass, vti.RegClass,
+                           vti.RegClass, isSEWAware>;
+}
+
+defm : RIVPatBinaryVL_VV<ri_vzipeven_vl, "PseudoRI_VZIPEVEN">;
+defm : RIVPatBinaryVL_VV<ri_vzipodd_vl, "PseudoRI_VZIPODD">;
+defm : RIVPatBinaryVL_VV<ri_vzip2a_vl, "PseudoRI_VZIP2A">;
+defm : RIVPatBinaryVL_VV<ri_vunzip2a_vl, "PseudoRI_VUNZIP2A">;
+defm : RIVPatBinaryVL_VV<ri_vunzip2b_vl, "PseudoRI_VUNZIP2B">;
+
 //===----------------------------------------------------------------------===//
 // XRivosVisni
 //===----------------------------------------------------------------------===//
@@ -87,3 +127,5 @@ def RI_VEXTRACT : CustomRivosXVI<0b010111, OPMVV, (outs GPR:$rd),
                                 (ins VR:$vs2, uimm5:$imm),
                                 "ri.vextract.x.v", "$rd, $vs2, $imm">;
 }
+
+
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-deinterleave2.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-deinterleave2.ll
index 9279e0a4d3a6c..2165c6025f7e7 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-deinterleave2.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-deinterleave2.ll
@@ -1,10 +1,13 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
 ; RUN: llc < %s -mtriple=riscv64 -mattr=+v,+zvfh,+zvl256b \
 ; RUN:   -lower-interleaved-accesses=false -verify-machineinstrs \
-; RUN:   | FileCheck %s --check-prefixes=CHECK,V
+; RUN:   | FileCheck %s --check-prefixes=CHECK,V,V-NOZIP
 ; RUN: llc < %s -mtriple=riscv64 -mattr=+f,+zve32f,+zvfh,+zvl256b \
 ; RUN:   -lower-interleaved-accesses=false -verify-machineinstrs \
 ; RUN:   | FileCheck %s --check-prefixes=CHECK,ZVE32F
+; RUN: llc < %s -mtriple=riscv64 -mattr=+v,+zvfh,+zvl256b,+experimental-xrivosvizip \
+; RUN:   -lower-interleaved-accesses=false -verify-machineinstrs \
+; RUN:   | FileCheck %s --check-prefixes=CHECK,V,ZIP
 
 define void @vnsrl_0_i8(ptr %in, ptr %out) {
 ; CHECK-LABEL: vnsrl_0_i8:
@@ -247,15 +250,15 @@ entry:
 }
 
 define void @vnsrl_0_i64(ptr %in, ptr %out) {
-; V-LABEL: vnsrl_0_i64:
-; V:       # %bb.0: # %entry
-; V-NEXT:    vsetivli zero, 4, e64, m1, ta, ma
-; V-NEXT:    vle64.v v8, (a0)
-; V-NEXT:    vsetivli zero, 2, e64, m1, ta, ma
-; V-NEXT:    vslidedown.vi v9, v8, 2
-; V-NEXT:    vslideup.vi v8, v9, 1
-; V-NEXT:    vse64.v v8, (a1)
-; V-NEXT:    ret
+; V-NOZIP-LABEL: vnsrl_0_i64:
+; V-NOZIP:       # %bb.0: # %entry
+; V-NOZIP-NEXT:    vsetivli zero, 4, e64, m1, ta, ma
+; V-NOZIP-NEXT:    vle64.v v8, (a0)
+; V-NOZIP-NEXT:    vsetivli zero, 2, e64, m1, ta, ma
+; V-NOZIP-NEXT:    vslidedown.vi v9, v8, 2
+; V-NOZIP-NEXT:    vslideup.vi v8, v9, 1
+; V-NOZIP-NEXT:    vse64.v v8, (a1)
+; V-NOZIP-NEXT:    ret
 ;
 ; ZVE32F-LABEL: vnsrl_0_i64:
 ; ZVE32F:       # %bb.0: # %entry
@@ -264,6 +267,18 @@ define void @vnsrl_0_i64(ptr %in, ptr %out) {
 ; ZVE32F-NEXT:    sd a2, 0(a1)
 ; ZVE32F-NEXT:    sd a0, 8(a1)
 ; ZVE32F-NEXT:    ret
+;
+; ZIP-LABEL: vnsrl_0_i64:
+; ZIP:       # %bb.0: # %entry
+; ZIP-NEXT:    vsetivli zero, 4, e64, m1, ta, ma
+; ZIP-NEXT:    vle64.v v8, (a0)
+; ZIP-NEXT:    vsetivli zero, 2, e64, m1, ta, ma
+; ZIP-NEXT:    ri.vunzip2a.vv v10, v8, v9
+; ZIP-NEXT:    vslidedown.vi v8, v8, 2
+; ZIP-NEXT:    ri.vunzip2a.vv v11, v8, v9
+; ZIP-NEXT:    vslideup.vi v10, v11, 1
+; ZIP-NEXT:    vse64.v v10, (a1)
+; ZIP-NEXT:    ret
 entry:
   %0 = load <4 x i64>, ptr %in, align 8
   %shuffle.i5 = shufflevector <4 x i64> %0, <4 x i64> poison, <2 x i32> <i32 0, i32 2>
@@ -272,16 +287,16 @@ entry:
 }
 
 define void @vnsrl_64_i64(ptr %in, ptr %out) {
-; V-LABEL: vnsrl_64_i64:
-; V:       # %bb.0: # %entry
-; V-NEXT:    vsetivli zero, 4, e64, m1, ta, ma
-; V-NEXT:    vle64.v v8, (a0)
-; V-NEXT:    vmv.v.i v0, 1
-; V-NEXT:    vsetivli zero, 2, e64, m1, ta, mu
-; V-NEXT:    vslidedown.vi v9, v8, 2
-; V-NEXT:    vslidedown.vi v9, v8, 1, v0.t
-; V-NEXT:    vse64.v v9, (a1)
-; V-NEXT:    ret
+; V-NOZIP-LABEL: vnsrl_64_i64:
+; V-NOZIP:       # %bb.0: # %entry
+; V-NOZIP-NEXT:    vsetivli zero, 4, e64, m1, ta, ma
+; V-NOZIP-NEXT:    vle64.v v8, (a0)
+; V-NOZIP-NEXT:    vmv.v.i v0, 1
+; V-NOZIP-NEXT:    vsetivli zero, 2, e64, m1, ta, mu
+; V-NOZIP-NEXT:    vslidedown.vi v9, v8, 2
+; V-NOZIP-NEXT:    vslidedown.vi v9, v8, 1, v0.t
+; V-NOZIP-NEXT:    vse64.v v9, (a1)
+; V-NOZIP-NEXT:    ret
 ;
 ; ZVE32F-LABEL: vnsrl_64_i64:
 ; ZVE32F:       # %bb.0: # %entry
@@ -290,6 +305,18 @@ define void @vnsrl_64_i64(ptr %in, ptr %out) {
 ; ZVE32F-NEXT:    sd a2, 0(a1)
 ; ZVE32F-NEXT:    sd a0, 8(a1)
 ; ZVE32F-NEXT:    ret
+;
+; ZIP-LABEL: vnsrl_64_i64:
+; ZIP:       # %bb.0: # %entry
+; ZIP-NEXT:    vsetivli zero, 4, e64, m1, ta, ma
+; ZIP-NEXT:    vle64.v v8, (a0)
+; ZIP-NEXT:    vsetivli zero, 2, e64, m1, ta, ma
+; ZIP-NEXT:    ri.vunzip2b.vv v10, v8, v9
+; ZIP-NEXT:    vslidedown.vi v8, v8, 2
+; ZIP-NEXT:    ri.vunzip2b.vv v11, v8, v9
+; ZIP-NEXT:    vslideup.vi v10, v11, 1
+; ZIP-NEXT:    vse64.v v10, (a1)
+; ZIP-NEXT:    ret
 entry:
   %0 = load <4 x i64>, ptr %in, align 8
   %shuffle.i5 = shufflevector <4 x i64> %0, <4 x i64> poison, <2 x i32> <i32 1, i32 3>
@@ -323,16 +350,16 @@ entry:
 }
 
 define void @vnsrl_64_double(ptr %in, ptr %out) {
-; V-LABEL: vnsrl_64_double:
-; V:       # %bb.0: # %entry
-; V-NEXT:    vsetivli zero, 4, e64, m1, ta, ma
-; V-NEXT:    vle64.v v8, (a0)
-; V-NEXT:    vmv.v.i v0, 1
-; V-NEXT:    vsetivli zero, 2, e64, m1, ta, mu
-; V-NEXT:    vslidedown.vi v9, v8, 2
-; V-NEXT:    vslidedown.vi v9, v8, 1, v0.t
-; V-NEXT:    vse64.v v9, (a1)
-; V-NEXT:    ret
+; V-NOZIP-LABEL: vnsrl_64_double:
+; V-NOZIP:       # %bb.0: # %entry
+; V-NOZIP-NEXT:    vsetivli zero, 4, e64, m1, ta, ma
+; V-NOZIP-NEXT:    vle64.v v8, (a0)
+; V-NOZIP-NEXT:    vmv.v.i v0, 1
+; V-NOZIP-NEXT:    vsetivli zero, 2, e64, m1, ta, mu
+; V-NOZIP-NEXT:    vslidedown.vi v9, v8, 2
+; V-NOZIP-NEXT:    vslidedown.vi v9, v8, 1, v0.t
+; V-NOZIP-NEXT:    vse64.v v9, (a1)
+; V-NOZIP-NEXT:    ret
 ;
 ; ZVE32F-LABEL: vnsrl_64_double:
 ; ZVE32F:       # %bb.0: # %entry
@@ -341,6 +368,16 @@ define void @vnsrl_64_double(ptr %in, ptr %out) {
 ; ZVE32F-NEXT:    sd a2, 0(a1)
 ; ZVE32F-NEXT:    sd a0, 8(a1)
 ; ZVE32F-NEXT:    ret
+;
+; ZIP-LABEL: vnsrl_64_double:
+; ZIP:       # %bb.0: # %entry
+; ZIP-NEXT:    vsetivli zero, 4, e64, m1, ta, ma
+; ZIP-NEXT:    vle64.v v8, (a0)
+; ZIP-NEXT:    vsetivli zero, 2, e64, m1, ta, ma
+; ZIP-NEXT:    vslidedown.vi v9, v8, 2
+; ZIP-NEXT:    ri.vzipodd.vv v10, v8, v9
+; ZIP-NEXT:    vse64.v v10, (a1)
+; ZIP-NEXT:    ret
 entry:
   %0 = load <4 x double>, ptr %in, align 8
   %shuffle.i5 = shufflevector <4 x double> %0, <4 x double> poison, <2 x i32> <i32 1, i32 3>
@@ -802,15 +839,15 @@ entry:
 }
 
 define void @vnsrl_32_i32_two_source(ptr %in0, ptr %in1, ptr %out) {
-; V-LABEL: vnsrl_32_i32_two_source:
-; V:       # %bb.0: # %entry
-; V-NEXT:    vsetivli zero, 2, e32, mf2, ta, mu
-; V-NEXT:    vle32.v v8, (a0)
-; V-NEXT:    vle32.v v9, (a1)
-; V-NEXT:    vmv.v.i v0, 1
-; V-NEXT:    vslidedown.vi v9, v8, 1, v0.t
-; V-NEXT:    vse32.v v9, (a2)
-; V-NEXT:    ret
+; V-NOZIP-LABEL: vnsrl_32_i32_two_source:
+; V-NOZIP:       # %bb.0: # %entry
+; V-NOZIP-NEXT:    vsetivli zero, 2, e32, mf2, ta, mu
+; V-NOZIP-NEXT:    vle32.v v8, (a0)
+; V-NOZIP-NEXT:    vle32.v v9, (a1)
+; V-NOZIP-NEXT:    vmv.v.i v0, 1
+; V-NOZIP-NEXT:    vslidedown.vi v9, v8, 1, v0.t
+; V-NOZIP-NEXT:    vse32.v v9, (a2)
+; V-NOZIP-NEXT:    ret
 ;
 ; ZVE32F-LABEL: vnsrl_32_i32_two_source:
 ; ZVE32F:       # %bb.0: # %entry
@@ -821,6 +858,15 @@ define void @vnsrl_32_i32_two_source(ptr %in0, ptr %in1, ptr %out) {
 ; ZVE32F-NEXT:    vslidedown.vi v9, v8, 1, v0.t
 ; ZVE32F-NEXT:    vse32.v v9, (a2)
 ; ZVE32F-NEXT:    ret
+;
+; ZIP-LABEL: vnsrl_32_i32_two_source:
+; ZIP:       # %bb.0: # %entry
+; ZIP-NEXT:    vsetivli zero, 2, e32, mf2, ta, ma
+; ZIP-NEXT:    vle32.v v8, (a0)
+; ZIP-NEXT:    vle32.v v9, (a1)
+; ZIP-NEXT:    ri.vzipodd.vv v10, v8, v9
+; ZIP-NEXT:    vse32.v v10, (a2)
+; ZIP-NEXT:    ret
 entry:
   %0 = load <2 x i32>, ptr %in0, align 4
   %1 = load <2 x i32>, ptr %in1, align 4
@@ -856,15 +902,15 @@ entry:
 }
 
 define void @vnsrl_32_float_two_source(ptr %in0, ptr %in1, ptr %out) {
-; V-LABEL: vnsrl_32_float_two_source:
-; V:       # %bb.0: # %entry
-; V-NEXT:    vsetivli zero, 2, e32, mf2, ta, mu
-; V-NEXT:    vle32.v v8, (a0)
-; V-NEXT:    vle32.v v9, (a1)
-; V-NEXT:    vmv.v.i v0, 1
-; V-NEXT:    vslidedown.vi v9, v8, 1, v0.t
-; V-NEXT:    vse32.v v9, (a2)
-; V-NEXT:    ret
+; V-NOZIP-LABEL: vnsrl_32_float_two_source:
+; V-NOZIP:       # %bb.0: # %entry
+; V-NOZIP-NEXT:    vsetivli zero, 2, e32, mf2, ta, mu
+; V-NOZIP-NEXT:    vle32.v v8, (a0)
+; V-NOZIP-NEXT:    vle32.v v9, (a1)
+; V-NOZIP-NEXT:    vmv.v.i v0, 1
+; V-NOZIP-NEXT:    vslidedown.vi v9, v8, 1, v0.t
+; V-NOZIP-NEXT:    vse32.v v9, (a2)
+; V-NOZIP-NEXT:    ret
 ;
 ; ZVE32F-LABEL: vnsrl_32_float_two_source:
 ; ZVE32F:       # %bb.0: # %entry
@@ -875,6 +921,15 @@ define void @vnsrl_32_float_two_s...
[truncated]

@github-actions
Copy link

github-actions bot commented Mar 18, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Member

@mshockwave mshockwave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably can use this for llvm.vector.(de)interleave2 and other power-of-two factors as well. But that can be a follow-up patch.

Copy link
Member

@mshockwave mshockwave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@topperc topperc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@preames preames merged commit f8ee58a into llvm:main Mar 29, 2025
6 of 11 checks passed
@preames preames deleted the pr-riscv-xrivosvizip-codegen branch March 29, 2025 22:26
@llvm-ci
Copy link
Collaborator

llvm-ci commented Mar 29, 2025

LLVM Buildbot has detected a new failure on builder ppc64le-lld-multistage-test running on ppc64le-lld-multistage-test while building llvm at step 12 "build-stage2-unified-tree".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/168/builds/10306

Here is the relevant piece of the build log for the reference
Step 12 (build-stage2-unified-tree) failure: build (failure)
...
344.786 [852/109/5554] Building CXX object tools/clang/unittests/Analysis/FlowSensitive/CMakeFiles/ClangAnalysisFlowSensitiveTests.dir/MatchSwitchTest.cpp.o
344.795 [852/108/5555] Building CXX object lib/Target/RISCV/CMakeFiles/LLVMRISCVCodeGen.dir/GISel/RISCVPostLegalizerCombiner.cpp.o
344.813 [852/107/5556] Building CXX object lib/Target/PowerPC/CMakeFiles/LLVMPowerPCCodeGen.dir/PPCISelLowering.cpp.o
344.831 [851/107/5557] Linking CXX static library lib/libLLVMPowerPCCodeGen.a
345.019 [850/107/5558] Building CXX object tools/clang/lib/ASTMatchers/CMakeFiles/obj.clangASTMatchers.dir/ASTMatchFinder.cpp.o
345.023 [850/106/5559] Building CXX object tools/clang/lib/CodeGen/CMakeFiles/obj.clangCodeGen.dir/ObjectFilePCHContainerWriter.cpp.o
345.077 [850/105/5560] Linking CXX executable unittests/Target/PowerPC/PowerPCTests
345.121 [850/104/5561] Building CXX object tools/clang/unittests/AST/ByteCode/CMakeFiles/InterpTests.dir/Descriptor.cpp.o
345.267 [850/103/5562] Building CXX object tools/clang/lib/AST/CMakeFiles/obj.clangAST.dir/ASTImporterLookupTable.cpp.o
345.514 [850/102/5563] Building CXX object lib/Target/RISCV/CMakeFiles/LLVMRISCVCodeGen.dir/RISCVISelLowering.cpp.o
FAILED: lib/Target/RISCV/CMakeFiles/LLVMRISCVCodeGen.dir/RISCVISelLowering.cpp.o 
ccache /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/install/stage1/bin/clang++ -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/build/stage2/lib/Target/RISCV -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/lib/Target/RISCV -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/build/stage2/include -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -std=c++17 -fvisibility=hidden  -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -MD -MT lib/Target/RISCV/CMakeFiles/LLVMRISCVCodeGen.dir/RISCVISelLowering.cpp.o -MF lib/Target/RISCV/CMakeFiles/LLVMRISCVCodeGen.dir/RISCVISelLowering.cpp.o.d -o lib/Target/RISCV/CMakeFiles/LLVMRISCVCodeGen.dir/RISCVISelLowering.cpp.o -c /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
/home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/lib/Target/RISCV/RISCVISelLowering.cpp:4629:26: error: comparison of integers of different signs: 'unsigned int' and 'int' [-Werror,-Wsign-compare]
 4629 |   for (unsigned i = 0; i != NumElts; ++i) {
      |                        ~ ^  ~~~~~~~
1 error generated.
345.602 [850/101/5564] Building CXX object tools/clang/lib/CodeGen/CMakeFiles/obj.clangCodeGen.dir/CGHLSLRuntime.cpp.o
345.605 [850/100/5565] Building CXX object lib/Target/AArch64/CMakeFiles/LLVMAArch64CodeGen.dir/AArch64FastISel.cpp.o
345.656 [850/99/5566] Building CXX object tools/clang/unittests/CodeGen/CMakeFiles/ClangCodeGenTests.dir/TBAAMetadataTest.cpp.o
345.689 [850/98/5567] Building CXX object lib/Target/AArch64/CMakeFiles/LLVMAArch64CodeGen.dir/GISel/AArch64LegalizerInfo.cpp.o
345.697 [850/97/5568] Building CXX object tools/clang/unittests/StaticAnalyzer/CMakeFiles/StaticAnalysisTests.dir/FalsePositiveRefutationBRVisitorTest.cpp.o
345.806 [850/96/5569] Building CXX object tools/clang/unittests/Analysis/FlowSensitive/CMakeFiles/ClangAnalysisFlowSensitiveTests.dir/LoggerTest.cpp.o
345.829 [850/95/5570] Building CXX object tools/clang/lib/AST/CMakeFiles/obj.clangAST.dir/ASTImporter.cpp.o
345.916 [850/94/5571] Building CXX object lib/Target/RISCV/CMakeFiles/LLVMRISCVCodeGen.dir/RISCVTargetMachine.cpp.o
345.996 [850/93/5572] Building CXX object tools/clang/lib/Frontend/CMakeFiles/obj.clangFrontend.dir/ASTConsumers.cpp.o
346.016 [850/92/5573] Building CXX object tools/clang/unittests/Serialization/CMakeFiles/SerializationTests.dir/NoCommentsTest.cpp.o
346.126 [850/91/5574] Building CXX object tools/clang/lib/CodeGen/CMakeFiles/obj.clangCodeGen.dir/TargetBuiltins/ARM.cpp.o
346.149 [850/90/5575] Building CXX object tools/clang/unittests/Analysis/FlowSensitive/CMakeFiles/ClangAnalysisFlowSensitiveTests.dir/ChromiumCheckModelTest.cpp.o
346.200 [850/89/5576] Building CXX object tools/clang/unittests/Sema/CMakeFiles/SemaTests.dir/SemaNoloadLookupTest.cpp.o
346.240 [850/88/5577] Building CXX object tools/clang/unittests/Analysis/FlowSensitive/CMakeFiles/ClangAnalysisFlowSensitiveTests.dir/CachedConstAccessorsLatticeTest.cpp.o
346.537 [850/87/5578] Building CXX object tools/clang/lib/Serialization/CMakeFiles/obj.clangSerialization.dir/ASTReaderDecl.cpp.o
346.622 [850/86/5579] Building CXX object tools/clang/unittests/Analysis/CMakeFiles/ClangAnalysisTests.dir/IntervalPartitionTest.cpp.o
346.671 [850/85/5580] Building CXX object tools/clang/lib/AST/CMakeFiles/obj.clangAST.dir/ParentMapContext.cpp.o
346.879 [850/84/5581] Building CXX object tools/clang/unittests/Analysis/FlowSensitive/CMakeFiles/ClangAnalysisFlowSensitiveTests.dir/WatchedLiteralsSolverTest.cpp.o
347.148 [850/83/5582] Building CXX object tools/clang/unittests/StaticAnalyzer/CMakeFiles/StaticAnalysisTests.dir/CallDescriptionTest.cpp.o
347.306 [850/82/5583] Building CXX object lib/Target/RISCV/CMakeFiles/LLVMRISCVCodeGen.dir/RISCVTargetTransformInfo.cpp.o
347.514 [850/81/5584] Building CXX object tools/clang/unittests/Analysis/FlowSensitive/CMakeFiles/ClangAnalysisFlowSensitiveTests.dir/TestingSupport.cpp.o
347.671 [850/80/5585] Building CXX object tools/clang/unittests/ASTMatchers/Dynamic/CMakeFiles/DynamicASTMatchersTests.dir/RegistryTest.cpp.o
347.703 [850/79/5586] Building AMDGPUGenRegisterInfo.inc...
347.752 [850/78/5587] Building CXX object lib/Transforms/Vectorize/CMakeFiles/LLVMVectorize.dir/SLPVectorizer.cpp.o
347.933 [850/77/5588] Building CXX object tools/clang/unittests/ASTMatchers/CMakeFiles/ASTMatchersTests.dir/ASTMatchersInternalTest.cpp.o
348.053 [850/76/5589] Building CXX object lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86CodeGenPassBuilder.cpp.o
348.118 [850/75/5590] Building CXX object tools/clang/lib/CodeGen/CMakeFiles/obj.clangCodeGen.dir/CGStmtOpenMP.cpp.o
348.488 [850/74/5591] Building CXX object tools/clang/lib/AST/CMakeFiles/obj.clangAST.dir/TextNodeDumper.cpp.o
348.506 [850/73/5592] Building CXX object tools/clang/unittests/Analysis/FlowSensitive/CMakeFiles/ClangAnalysisFlowSensitiveTests.dir/TestingSupportTest.cpp.o
348.656 [850/72/5593] Building CXX object lib/Target/RISCV/CMakeFiles/LLVMRISCVCodeGen.dir/GISel/RISCVPreLegalizerCombiner.cpp.o
348.928 [850/71/5594] Building CXX object unittests/tools/llvm-exegesis/CMakeFiles/LLVMExegesisTests.dir/X86/SnippetFileTest.cpp.o
349.061 [850/70/5595] Building CXX object unittests/tools/llvm-exegesis/CMakeFiles/LLVMExegesisTests.dir/X86/TargetTest.cpp.o
349.624 [850/69/5596] Building CXX object tools/clang/unittests/Format/CMakeFiles/FormatTests.dir/ConfigParseTest.cpp.o

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants