Skip to content

Conversation

@4vtomat
Copy link
Member

@4vtomat 4vtomat commented Jun 6, 2025

stack on: #143068

@llvmbot
Copy link
Member

llvmbot commented Jun 6, 2025

@llvm/pr-subscribers-llvm-ir

Author: Brandon Wu (4vtomat)

Changes
  • [RISCV] Add XSfmm pseudo instruction and vset* insertion support
  • [RISCV] Support XSfmm LLVM IR and CodeGen

Patch is 114.55 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/143069.diff

40 Files Affected:

  • (modified) llvm/include/llvm/IR/IntrinsicsRISCVXsf.td (+95)
  • (modified) llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp (+4)
  • (modified) llvm/lib/Target/RISCV/MCTargetDesc/RISCVBaseInfo.h (+65-1)
  • (modified) llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp (+6)
  • (modified) llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp (+182)
  • (modified) llvm/lib/Target/RISCV/RISCVISelDAGToDAG.h (+1)
  • (modified) llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp (+201-9)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrFormats.td (+19)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.cpp (+3)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td (+13-9)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoXSf.td (+6-3)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoXSfmm.td (+189)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrPredicates.td (+32-1)
  • (modified) llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp (+4)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive-O0-ATM-ATK.ll (+18)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive-xsfmm-vset-insert.mir (+523)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_mm_e4m3_e4m3.ll (+20)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_mm_e4m3_e5m2.ll (+20)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_mm_e5m2_e4m3.ll (+20)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_mm_e5m2_e5m2.ll (+20)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_mm_f_f.ll (+52)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_mm_s_s.ll (+20)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_mm_s_u.ll (+20)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_mm_u_s.ll (+20)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_mm_u_u.ll (+20)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_vlte16.ll (+23)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_vlte32.ll (+23)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_vlte64.ll (+23)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_vlte8.ll (+23)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_vsettk.ll (+23)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_vsettm.ll (+23)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_vsettnt.ll (+72)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_vste16.ll (+23)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_vste32.ll (+23)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_vste64.ll (+23)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_vste8.ll (+23)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_vtdiscard.ll (+22)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_vtmv_t_v.ll (+114)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_vtmv_v_t.ll (+114)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_vtzero_t.ll (+24)
diff --git a/llvm/include/llvm/IR/IntrinsicsRISCVXsf.td b/llvm/include/llvm/IR/IntrinsicsRISCVXsf.td
index bf20080229aa4..1942423ae5bea 100644
--- a/llvm/include/llvm/IR/IntrinsicsRISCVXsf.td
+++ b/llvm/include/llvm/IR/IntrinsicsRISCVXsf.td
@@ -180,4 +180,99 @@ let TargetPrefix = "riscv" in {
   // XSfvfnrclipxfqf
   defm int_riscv_sf_vfnrclip_x_f_qf : RISCVSFCustomVFNRCLIP;
   defm int_riscv_sf_vfnrclip_xu_f_qf : RISCVSFCustomVFNRCLIP;
+
+  // XSfmm
+  // Output: (output_len)
+  // Input: (input_len, vsew, twiden)
+  class RISCVSFVSet
+      : DefaultAttrsIntrinsic<[llvm_anyint_ty],
+                              [LLVMMatchType<0>, LLVMMatchType<0>, LLVMMatchType<0>],
+                              [ImmArg<ArgIndex<1>>, ImmArg<ArgIndex<2>>, IntrNoMem]>;
+
+  // Input: (tss, base, tn)
+  // IntrReadMem, IntrHasSideEffects does not work for pattern matching.
+  class RISCVSFTileLoad
+      : DefaultAttrsIntrinsic<[],
+                              [llvm_anyint_ty, llvm_ptr_ty, LLVMMatchType<0>],
+                              [NoCapture<ArgIndex<1>>]>,
+        RISCVVIntrinsic;
+
+  // Input: (tss, base, tn)
+  class RISCVSFTileStore
+      : DefaultAttrsIntrinsic<[],
+                              [llvm_anyint_ty, llvm_ptr_ty, LLVMMatchType<0>],
+                              [NoCapture<ArgIndex<1>>, IntrWriteMem,
+                               IntrHasSideEffects]>,
+        RISCVVIntrinsic;
+
+  // Output: ()
+  // Input: (mtd, mat1, mat2, tm, tn, tk, twiden)
+  class RISCVSFCustomMatMul<bit is_float = false>
+      : DefaultAttrsIntrinsic<[], [llvm_anyint_ty, llvm_anyvector_ty,
+                                   !if(is_float, LLVMMatchType<1>,
+                                                 llvm_anyvector_ty),
+                                   LLVMMatchType<0>, LLVMMatchType<0>,
+                                   LLVMMatchType<0>, LLVMMatchType<0>],
+                              [IntrNoMem, IntrHasSideEffects,
+                               ImmArg<ArgIndex<0>>, ImmArg<ArgIndex<6>>]>,
+        RISCVVIntrinsic;
+
+  def int_riscv_sf_vsettnt  : RISCVSFVSet;
+  def int_riscv_sf_vsettm   : RISCVSFVSet;
+  def int_riscv_sf_vsettk   : RISCVSFVSet;
+
+  def int_riscv_sf_vlte8    : RISCVSFTileLoad;
+  def int_riscv_sf_vlte16   : RISCVSFTileLoad;
+  def int_riscv_sf_vlte32   : RISCVSFTileLoad;
+  def int_riscv_sf_vlte64   : RISCVSFTileLoad;
+  def int_riscv_sf_vste8    : RISCVSFTileStore;
+  def int_riscv_sf_vste16   : RISCVSFTileStore;
+  def int_riscv_sf_vste32   : RISCVSFTileStore;
+  def int_riscv_sf_vste64   : RISCVSFTileStore;
+
+  // Output: (vd)
+  // Input: (tss, tn)
+  def int_riscv_sf_vtmv_v_t
+      : DefaultAttrsIntrinsic<[llvm_anyvector_ty],
+                              [llvm_anyint_ty, LLVMMatchType<1>],
+                              [IntrNoMem, IntrHasSideEffects]>,
+      RISCVVIntrinsic {
+    let VLOperand = 2;
+  }
+  // Output: ()
+  // Input: (tss, vs2, tn)
+  def int_riscv_sf_vtmv_t_v
+      : DefaultAttrsIntrinsic<[], [LLVMMatchType<1>, llvm_anyvector_ty,
+                                   llvm_anyint_ty], [IntrNoMem, IntrHasSideEffects]>,
+      RISCVVIntrinsic {
+    let VLOperand = 2;
+  }
+
+  foreach a = ["u", "s"] in {
+    foreach b = ["u", "s"] in {
+      def int_riscv_sf_mm_ # a # _ # b   : RISCVSFCustomMatMul;
+    }
+  }
+
+  def int_riscv_sf_mm_f_f : RISCVSFCustomMatMul<true>;
+  foreach e1 = [5, 4] in
+    foreach e2 = [5, 4] in
+      def int_riscv_sf_mm_e # e1 # m # !sub(7, e1) # _e # e2 # m # !sub(7, e2)
+          : RISCVSFCustomMatMul<true>;
+
+  // Output: ()
+  // Input: (mtd)
+  def int_riscv_sf_vtzero_t
+      : DefaultAttrsIntrinsic<[],
+                              [llvm_anyint_ty, LLVMMatchType<0>,LLVMMatchType<0>,
+                               LLVMMatchType<0>, LLVMMatchType<0>],
+                              [ImmArg<ArgIndex<0>>, ImmArg<ArgIndex<3>>,
+                               ImmArg<ArgIndex<4>>, IntrNoMem, IntrHasSideEffects]>,
+        RISCVVIntrinsic;
+
+  // Output: ()
+  // Input: ()
+  def int_riscv_sf_vtdiscard
+      : DefaultAttrsIntrinsic<[], [], [IntrNoMem, IntrHasSideEffects]>,
+        RISCVVIntrinsic;
 } // TargetPrefix = "riscv"
diff --git a/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp b/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
index 1f434beca5388..8a18221832ecb 100644
--- a/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
+++ b/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
@@ -1616,6 +1616,10 @@ bool RISCVAsmParser::matchAndEmitInstruction(SMLoc IDLoc, unsigned &Opcode,
                                       "operand must be a valid system register "
                                       "name or an integer in the range");
   }
+  case Match_InvalidXSfmmVType: {
+    SMLoc ErrorLoc = ((RISCVOperand &)*Operands[ErrorInfo]).getStartLoc();
+    return generateXSfmmVTypeError(ErrorLoc);
+  }
   case Match_InvalidVTypeI: {
     SMLoc ErrorLoc = ((RISCVOperand &)*Operands[ErrorInfo]).getStartLoc();
     return generateVTypeError(ErrorLoc);
diff --git a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVBaseInfo.h b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVBaseInfo.h
index 6ef94fb5e93da..e470d51c6c5fa 100644
--- a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVBaseInfo.h
+++ b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVBaseInfo.h
@@ -138,6 +138,25 @@ enum {
   // 3 -> SEW * 4
   DestEEWShift = ElementsDependOnMaskShift + 1,
   DestEEWMask = 3ULL << DestEEWShift,
+
+  // 0 -> Don't care about altfmt bit in VTYPE.
+  // 1 -> Is not altfmt.
+  // 2 -> Is altfmt(BF16).
+  AltFmtTypeShift = DestEEWShift + 2,
+  AltFmtTypeMask = 3ULL << AltFmtTypeShift,
+
+  IsWidenShift = AltFmtTypeShift + 2,
+  IsWidenMask = 1ULL << IsWidenShift,
+
+  // XSfmmbase
+  HasTWidenOpShift = IsWidenShift + 1,
+  HasTWidenOpMask = 1ULL << HasTWidenOpShift,
+
+  HasTMOpShift = HasTWidenOpShift + 1,
+  HasTMOpMask = 1ULL << HasTMOpShift,
+
+  HasTKOpShift = HasTMOpShift + 1,
+  HasTKOpMask = 1ULL << HasTKOpShift,
 };
 
 // Helper functions to read TSFlags.
@@ -179,6 +198,11 @@ static inline bool hasRoundModeOp(uint64_t TSFlags) {
   return TSFlags & HasRoundModeOpMask;
 }
 
+enum class AltFmtType { DontCare, NotAltFmt, AltFmt };
+static inline AltFmtType getAltFmtType(uint64_t TSFlags) {
+  return static_cast<AltFmtType>((TSFlags & AltFmtTypeMask) >> AltFmtTypeShift);
+}
+
 /// \returns true if this instruction uses vxrm
 static inline bool usesVXRM(uint64_t TSFlags) { return TSFlags & UsesVXRMMask; }
 
@@ -194,11 +218,47 @@ static inline bool elementsDependOnMask(uint64_t TSFlags) {
   return TSFlags & ElementsDependOnMaskMask;
 }
 
+// XSfmmbase
+static inline bool hasTWidenOp(uint64_t TSFlags) {
+  return TSFlags & HasTWidenOpMask;
+}
+
+static inline bool hasTMOp(uint64_t TSFlags) { return TSFlags & HasTMOpMask; }
+
+static inline bool hasTKOp(uint64_t TSFlags) { return TSFlags & HasTKOpMask; }
+
+static inline unsigned getTNOpNum(const MCInstrDesc &Desc) {
+  const uint64_t TSFlags = Desc.TSFlags;
+  assert(hasTWidenOp(TSFlags) && hasVLOp(TSFlags));
+  unsigned Offset = 3;
+  if (hasTKOp(TSFlags))
+    Offset = 4;
+  return Desc.getNumOperands() - Offset;
+}
+
+static inline unsigned getTMOpNum(const MCInstrDesc &Desc) {
+  const uint64_t TSFlags = Desc.TSFlags;
+  assert(hasTWidenOp(TSFlags) && hasTMOp(TSFlags));
+  if (hasTKOp(TSFlags))
+    return Desc.getNumOperands() - 5;
+  // vtzero.t
+  return Desc.getNumOperands() - 4;
+}
+
+static inline unsigned getTKOpNum(const MCInstrDesc &Desc) {
+  const uint64_t TSFlags = Desc.TSFlags;
+  assert(hasTWidenOp(TSFlags) && hasTKOp(TSFlags));
+  return Desc.getNumOperands() - 3;
+}
+
 static inline unsigned getVLOpNum(const MCInstrDesc &Desc) {
   const uint64_t TSFlags = Desc.TSFlags;
   // This method is only called if we expect to have a VL operand, and all
   // instructions with VL also have SEW.
   assert(hasSEWOp(TSFlags) && hasVLOp(TSFlags));
+  // In Xsfmmbase, TN is alias for VL, so here we use the same TSFlags bit.
+  if (hasTWidenOp(TSFlags))
+    return getTNOpNum(Desc);
   unsigned Offset = 2;
   if (hasVecPolicyOp(TSFlags))
     Offset = 3;
@@ -216,7 +276,7 @@ static inline unsigned getSEWOpNum(const MCInstrDesc &Desc) {
   const uint64_t TSFlags = Desc.TSFlags;
   assert(hasSEWOp(TSFlags));
   unsigned Offset = 1;
-  if (hasVecPolicyOp(TSFlags))
+  if (hasVecPolicyOp(TSFlags) || hasTWidenOp(TSFlags))
     Offset = 2;
   return Desc.getNumOperands() - Offset;
 }
@@ -233,6 +293,9 @@ static inline int getFRMOpNum(const MCInstrDesc &Desc) {
   if (!hasRoundModeOp(TSFlags) || usesVXRM(TSFlags))
     return -1;
 
+  if (hasTWidenOp(TSFlags) && hasTMOp(TSFlags))
+    return getTMOpNum(Desc) - 1;
+
   // The operand order
   // --------------------------------------
   // | n-1 (if any)   | n-2  | n-3 | n-4 |
@@ -375,6 +438,7 @@ enum OperandType : unsigned {
   // instructions to represent a value that be passed as AVL to either vsetvli
   // or vsetivli.
   OPERAND_AVL,
+  OPERAND_XSFMM_VTYPE,
 };
 } // namespace RISCVOp
 
diff --git a/llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp b/llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp
index 72f1596d79a02..534c330463085 100644
--- a/llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp
+++ b/llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp
@@ -1100,6 +1100,12 @@ static bool lowerRISCVVMachineInstrToMCInst(const MachineInstr *MI,
     --NumOps;
   if (RISCVII::hasRoundModeOp(TSFlags))
     --NumOps;
+  if (RISCVII::hasTWidenOp(TSFlags))
+    --NumOps;
+  if (RISCVII::hasTMOp(TSFlags))
+    --NumOps;
+  if (RISCVII::hasTKOp(TSFlags))
+    --NumOps;
 
   bool hasVLOutput = RISCV::isFaultFirstLoad(*MI);
   for (unsigned OpNo = 0; OpNo != NumOps; ++OpNo) {
diff --git a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
index 494d6ed03292a..d44eee21e0f9e 100644
--- a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
@@ -522,6 +522,43 @@ void RISCVDAGToDAGISel::selectVSETVLI(SDNode *Node) {
               CurDAG->getMachineNode(Opcode, DL, XLenVT, VLOperand, VTypeIOp));
 }
 
+void RISCVDAGToDAGISel::selectXSfmmVSET(SDNode *Node) {
+  if (!Subtarget->hasVendorXSfmmbase())
+    return;
+
+  assert(Node->getOpcode() == ISD::INTRINSIC_WO_CHAIN && "Unexpected opcode");
+
+  SDLoc DL(Node);
+  MVT XLenVT = Subtarget->getXLenVT();
+
+  unsigned IntNo = Node->getConstantOperandVal(0);
+
+  assert((IntNo == Intrinsic::riscv_sf_vsettnt ||
+          IntNo == Intrinsic::riscv_sf_vsettm ||
+          IntNo == Intrinsic::riscv_sf_vsettk) &&
+         "Unexpected XSfmm vset intrinsic");
+
+  unsigned SEW = RISCVVType::decodeVSEW(Node->getConstantOperandVal(2));
+  unsigned Widen = RISCVVType::decodeTWiden(Node->getConstantOperandVal(3));
+  unsigned PseudoOpCode =
+      IntNo == Intrinsic::riscv_sf_vsettnt  ? RISCV::PseudoSF_VSETTNT
+      : IntNo == Intrinsic::riscv_sf_vsettm ? RISCV::PseudoSF_VSETTM
+                                            : RISCV::PseudoSF_VSETTK;
+
+  unsigned VTypeI = RISCVVType::encodeXSfmmVType(SEW, Widen, 0);
+  SDValue VTypeIOp = CurDAG->getTargetConstant(VTypeI, DL, XLenVT);
+  SDValue Log2SEW = CurDAG->getTargetConstant(Log2_32(SEW), DL, XLenVT);
+  SDValue TWiden = CurDAG->getTargetConstant(Widen, DL, XLenVT);
+
+  if (IntNo == Intrinsic::riscv_sf_vsettnt)
+    ReplaceNode(Node, CurDAG->getMachineNode(PseudoOpCode, DL, XLenVT,
+                                             Node->getOperand(1), VTypeIOp));
+  else
+    ReplaceNode(Node,
+                CurDAG->getMachineNode(PseudoOpCode, DL, XLenVT,
+                                       Node->getOperand(1), Log2SEW, TWiden));
+}
+
 bool RISCVDAGToDAGISel::tryShrinkShlLogicImm(SDNode *Node) {
   MVT VT = Node->getSimpleValueType(0);
   unsigned Opcode = Node->getOpcode();
@@ -777,6 +814,11 @@ bool RISCVDAGToDAGISel::tryIndexedLoad(SDNode *Node) {
   return true;
 }
 
+static Register getTileReg(uint64_t TileNum) {
+  assert(TileNum <= 15 && "Invalid tile number");
+  return RISCV::T0 + TileNum;
+}
+
 void RISCVDAGToDAGISel::selectSF_VC_X_SE(SDNode *Node) {
   if (!Subtarget->hasVInstructions())
     return;
@@ -1955,6 +1997,10 @@ void RISCVDAGToDAGISel::Select(SDNode *Node) {
     case Intrinsic::riscv_vsetvli:
     case Intrinsic::riscv_vsetvlimax:
       return selectVSETVLI(Node);
+    case Intrinsic::riscv_sf_vsettnt:
+    case Intrinsic::riscv_sf_vsettm:
+    case Intrinsic::riscv_sf_vsettk:
+      return selectXSfmmVSET(Node);
     }
     break;
   }
@@ -2352,6 +2398,142 @@ void RISCVDAGToDAGISel::Select(SDNode *Node) {
     case Intrinsic::riscv_sf_vc_i_se:
       selectSF_VC_X_SE(Node);
       return;
+    case Intrinsic::riscv_sf_vlte8:
+    case Intrinsic::riscv_sf_vlte16:
+    case Intrinsic::riscv_sf_vlte32:
+    case Intrinsic::riscv_sf_vlte64: {
+      unsigned Log2SEW;
+      unsigned PseudoInst;
+      switch (IntNo) {
+      case Intrinsic::riscv_sf_vlte8:
+        PseudoInst = RISCV::PseudoSF_VLTE8;
+        Log2SEW = 3;
+        break;
+      case Intrinsic::riscv_sf_vlte16:
+        PseudoInst = RISCV::PseudoSF_VLTE16;
+        Log2SEW = 4;
+        break;
+      case Intrinsic::riscv_sf_vlte32:
+        PseudoInst = RISCV::PseudoSF_VLTE32;
+        Log2SEW = 5;
+        break;
+      case Intrinsic::riscv_sf_vlte64:
+        PseudoInst = RISCV::PseudoSF_VLTE64;
+        Log2SEW = 6;
+        break;
+      }
+
+      SDValue SEWOp = CurDAG->getTargetConstant(Log2SEW, DL, XLenVT);
+      SDValue TWidenOp = CurDAG->getTargetConstant(1, DL, XLenVT);
+      SmallVector<SDValue, 7> Operands = {Node->getOperand(2),
+                                          Node->getOperand(3),
+                                          Node->getOperand(4),
+                                          SEWOp,
+                                          TWidenOp,
+                                          Node->getOperand(0)};
+
+      MachineSDNode *TileLoad =
+          CurDAG->getMachineNode(PseudoInst, DL, Node->getVTList(), Operands);
+      if (auto *MemOp = dyn_cast<MemSDNode>(Node))
+        CurDAG->setNodeMemRefs(TileLoad, {MemOp->getMemOperand()});
+
+      ReplaceNode(Node, TileLoad);
+      return;
+    }
+    case Intrinsic::riscv_sf_mm_s_s:
+    case Intrinsic::riscv_sf_mm_s_u:
+    case Intrinsic::riscv_sf_mm_u_s:
+    case Intrinsic::riscv_sf_mm_u_u:
+    case Intrinsic::riscv_sf_mm_e5m2_e5m2:
+    case Intrinsic::riscv_sf_mm_e5m2_e4m3:
+    case Intrinsic::riscv_sf_mm_e4m3_e5m2:
+    case Intrinsic::riscv_sf_mm_e4m3_e4m3:
+    case Intrinsic::riscv_sf_mm_f_f: {
+      bool HasFRM = false;
+      unsigned PseudoInst;
+      switch (IntNo) {
+      case Intrinsic::riscv_sf_mm_s_s:
+        PseudoInst = RISCV::PseudoSF_MM_S_S;
+        break;
+      case Intrinsic::riscv_sf_mm_s_u:
+        PseudoInst = RISCV::PseudoSF_MM_S_U;
+        break;
+      case Intrinsic::riscv_sf_mm_u_s:
+        PseudoInst = RISCV::PseudoSF_MM_U_S;
+        break;
+      case Intrinsic::riscv_sf_mm_u_u:
+        PseudoInst = RISCV::PseudoSF_MM_U_U;
+        break;
+      case Intrinsic::riscv_sf_mm_e5m2_e5m2:
+        PseudoInst = RISCV::PseudoSF_MM_E5M2_E5M2;
+        HasFRM = true;
+        break;
+      case Intrinsic::riscv_sf_mm_e5m2_e4m3:
+        PseudoInst = RISCV::PseudoSF_MM_E5M2_E4M3;
+        HasFRM = true;
+        break;
+      case Intrinsic::riscv_sf_mm_e4m3_e5m2:
+        PseudoInst = RISCV::PseudoSF_MM_E4M3_E5M2;
+        HasFRM = true;
+        break;
+      case Intrinsic::riscv_sf_mm_e4m3_e4m3:
+        PseudoInst = RISCV::PseudoSF_MM_E4M3_E4M3;
+        HasFRM = true;
+        break;
+      case Intrinsic::riscv_sf_mm_f_f:
+        if (Node->getOperand(3).getValueType().getScalarType() == MVT::bf16)
+          PseudoInst = RISCV::PseudoSF_MM_F_F_ALT;
+        else
+          PseudoInst = RISCV::PseudoSF_MM_F_F;
+        HasFRM = true;
+        break;
+      }
+      uint64_t TileNum = Node->getConstantOperandVal(2);
+      SDValue Op1 = Node->getOperand(3);
+      SDValue Op2 = Node->getOperand(4);
+      MVT VT = Op1->getSimpleValueType(0);
+      unsigned Log2SEW = Log2_32(VT.getScalarSizeInBits());
+      SDValue TmOp = Node->getOperand(5);
+      SDValue TnOp = Node->getOperand(6);
+      SDValue TkOp = Node->getOperand(7);
+      SDValue TWidenOp = Node->getOperand(8);
+      SDValue Chain = Node->getOperand(0);
+
+      // sf.mm.f.f with sew=32, twiden=2 is invalid
+      if (IntNo == Intrinsic::riscv_sf_mm_f_f && Log2SEW == 5 &&
+          TWidenOp->getAsZExtVal() == 2)
+        report_fatal_error("sf.mm.f.f doesn't support (sew=32, twiden=2)");
+
+      SmallVector<SDValue, 10> Operands(
+          {CurDAG->getRegister(getTileReg(TileNum), XLenVT), Op1, Op2});
+      if (HasFRM)
+        Operands.push_back(
+            CurDAG->getTargetConstant(RISCVFPRndMode::DYN, DL, XLenVT));
+      Operands.append({TmOp, TnOp, TkOp,
+                       CurDAG->getTargetConstant(Log2SEW, DL, XLenVT), TWidenOp,
+                       Chain});
+
+      auto *NewNode =
+          CurDAG->getMachineNode(PseudoInst, DL, Node->getVTList(), Operands);
+
+      ReplaceNode(Node, NewNode);
+      return;
+    }
+    case Intrinsic::riscv_sf_vtzero_t: {
+      uint64_t TileNum = Node->getConstantOperandVal(2);
+      SDValue Tm = Node->getOperand(3);
+      SDValue Tn = Node->getOperand(4);
+      SDValue Log2SEW = Node->getOperand(5);
+      SDValue TWiden = Node->getOperand(6);
+      SDValue Chain = Node->getOperand(0);
+      auto *NewNode = CurDAG->getMachineNode(
+          RISCV::PseudoSF_VTZERO_T, DL, Node->getVTList(),
+          {CurDAG->getRegister(getTileReg(TileNum), XLenVT), Tm, Tn, Log2SEW,
+           TWiden, Chain});
+
+      ReplaceNode(Node, NewNode);
+      return;
+    }
     }
     break;
   }
diff --git a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.h b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.h
index f199c2031b9a9..ce40075ff6d7b 100644
--- a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.h
+++ b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.h
@@ -164,6 +164,7 @@ class RISCVDAGToDAGISel : public SelectionDAGISel {
   void selectVSXSEG(SDNode *Node, unsigned NF, bool IsMasked, bool IsOrdered);
 
   void selectVSETVLI(SDNode *Node);
+  void selectXSfmmVSET(SDNode *Node);
 
   void selectSF_VC_X_SE(SDNode *Node);
 
diff --git a/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp b/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
index e1c69c69a99ca..88d1178eba9b1 100644
--- a/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
@@ -164,10 +164,13 @@ struct DemandedFields {
   // If this is true, we demand that VTYPE is set to some legal state, i.e. that
   // vill is unset.
   bool VILL = false;
+  bool UseTWiden = false;
+  bool UseAltFmt = false;
 
   // Return true if any part of VTYPE was used
   bool usedVTYPE() const {
-    return SEW || LMUL || SEWLMULRatio || TailPolicy || MaskPolicy || VILL;
+    return SEW || LMUL || SEWLMULRatio || TailPolicy || MaskPolicy || VILL ||
+           UseTWiden || UseAltFmt;
   }
 
   // Return true if any property of VL was used
@@ -183,6 +186,8 @@ struct DemandedFields {
     TailPolicy = true;
     MaskPolicy = true;
     VILL = true;
+    UseTWiden = true;
+    UseAltFmt = true;
   }
 
   // Mark all VL properties as demanded
@@ -208,6 +213,8 @@ struct DemandedFields {
     TailPolicy |= B.TailPolicy;
     MaskPolicy |= B.MaskPolicy;
     VILL |= B.VILL;
+    UseAltFmt |= B.UseAltFmt;
+    UseTWiden |= B.UseTWiden;
   }
 
 #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
@@ -255,6 +262,8 @@ struct DemandedFields {
     OS << "TailPolicy=" << TailPolicy << ", ";
     OS << "MaskPolicy=" << MaskPolicy << ", ";
     OS << "VILL=" << VILL;
+    OS << "UseAltFmt=" << UseAltFmt << ", ";
+    OS << "UseTWiden=" << UseTWiden;
     OS << "}";
   }
 #endif
@@ -324,6 +333,15 @@ static bool areCompatibleVTYPEs(uint64_t CurVType, uint64_t NewVType,
   if (Used.MaskPolicy && RISCVVType::isMaskAgnostic(CurVType) !=
                              RISCVVType::isMaskAgnostic(NewVType))
     return false;
+  if (Used.UseTWiden && (RISCVVType::hasXSfmmWiden(CurVType) !=
+                             RISCVVType::hasXSfmmWiden(NewVType) ||
+                         (RISCVVType::hasXSfmmWiden(CurVType) &&
+                          RISCVVType::getXSfmmWiden(CurVType) !=
+                              RISCVVType::getXSfmmWiden(NewVType))))
+    return false;
+  if (Used.UseAltFmt &&
+      RISCVVType::isAltFmt(CurVType) != RISCVVType::isAltFmt(NewVType))
+    return fals...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Jun 6, 2025

@llvm/pr-subscribers-backend-risc-v

Author: Brandon Wu (4vtomat)

Changes
  • [RISCV] Add XSfmm pseudo instruction and vset* insertion support
  • [RISCV] Support XSfmm LLVM IR and CodeGen

Patch is 114.55 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/143069.diff

40 Files Affected:

  • (modified) llvm/include/llvm/IR/IntrinsicsRISCVXsf.td (+95)
  • (modified) llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp (+4)
  • (modified) llvm/lib/Target/RISCV/MCTargetDesc/RISCVBaseInfo.h (+65-1)
  • (modified) llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp (+6)
  • (modified) llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp (+182)
  • (modified) llvm/lib/Target/RISCV/RISCVISelDAGToDAG.h (+1)
  • (modified) llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp (+201-9)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrFormats.td (+19)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.cpp (+3)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td (+13-9)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoXSf.td (+6-3)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoXSfmm.td (+189)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrPredicates.td (+32-1)
  • (modified) llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp (+4)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive-O0-ATM-ATK.ll (+18)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive-xsfmm-vset-insert.mir (+523)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_mm_e4m3_e4m3.ll (+20)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_mm_e4m3_e5m2.ll (+20)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_mm_e5m2_e4m3.ll (+20)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_mm_e5m2_e5m2.ll (+20)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_mm_f_f.ll (+52)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_mm_s_s.ll (+20)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_mm_s_u.ll (+20)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_mm_u_s.ll (+20)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_mm_u_u.ll (+20)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_vlte16.ll (+23)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_vlte32.ll (+23)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_vlte64.ll (+23)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_vlte8.ll (+23)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_vsettk.ll (+23)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_vsettm.ll (+23)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_vsettnt.ll (+72)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_vste16.ll (+23)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_vste32.ll (+23)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_vste64.ll (+23)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_vste8.ll (+23)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_vtdiscard.ll (+22)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_vtmv_t_v.ll (+114)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_vtmv_v_t.ll (+114)
  • (added) llvm/test/CodeGen/RISCV/rvv/sifive_sf_vtzero_t.ll (+24)
diff --git a/llvm/include/llvm/IR/IntrinsicsRISCVXsf.td b/llvm/include/llvm/IR/IntrinsicsRISCVXsf.td
index bf20080229aa4..1942423ae5bea 100644
--- a/llvm/include/llvm/IR/IntrinsicsRISCVXsf.td
+++ b/llvm/include/llvm/IR/IntrinsicsRISCVXsf.td
@@ -180,4 +180,99 @@ let TargetPrefix = "riscv" in {
   // XSfvfnrclipxfqf
   defm int_riscv_sf_vfnrclip_x_f_qf : RISCVSFCustomVFNRCLIP;
   defm int_riscv_sf_vfnrclip_xu_f_qf : RISCVSFCustomVFNRCLIP;
+
+  // XSfmm
+  // Output: (output_len)
+  // Input: (input_len, vsew, twiden)
+  class RISCVSFVSet
+      : DefaultAttrsIntrinsic<[llvm_anyint_ty],
+                              [LLVMMatchType<0>, LLVMMatchType<0>, LLVMMatchType<0>],
+                              [ImmArg<ArgIndex<1>>, ImmArg<ArgIndex<2>>, IntrNoMem]>;
+
+  // Input: (tss, base, tn)
+  // IntrReadMem, IntrHasSideEffects does not work for pattern matching.
+  class RISCVSFTileLoad
+      : DefaultAttrsIntrinsic<[],
+                              [llvm_anyint_ty, llvm_ptr_ty, LLVMMatchType<0>],
+                              [NoCapture<ArgIndex<1>>]>,
+        RISCVVIntrinsic;
+
+  // Input: (tss, base, tn)
+  class RISCVSFTileStore
+      : DefaultAttrsIntrinsic<[],
+                              [llvm_anyint_ty, llvm_ptr_ty, LLVMMatchType<0>],
+                              [NoCapture<ArgIndex<1>>, IntrWriteMem,
+                               IntrHasSideEffects]>,
+        RISCVVIntrinsic;
+
+  // Output: ()
+  // Input: (mtd, mat1, mat2, tm, tn, tk, twiden)
+  class RISCVSFCustomMatMul<bit is_float = false>
+      : DefaultAttrsIntrinsic<[], [llvm_anyint_ty, llvm_anyvector_ty,
+                                   !if(is_float, LLVMMatchType<1>,
+                                                 llvm_anyvector_ty),
+                                   LLVMMatchType<0>, LLVMMatchType<0>,
+                                   LLVMMatchType<0>, LLVMMatchType<0>],
+                              [IntrNoMem, IntrHasSideEffects,
+                               ImmArg<ArgIndex<0>>, ImmArg<ArgIndex<6>>]>,
+        RISCVVIntrinsic;
+
+  def int_riscv_sf_vsettnt  : RISCVSFVSet;
+  def int_riscv_sf_vsettm   : RISCVSFVSet;
+  def int_riscv_sf_vsettk   : RISCVSFVSet;
+
+  def int_riscv_sf_vlte8    : RISCVSFTileLoad;
+  def int_riscv_sf_vlte16   : RISCVSFTileLoad;
+  def int_riscv_sf_vlte32   : RISCVSFTileLoad;
+  def int_riscv_sf_vlte64   : RISCVSFTileLoad;
+  def int_riscv_sf_vste8    : RISCVSFTileStore;
+  def int_riscv_sf_vste16   : RISCVSFTileStore;
+  def int_riscv_sf_vste32   : RISCVSFTileStore;
+  def int_riscv_sf_vste64   : RISCVSFTileStore;
+
+  // Output: (vd)
+  // Input: (tss, tn)
+  def int_riscv_sf_vtmv_v_t
+      : DefaultAttrsIntrinsic<[llvm_anyvector_ty],
+                              [llvm_anyint_ty, LLVMMatchType<1>],
+                              [IntrNoMem, IntrHasSideEffects]>,
+      RISCVVIntrinsic {
+    let VLOperand = 2;
+  }
+  // Output: ()
+  // Input: (tss, vs2, tn)
+  def int_riscv_sf_vtmv_t_v
+      : DefaultAttrsIntrinsic<[], [LLVMMatchType<1>, llvm_anyvector_ty,
+                                   llvm_anyint_ty], [IntrNoMem, IntrHasSideEffects]>,
+      RISCVVIntrinsic {
+    let VLOperand = 2;
+  }
+
+  foreach a = ["u", "s"] in {
+    foreach b = ["u", "s"] in {
+      def int_riscv_sf_mm_ # a # _ # b   : RISCVSFCustomMatMul;
+    }
+  }
+
+  def int_riscv_sf_mm_f_f : RISCVSFCustomMatMul<true>;
+  foreach e1 = [5, 4] in
+    foreach e2 = [5, 4] in
+      def int_riscv_sf_mm_e # e1 # m # !sub(7, e1) # _e # e2 # m # !sub(7, e2)
+          : RISCVSFCustomMatMul<true>;
+
+  // Output: ()
+  // Input: (mtd)
+  def int_riscv_sf_vtzero_t
+      : DefaultAttrsIntrinsic<[],
+                              [llvm_anyint_ty, LLVMMatchType<0>,LLVMMatchType<0>,
+                               LLVMMatchType<0>, LLVMMatchType<0>],
+                              [ImmArg<ArgIndex<0>>, ImmArg<ArgIndex<3>>,
+                               ImmArg<ArgIndex<4>>, IntrNoMem, IntrHasSideEffects]>,
+        RISCVVIntrinsic;
+
+  // Output: ()
+  // Input: ()
+  def int_riscv_sf_vtdiscard
+      : DefaultAttrsIntrinsic<[], [], [IntrNoMem, IntrHasSideEffects]>,
+        RISCVVIntrinsic;
 } // TargetPrefix = "riscv"
diff --git a/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp b/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
index 1f434beca5388..8a18221832ecb 100644
--- a/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
+++ b/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
@@ -1616,6 +1616,10 @@ bool RISCVAsmParser::matchAndEmitInstruction(SMLoc IDLoc, unsigned &Opcode,
                                       "operand must be a valid system register "
                                       "name or an integer in the range");
   }
+  case Match_InvalidXSfmmVType: {
+    SMLoc ErrorLoc = ((RISCVOperand &)*Operands[ErrorInfo]).getStartLoc();
+    return generateXSfmmVTypeError(ErrorLoc);
+  }
   case Match_InvalidVTypeI: {
     SMLoc ErrorLoc = ((RISCVOperand &)*Operands[ErrorInfo]).getStartLoc();
     return generateVTypeError(ErrorLoc);
diff --git a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVBaseInfo.h b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVBaseInfo.h
index 6ef94fb5e93da..e470d51c6c5fa 100644
--- a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVBaseInfo.h
+++ b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVBaseInfo.h
@@ -138,6 +138,25 @@ enum {
   // 3 -> SEW * 4
   DestEEWShift = ElementsDependOnMaskShift + 1,
   DestEEWMask = 3ULL << DestEEWShift,
+
+  // 0 -> Don't care about altfmt bit in VTYPE.
+  // 1 -> Is not altfmt.
+  // 2 -> Is altfmt(BF16).
+  AltFmtTypeShift = DestEEWShift + 2,
+  AltFmtTypeMask = 3ULL << AltFmtTypeShift,
+
+  IsWidenShift = AltFmtTypeShift + 2,
+  IsWidenMask = 1ULL << IsWidenShift,
+
+  // XSfmmbase
+  HasTWidenOpShift = IsWidenShift + 1,
+  HasTWidenOpMask = 1ULL << HasTWidenOpShift,
+
+  HasTMOpShift = HasTWidenOpShift + 1,
+  HasTMOpMask = 1ULL << HasTMOpShift,
+
+  HasTKOpShift = HasTMOpShift + 1,
+  HasTKOpMask = 1ULL << HasTKOpShift,
 };
 
 // Helper functions to read TSFlags.
@@ -179,6 +198,11 @@ static inline bool hasRoundModeOp(uint64_t TSFlags) {
   return TSFlags & HasRoundModeOpMask;
 }
 
+enum class AltFmtType { DontCare, NotAltFmt, AltFmt };
+static inline AltFmtType getAltFmtType(uint64_t TSFlags) {
+  return static_cast<AltFmtType>((TSFlags & AltFmtTypeMask) >> AltFmtTypeShift);
+}
+
 /// \returns true if this instruction uses vxrm
 static inline bool usesVXRM(uint64_t TSFlags) { return TSFlags & UsesVXRMMask; }
 
@@ -194,11 +218,47 @@ static inline bool elementsDependOnMask(uint64_t TSFlags) {
   return TSFlags & ElementsDependOnMaskMask;
 }
 
+// XSfmmbase
+static inline bool hasTWidenOp(uint64_t TSFlags) {
+  return TSFlags & HasTWidenOpMask;
+}
+
+static inline bool hasTMOp(uint64_t TSFlags) { return TSFlags & HasTMOpMask; }
+
+static inline bool hasTKOp(uint64_t TSFlags) { return TSFlags & HasTKOpMask; }
+
+static inline unsigned getTNOpNum(const MCInstrDesc &Desc) {
+  const uint64_t TSFlags = Desc.TSFlags;
+  assert(hasTWidenOp(TSFlags) && hasVLOp(TSFlags));
+  unsigned Offset = 3;
+  if (hasTKOp(TSFlags))
+    Offset = 4;
+  return Desc.getNumOperands() - Offset;
+}
+
+static inline unsigned getTMOpNum(const MCInstrDesc &Desc) {
+  const uint64_t TSFlags = Desc.TSFlags;
+  assert(hasTWidenOp(TSFlags) && hasTMOp(TSFlags));
+  if (hasTKOp(TSFlags))
+    return Desc.getNumOperands() - 5;
+  // vtzero.t
+  return Desc.getNumOperands() - 4;
+}
+
+static inline unsigned getTKOpNum(const MCInstrDesc &Desc) {
+  const uint64_t TSFlags = Desc.TSFlags;
+  assert(hasTWidenOp(TSFlags) && hasTKOp(TSFlags));
+  return Desc.getNumOperands() - 3;
+}
+
 static inline unsigned getVLOpNum(const MCInstrDesc &Desc) {
   const uint64_t TSFlags = Desc.TSFlags;
   // This method is only called if we expect to have a VL operand, and all
   // instructions with VL also have SEW.
   assert(hasSEWOp(TSFlags) && hasVLOp(TSFlags));
+  // In Xsfmmbase, TN is alias for VL, so here we use the same TSFlags bit.
+  if (hasTWidenOp(TSFlags))
+    return getTNOpNum(Desc);
   unsigned Offset = 2;
   if (hasVecPolicyOp(TSFlags))
     Offset = 3;
@@ -216,7 +276,7 @@ static inline unsigned getSEWOpNum(const MCInstrDesc &Desc) {
   const uint64_t TSFlags = Desc.TSFlags;
   assert(hasSEWOp(TSFlags));
   unsigned Offset = 1;
-  if (hasVecPolicyOp(TSFlags))
+  if (hasVecPolicyOp(TSFlags) || hasTWidenOp(TSFlags))
     Offset = 2;
   return Desc.getNumOperands() - Offset;
 }
@@ -233,6 +293,9 @@ static inline int getFRMOpNum(const MCInstrDesc &Desc) {
   if (!hasRoundModeOp(TSFlags) || usesVXRM(TSFlags))
     return -1;
 
+  if (hasTWidenOp(TSFlags) && hasTMOp(TSFlags))
+    return getTMOpNum(Desc) - 1;
+
   // The operand order
   // --------------------------------------
   // | n-1 (if any)   | n-2  | n-3 | n-4 |
@@ -375,6 +438,7 @@ enum OperandType : unsigned {
   // instructions to represent a value that be passed as AVL to either vsetvli
   // or vsetivli.
   OPERAND_AVL,
+  OPERAND_XSFMM_VTYPE,
 };
 } // namespace RISCVOp
 
diff --git a/llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp b/llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp
index 72f1596d79a02..534c330463085 100644
--- a/llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp
+++ b/llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp
@@ -1100,6 +1100,12 @@ static bool lowerRISCVVMachineInstrToMCInst(const MachineInstr *MI,
     --NumOps;
   if (RISCVII::hasRoundModeOp(TSFlags))
     --NumOps;
+  if (RISCVII::hasTWidenOp(TSFlags))
+    --NumOps;
+  if (RISCVII::hasTMOp(TSFlags))
+    --NumOps;
+  if (RISCVII::hasTKOp(TSFlags))
+    --NumOps;
 
   bool hasVLOutput = RISCV::isFaultFirstLoad(*MI);
   for (unsigned OpNo = 0; OpNo != NumOps; ++OpNo) {
diff --git a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
index 494d6ed03292a..d44eee21e0f9e 100644
--- a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
@@ -522,6 +522,43 @@ void RISCVDAGToDAGISel::selectVSETVLI(SDNode *Node) {
               CurDAG->getMachineNode(Opcode, DL, XLenVT, VLOperand, VTypeIOp));
 }
 
+void RISCVDAGToDAGISel::selectXSfmmVSET(SDNode *Node) {
+  if (!Subtarget->hasVendorXSfmmbase())
+    return;
+
+  assert(Node->getOpcode() == ISD::INTRINSIC_WO_CHAIN && "Unexpected opcode");
+
+  SDLoc DL(Node);
+  MVT XLenVT = Subtarget->getXLenVT();
+
+  unsigned IntNo = Node->getConstantOperandVal(0);
+
+  assert((IntNo == Intrinsic::riscv_sf_vsettnt ||
+          IntNo == Intrinsic::riscv_sf_vsettm ||
+          IntNo == Intrinsic::riscv_sf_vsettk) &&
+         "Unexpected XSfmm vset intrinsic");
+
+  unsigned SEW = RISCVVType::decodeVSEW(Node->getConstantOperandVal(2));
+  unsigned Widen = RISCVVType::decodeTWiden(Node->getConstantOperandVal(3));
+  unsigned PseudoOpCode =
+      IntNo == Intrinsic::riscv_sf_vsettnt  ? RISCV::PseudoSF_VSETTNT
+      : IntNo == Intrinsic::riscv_sf_vsettm ? RISCV::PseudoSF_VSETTM
+                                            : RISCV::PseudoSF_VSETTK;
+
+  unsigned VTypeI = RISCVVType::encodeXSfmmVType(SEW, Widen, 0);
+  SDValue VTypeIOp = CurDAG->getTargetConstant(VTypeI, DL, XLenVT);
+  SDValue Log2SEW = CurDAG->getTargetConstant(Log2_32(SEW), DL, XLenVT);
+  SDValue TWiden = CurDAG->getTargetConstant(Widen, DL, XLenVT);
+
+  if (IntNo == Intrinsic::riscv_sf_vsettnt)
+    ReplaceNode(Node, CurDAG->getMachineNode(PseudoOpCode, DL, XLenVT,
+                                             Node->getOperand(1), VTypeIOp));
+  else
+    ReplaceNode(Node,
+                CurDAG->getMachineNode(PseudoOpCode, DL, XLenVT,
+                                       Node->getOperand(1), Log2SEW, TWiden));
+}
+
 bool RISCVDAGToDAGISel::tryShrinkShlLogicImm(SDNode *Node) {
   MVT VT = Node->getSimpleValueType(0);
   unsigned Opcode = Node->getOpcode();
@@ -777,6 +814,11 @@ bool RISCVDAGToDAGISel::tryIndexedLoad(SDNode *Node) {
   return true;
 }
 
+static Register getTileReg(uint64_t TileNum) {
+  assert(TileNum <= 15 && "Invalid tile number");
+  return RISCV::T0 + TileNum;
+}
+
 void RISCVDAGToDAGISel::selectSF_VC_X_SE(SDNode *Node) {
   if (!Subtarget->hasVInstructions())
     return;
@@ -1955,6 +1997,10 @@ void RISCVDAGToDAGISel::Select(SDNode *Node) {
     case Intrinsic::riscv_vsetvli:
     case Intrinsic::riscv_vsetvlimax:
       return selectVSETVLI(Node);
+    case Intrinsic::riscv_sf_vsettnt:
+    case Intrinsic::riscv_sf_vsettm:
+    case Intrinsic::riscv_sf_vsettk:
+      return selectXSfmmVSET(Node);
     }
     break;
   }
@@ -2352,6 +2398,142 @@ void RISCVDAGToDAGISel::Select(SDNode *Node) {
     case Intrinsic::riscv_sf_vc_i_se:
       selectSF_VC_X_SE(Node);
       return;
+    case Intrinsic::riscv_sf_vlte8:
+    case Intrinsic::riscv_sf_vlte16:
+    case Intrinsic::riscv_sf_vlte32:
+    case Intrinsic::riscv_sf_vlte64: {
+      unsigned Log2SEW;
+      unsigned PseudoInst;
+      switch (IntNo) {
+      case Intrinsic::riscv_sf_vlte8:
+        PseudoInst = RISCV::PseudoSF_VLTE8;
+        Log2SEW = 3;
+        break;
+      case Intrinsic::riscv_sf_vlte16:
+        PseudoInst = RISCV::PseudoSF_VLTE16;
+        Log2SEW = 4;
+        break;
+      case Intrinsic::riscv_sf_vlte32:
+        PseudoInst = RISCV::PseudoSF_VLTE32;
+        Log2SEW = 5;
+        break;
+      case Intrinsic::riscv_sf_vlte64:
+        PseudoInst = RISCV::PseudoSF_VLTE64;
+        Log2SEW = 6;
+        break;
+      }
+
+      SDValue SEWOp = CurDAG->getTargetConstant(Log2SEW, DL, XLenVT);
+      SDValue TWidenOp = CurDAG->getTargetConstant(1, DL, XLenVT);
+      SmallVector<SDValue, 7> Operands = {Node->getOperand(2),
+                                          Node->getOperand(3),
+                                          Node->getOperand(4),
+                                          SEWOp,
+                                          TWidenOp,
+                                          Node->getOperand(0)};
+
+      MachineSDNode *TileLoad =
+          CurDAG->getMachineNode(PseudoInst, DL, Node->getVTList(), Operands);
+      if (auto *MemOp = dyn_cast<MemSDNode>(Node))
+        CurDAG->setNodeMemRefs(TileLoad, {MemOp->getMemOperand()});
+
+      ReplaceNode(Node, TileLoad);
+      return;
+    }
+    case Intrinsic::riscv_sf_mm_s_s:
+    case Intrinsic::riscv_sf_mm_s_u:
+    case Intrinsic::riscv_sf_mm_u_s:
+    case Intrinsic::riscv_sf_mm_u_u:
+    case Intrinsic::riscv_sf_mm_e5m2_e5m2:
+    case Intrinsic::riscv_sf_mm_e5m2_e4m3:
+    case Intrinsic::riscv_sf_mm_e4m3_e5m2:
+    case Intrinsic::riscv_sf_mm_e4m3_e4m3:
+    case Intrinsic::riscv_sf_mm_f_f: {
+      bool HasFRM = false;
+      unsigned PseudoInst;
+      switch (IntNo) {
+      case Intrinsic::riscv_sf_mm_s_s:
+        PseudoInst = RISCV::PseudoSF_MM_S_S;
+        break;
+      case Intrinsic::riscv_sf_mm_s_u:
+        PseudoInst = RISCV::PseudoSF_MM_S_U;
+        break;
+      case Intrinsic::riscv_sf_mm_u_s:
+        PseudoInst = RISCV::PseudoSF_MM_U_S;
+        break;
+      case Intrinsic::riscv_sf_mm_u_u:
+        PseudoInst = RISCV::PseudoSF_MM_U_U;
+        break;
+      case Intrinsic::riscv_sf_mm_e5m2_e5m2:
+        PseudoInst = RISCV::PseudoSF_MM_E5M2_E5M2;
+        HasFRM = true;
+        break;
+      case Intrinsic::riscv_sf_mm_e5m2_e4m3:
+        PseudoInst = RISCV::PseudoSF_MM_E5M2_E4M3;
+        HasFRM = true;
+        break;
+      case Intrinsic::riscv_sf_mm_e4m3_e5m2:
+        PseudoInst = RISCV::PseudoSF_MM_E4M3_E5M2;
+        HasFRM = true;
+        break;
+      case Intrinsic::riscv_sf_mm_e4m3_e4m3:
+        PseudoInst = RISCV::PseudoSF_MM_E4M3_E4M3;
+        HasFRM = true;
+        break;
+      case Intrinsic::riscv_sf_mm_f_f:
+        if (Node->getOperand(3).getValueType().getScalarType() == MVT::bf16)
+          PseudoInst = RISCV::PseudoSF_MM_F_F_ALT;
+        else
+          PseudoInst = RISCV::PseudoSF_MM_F_F;
+        HasFRM = true;
+        break;
+      }
+      uint64_t TileNum = Node->getConstantOperandVal(2);
+      SDValue Op1 = Node->getOperand(3);
+      SDValue Op2 = Node->getOperand(4);
+      MVT VT = Op1->getSimpleValueType(0);
+      unsigned Log2SEW = Log2_32(VT.getScalarSizeInBits());
+      SDValue TmOp = Node->getOperand(5);
+      SDValue TnOp = Node->getOperand(6);
+      SDValue TkOp = Node->getOperand(7);
+      SDValue TWidenOp = Node->getOperand(8);
+      SDValue Chain = Node->getOperand(0);
+
+      // sf.mm.f.f with sew=32, twiden=2 is invalid
+      if (IntNo == Intrinsic::riscv_sf_mm_f_f && Log2SEW == 5 &&
+          TWidenOp->getAsZExtVal() == 2)
+        report_fatal_error("sf.mm.f.f doesn't support (sew=32, twiden=2)");
+
+      SmallVector<SDValue, 10> Operands(
+          {CurDAG->getRegister(getTileReg(TileNum), XLenVT), Op1, Op2});
+      if (HasFRM)
+        Operands.push_back(
+            CurDAG->getTargetConstant(RISCVFPRndMode::DYN, DL, XLenVT));
+      Operands.append({TmOp, TnOp, TkOp,
+                       CurDAG->getTargetConstant(Log2SEW, DL, XLenVT), TWidenOp,
+                       Chain});
+
+      auto *NewNode =
+          CurDAG->getMachineNode(PseudoInst, DL, Node->getVTList(), Operands);
+
+      ReplaceNode(Node, NewNode);
+      return;
+    }
+    case Intrinsic::riscv_sf_vtzero_t: {
+      uint64_t TileNum = Node->getConstantOperandVal(2);
+      SDValue Tm = Node->getOperand(3);
+      SDValue Tn = Node->getOperand(4);
+      SDValue Log2SEW = Node->getOperand(5);
+      SDValue TWiden = Node->getOperand(6);
+      SDValue Chain = Node->getOperand(0);
+      auto *NewNode = CurDAG->getMachineNode(
+          RISCV::PseudoSF_VTZERO_T, DL, Node->getVTList(),
+          {CurDAG->getRegister(getTileReg(TileNum), XLenVT), Tm, Tn, Log2SEW,
+           TWiden, Chain});
+
+      ReplaceNode(Node, NewNode);
+      return;
+    }
     }
     break;
   }
diff --git a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.h b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.h
index f199c2031b9a9..ce40075ff6d7b 100644
--- a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.h
+++ b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.h
@@ -164,6 +164,7 @@ class RISCVDAGToDAGISel : public SelectionDAGISel {
   void selectVSXSEG(SDNode *Node, unsigned NF, bool IsMasked, bool IsOrdered);
 
   void selectVSETVLI(SDNode *Node);
+  void selectXSfmmVSET(SDNode *Node);
 
   void selectSF_VC_X_SE(SDNode *Node);
 
diff --git a/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp b/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
index e1c69c69a99ca..88d1178eba9b1 100644
--- a/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
@@ -164,10 +164,13 @@ struct DemandedFields {
   // If this is true, we demand that VTYPE is set to some legal state, i.e. that
   // vill is unset.
   bool VILL = false;
+  bool UseTWiden = false;
+  bool UseAltFmt = false;
 
   // Return true if any part of VTYPE was used
   bool usedVTYPE() const {
-    return SEW || LMUL || SEWLMULRatio || TailPolicy || MaskPolicy || VILL;
+    return SEW || LMUL || SEWLMULRatio || TailPolicy || MaskPolicy || VILL ||
+           UseTWiden || UseAltFmt;
   }
 
   // Return true if any property of VL was used
@@ -183,6 +186,8 @@ struct DemandedFields {
     TailPolicy = true;
     MaskPolicy = true;
     VILL = true;
+    UseTWiden = true;
+    UseAltFmt = true;
   }
 
   // Mark all VL properties as demanded
@@ -208,6 +213,8 @@ struct DemandedFields {
     TailPolicy |= B.TailPolicy;
     MaskPolicy |= B.MaskPolicy;
     VILL |= B.VILL;
+    UseAltFmt |= B.UseAltFmt;
+    UseTWiden |= B.UseTWiden;
   }
 
 #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
@@ -255,6 +262,8 @@ struct DemandedFields {
     OS << "TailPolicy=" << TailPolicy << ", ";
     OS << "MaskPolicy=" << MaskPolicy << ", ";
     OS << "VILL=" << VILL;
+    OS << "UseAltFmt=" << UseAltFmt << ", ";
+    OS << "UseTWiden=" << UseTWiden;
     OS << "}";
   }
 #endif
@@ -324,6 +333,15 @@ static bool areCompatibleVTYPEs(uint64_t CurVType, uint64_t NewVType,
   if (Used.MaskPolicy && RISCVVType::isMaskAgnostic(CurVType) !=
                              RISCVVType::isMaskAgnostic(NewVType))
     return false;
+  if (Used.UseTWiden && (RISCVVType::hasXSfmmWiden(CurVType) !=
+                             RISCVVType::hasXSfmmWiden(NewVType) ||
+                         (RISCVVType::hasXSfmmWiden(CurVType) &&
+                          RISCVVType::getXSfmmWiden(CurVType) !=
+                              RISCVVType::getXSfmmWiden(NewVType))))
+    return false;
+  if (Used.UseAltFmt &&
+      RISCVVType::isAltFmt(CurVType) != RISCVVType::isAltFmt(NewVType))
+    return fals...
[truncated]

@github-actions
Copy link

github-actions bot commented Jun 6, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

void RISCVInsertVSETVLI::insertVSETVLI(MachineBasicBlock &MBB,
MachineBasicBlock::iterator InsertPt, DebugLoc DL,
const VSETVLIInfo &Info, const VSETVLIInfo &PrevInfo) {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop this unnecessary change

@efriedma-quic
Copy link
Collaborator

Is there an ABI specification for this somewhere? For SME, we did a lot of work to ensure that operations involving the matrix tile work cleanly. Which... I guess you don't necessarily need to do all of that, but you probably want a spec that says something about it.

[ImmArg<ArgIndex<1>>, ImmArg<ArgIndex<2>>, IntrNoMem]>;

// Input: (tss, base, tn)
// IntrReadMem, IntrHasSideEffects does not work for pattern matching.
Copy link
Collaborator

@topperc topperc Jul 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't these intrinsics manually selected? Not sure what you mean by pattern matching here.

We definitely need IntrHasSideEffects. I do recall that IntrReadMem+IntrHasSideEffects has some issue in attribute generation. You're better off omitting IntrReadMem than not having IntrHasSideEffects. I guess maybe that's the default if you don't say anything?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I probably mean we don't use pattern matching here because IntrReadMem + IntrHasSideEffects doesn't work properly, it's eliminated during instruction selection.
I'll keep IntrHasSideEffects anyways lol

@4vtomat
Copy link
Member Author

4vtomat commented Aug 5, 2025

Is there an ABI specification for this somewhere? For SME, we did a lot of work to ensure that operations involving the matrix tile work cleanly. Which... I guess you don't necessarily need to do all of that, but you probably want a spec that says something about it.

Yeah I'm cooking on this, thanks for reminding!
I'll have another PR for that!

@4vtomat 4vtomat force-pushed the xsfmm_llvm_ir branch 2 times, most recently from 6bcdbaa to 3914b96 Compare September 22, 2025 14:55
Copy link
Collaborator

@topperc topperc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM other than the FRM question


SmallVector<SDValue, 10> Operands(
{CurDAG->getRegister(getTileReg(TileNum), XLenVT), Op1, Op2});
if (HasFRM)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you plan to add FRM to the intrinsic operands?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think I'll list as TODO

@4vtomat 4vtomat enabled auto-merge (squash) October 14, 2025 01:06
@4vtomat 4vtomat merged commit 6cec362 into llvm:main Oct 14, 2025
10 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Oct 14, 2025

LLVM Buildbot has detected a new failure on builder arc-builder running on arc-worker while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/3/builds/23368

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: CodeGen/X86/sse2-intrinsics-fast-isel.ll' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 2
/buildbot/worker/arc-folder/build/bin/llc < /buildbot/worker/arc-folder/llvm-project/llvm/test/CodeGen/X86/sse2-intrinsics-fast-isel.ll -show-mc-encoding -fast-isel -mtriple=i386-unknown-unknown -mattr=+sse2 | /buildbot/worker/arc-folder/build/bin/FileCheck /buildbot/worker/arc-folder/llvm-project/llvm/test/CodeGen/X86/sse2-intrinsics-fast-isel.ll --check-prefixes=CHECK,X86,SSE,X86-SSE
# executed command: /buildbot/worker/arc-folder/build/bin/llc -show-mc-encoding -fast-isel -mtriple=i386-unknown-unknown -mattr=+sse2
# .---command stderr------------
# | LLVM ERROR: Cannot select: intrinsic %llvm.x86.sse2.clflush
# | PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace and instructions to reproduce the bug.
# | Stack dump:
# | 0.	Program arguments: /buildbot/worker/arc-folder/build/bin/llc -show-mc-encoding -fast-isel -mtriple=i386-unknown-unknown -mattr=+sse2
# | 1.	Running pass 'Function Pass Manager' on module '<stdin>'.
# | 2.	Running pass 'X86 DAG->DAG Instruction Selection' on function '@test_mm_clflush'
# |  #0 0x0000000002394e68 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/buildbot/worker/arc-folder/build/bin/llc+0x2394e68)
# |  #1 0x0000000002391d75 SignalHandler(int, siginfo_t*, void*) Signals.cpp:0:0
# |  #2 0x00007fb7f9b34630 __restore_rt sigaction.c:0:0
# |  #3 0x00007fb7f88843d7 raise (/usr/lib64/libc.so.6+0x363d7)
# |  #4 0x00007fb7f8885ac8 abort (/usr/lib64/libc.so.6+0x37ac8)
# |  #5 0x000000000072664b llvm::json::operator==(llvm::json::Value const&, llvm::json::Value const&) (.cold) JSON.cpp:0:0
# |  #6 0x0000000002116629 llvm::SelectionDAGISel::CannotYetSelect(llvm::SDNode*) (/buildbot/worker/arc-folder/build/bin/llc+0x2116629)
# |  #7 0x000000000211b1ba llvm::SelectionDAGISel::SelectCodeCommon(llvm::SDNode*, unsigned char const*, unsigned int) (/buildbot/worker/arc-folder/build/bin/llc+0x211b1ba)
# |  #8 0x000000000096ce27 (anonymous namespace)::X86DAGToDAGISel::Select(llvm::SDNode*) X86ISelDAGToDAG.cpp:0:0
# |  #9 0x0000000002111e6f llvm::SelectionDAGISel::DoInstructionSelection() (/buildbot/worker/arc-folder/build/bin/llc+0x2111e6f)
# | #10 0x0000000002121d78 llvm::SelectionDAGISel::CodeGenAndEmitDAG() (/buildbot/worker/arc-folder/build/bin/llc+0x2121d78)
# | #11 0x0000000002125e8e llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) (/buildbot/worker/arc-folder/build/bin/llc+0x2125e8e)
# | #12 0x0000000002126ae5 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) (/buildbot/worker/arc-folder/build/bin/llc+0x2126ae5)
# | #13 0x000000000211167f llvm::SelectionDAGISelLegacy::runOnMachineFunction(llvm::MachineFunction&) (/buildbot/worker/arc-folder/build/bin/llc+0x211167f)
# | #14 0x000000000121fdb7 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (.part.0) MachineFunctionPass.cpp:0:0
# | #15 0x00000000018a0dcb llvm::FPPassManager::runOnFunction(llvm::Function&) (/buildbot/worker/arc-folder/build/bin/llc+0x18a0dcb)
# | #16 0x00000000018a1171 llvm::FPPassManager::runOnModule(llvm::Module&) (/buildbot/worker/arc-folder/build/bin/llc+0x18a1171)
# | #17 0x00000000018a1d85 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/buildbot/worker/arc-folder/build/bin/llc+0x18a1d85)
# | #18 0x000000000080a66d compileModule(char**, llvm::LLVMContext&) llc.cpp:0:0
# | #19 0x000000000072f096 main (/buildbot/worker/arc-folder/build/bin/llc+0x72f096)
# | #20 0x00007fb7f8870555 __libc_start_main (/usr/lib64/libc.so.6+0x22555)
# | #21 0x00000000007ffa66 _start (/buildbot/worker/arc-folder/build/bin/llc+0x7ffa66)
# `-----------------------------
# error: command failed with exit status: -6
# executed command: /buildbot/worker/arc-folder/build/bin/FileCheck /buildbot/worker/arc-folder/llvm-project/llvm/test/CodeGen/X86/sse2-intrinsics-fast-isel.ll --check-prefixes=CHECK,X86,SSE,X86-SSE
# .---command stderr------------
# | /buildbot/worker/arc-folder/llvm-project/llvm/test/CodeGen/X86/sse2-intrinsics-fast-isel.ll:399:14: error: SSE-LABEL: expected string not found in input
# | ; SSE-LABEL: test_mm_bsrli_si128:
# |              ^
# | <stdin>:170:21: note: scanning from here
# | test_mm_bslli_si128: # @test_mm_bslli_si128
# |                     ^
# | <stdin>:178:9: note: possible intended match here
# |  .globl test_mm_bsrli_si128 # 
# |         ^
...

akadutta pushed a commit to akadutta/llvm-project that referenced this pull request Oct 14, 2025
@efriedma-quic
Copy link
Collaborator

Is there an ABI specification for this somewhere? For SME, we did a lot of work to ensure that operations involving the matrix tile work cleanly. Which... I guess you don't necessarily need to do all of that, but you probably want a spec that says something about it.

Did anything happen for this?

@efriedma-quic
Copy link
Collaborator

Is there an ABI specification for this somewhere? For SME, we did a lot of work to ensure that operations involving the matrix tile work cleanly. Which... I guess you don't necessarily need to do all of that, but you probably want a spec that says something about it.

Did anything happen for this?

Ping

@4vtomat
Copy link
Member Author

4vtomat commented Oct 21, 2025

Is there an ABI specification for this somewhere? For SME, we did a lot of work to ensure that operations involving the matrix tile work cleanly. Which... I guess you don't necessarily need to do all of that, but you probably want a spec that says something about it.

Did anything happen for this?

Ping

Sorry for late reply. We have a PR reviewing in our downstream, after settle down, we'll have a spec and PR publicly

4vtomat added a commit that referenced this pull request Oct 24, 2025
In this version of intrinsics, users need to manage the life time of
tiles on their own, compiler doesn't have tile type for variables not
only for design simplicity but also preventing users to write bad
performance code that could potentially having tile spills which are
quite expensive in terms of cycles.

Intrinsics are specified at the end of this document
https://www.sifive.com/document-file/xsfmm-matrix-extensions-specification

stack on: #143068 and
#143069
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Oct 24, 2025
In this version of intrinsics, users need to manage the life time of
tiles on their own, compiler doesn't have tile type for variables not
only for design simplicity but also preventing users to write bad
performance code that could potentially having tile spills which are
quite expensive in terms of cycles.

Intrinsics are specified at the end of this document
https://www.sifive.com/document-file/xsfmm-matrix-extensions-specification

stack on: llvm/llvm-project#143068 and
llvm/llvm-project#143069
dvbuka pushed a commit to dvbuka/llvm-project that referenced this pull request Oct 27, 2025
In this version of intrinsics, users need to manage the life time of
tiles on their own, compiler doesn't have tile type for variables not
only for design simplicity but also preventing users to write bad
performance code that could potentially having tile spills which are
quite expensive in terms of cycles.

Intrinsics are specified at the end of this document
https://www.sifive.com/document-file/xsfmm-matrix-extensions-specification

stack on: llvm#143068 and
llvm#143069
Lukacma pushed a commit to Lukacma/llvm-project that referenced this pull request Oct 29, 2025
In this version of intrinsics, users need to manage the life time of
tiles on their own, compiler doesn't have tile type for variables not
only for design simplicity but also preventing users to write bad
performance code that could potentially having tile spills which are
quite expensive in terms of cycles.

Intrinsics are specified at the end of this document
https://www.sifive.com/document-file/xsfmm-matrix-extensions-specification

stack on: llvm#143068 and
llvm#143069
aokblast pushed a commit to aokblast/llvm-project that referenced this pull request Oct 30, 2025
In this version of intrinsics, users need to manage the life time of
tiles on their own, compiler doesn't have tile type for variables not
only for design simplicity but also preventing users to write bad
performance code that could potentially having tile spills which are
quite expensive in terms of cycles.

Intrinsics are specified at the end of this document
https://www.sifive.com/document-file/xsfmm-matrix-extensions-specification

stack on: llvm#143068 and
llvm#143069
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants