[SPIR-V] Legalize vector arithmetic and intrinsics for large vectors #170668

s-perron · 2025-12-04T14:34:06Z

This patch improves the legalization of vector operations, particularly
focusing on vectors that exceed the maximum supported size (e.g., 4 elements
for shaders). This includes better handling for insert and extract element
operations, which facilitates the legalization of loads and stores for
long vectors—a common pattern when compiling HLSL matrices with Clang.

Key changes include:

Adding legalization rules for G_FMA, G_INSERT_VECTOR_ELT, and various
arithmetic operations to handle splitting of large vectors.
Updating G_CONCAT_VECTORS and G_SPLAT_VECTOR to be legal for allowed
types.
Implementing custom legalization for G_INSERT_VECTOR_ELT using the
spv_insertelt intrinsic.
Enhancing SPIRVPostLegalizer to deduce types for arithmetic instructions
and vector element intrinsics (spv_insertelt, spv_extractelt).
Refactoring legalizeIntrinsic to uniformly handle vector legalization
requirements.

The strategy for insert and extract operations mirrors that of bitcasts:
incoming intrinsics are converted to generic MIR instructions (G_INSERT_VECTOR_ELT
and G_EXTRACT_VECTOR_ELT) to leverage standard legalization rules (like splitting).
After legalization, they are converted back to their respective SPIR-V intrinsics
(spv_insertelt, spv_extractelt) because later passes in the backend expect these
intrinsics rather than the generic instructions.

This ensures that operations on large vectors (e.g., <16 x float>) are
correctly broken down into legal sub-vectors.

This patch improves the legalization of vector operations, particularly focusing on vectors that exceed the maximum supported size (e.g., 4 elements for shaders). This includes better handling for insert and extract element operations, which facilitates the legalization of loads and stores for long vectors—a common pattern when compiling HLSL matrices with Clang. Key changes include: - Adding legalization rules for G_FMA, G_INSERT_VECTOR_ELT, and various arithmetic operations to handle splitting of large vectors. - Updating G_CONCAT_VECTORS and G_SPLAT_VECTOR to be legal for allowed types. - Implementing custom legalization for G_INSERT_VECTOR_ELT using the spv_insertelt intrinsic. - Enhancing SPIRVPostLegalizer to deduce types for arithmetic instructions and vector element intrinsics (spv_insertelt, spv_extractelt). - Refactoring legalizeIntrinsic to uniformly handle vector legalization requirements. The strategy for insert and extract operations mirrors that of bitcasts: incoming intrinsics are converted to generic MIR instructions (G_INSERT_VECTOR_ELT and G_EXTRACT_VECTOR_ELT) to leverage standard legalization rules (like splitting). After legalization, they are converted back to their respective SPIR-V intrinsics (spv_insertelt, spv_extractelt) because later passes in the backend expect these intrinsics rather than the generic instructions. This ensures that operations on large vectors (e.g., <16 x float>) are correctly broken down into legal sub-vectors.

llvmbot · 2025-12-04T14:34:42Z

@llvm/pr-subscribers-backend-spir-v

Author: Steven Perron (s-perron)

Changes

This patch improves the legalization of vector operations, particularly
focusing on vectors that exceed the maximum supported size (e.g., 4 elements
for shaders). This includes better handling for insert and extract element
operations, which facilitates the legalization of loads and stores for
long vectors—a common pattern when compiling HLSL matrices with Clang.

Key changes include:

Adding legalization rules for G_FMA, G_INSERT_VECTOR_ELT, and various
arithmetic operations to handle splitting of large vectors.
Updating G_CONCAT_VECTORS and G_SPLAT_VECTOR to be legal for allowed
types.
Implementing custom legalization for G_INSERT_VECTOR_ELT using the
spv_insertelt intrinsic.
Enhancing SPIRVPostLegalizer to deduce types for arithmetic instructions
and vector element intrinsics (spv_insertelt, spv_extractelt).
Refactoring legalizeIntrinsic to uniformly handle vector legalization
requirements.

The strategy for insert and extract operations mirrors that of bitcasts:
incoming intrinsics are converted to generic MIR instructions (G_INSERT_VECTOR_ELT
and G_EXTRACT_VECTOR_ELT) to leverage standard legalization rules (like splitting).
After legalization, they are converted back to their respective SPIR-V intrinsics
(spv_insertelt, spv_extractelt) because later passes in the backend expect these
intrinsics rather than the generic instructions.

This ensures that operations on large vectors (e.g., <16 x float>) are
correctly broken down into legal sub-vectors.

Patch is 29.59 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/170668.diff

4 Files Affected:

(modified) llvm/lib/Target/SPIRV/SPIRVLegalizerInfo.cpp (+83-34)
(modified) llvm/lib/Target/SPIRV/SPIRVPostLegalizer.cpp (+38-9)
(added) llvm/test/CodeGen/SPIRV/legalization/load-store-global.ll (+194)
(added) llvm/test/CodeGen/SPIRV/legalization/vector-arithmetic.ll (+149)

diff --git a/llvm/lib/Target/SPIRV/SPIRVLegalizerInfo.cpp b/llvm/lib/Target/SPIRV/SPIRVLegalizerInfo.cpp
index b5912c27316c9..4d83649c0f84f 100644
--- a/llvm/lib/Target/SPIRV/SPIRVLegalizerInfo.cpp
+++ b/llvm/lib/Target/SPIRV/SPIRVLegalizerInfo.cpp
@@ -113,6 +113,8 @@ SPIRVLegalizerInfo::SPIRVLegalizerInfo(const SPIRVSubtarget &ST) {
                            v3s1, v3s8, v3s16, v3s32, v3s64,
                            v4s1, v4s8, v4s16, v4s32, v4s64};
 
+  auto allScalars = {s1, s8, s16, s32};
+
   auto allScalarsAndVectors = {
       s1,   s8,   s16,   s32,   s64,   v2s1,  v2s8,  v2s16,  v2s32,  v2s64,
       v3s1, v3s8, v3s16, v3s32, v3s64, v4s1,  v4s8,  v4s16,  v4s32,  v4s64,
@@ -172,9 +174,25 @@ SPIRVLegalizerInfo::SPIRVLegalizerInfo(const SPIRVSubtarget &ST) {
 
   for (auto Opc : getTypeFoldingSupportedOpcodes()) {
     if (Opc != G_EXTRACT_VECTOR_ELT)
-      getActionDefinitionsBuilder(Opc).custom();
+      getActionDefinitionsBuilder(Opc)
+          .customFor(allScalars)
+          .customFor(allowedVectorTypes)
+          .moreElementsToNextPow2(0)
+          .fewerElementsIf(vectorElementCountIsGreaterThan(0, MaxVectorSize),
+                           LegalizeMutations::changeElementCountTo(
+                               0, ElementCount::getFixed(MaxVectorSize)))
+          .custom();
   }
 
+  getActionDefinitionsBuilder(TargetOpcode::G_FMA)
+      .legalFor(allScalars)
+      .legalFor(allowedVectorTypes)
+      .moreElementsToNextPow2(0)
+      .fewerElementsIf(vectorElementCountIsGreaterThan(0, MaxVectorSize),
+                       LegalizeMutations::changeElementCountTo(
+                           0, ElementCount::getFixed(MaxVectorSize)))
+      .alwaysLegal();
+
   getActionDefinitionsBuilder(G_INTRINSIC_W_SIDE_EFFECTS).custom();
 
   getActionDefinitionsBuilder(G_SHUFFLE_VECTOR)
@@ -192,6 +210,13 @@ SPIRVLegalizerInfo::SPIRVLegalizerInfo(const SPIRVSubtarget &ST) {
                            1, ElementCount::getFixed(MaxVectorSize)))
       .custom();
 
+  getActionDefinitionsBuilder(G_INSERT_VECTOR_ELT)
+      .moreElementsToNextPow2(0)
+      .fewerElementsIf(vectorElementCountIsGreaterThan(0, MaxVectorSize),
+                       LegalizeMutations::changeElementCountTo(
+                           0, ElementCount::getFixed(MaxVectorSize)))
+      .custom();
+
   // Illegal G_UNMERGE_VALUES instructions should be handled
   // during the combine phase.
   getActionDefinitionsBuilder(G_BUILD_VECTOR)
@@ -215,14 +240,13 @@ SPIRVLegalizerInfo::SPIRVLegalizerInfo(const SPIRVSubtarget &ST) {
       .lowerIf(vectorElementCountIsGreaterThan(1, MaxVectorSize))
       .custom();
 
+  // If the result is still illegal, the combiner should be able to remove it.
   getActionDefinitionsBuilder(G_CONCAT_VECTORS)
-      .legalIf(vectorElementCountIsLessThanOrEqualTo(0, MaxVectorSize))
-      .moreElementsToNextPow2(0)
-      .lowerIf(vectorElementCountIsGreaterThan(0, MaxVectorSize))
-      .alwaysLegal();
+      .legalForCartesianProduct(allowedVectorTypes, allowedVectorTypes)
+      .moreElementsToNextPow2(0);
 
   getActionDefinitionsBuilder(G_SPLAT_VECTOR)
-      .legalIf(vectorElementCountIsLessThanOrEqualTo(0, MaxVectorSize))
+      .legalFor(allowedVectorTypes)
       .moreElementsToNextPow2(0)
       .fewerElementsIf(vectorElementCountIsGreaterThan(0, MaxVectorSize),
                        LegalizeMutations::changeElementSizeTo(0, MaxVectorSize))
@@ -458,6 +482,23 @@ static bool legalizeExtractVectorElt(LegalizerHelper &Helper, MachineInstr &MI,
   return true;
 }
 
+static bool legalizeInsertVectorElt(LegalizerHelper &Helper, MachineInstr &MI,
+                                    SPIRVGlobalRegistry *GR) {
+  MachineIRBuilder &MIRBuilder = Helper.MIRBuilder;
+  Register DstReg = MI.getOperand(0).getReg();
+  Register SrcReg = MI.getOperand(1).getReg();
+  Register ValReg = MI.getOperand(2).getReg();
+  Register IdxReg = MI.getOperand(3).getReg();
+
+  MIRBuilder
+      .buildIntrinsic(Intrinsic::spv_insertelt, ArrayRef<Register>{DstReg})
+      .addUse(SrcReg)
+      .addUse(ValReg)
+      .addUse(IdxReg);
+  MI.eraseFromParent();
+  return true;
+}
+
 static Register convertPtrToInt(Register Reg, LLT ConvTy, SPIRVType *SpvType,
                                 LegalizerHelper &Helper,
                                 MachineRegisterInfo &MRI,
@@ -483,6 +524,8 @@ bool SPIRVLegalizerInfo::legalizeCustom(
     return legalizeBitcast(Helper, MI);
   case TargetOpcode::G_EXTRACT_VECTOR_ELT:
     return legalizeExtractVectorElt(Helper, MI, GR);
+  case TargetOpcode::G_INSERT_VECTOR_ELT:
+    return legalizeInsertVectorElt(Helper, MI, GR);
   case TargetOpcode::G_INTRINSIC:
   case TargetOpcode::G_INTRINSIC_W_SIDE_EFFECTS:
     return legalizeIntrinsic(Helper, MI);
@@ -512,6 +555,15 @@ bool SPIRVLegalizerInfo::legalizeCustom(
   }
 }
 
+static bool needsVectorLegalization(const LLT &Ty, const SPIRVSubtarget &ST) {
+  if (!Ty.isVector())
+    return false;
+  unsigned NumElements = Ty.getNumElements();
+  unsigned MaxVectorSize = ST.isShader() ? 4 : 16;
+  return (NumElements > 4 && !isPowerOf2_32(NumElements)) ||
+         NumElements > MaxVectorSize;
+}
+
 bool SPIRVLegalizerInfo::legalizeIntrinsic(LegalizerHelper &Helper,
                                            MachineInstr &MI) const {
   LLVM_DEBUG(dbgs() << "legalizeIntrinsic: " << MI);
@@ -528,41 +580,38 @@ bool SPIRVLegalizerInfo::legalizeIntrinsic(LegalizerHelper &Helper,
     LLT DstTy = MRI.getType(DstReg);
     LLT SrcTy = MRI.getType(SrcReg);
 
-    int32_t MaxVectorSize = ST.isShader() ? 4 : 16;
-
-    bool DstNeedsLegalization = false;
-    bool SrcNeedsLegalization = false;
-
-    if (DstTy.isVector()) {
-      if (DstTy.getNumElements() > 4 &&
-          !isPowerOf2_32(DstTy.getNumElements())) {
-        DstNeedsLegalization = true;
-      }
-
-      if (DstTy.getNumElements() > MaxVectorSize) {
-        DstNeedsLegalization = true;
-      }
-    }
-
-    if (SrcTy.isVector()) {
-      if (SrcTy.getNumElements() > 4 &&
-          !isPowerOf2_32(SrcTy.getNumElements())) {
-        SrcNeedsLegalization = true;
-      }
-
-      if (SrcTy.getNumElements() > MaxVectorSize) {
-        SrcNeedsLegalization = true;
-      }
-    }
-
     // If an spv_bitcast needs to be legalized, we convert it to G_BITCAST to
     // allow using the generic legalization rules.
-    if (DstNeedsLegalization || SrcNeedsLegalization) {
+    if (needsVectorLegalization(DstTy, ST) ||
+        needsVectorLegalization(SrcTy, ST)) {
       LLVM_DEBUG(dbgs() << "Replacing with a G_BITCAST\n");
       MIRBuilder.buildBitcast(DstReg, SrcReg);
       MI.eraseFromParent();
     }
     return true;
+  } else if (IntrinsicID == Intrinsic::spv_insertelt) {
+    Register DstReg = MI.getOperand(0).getReg();
+    LLT DstTy = MRI.getType(DstReg);
+
+    if (needsVectorLegalization(DstTy, ST)) {
+      Register SrcReg = MI.getOperand(2).getReg();
+      Register ValReg = MI.getOperand(3).getReg();
+      Register IdxReg = MI.getOperand(4).getReg();
+      MIRBuilder.buildInsertVectorElement(DstReg, SrcReg, ValReg, IdxReg);
+      MI.eraseFromParent();
+    }
+    return true;
+  } else if (IntrinsicID == Intrinsic::spv_extractelt) {
+    Register SrcReg = MI.getOperand(2).getReg();
+    LLT SrcTy = MRI.getType(SrcReg);
+
+    if (needsVectorLegalization(SrcTy, ST)) {
+      Register DstReg = MI.getOperand(0).getReg();
+      Register IdxReg = MI.getOperand(3).getReg();
+      MIRBuilder.buildExtractVectorElement(DstReg, SrcReg, IdxReg);
+      MI.eraseFromParent();
+    }
+    return true;
   }
   return true;
 }
diff --git a/llvm/lib/Target/SPIRV/SPIRVPostLegalizer.cpp b/llvm/lib/Target/SPIRV/SPIRVPostLegalizer.cpp
index c90e6d8cfbfb4..d91016a38539b 100644
--- a/llvm/lib/Target/SPIRV/SPIRVPostLegalizer.cpp
+++ b/llvm/lib/Target/SPIRV/SPIRVPostLegalizer.cpp
@@ -16,6 +16,7 @@
 #include "SPIRV.h"
 #include "SPIRVSubtarget.h"
 #include "SPIRVUtils.h"
+#include "llvm/CodeGen/GlobalISel/GenericMachineInstrs.h"
 #include "llvm/IR/IntrinsicsSPIRV.h"
 #include "llvm/Support/Debug.h"
 #include <stack>
@@ -66,8 +67,9 @@ static bool deduceAndAssignTypeForGUnmerge(MachineInstr *I, MachineFunction &MF,
     for (unsigned i = 0; i < I->getNumDefs() && !ScalarType; ++i) {
       for (const auto &Use :
            MRI.use_nodbg_instructions(I->getOperand(i).getReg())) {
-        assert(Use.getOpcode() == TargetOpcode::G_BUILD_VECTOR &&
-               "Expected use of G_UNMERGE_VALUES to be a G_BUILD_VECTOR");
+        if (Use.getOpcode() != TargetOpcode::G_BUILD_VECTOR)
+          continue;
+
         if (auto *VecType =
                 GR->getSPIRVTypeForVReg(Use.getOperand(0).getReg())) {
           ScalarType = GR->getScalarOrVectorComponentType(VecType);
@@ -133,10 +135,10 @@ static SPIRVType *deduceTypeFromOperandRange(MachineInstr *I,
   return ResType;
 }
 
-static SPIRVType *deduceTypeForResultRegister(MachineInstr *Use,
-                                              Register UseRegister,
-                                              SPIRVGlobalRegistry *GR,
-                                              MachineIRBuilder &MIB) {
+static SPIRVType *deduceTypeFromResultRegister(MachineInstr *Use,
+                                               Register UseRegister,
+                                               SPIRVGlobalRegistry *GR,
+                                               MachineIRBuilder &MIB) {
   for (const MachineOperand &MO : Use->defs()) {
     if (!MO.isReg())
       continue;
@@ -159,16 +161,43 @@ static SPIRVType *deduceTypeFromUses(Register Reg, MachineFunction &MF,
   MachineRegisterInfo &MRI = MF.getRegInfo();
   for (MachineInstr &Use : MRI.use_nodbg_instructions(Reg)) {
     SPIRVType *ResType = nullptr;
+    LLVM_DEBUG(dbgs() << "Looking at use " << Use);
     switch (Use.getOpcode()) {
     case TargetOpcode::G_BUILD_VECTOR:
     case TargetOpcode::G_EXTRACT_VECTOR_ELT:
     case TargetOpcode::G_UNMERGE_VALUES:
-      LLVM_DEBUG(dbgs() << "Looking at use " << Use << "\n");
-      ResType = deduceTypeForResultRegister(&Use, Reg, GR, MIB);
+    case TargetOpcode::G_ADD:
+    case TargetOpcode::G_SUB:
+    case TargetOpcode::G_MUL:
+    case TargetOpcode::G_SDIV:
+    case TargetOpcode::G_UDIV:
+    case TargetOpcode::G_SREM:
+    case TargetOpcode::G_UREM:
+    case TargetOpcode::G_FADD:
+    case TargetOpcode::G_FSUB:
+    case TargetOpcode::G_FMUL:
+    case TargetOpcode::G_FDIV:
+    case TargetOpcode::G_FREM:
+    case TargetOpcode::G_FMA:
+      ResType = deduceTypeFromResultRegister(&Use, Reg, GR, MIB);
+      break;
+    case TargetOpcode::G_INTRINSIC_W_SIDE_EFFECTS:
+    case TargetOpcode::G_INTRINSIC: {
+      auto IntrinsicID = cast<GIntrinsic>(Use).getIntrinsicID();
+      if (IntrinsicID == Intrinsic::spv_insertelt) {
+        if (Reg == Use.getOperand(2).getReg())
+          ResType = deduceTypeFromResultRegister(&Use, Reg, GR, MIB);
+      } else if (IntrinsicID == Intrinsic::spv_extractelt) {
+        if (Reg == Use.getOperand(2).getReg())
+          ResType = deduceTypeFromResultRegister(&Use, Reg, GR, MIB);
+      }
       break;
     }
-    if (ResType)
+    }
+    if (ResType) {
+      LLVM_DEBUG(dbgs() << "Deduced type from use " << *ResType);
       return ResType;
+    }
   }
   return nullptr;
 }
diff --git a/llvm/test/CodeGen/SPIRV/legalization/load-store-global.ll b/llvm/test/CodeGen/SPIRV/legalization/load-store-global.ll
new file mode 100644
index 0000000000000..468d3ded4c306
--- /dev/null
+++ b/llvm/test/CodeGen/SPIRV/legalization/load-store-global.ll
@@ -0,0 +1,194 @@
+; RUN: llc -O0 -verify-machineinstrs -mtriple=spirv-unknown-vulkan %s -o - | FileCheck %s
+; RUN: %if spirv-tools %{ llc -O0 -mtriple=spirv-unknown-vulkan %s -o - -filetype=obj | spirv-val %}
+
+; CHECK-DAG: OpName %[[#test_int32_double_conversion:]] "test_int32_double_conversion"
+; CHECK-DAG: %[[#int:]] = OpTypeInt 32 0
+; CHECK-DAG: %[[#v4i32:]] = OpTypeVector %[[#int]] 4
+; CHECK-DAG: %[[#double:]] = OpTypeFloat 64
+; CHECK-DAG: %[[#v4f64:]] = OpTypeVector %[[#double]] 4
+; CHECK-DAG: %[[#v2i32:]] = OpTypeVector %[[#int]] 2
+; CHECK-DAG: %[[#ptr_private_v4i32:]] = OpTypePointer Private %[[#v4i32]]
+; CHECK-DAG: %[[#ptr_private_v4f64:]] = OpTypePointer Private %[[#v4f64]]
+; CHECK-DAG: %[[#global_double:]] = OpVariable %[[#ptr_private_v4f64]] Private
+; CHECK-DAG: %[[#C15:]] = OpConstant %[[#int]] 15{{$}}
+; CHECK-DAG: %[[#C14:]] = OpConstant %[[#int]] 14{{$}}
+; CHECK-DAG: %[[#C13:]] = OpConstant %[[#int]] 13{{$}}
+; CHECK-DAG: %[[#C12:]] = OpConstant %[[#int]] 12{{$}}
+; CHECK-DAG: %[[#C11:]] = OpConstant %[[#int]] 11{{$}}
+; CHECK-DAG: %[[#C10:]] = OpConstant %[[#int]] 10{{$}}
+; CHECK-DAG: %[[#C9:]] = OpConstant %[[#int]] 9{{$}}
+; CHECK-DAG: %[[#C8:]] = OpConstant %[[#int]] 8{{$}}
+; CHECK-DAG: %[[#C7:]] = OpConstant %[[#int]] 7{{$}}
+; CHECK-DAG: %[[#C6:]] = OpConstant %[[#int]] 6{{$}}
+; CHECK-DAG: %[[#C5:]] = OpConstant %[[#int]] 5{{$}}
+; CHECK-DAG: %[[#C4:]] = OpConstant %[[#int]] 4{{$}}
+; CHECK-DAG: %[[#C3:]] = OpConstant %[[#int]] 3{{$}}
+; CHECK-DAG: %[[#C2:]] = OpConstant %[[#int]] 2{{$}}
+; CHECK-DAG: %[[#C1:]] = OpConstant %[[#int]] 1{{$}}
+; CHECK-DAG: %[[#C0:]] = OpConstant %[[#int]] 0{{$}}
+
+@G_16 = internal addrspace(10) global [16 x i32] zeroinitializer
+@G_4_double = internal addrspace(10) global <4 x double> zeroinitializer
+@G_4_int = internal addrspace(10) global <4 x i32> zeroinitializer
+
+
+; This is the way matrices will be represented in HLSL. The memory type will be
+; an array, but it will be loaded as a vector.
+define spir_func void @test_load_store_global() {
+entry:
+; CHECK-DAG: %[[#PTR0:]] = OpAccessChain %[[#ptr_int:]] %[[#G16:]] %[[#C0]]
+; CHECK-DAG: %[[#VAL0:]] = OpLoad %[[#int]] %[[#PTR0]] Aligned 4
+; CHECK-DAG: %[[#PTR1:]] = OpAccessChain %[[#ptr_int]] %[[#G16]] %[[#C1]]
+; CHECK-DAG: %[[#VAL1:]] = OpLoad %[[#int]] %[[#PTR1]] Aligned 4
+; CHECK-DAG: %[[#PTR2:]] = OpAccessChain %[[#ptr_int]] %[[#G16]] %[[#C2]]
+; CHECK-DAG: %[[#VAL2:]] = OpLoad %[[#int]] %[[#PTR2]] Aligned 4
+; CHECK-DAG: %[[#PTR3:]] = OpAccessChain %[[#ptr_int]] %[[#G16]] %[[#C3]]
+; CHECK-DAG: %[[#VAL3:]] = OpLoad %[[#int]] %[[#PTR3]] Aligned 4
+; CHECK-DAG: %[[#PTR4:]] = OpAccessChain %[[#ptr_int]] %[[#G16]] %[[#C4]]
+; CHECK-DAG: %[[#VAL4:]] = OpLoad %[[#int]] %[[#PTR4]] Aligned 4
+; CHECK-DAG: %[[#PTR5:]] = OpAccessChain %[[#ptr_int]] %[[#G16]] %[[#C5]]
+; CHECK-DAG: %[[#VAL5:]] = OpLoad %[[#int]] %[[#PTR5]] Aligned 4
+; CHECK-DAG: %[[#PTR6:]] = OpAccessChain %[[#ptr_int]] %[[#G16]] %[[#C6]]
+; CHECK-DAG: %[[#VAL6:]] = OpLoad %[[#int]] %[[#PTR6]] Aligned 4
+; CHECK-DAG: %[[#PTR7:]] = OpAccessChain %[[#ptr_int]] %[[#G16]] %[[#C7]]
+; CHECK-DAG: %[[#VAL7:]] = OpLoad %[[#int]] %[[#PTR7]] Aligned 4
+; CHECK-DAG: %[[#PTR8:]] = OpAccessChain %[[#ptr_int]] %[[#G16]] %[[#C8]]
+; CHECK-DAG: %[[#VAL8:]] = OpLoad %[[#int]] %[[#PTR8]] Aligned 4
+; CHECK-DAG: %[[#PTR9:]] = OpAccessChain %[[#ptr_int]] %[[#G16]] %[[#C9]]
+; CHECK-DAG: %[[#VAL9:]] = OpLoad %[[#int]] %[[#PTR9]] Aligned 4
+; CHECK-DAG: %[[#PTR10:]] = OpAccessChain %[[#ptr_int]] %[[#G16]] %[[#C10]]
+; CHECK-DAG: %[[#VAL10:]] = OpLoad %[[#int]] %[[#PTR10]] Aligned 4
+; CHECK-DAG: %[[#PTR11:]] = OpAccessChain %[[#ptr_int]] %[[#G16]] %[[#C11]]
+; CHECK-DAG: %[[#VAL11:]] = OpLoad %[[#int]] %[[#PTR11]] Aligned 4
+; CHECK-DAG: %[[#PTR12:]] = OpAccessChain %[[#ptr_int]] %[[#G16]] %[[#C12]]
+; CHECK-DAG: %[[#VAL12:]] = OpLoad %[[#int]] %[[#PTR12]] Aligned 4
+; CHECK-DAG: %[[#PTR13:]] = OpAccessChain %[[#ptr_int]] %[[#G16]] %[[#C13]]
+; CHECK-DAG: %[[#VAL13:]] = OpLoad %[[#int]] %[[#PTR13]] Aligned 4
+; CHECK-DAG: %[[#PTR14:]] = OpAccessChain %[[#ptr_int]] %[[#G16]] %[[#C14]]
+; CHECK-DAG: %[[#VAL14:]] = OpLoad %[[#int]] %[[#PTR14]] Aligned 4
+; CHECK-DAG: %[[#PTR15:]] = OpAccessChain %[[#ptr_int]] %[[#G16]] %[[#C15]]
+; CHECK-DAG: %[[#VAL15:]] = OpLoad %[[#int]] %[[#PTR15]] Aligned 4
+; CHECK-DAG: %[[#INS0:]] = OpCompositeInsert %[[#v4i32]] %[[#VAL0]] %[[#UNDEF:]] 0
+; CHECK-DAG: %[[#INS1:]] = OpCompositeInsert %[[#v4i32]] %[[#VAL1]] %[[#INS0]] 1
+; CHECK-DAG: %[[#INS2:]] = OpCompositeInsert %[[#v4i32]] %[[#VAL2]] %[[#INS1]] 2
+; CHECK-DAG: %[[#INS3:]] = OpCompositeInsert %[[#v4i32]] %[[#VAL3]] %[[#INS2]] 3
+; CHECK-DAG: %[[#INS4:]] = OpCompositeInsert %[[#v4i32]] %[[#VAL4]] %[[#UNDEF]] 0
+; CHECK-DAG: %[[#INS5:]] = OpCompositeInsert %[[#v4i32]] %[[#VAL5]] %[[#INS4]] 1
+; CHECK-DAG: %[[#INS6:]] = OpCompositeInsert %[[#v4i32]] %[[#VAL6]] %[[#INS5]] 2
+; CHECK-DAG: %[[#INS7:]] = OpCompositeInsert %[[#v4i32]] %[[#VAL7]] %[[#INS6]] 3
+; CHECK-DAG: %[[#INS8:]] = OpCompositeInsert %[[#v4i32]] %[[#VAL8]] %[[#UNDEF]] 0
+; CHECK-DAG: %[[#INS9:]] = OpCompositeInsert %[[#v4i32]] %[[#VAL9]] %[[#INS8]] 1
+; CHECK-DAG: %[[#INS10:]] = OpCompositeInsert %[[#v4i32]] %[[#VAL10]] %[[#INS9]] 2
+; CHECK-DAG: %[[#INS11:]] = OpCompositeInsert %[[#v4i32]] %[[#VAL11]] %[[#INS10]] 3
+; CHECK-DAG: %[[#INS12:]] = OpCompositeInsert %[[#v4i32]] %[[#VAL12]] %[[#UNDEF]] 0
+; CHECK-DAG: %[[#INS13:]] = OpCompositeInsert %[[#v4i32]] %[[#VAL13]] %[[#INS12]] 1
+; CHECK-DAG: %[[#INS14:]] = OpCompositeInsert %[[#v4i32]] %[[#VAL14]] %[[#INS13]] 2
+; CHECK-DAG: %[[#INS15:]] = OpCompositeInsert %[[#v4i32]] %[[#VAL15]] %[[#INS14]] 3
+  %0 = load <16 x i32>, ptr addrspace(10) @G_16, align 64
+ 
+; CHECK-DAG: %[[#PTR0_S:]] = OpAccessChain %[[#ptr_int]] %[[#G16]] %[[#C0]]
+; CHECK-DAG: %[[#VAL0_S:]] = OpCompositeExtract %[[#int]] %[[#INS3]] 0
+; CHECK-DAG: OpStore %[[#PTR0_S]] %[[#VAL0_S]] Aligned 64
+; CHECK-DAG: %[[#PTR1_S:]] = OpAccessChain %[[#ptr_int]] %[[#G16]] %[[#C1]]
+; CHECK-DAG: %[[#VAL1_S:]] = OpCompositeExtract %[[#int]] %[[#INS3]] 1
+; CHECK-DAG: OpStore %[[#PTR1_S]] %[[#VAL1_S]] Aligned 64
+; CHECK-DAG: %[[#PTR2_S:]] = OpAccessChain %[[#ptr_int]] %[[#G16]] %[[#C2]]
+; CHECK-DAG: %[[#VAL2_S:]] = OpCompositeExtract %[[#int]] %[[#INS3]] 2
+; CHECK-DAG: OpStore %[[#PTR2_S]] %[[#VAL2_S]] Aligned 64
+; CHECK-DAG: %[[#PTR3_S:]] = OpAccessChain %[[#ptr_int]] %[[#G16]] %[[#C3]]
+; CHECK-DAG: %[[#VAL3_S:]] = OpCompositeExtract %[[#int]] %[[#INS3]] 3
+; CHECK-DAG: OpStore %[[#PTR3_S]] %[[#VAL3_S]] Aligned 64
+; CHECK-DAG: %[[#PTR4_S:]] = OpAccessChain %[[#ptr_int]] %[[#G16]] %[[#C4]]
+; CHECK-DAG: %[[#VAL4_S:]] = OpCompositeExtract %[[#int]] %[[#INS7]] 0
+; CHECK-DAG: OpStore %[[#PTR4_S]] %[[#VAL4_S]] Aligned 64
+; CHECK-DAG: %[[#PTR5_S:]] = OpAccessChain %[[#ptr_int]] %[[#G16]] %[[#C5]]
+; CHECK-DAG: %[[#VAL5_S:]] = OpCompositeExtract %[[#int]] %[[#INS7]] 1
+; CHECK-DAG: OpStore %[[#PTR5_S]] %[[#VAL5_S]] Aligned 64
+; CHECK-DAG: %[[#PTR6_S:]] = OpAccessChain %[[#ptr_int]] %[[#G16]] %[[#C6]]
+; CHECK-DAG: %[[#VAL6_S:]] = OpCompositeExtract %[[#int]] %[[#INS7]] 2
+; CHECK-DAG: OpStore %[[#PTR6_S]] %[[#VAL6_S]] Aligned 64
+; CHECK-DAG: %[[#PTR7_S:]] = OpAccessChain %[[#ptr_int]] %[[#G16]] %[[#C7]]
+; CHECK-DAG: %[[#VAL7_S:]] = OpCompositeExtract %[[#int]] %[[#INS7]] 3
+; CHECK-DAG: OpStore %[[#PTR7_S]] %[[#VAL7_S]] Aligned 64
+; CHECK-DAG: %[[#PTR8_S:]] = OpAccessChain %[[#ptr_int]] %[[#G16]] %[[#C8]]
+; CHECK-DAG: %[[#VAL8_S:]] = OpCompositeExtract %[[#int]] %[[#INS11]] 0
+; CHECK-DAG: OpStore %[[#PTR8_S]] %[[#VAL8_S]] Aligned 64
+; CHECK-DAG: %[[#PTR9_S:]] = OpAccessChain %[[#ptr_int]] %[[#G16]] %[[#C9]]
+; CHECK-DAG: %[[#VAL9_S:]] = OpCompositeExtract %[[#int]] %[[#INS11]] 1
+; CHECK-DAG: OpStore %[[#PTR9_S]] %[[#VAL9_S]] Aligned 64
+; CHECK-DAG: %[[#PTR10_S:]] = OpAccessChain %[[#ptr_int]] %[[#G16]] %[[#C10]]
+; CHECK-DAG: %[[#VAL10_S:]] = OpCompositeExtract %[[#int]] %[[#INS11]] 2
+; CHECK-DAG: OpStore %[[#PTR10_S]] %[[#VAL10_S]] Aligned 64
+; CHECK-DAG: %[[#PTR11_S:]] = OpAccessChain %[[#ptr_int]] %[[#G16]] %[[#C11]]
+; CHECK-DAG: %[[#VAL11_S:]] = OpCompositeExtract %[[#int]] %[[#INS11]] 3
+; CHECK-DAG: OpStore %[[#PTR11_S]] %[[#VAL11_S]] Aligned 64
+; CHECK-DAG: %[[#PTR12_S:]] = OpAccessChain %[[#ptr_int]] %[[#G16]] %[[#C12]]
+; CHECK-DAG: %[[#VAL12_S:]] = OpCompositeExtract %[[#int]] %[[#INS15]] 0
+; CHECK-DAG: OpStore %[[#PTR12_S]] %[[#VAL12_S]] Aligned 64
+; CHECK-DAG: %[[#PTR13_S:]] = OpAccessChain %[[#ptr_int]] %[[#G16]] %[[#C13]]
+; CHECK-DAG: %[[#VAL13_S:]] = OpCompositeExtract %[[#int]] %[[#INS15]] 1
+; CHECK-DAG: OpStore %[[#PTR13_S]] %[[#VAL13_S]] Aligned 64
+; CHECK-DAG: %[[#PTR14_S:]] = OpAccessChain %[[#ptr_int]] %[[#G16]] %[[#C14]]
+; CHECK-DAG: %[[#VAL14_S:]] = OpCompositeExtract %[[#int]] %[[#INS15]] 2
+; CHECK-DAG: OpStore %[[#PTR14_S]] %[[#VAL14_S]] Aligned 64
+; CHECK-DAG: %[[#PTR15...
[truncated]

s-perron · 2025-12-04T14:37:20Z

@farzonl, this PR should enable the SPIR-V backend to handle some loads and stores, and it should be able to handle the element-wise vector/matrix operations.

It will still not be able to handle a variable index to get a particular element.

llvm/test/CodeGen/SPIRV/legalization/load-store-global.ll

llvm/lib/Target/SPIRV/SPIRVLegalizerInfo.cpp

llvm/test/CodeGen/SPIRV/legalization/load-store-global.ll

Keenuts · 2025-12-09T14:20:35Z

llvm/test/CodeGen/SPIRV/legalization/load-store-global.ll

+; CHECK-DAG: %[[#C1:]] = OpConstant %[[#int]] 1{{$}}
+; CHECK-DAG: %[[#C0:]] = OpConstant %[[#int]] 0{{$}}
+
+@G_16 = internal addrspace(10) global [16 x i32] zeroinitializer


Do you know what the Align actually implies in SPIR-V stores?

Looking at:

@var = internal addrspace(10) global [5 x double] zeroinitializer %tmp = load <5 x double>, ptr addrspace(10) @var store <5 x double> %tmp, ptr addrspace(10) @var

We get 5 OpAccessChain, with 5 loads Aligned 8, one for each double.
But at the store, we have:

%37 = OpAccessChain %_ptr_Private_double %var %uint_0 %38 = OpCompositeExtract %double %tmp1 0 OpStore %37 %38 Aligned 64 %39 = OpAccessChain %_ptr_Private_double %var %uint_1 %40 = OpCompositeExtract %double %tmp1 1 OpStore %39 %40 Aligned 8 %41 = OpAccessChain %_ptr_Private_double %var %uint_2 %42 = OpCompositeExtract %double %tmp1 2 OpStore %41 %42 Aligned 16 %43 = OpAccessChain %_ptr_Private_double %var %uint_3 %44 = OpCompositeExtract %double %tmp1 3 OpStore %43 %44 Aligned 8 %45 = OpAccessChain %_ptr_Private_double %var %uint_4 %46 = OpCompositeExtract %double %tmp2 0 OpStore %45 %46 Aligned 32

The alignment is a guarantee of a minimum alignment. The alignments on the loads could probably be improved. As long as there are no regressions, we can handle more cases in a follow up PR.

llvm/test/CodeGen/SPIRV/legalization/vector-arithmetic.ll

llvm/lib/Target/SPIRV/SPIRVLegalizerInfo.cpp

github-actions · 2025-12-10T21:21:14Z

🐧 Linux x64 Test Results

167005 tests passed
2929 tests skipped
1 test failed

Failed Tests

(click on a test name to see its output)

LLVM

LLVM.CodeGen/SPIRV/legalization/vector-arithmetic-6.ll

Exit Code: 2

Command Output (stdout):
--
# RUN: at line 1
/home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/llc -O0 -verify-machineinstrs -mtriple=spirv-unknown-vulkan /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/SPIRV/legalization/vector-arithmetic-6.ll -o - | /home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/FileCheck /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/SPIRV/legalization/vector-arithmetic-6.ll
# executed command: /home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/llc -O0 -verify-machineinstrs -mtriple=spirv-unknown-vulkan /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/SPIRV/legalization/vector-arithmetic-6.ll -o -
# .---command stderr------------
# | LLVM ERROR: unable to legalize instruction: %70:vid(<6 x s32>) = G_UREM %69:vid, %68:vid (in function: test_int_vector_arithmetic)
# `-----------------------------
# error: command failed with exit status: 1
# executed command: /home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/FileCheck /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/SPIRV/legalization/vector-arithmetic-6.ll
# .---command stderr------------
# | FileCheck error: '<stdin>' is empty.
# | FileCheck command line:  /home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/FileCheck /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/SPIRV/legalization/vector-arithmetic-6.ll
# `-----------------------------
# error: command failed with exit status: 2

--

If these failures are unrelated to your changes (for example tests are broken or flaky at HEAD), please open an issue at https://github.com/llvm/llvm-project/issues and add the infrastructure label.

github-actions · 2025-12-10T21:21:14Z

🪟 Windows x64 Test Results

128523 tests passed
2805 tests skipped
1 test failed

Failed Tests

(click on a test name to see its output)

LLVM

LLVM.CodeGen/SPIRV/legalization/vector-arithmetic-6.ll

Exit Code: 2

Command Output (stdout):
--
# RUN: at line 1
c:\_work\llvm-project\llvm-project\build\bin\llc.exe -O0 -verify-machineinstrs -mtriple=spirv-unknown-vulkan C:\_work\llvm-project\llvm-project\llvm\test\CodeGen\SPIRV\legalization\vector-arithmetic-6.ll -o - | c:\_work\llvm-project\llvm-project\build\bin\filecheck.exe C:\_work\llvm-project\llvm-project\llvm\test\CodeGen\SPIRV\legalization\vector-arithmetic-6.ll
# executed command: 'c:\_work\llvm-project\llvm-project\build\bin\llc.exe' -O0 -verify-machineinstrs -mtriple=spirv-unknown-vulkan 'C:\_work\llvm-project\llvm-project\llvm\test\CodeGen\SPIRV\legalization\vector-arithmetic-6.ll' -o -
# .---command stderr------------
# | LLVM ERROR: unable to legalize instruction: %70:vid(<6 x s32>) = G_UREM %69:vid, %68:vid (in function: test_int_vector_arithmetic)
# | PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace and instructions to reproduce the bug.
# | Stack dump:
# | 0.	Program arguments: c:\\_work\\llvm-project\\llvm-project\\build\\bin\\llc.exe -O0 -verify-machineinstrs -mtriple=spirv-unknown-vulkan C:\\_work\\llvm-project\\llvm-project\\llvm\\test\\CodeGen\\SPIRV\\legalization\\vector-arithmetic-6.ll -o -
# | 1.	Running pass 'Function Pass Manager' on module 'C:\_work\llvm-project\llvm-project\llvm\test\CodeGen\SPIRV\legalization\vector-arithmetic-6.ll'.
# | 2.	Running pass 'Legalizer' on function '@test_int_vector_arithmetic'
# `-----------------------------
# error: command failed with exit status: 1
# executed command: 'c:\_work\llvm-project\llvm-project\build\bin\filecheck.exe' 'C:\_work\llvm-project\llvm-project\llvm\test\CodeGen\SPIRV\legalization\vector-arithmetic-6.ll'
# .---command stderr------------
# | FileCheck error: '<stdin>' is empty.
# | FileCheck command line:  c:\_work\llvm-project\llvm-project\build\bin\filecheck.exe C:\_work\llvm-project\llvm-project\llvm\test\CodeGen\SPIRV\legalization\vector-arithmetic-6.ll
# `-----------------------------
# error: command failed with exit status: 2

--

If these failures are unrelated to your changes (for example tests are broken or flaky at HEAD), please open an issue at https://github.com/llvm/llvm-project/issues and add the infrastructure label.

s-perron requested review from Keenuts and luciechoi December 4, 2025 14:34

llvmbot added the backend:SPIR-V label Dec 4, 2025

s-perron requested a review from farzonl December 4, 2025 14:34

Keenuts reviewed Dec 8, 2025

View reviewed changes

llvm/test/CodeGen/SPIRV/legalization/load-store-global.ll Outdated Show resolved Hide resolved

VyacheslavLevytskyy reviewed Dec 8, 2025

View reviewed changes

llvm/lib/Target/SPIRV/SPIRVLegalizerInfo.cpp Outdated Show resolved Hide resolved

s-perron added 2 commits December 8, 2025 15:44

Add s64 to all scalars.

92f96f7

Fix alignment in stores generated in legalize pointer cast.

c6c4a27

s-perron requested review from Keenuts and VyacheslavLevytskyy December 8, 2025 21:08

Keenuts reviewed Dec 9, 2025

View reviewed changes

Set all def to default type in post legalization.

0760562

farzonl reviewed Dec 9, 2025

View reviewed changes

llvm/test/CodeGen/SPIRV/legalization/vector-arithmetic.ll Outdated Show resolved Hide resolved

farzonl reviewed Dec 9, 2025

View reviewed changes

llvm/test/CodeGen/SPIRV/legalization/vector-arithmetic.ll Outdated Show resolved Hide resolved

farzonl reviewed Dec 9, 2025

View reviewed changes

llvm/test/CodeGen/SPIRV/legalization/vector-arithmetic.ll Show resolved Hide resolved

farzonl reviewed Dec 9, 2025

View reviewed changes

llvm/lib/Target/SPIRV/SPIRVLegalizerInfo.cpp Outdated Show resolved Hide resolved

s-perron added 4 commits December 10, 2025 14:36

Update tests, and add tests for 6-element vectors

740bf6e

Handle G_STRICT_FMA

64afb6c

Scalarize rem and div with illegal vector size that is not a power of 2.

9ef180b

Merge branch 'main' into load-vec-from-array

e503f4d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPIR-V] Legalize vector arithmetic and intrinsics for large vectors #170668

[SPIR-V] Legalize vector arithmetic and intrinsics for large vectors #170668

Uh oh!

s-perron commented Dec 4, 2025

Uh oh!

llvmbot commented Dec 4, 2025

Uh oh!

s-perron commented Dec 4, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Keenuts Dec 9, 2025

Uh oh!

s-perron Dec 9, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Dec 10, 2025

Uh oh!

github-actions bot commented Dec 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[SPIR-V] Legalize vector arithmetic and intrinsics for large vectors #170668

Are you sure you want to change the base?

[SPIR-V] Legalize vector arithmetic and intrinsics for large vectors #170668

Uh oh!

Conversation

s-perron commented Dec 4, 2025

Uh oh!

llvmbot commented Dec 4, 2025

Uh oh!

s-perron commented Dec 4, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Keenuts Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

s-perron Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Dec 10, 2025

🐧 Linux x64 Test Results

Failed Tests

LLVM

Uh oh!

github-actions bot commented Dec 10, 2025

🪟 Windows x64 Test Results

Failed Tests

LLVM

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants