Skip to content

Conversation

@RKSimon
Copy link
Collaborator

@RKSimon RKSimon commented Jan 31, 2025

As noted on #124499 - this is currently missing for type-only analysis and was falling back to scalarization for fixed vectors (and failing entirely for scalable vectors)

… load/store intrinsics

As noted on llvm#124499 - this is currently missing for type-only analysis and was falling back to scalarization for fixed vectors (and failing entirely for scalable vectors)
@llvmbot llvmbot added the llvm:analysis Includes value tracking, cost tables and constant folding label Jan 31, 2025
@llvmbot
Copy link
Member

llvmbot commented Jan 31, 2025

@llvm/pr-subscribers-llvm-analysis

Author: Simon Pilgrim (RKSimon)

Changes

As noted on #124499 - this is currently missing for type-only analysis and was falling back to scalarization for fixed vectors (and failing entirely for scalable vectors)


Patch is 20.04 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/125223.diff

2 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/BasicTTIImpl.h (+14)
  • (modified) llvm/test/Analysis/CostModel/RISCV/vp-intrinsics.ll (+44-44)
diff --git a/llvm/include/llvm/CodeGen/BasicTTIImpl.h b/llvm/include/llvm/CodeGen/BasicTTIImpl.h
index 596db392392131..9571bd9330de6c 100644
--- a/llvm/include/llvm/CodeGen/BasicTTIImpl.h
+++ b/llvm/include/llvm/CodeGen/BasicTTIImpl.h
@@ -2204,6 +2204,20 @@ class BasicTTIImplBase : public TargetTransformInfoImplCRTPBase<T> {
       return thisT()->getMaskedMemoryOpCost(Instruction::Load, Ty, TyAlign, 0,
                                             CostKind);
     }
+    case Intrinsic::experimental_vp_strided_store: {
+      auto *Ty = cast<VectorType>(ICA.getArgTypes()[0]);
+      Align Alignment = thisT()->DL.getABITypeAlign(Ty->getElementType());
+      return thisT()->getStridedMemoryOpCost(
+          Instruction::Store, Ty, /*Ptr=*/nullptr, /*VariableMask=*/true,
+          Alignment, CostKind, ICA.getInst());
+    }
+    case Intrinsic::experimental_vp_strided_load: {
+      auto *Ty = cast<VectorType>(ICA.getReturnType());
+      Align Alignment = thisT()->DL.getABITypeAlign(Ty->getElementType());
+      return thisT()->getStridedMemoryOpCost(
+          Instruction::Load, Ty, /*Ptr=*/nullptr, /*VariableMask=*/true,
+          Alignment, CostKind, ICA.getInst());
+    }
     case Intrinsic::vector_reduce_add:
     case Intrinsic::vector_reduce_mul:
     case Intrinsic::vector_reduce_and:
diff --git a/llvm/test/Analysis/CostModel/RISCV/vp-intrinsics.ll b/llvm/test/Analysis/CostModel/RISCV/vp-intrinsics.ll
index 0245a0f7ee6cbc..7c5197b283fbab 100644
--- a/llvm/test/Analysis/CostModel/RISCV/vp-intrinsics.ll
+++ b/llvm/test/Analysis/CostModel/RISCV/vp-intrinsics.ll
@@ -1482,30 +1482,30 @@ define void @strided_load() {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
 ;
 ; TYPEBASED-LABEL: 'strided_load'
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 13 for instruction: %ti1_2 = call <2 x i1> @llvm.experimental.vp.strided.load.v2i1.p0.i64(ptr undef, i64 undef, <2 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 25 for instruction: %ti1_4 = call <4 x i1> @llvm.experimental.vp.strided.load.v4i1.p0.i64(ptr undef, i64 undef, <4 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 49 for instruction: %ti1_8 = call <8 x i1> @llvm.experimental.vp.strided.load.v8i1.p0.i64(ptr undef, i64 undef, <8 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 97 for instruction: %ti1_16 = call <16 x i1> @llvm.experimental.vp.strided.load.v16i1.p0.i64(ptr undef, i64 undef, <16 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 11 for instruction: %t0 = call <2 x i8> @llvm.experimental.vp.strided.load.v2i8.p0.i64(ptr undef, i64 undef, <2 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 23 for instruction: %t2 = call <4 x i8> @llvm.experimental.vp.strided.load.v4i8.p0.i64(ptr undef, i64 undef, <4 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 47 for instruction: %t4 = call <8 x i8> @llvm.experimental.vp.strided.load.v8i8.p0.i64(ptr undef, i64 undef, <8 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 95 for instruction: %t6 = call <16 x i8> @llvm.experimental.vp.strided.load.v16i8.p0.i64(ptr undef, i64 undef, <16 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 11 for instruction: %t8.a = call <2 x i64> @llvm.experimental.vp.strided.load.v2i64.p0.i64(ptr align 8 undef, i64 undef, <2 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 26 for instruction: %t10.a = call <4 x i64> @llvm.experimental.vp.strided.load.v4i64.p0.i64(ptr align 8 undef, i64 undef, <4 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 54 for instruction: %t13.a = call <8 x i64> @llvm.experimental.vp.strided.load.v8i64.p0.i64(ptr align 8 undef, i64 undef, <8 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 110 for instruction: %t15.a = call <16 x i64> @llvm.experimental.vp.strided.load.v16i64.p0.i64(ptr align 8 undef, i64 undef, <16 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 11 for instruction: %t8 = call <2 x i64> @llvm.experimental.vp.strided.load.v2i64.p0.i64(ptr undef, i64 undef, <2 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 26 for instruction: %t10 = call <4 x i64> @llvm.experimental.vp.strided.load.v4i64.p0.i64(ptr undef, i64 undef, <4 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 54 for instruction: %t13 = call <8 x i64> @llvm.experimental.vp.strided.load.v8i64.p0.i64(ptr undef, i64 undef, <8 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 110 for instruction: %t15 = call <16 x i64> @llvm.experimental.vp.strided.load.v16i64.p0.i64(ptr undef, i64 undef, <16 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Invalid cost for instruction: %t17 = call <vscale x 2 x i8> @llvm.experimental.vp.strided.load.nxv2i8.p0.i64(ptr undef, i64 undef, <vscale x 2 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Invalid cost for instruction: %t19 = call <vscale x 4 x i8> @llvm.experimental.vp.strided.load.nxv4i8.p0.i64(ptr undef, i64 undef, <vscale x 4 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Invalid cost for instruction: %t21 = call <vscale x 8 x i8> @llvm.experimental.vp.strided.load.nxv8i8.p0.i64(ptr undef, i64 undef, <vscale x 8 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Invalid cost for instruction: %t23 = call <vscale x 16 x i8> @llvm.experimental.vp.strided.load.nxv16i8.p0.i64(ptr undef, i64 undef, <vscale x 16 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Invalid cost for instruction: %t25 = call <vscale x 2 x i64> @llvm.experimental.vp.strided.load.nxv2i64.p0.i64(ptr undef, i64 undef, <vscale x 2 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Invalid cost for instruction: %t27 = call <vscale x 4 x i64> @llvm.experimental.vp.strided.load.nxv4i64.p0.i64(ptr undef, i64 undef, <vscale x 4 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Invalid cost for instruction: %t29 = call <vscale x 8 x i64> @llvm.experimental.vp.strided.load.nxv8i64.p0.i64(ptr undef, i64 undef, <vscale x 8 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Invalid cost for instruction: %t31 = call <vscale x 16 x i64> @llvm.experimental.vp.strided.load.nxv16i64.p0.i64(ptr undef, i64 undef, <vscale x 16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %ti1_2 = call <2 x i1> @llvm.experimental.vp.strided.load.v2i1.p0.i64(ptr undef, i64 undef, <2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %ti1_4 = call <4 x i1> @llvm.experimental.vp.strided.load.v4i1.p0.i64(ptr undef, i64 undef, <4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 64 for instruction: %ti1_8 = call <8 x i1> @llvm.experimental.vp.strided.load.v8i1.p0.i64(ptr undef, i64 undef, <8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 128 for instruction: %ti1_16 = call <16 x i1> @llvm.experimental.vp.strided.load.v16i1.p0.i64(ptr undef, i64 undef, <16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %t0 = call <2 x i8> @llvm.experimental.vp.strided.load.v2i8.p0.i64(ptr undef, i64 undef, <2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %t2 = call <4 x i8> @llvm.experimental.vp.strided.load.v4i8.p0.i64(ptr undef, i64 undef, <4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %t4 = call <8 x i8> @llvm.experimental.vp.strided.load.v8i8.p0.i64(ptr undef, i64 undef, <8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %t6 = call <16 x i8> @llvm.experimental.vp.strided.load.v16i8.p0.i64(ptr undef, i64 undef, <16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %t8.a = call <2 x i64> @llvm.experimental.vp.strided.load.v2i64.p0.i64(ptr align 8 undef, i64 undef, <2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %t10.a = call <4 x i64> @llvm.experimental.vp.strided.load.v4i64.p0.i64(ptr align 8 undef, i64 undef, <4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %t13.a = call <8 x i64> @llvm.experimental.vp.strided.load.v8i64.p0.i64(ptr align 8 undef, i64 undef, <8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %t15.a = call <16 x i64> @llvm.experimental.vp.strided.load.v16i64.p0.i64(ptr align 8 undef, i64 undef, <16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %t8 = call <2 x i64> @llvm.experimental.vp.strided.load.v2i64.p0.i64(ptr undef, i64 undef, <2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %t10 = call <4 x i64> @llvm.experimental.vp.strided.load.v4i64.p0.i64(ptr undef, i64 undef, <4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %t13 = call <8 x i64> @llvm.experimental.vp.strided.load.v8i64.p0.i64(ptr undef, i64 undef, <8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %t15 = call <16 x i64> @llvm.experimental.vp.strided.load.v16i64.p0.i64(ptr undef, i64 undef, <16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %t17 = call <vscale x 2 x i8> @llvm.experimental.vp.strided.load.nxv2i8.p0.i64(ptr undef, i64 undef, <vscale x 2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %t19 = call <vscale x 4 x i8> @llvm.experimental.vp.strided.load.nxv4i8.p0.i64(ptr undef, i64 undef, <vscale x 4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %t21 = call <vscale x 8 x i8> @llvm.experimental.vp.strided.load.nxv8i8.p0.i64(ptr undef, i64 undef, <vscale x 8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %t23 = call <vscale x 16 x i8> @llvm.experimental.vp.strided.load.nxv16i8.p0.i64(ptr undef, i64 undef, <vscale x 16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %t25 = call <vscale x 2 x i64> @llvm.experimental.vp.strided.load.nxv2i64.p0.i64(ptr undef, i64 undef, <vscale x 2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %t27 = call <vscale x 4 x i64> @llvm.experimental.vp.strided.load.nxv4i64.p0.i64(ptr undef, i64 undef, <vscale x 4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %t29 = call <vscale x 8 x i64> @llvm.experimental.vp.strided.load.nxv8i64.p0.i64(ptr undef, i64 undef, <vscale x 8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %t31 = call <vscale x 16 x i64> @llvm.experimental.vp.strided.load.nxv16i64.p0.i64(ptr undef, i64 undef, <vscale x 16 x i1> undef, i32 undef)
 ; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
 ;
   %ti1_2 = call <2 x i1> @llvm.experimental.vp.strided.load.v2i1.i64(ptr undef, i64 undef, <2 x i1> undef, i32 undef)
@@ -1560,26 +1560,26 @@ define void @strided_store() {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
 ;
 ; TYPEBASED-LABEL: 'strided_store'
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: call void @llvm.experimental.vp.strided.store.v2i8.p0.i64(<2 x i8> undef, ptr undef, i64 undef, <2 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 26 for instruction: call void @llvm.experimental.vp.strided.store.v4i8.p0.i64(<4 x i8> undef, ptr undef, i64 undef, <4 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 54 for instruction: call void @llvm.experimental.vp.strided.store.v8i8.p0.i64(<8 x i8> undef, ptr undef, i64 undef, <8 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 110 for instruction: call void @llvm.experimental.vp.strided.store.v16i8.p0.i64(<16 x i8> undef, ptr undef, i64 undef, <16 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: call void @llvm.experimental.vp.strided.store.v2i64.p0.i64(<2 x i64> undef, ptr undef, i64 undef, <2 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 26 for instruction: call void @llvm.experimental.vp.strided.store.v4i64.p0.i64(<4 x i64> undef, ptr undef, i64 undef, <4 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 54 for instruction: call void @llvm.experimental.vp.strided.store.v8i64.p0.i64(<8 x i64> undef, ptr undef, i64 undef, <8 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 110 for instruction: call void @llvm.experimental.vp.strided.store.v16i64.p0.i64(<16 x i64> undef, ptr undef, i64 undef, <16 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: call void @llvm.experimental.vp.strided.store.v2i64.p0.i64(<2 x i64> undef, ptr align 8 undef, i64 undef, <2 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 26 for instruction: call void @llvm.experimental.vp.strided.store.v4i64.p0.i64(<4 x i64> undef, ptr align 8 undef, i64 undef, <4 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 54 for instruction: call void @llvm.experimental.vp.strided.store.v8i64.p0.i64(<8 x i64> undef, ptr align 8 undef, i64 undef, <8 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 110 for instruction: call void @llvm.experimental.vp.strided.store.v16i64.p0.i64(<16 x i64> undef, ptr align 8 undef, i64 undef, <16 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Invalid cost for instruction: call void @llvm.experimental.vp.strided.store.nxv2i8.p0.i64(<vscale x 2 x i8> undef, ptr undef, i64 undef, <vscale x 2 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Invalid cost for instruction: call void @llvm.experimental.vp.strided.store.nxv4i8.p0.i64(<vscale x 4 x i8> undef, ptr undef, i64 undef, <vscale x 4 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Invalid cost for instruction: call void @llvm.experimental.vp.strided.store.nxv8i8.p0.i64(<vscale x 8 x i8> undef, ptr undef, i64 undef, <vscale x 8 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Invalid cost for instruction: call void @llvm.experimental.vp.strided.store.nxv16i8.p0.i64(<vscale x 16 x i8> undef, ptr undef, i64 undef, <vscale x 16 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Invalid cost for instruction: call void @llvm.experimental.vp.strided.store.nxv2i64.p0.i64(<vscale x 2 x i64> undef, ptr undef, i64 undef, <vscale x 2 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Invalid cost for instruction: call void @llvm.experimental.vp.strided.store.nxv4i64.p0.i64(<vscale x 4 x i64> undef, ptr undef, i64 undef, <vscale x 4 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Invalid cost for instruction: call void @llvm.experimental.vp.strided.store.nxv8i64.p0.i64(<vscale x 8 x i64> undef, ptr undef, i64 undef, <vscale x 8 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Invalid cost for instruction: call void @llvm.experimental.vp.strided.store.nxv16i64.p0.i64(<vscale x 16 x i64> undef, ptr undef, i64 undef, <vscale x 16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: call void @llvm.experimental.vp.strided.store.v2i8.p0.i64(<2 x i8> undef, ptr undef, i64 undef, <2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: call void @llvm.experimental.vp.strided.store.v4i8.p0.i64(<4 x i8> undef, ptr undef, i64 undef, <4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: call void @llvm.experimental.vp.strided.store.v8i8.p0.i64(<8 x i8> undef, ptr undef, i64 undef, <8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: call void @llvm.experimental.vp.strided.store.v16i8.p0.i64(<16 x i8> undef, ptr undef, i64 undef, <16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: call void @llvm.experimental.vp.strided.store.v2i64.p0.i64(<2 x i64> undef, ptr undef, i64 undef, <2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: call void @llvm.experimental.vp.strided.store.v4i64.p0.i64(<4 x i64> undef, ptr undef, i64 undef, <4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: call void @llvm.experimental.vp.strided.store.v8i64.p0.i64(<8 x i64> undef, ptr undef, i64 undef, <8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: call void @llvm.experimental.vp.strided.store.v16i64.p0.i64(<16 x i64> undef, ptr undef, i64 undef, <16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: call void @llvm.experimental.vp.strided.store.v2i64.p0.i64(<2 x i64> undef, ptr align 8 undef, i64 undef, <2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: call void @llvm.experimental.vp.strided.store.v4i64.p0.i64(<4 x i64> undef, ptr align 8 undef, i64 undef, <4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: call void @llvm.experimental.vp.strided.store.v8i64.p0.i64(<8 x i64> undef, ptr align 8 undef, i64 undef, <8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: call void @llvm.experimental.vp.strided.store.v16i64.p0.i64(<16 x i64> undef, ptr align 8 undef, i64 undef, <16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: call void @llvm.experimental.vp.strided.store.nxv2i8.p0.i64(<vscale x 2 x i8> undef, ptr undef, i64 undef, <vscale x 2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: call void @llvm.experimental.vp.strided.store.nxv4i8.p0.i64(<vscale x 4 x i8> undef, ptr undef, i64 undef, <vscale x 4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: call void @llvm.experimental.vp.strided.store.nxv8i8.p0.i64(<vscale x 8 x i8> undef, ptr undef, i64 undef, <vscale x 8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: call void @llvm.experimental.vp.strided.store.nxv16i8.p0.i64(<vscale x 16 x i8> undef, ptr undef, i64 undef, <vscale x 16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: call void @llvm.experimental.vp.strided.store.nxv2i64.p0.i64(<vscale x 2 x i64> undef, ptr undef, i64 undef, <vscale x 2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: call void @llvm.experimental.vp.strided.store.nxv4i64.p0.i64(<vscale x 4 x i64> undef, ptr undef, i64 undef, <vscale x 4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: call void @llvm.experimental.vp.strided.store.nxv8i64.p0.i64(<vscale x 8 x i64> undef, ptr undef, i64 undef, <vscale x 8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: call void @llvm.experimental.vp.strided.store.nxv16i64.p0.i64(<vscale x 16 x i64> undef, ptr undef, i64 undef, <vscale x 16 x i1> undef, i32 undef)
 ; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
 ;
   call void @llvm.experimental.vp.strided.store.v2i8.i64(<2 x i8> undef, ptr undef, ...
[truncated]

Copy link
Collaborator

@preames preames left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@RKSimon RKSimon merged commit 48a66e9 into llvm:main Jan 31, 2025
10 checks passed
@RKSimon RKSimon deleted the costmodel-strided-memory branch January 31, 2025 16:06
RKSimon added a commit to RKSimon/llvm-project that referenced this pull request Jan 31, 2025
…BASED + TYPEBASED test coverage

Inspired by llvm#125223 - helps identify when the cost models are relying on arg data (or failures in getTypeBasedIntrinsicInstrCost)
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jan 31, 2025

LLVM Buildbot has detected a new failure on builder ml-opt-devrel-x86-64 running on ml-opt-devrel-x86-64-b1 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/175/builds/12482

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: Transforms/SLPVectorizer/RISCV/complex-loads.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /b/ml-opt-devrel-x86-64-b1/build/bin/opt -S -mtriple riscv64-unknown-linux-gnu < /b/ml-opt-devrel-x86-64-b1/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll --passes=slp-vectorizer -mattr=+v -slp-threshold=-20 | /b/ml-opt-devrel-x86-64-b1/build/bin/FileCheck /b/ml-opt-devrel-x86-64-b1/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll
+ /b/ml-opt-devrel-x86-64-b1/build/bin/FileCheck /b/ml-opt-devrel-x86-64-b1/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll
+ /b/ml-opt-devrel-x86-64-b1/build/bin/opt -S -mtriple riscv64-unknown-linux-gnu --passes=slp-vectorizer -mattr=+v -slp-threshold=-20
/b/ml-opt-devrel-x86-64-b1/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll:31:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[ARRAYIDX8_2:%.*]] = getelementptr i8, ptr [[ADD_PTR_1]], i64 1
              ^
<stdin>:29:58: note: scanning from here
 %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4
                                                         ^
<stdin>:29:58: note: with "ADD_PTR_1" equal to "%add\\.ptr\\.1"
 %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4
                                                         ^
<stdin>:66:12: note: possible intended match here
 %arrayidx5.3 = getelementptr i8, ptr null, i64 4
           ^

Input file: <stdin>
Check file: /b/ml-opt-devrel-x86-64-b1/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
          24:  %3 = load i8, ptr %arrayidx32.1, align 1 
          25:  %conv33.1 = zext i8 %3 to i32 
          26:  %add.ptr.1 = getelementptr i8, ptr %add.ptr, i64 %idx.ext 
          27:  %add.ptr64.1 = getelementptr i8, ptr %add.ptr64, i64 %idx.ext63 
          28:  %arrayidx3.2 = getelementptr i8, ptr %add.ptr.1, i64 4 
          29:  %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4 
next:31'0                                                              X error: no match found
next:31'1                                                                with "ADD_PTR_1" equal to "%add\\.ptr\\.1"
          30:  %4 = load <4 x i8>, ptr %add.ptr.1, align 1 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          31:  %5 = shufflevector <4 x i8> %4, <4 x i8> poison, <2 x i32> <i32 0, i32 2> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          32:  %6 = zext <2 x i8> %5 to <2 x i32> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          33:  %7 = load <4 x i8>, ptr %add.ptr64.1, align 1 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          34:  %8 = shufflevector <4 x i8> %7, <4 x i8> poison, <2 x i32> <i32 0, i32 2> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           .
...

RKSimon added a commit that referenced this pull request Jan 31, 2025
…ling for strided load/store intrinsics (#125223)"

Investigating build bot failures (I think due to some other recent reversions).
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jan 31, 2025

LLVM Buildbot has detected a new failure on builder clang-debian-cpp20 running on clang-debian-cpp20 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/108/builds/8802

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: Transforms/SLPVectorizer/RISCV/complex-loads.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/opt -S -mtriple riscv64-unknown-linux-gnu < /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll --passes=slp-vectorizer -mattr=+v -slp-threshold=-20 | /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/FileCheck /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll
+ /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/FileCheck /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll
+ /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/opt -S -mtriple riscv64-unknown-linux-gnu --passes=slp-vectorizer -mattr=+v -slp-threshold=-20
/vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll:31:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[ARRAYIDX8_2:%.*]] = getelementptr i8, ptr [[ADD_PTR_1]], i64 1
              ^
<stdin>:29:58: note: scanning from here
 %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4
                                                         ^
<stdin>:29:58: note: with "ADD_PTR_1" equal to "%add\\.ptr\\.1"
 %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4
                                                         ^
<stdin>:66:12: note: possible intended match here
 %arrayidx5.3 = getelementptr i8, ptr null, i64 4
           ^

Input file: <stdin>
Check file: /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
          24:  %3 = load i8, ptr %arrayidx32.1, align 1 
          25:  %conv33.1 = zext i8 %3 to i32 
          26:  %add.ptr.1 = getelementptr i8, ptr %add.ptr, i64 %idx.ext 
          27:  %add.ptr64.1 = getelementptr i8, ptr %add.ptr64, i64 %idx.ext63 
          28:  %arrayidx3.2 = getelementptr i8, ptr %add.ptr.1, i64 4 
          29:  %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4 
next:31'0                                                              X error: no match found
next:31'1                                                                with "ADD_PTR_1" equal to "%add\\.ptr\\.1"
          30:  %4 = load <4 x i8>, ptr %add.ptr.1, align 1 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          31:  %5 = shufflevector <4 x i8> %4, <4 x i8> poison, <2 x i32> <i32 0, i32 2> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          32:  %6 = zext <2 x i8> %5 to <2 x i32> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          33:  %7 = load <4 x i8>, ptr %add.ptr64.1, align 1 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          34:  %8 = shufflevector <4 x i8> %7, <4 x i8> poison, <2 x i32> <i32 0, i32 2> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           .
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Jan 31, 2025

LLVM Buildbot has detected a new failure on builder clang-x86_64-debian-fast running on gribozavr4 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/56/builds/17644

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: Transforms/SLPVectorizer/RISCV/complex-loads.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /b/1/clang-x86_64-debian-fast/llvm.obj/bin/opt -S -mtriple riscv64-unknown-linux-gnu < /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll --passes=slp-vectorizer -mattr=+v -slp-threshold=-20 | /b/1/clang-x86_64-debian-fast/llvm.obj/bin/FileCheck /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll
+ /b/1/clang-x86_64-debian-fast/llvm.obj/bin/opt -S -mtriple riscv64-unknown-linux-gnu --passes=slp-vectorizer -mattr=+v -slp-threshold=-20
+ /b/1/clang-x86_64-debian-fast/llvm.obj/bin/FileCheck /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll
/b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll:31:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[ARRAYIDX8_2:%.*]] = getelementptr i8, ptr [[ADD_PTR_1]], i64 1
              ^
<stdin>:29:58: note: scanning from here
 %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4
                                                         ^
<stdin>:29:58: note: with "ADD_PTR_1" equal to "%add\\.ptr\\.1"
 %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4
                                                         ^
<stdin>:66:12: note: possible intended match here
 %arrayidx5.3 = getelementptr i8, ptr null, i64 4
           ^

Input file: <stdin>
Check file: /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
          24:  %3 = load i8, ptr %arrayidx32.1, align 1 
          25:  %conv33.1 = zext i8 %3 to i32 
          26:  %add.ptr.1 = getelementptr i8, ptr %add.ptr, i64 %idx.ext 
          27:  %add.ptr64.1 = getelementptr i8, ptr %add.ptr64, i64 %idx.ext63 
          28:  %arrayidx3.2 = getelementptr i8, ptr %add.ptr.1, i64 4 
          29:  %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4 
next:31'0                                                              X error: no match found
next:31'1                                                                with "ADD_PTR_1" equal to "%add\\.ptr\\.1"
          30:  %4 = load <4 x i8>, ptr %add.ptr.1, align 1 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          31:  %5 = shufflevector <4 x i8> %4, <4 x i8> poison, <2 x i32> <i32 0, i32 2> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          32:  %6 = zext <2 x i8> %5 to <2 x i32> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          33:  %7 = load <4 x i8>, ptr %add.ptr64.1, align 1 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          34:  %8 = shufflevector <4 x i8> %7, <4 x i8> poison, <2 x i32> <i32 0, i32 2> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           .
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Jan 31, 2025

LLVM Buildbot has detected a new failure on builder llvm-x86_64-debian-dylib running on gribozavr4 while building llvm at step 7 "test-build-unified-tree-check-llvm".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/60/builds/18518

Here is the relevant piece of the build log for the reference
Step 7 (test-build-unified-tree-check-llvm) failure: test (failure)
******************** TEST 'LLVM :: Transforms/SLPVectorizer/RISCV/complex-loads.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /b/1/llvm-x86_64-debian-dylib/build/bin/opt -S -mtriple riscv64-unknown-linux-gnu < /b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll --passes=slp-vectorizer -mattr=+v -slp-threshold=-20 | /b/1/llvm-x86_64-debian-dylib/build/bin/FileCheck /b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll
+ /b/1/llvm-x86_64-debian-dylib/build/bin/opt -S -mtriple riscv64-unknown-linux-gnu --passes=slp-vectorizer -mattr=+v -slp-threshold=-20
+ /b/1/llvm-x86_64-debian-dylib/build/bin/FileCheck /b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll
/b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll:31:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[ARRAYIDX8_2:%.*]] = getelementptr i8, ptr [[ADD_PTR_1]], i64 1
              ^
<stdin>:29:58: note: scanning from here
 %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4
                                                         ^
<stdin>:29:58: note: with "ADD_PTR_1" equal to "%add\\.ptr\\.1"
 %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4
                                                         ^
<stdin>:66:12: note: possible intended match here
 %arrayidx5.3 = getelementptr i8, ptr null, i64 4
           ^

Input file: <stdin>
Check file: /b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
          24:  %3 = load i8, ptr %arrayidx32.1, align 1 
          25:  %conv33.1 = zext i8 %3 to i32 
          26:  %add.ptr.1 = getelementptr i8, ptr %add.ptr, i64 %idx.ext 
          27:  %add.ptr64.1 = getelementptr i8, ptr %add.ptr64, i64 %idx.ext63 
          28:  %arrayidx3.2 = getelementptr i8, ptr %add.ptr.1, i64 4 
          29:  %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4 
next:31'0                                                              X error: no match found
next:31'1                                                                with "ADD_PTR_1" equal to "%add\\.ptr\\.1"
          30:  %4 = load <4 x i8>, ptr %add.ptr.1, align 1 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          31:  %5 = shufflevector <4 x i8> %4, <4 x i8> poison, <2 x i32> <i32 0, i32 2> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          32:  %6 = zext <2 x i8> %5 to <2 x i32> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          33:  %7 = load <4 x i8>, ptr %add.ptr64.1, align 1 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          34:  %8 = shufflevector <4 x i8> %7, <4 x i8> poison, <2 x i32> <i32 0, i32 2> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           .
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Jan 31, 2025

LLVM Buildbot has detected a new failure on builder llvm-clang-x86_64-expensive-checks-debian running on gribozavr4 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/16/builds/13053

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: Transforms/SLPVectorizer/RISCV/complex-loads.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/opt -S -mtriple riscv64-unknown-linux-gnu < /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll --passes=slp-vectorizer -mattr=+v -slp-threshold=-20 | /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/FileCheck /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll
+ /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/opt -S -mtriple riscv64-unknown-linux-gnu --passes=slp-vectorizer -mattr=+v -slp-threshold=-20
+ /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/FileCheck /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll
/b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll:31:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[ARRAYIDX8_2:%.*]] = getelementptr i8, ptr [[ADD_PTR_1]], i64 1
              ^
<stdin>:29:58: note: scanning from here
 %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4
                                                         ^
<stdin>:29:58: note: with "ADD_PTR_1" equal to "%add\\.ptr\\.1"
 %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4
                                                         ^
<stdin>:66:12: note: possible intended match here
 %arrayidx5.3 = getelementptr i8, ptr null, i64 4
           ^

Input file: <stdin>
Check file: /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
          24:  %3 = load i8, ptr %arrayidx32.1, align 1 
          25:  %conv33.1 = zext i8 %3 to i32 
          26:  %add.ptr.1 = getelementptr i8, ptr %add.ptr, i64 %idx.ext 
          27:  %add.ptr64.1 = getelementptr i8, ptr %add.ptr64, i64 %idx.ext63 
          28:  %arrayidx3.2 = getelementptr i8, ptr %add.ptr.1, i64 4 
          29:  %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4 
next:31'0                                                              X error: no match found
next:31'1                                                                with "ADD_PTR_1" equal to "%add\\.ptr\\.1"
          30:  %4 = load <4 x i8>, ptr %add.ptr.1, align 1 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          31:  %5 = shufflevector <4 x i8> %4, <4 x i8> poison, <2 x i32> <i32 0, i32 2> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          32:  %6 = zext <2 x i8> %5 to <2 x i32> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          33:  %7 = load <4 x i8>, ptr %add.ptr64.1, align 1 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          34:  %8 = shufflevector <4 x i8> %7, <4 x i8> poison, <2 x i32> <i32 0, i32 2> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           .
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Jan 31, 2025

LLVM Buildbot has detected a new failure on builder lld-x86_64-ubuntu-fast running on as-builder-4 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/33/builds/10634

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: Transforms/SLPVectorizer/RISCV/complex-loads.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/build/bin/opt -S -mtriple riscv64-unknown-linux-gnu < /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll --passes=slp-vectorizer -mattr=+v -slp-threshold=-20 | /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/build/bin/FileCheck /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll
+ /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/build/bin/opt -S -mtriple riscv64-unknown-linux-gnu --passes=slp-vectorizer -mattr=+v -slp-threshold=-20
+ /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/build/bin/FileCheck /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll
/home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll:31:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[ARRAYIDX8_2:%.*]] = getelementptr i8, ptr [[ADD_PTR_1]], i64 1
              ^
<stdin>:29:58: note: scanning from here
 %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4
                                                         ^
<stdin>:29:58: note: with "ADD_PTR_1" equal to "%add\\.ptr\\.1"
 %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4
                                                         ^
<stdin>:66:12: note: possible intended match here
 %arrayidx5.3 = getelementptr i8, ptr null, i64 4
           ^

Input file: <stdin>
Check file: /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
          24:  %3 = load i8, ptr %arrayidx32.1, align 1 
          25:  %conv33.1 = zext i8 %3 to i32 
          26:  %add.ptr.1 = getelementptr i8, ptr %add.ptr, i64 %idx.ext 
          27:  %add.ptr64.1 = getelementptr i8, ptr %add.ptr64, i64 %idx.ext63 
          28:  %arrayidx3.2 = getelementptr i8, ptr %add.ptr.1, i64 4 
          29:  %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4 
next:31'0                                                              X error: no match found
next:31'1                                                                with "ADD_PTR_1" equal to "%add\\.ptr\\.1"
          30:  %4 = load <4 x i8>, ptr %add.ptr.1, align 1 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          31:  %5 = shufflevector <4 x i8> %4, <4 x i8> poison, <2 x i32> <i32 0, i32 2> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          32:  %6 = zext <2 x i8> %5 to <2 x i32> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          33:  %7 = load <4 x i8>, ptr %add.ptr64.1, align 1 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          34:  %8 = shufflevector <4 x i8> %7, <4 x i8> poison, <2 x i32> <i32 0, i32 2> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           .
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Feb 1, 2025

LLVM Buildbot has detected a new failure on builder premerge-monolithic-linux running on premerge-linux-1 while building llvm at step 7 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/153/builds/21568

Here is the relevant piece of the build log for the reference
Step 7 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: Transforms/SLPVectorizer/RISCV/complex-loads.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /build/buildbot/premerge-monolithic-linux/build/bin/opt -S -mtriple riscv64-unknown-linux-gnu < /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll --passes=slp-vectorizer -mattr=+v -slp-threshold=-20 | /build/buildbot/premerge-monolithic-linux/build/bin/FileCheck /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll
+ /build/buildbot/premerge-monolithic-linux/build/bin/opt -S -mtriple riscv64-unknown-linux-gnu --passes=slp-vectorizer -mattr=+v -slp-threshold=-20
+ /build/buildbot/premerge-monolithic-linux/build/bin/FileCheck /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll
/build/buildbot/premerge-monolithic-linux/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll:31:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[ARRAYIDX8_2:%.*]] = getelementptr i8, ptr [[ADD_PTR_1]], i64 1
              ^
<stdin>:29:58: note: scanning from here
 %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4
                                                         ^
<stdin>:29:58: note: with "ADD_PTR_1" equal to "%add\\.ptr\\.1"
 %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4
                                                         ^
<stdin>:66:12: note: possible intended match here
 %arrayidx5.3 = getelementptr i8, ptr null, i64 4
           ^

Input file: <stdin>
Check file: /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
          24:  %3 = load i8, ptr %arrayidx32.1, align 1 
          25:  %conv33.1 = zext i8 %3 to i32 
          26:  %add.ptr.1 = getelementptr i8, ptr %add.ptr, i64 %idx.ext 
          27:  %add.ptr64.1 = getelementptr i8, ptr %add.ptr64, i64 %idx.ext63 
          28:  %arrayidx3.2 = getelementptr i8, ptr %add.ptr.1, i64 4 
          29:  %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4 
next:31'0                                                              X error: no match found
next:31'1                                                                with "ADD_PTR_1" equal to "%add\\.ptr\\.1"
          30:  %4 = load <4 x i8>, ptr %add.ptr.1, align 1 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          31:  %5 = shufflevector <4 x i8> %4, <4 x i8> poison, <2 x i32> <i32 0, i32 2> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          32:  %6 = zext <2 x i8> %5 to <2 x i32> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          33:  %7 = load <4 x i8>, ptr %add.ptr64.1, align 1 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          34:  %8 = shufflevector <4 x i8> %7, <4 x i8> poison, <2 x i32> <i32 0, i32 2> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           .
...

RKSimon added a commit that referenced this pull request Feb 1, 2025
… load/store intrinsics (#125223) (REAPPLIED)

As noted on #124499 - this is currently missing for type-only analysis and was falling back to scalarization for fixed vectors (and failing entirely for scalable vectors)
RKSimon added a commit to RKSimon/llvm-project that referenced this pull request Feb 1, 2025
…BASED + TYPEBASED test coverage

Inspired by llvm#125223 - helps identify when the cost models are relying on arg data (or failures in getTypeBasedIntrinsicInstrCost)
RKSimon added a commit that referenced this pull request Feb 1, 2025
…BASED + TYPEBASED test coverage (#125245)

Inspired by #125223 - helps identify when the cost models are relying on arg data (or failures in getTypeBasedIntrinsicInstrCost)
RKSimon added a commit that referenced this pull request Feb 3, 2025
…e cost analysis. (#124129) (REAPPLIED)

We were only constructing the IntrinsicCostAttributes with the arg type info, and not the args themselves, preventing more detailed cost analysis (constant / uniform args etc.)

Just pass the whole IntrinsicInst to the constructor and let it resolve everything it can.

Noticed while having yet another attempt at #63980

Reapplied cleanup now that #125223 and #124984 have landed.
Icohedron pushed a commit to Icohedron/llvm-project that referenced this pull request Feb 11, 2025
…e cost analysis. (llvm#124129) (REAPPLIED)

We were only constructing the IntrinsicCostAttributes with the arg type info, and not the args themselves, preventing more detailed cost analysis (constant / uniform args etc.)

Just pass the whole IntrinsicInst to the constructor and let it resolve everything it can.

Noticed while having yet another attempt at llvm#63980

Reapplied cleanup now that llvm#125223 and llvm#124984 have landed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

llvm:analysis Includes value tracking, cost tables and constant folding

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants