Skip to content

[VPlan] Return invalid cost if any skeleton block has invalid costs. #151940

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Aug 7, 2025

Conversation

fhahn
Copy link
Contributor

@fhahn fhahn commented Aug 4, 2025

We need to reject plans that contain recipes with invalid costs. LICM can move recipes with invalid costs out of the loop region, which then get missed by the main cost computation.

Extend the logic to check recipes for invalid cost currently only covering the middle block to include all skeleton blocks.

Fixes #144358
Fixes #151664

We need to reject plans that contain recipes with invalid costs. LICM
can move recipes with invalid costs out of the loop region, which then
get missed by the main cost computation.

Extend the logic to check recipes for invalid cost currently only
covering the middle block to include all skeleton blocks.

Fixes llvm#144358
Fixes llvm#151664
@llvmbot
Copy link
Member

llvmbot commented Aug 4, 2025

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-vectorizers

Author: Florian Hahn (fhahn)

Changes

We need to reject plans that contain recipes with invalid costs. LICM can move recipes with invalid costs out of the loop region, which then get missed by the main cost computation.

Extend the logic to check recipes for invalid cost currently only covering the middle block to include all skeleton blocks.

Fixes #144358
Fixes #151664


Full diff: https://github.com/llvm/llvm-project/pull/151940.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Vectorize/VPlan.cpp (+7-3)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/pr151664-cost-hoisted-vector-scalable.ll (+11-30)
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.cpp b/llvm/lib/Transforms/Vectorize/VPlan.cpp
index 2138b4154d2c2..9cc62247ee553 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlan.cpp
@@ -1072,9 +1072,13 @@ InstructionCost VPlan::cost(ElementCount VF, VPCostContext &Ctx) {
   // blocks, like the preheader or middle blocks.
   InstructionCost Cost = getVectorLoopRegion()->cost(VF, Ctx);
 
-  // If any instructions in the middle block are invalid return invalid.
-  // TODO: Remove once no VPlans with VF == vscale x 1 and first-order recurrences are created.
-  if (!getMiddleBlock()->cost(VF, Ctx).isValid())
+  // If any instructions in the skeleton outside loop regions are invalid return
+  // invalid.
+  if (any_of(VPBlockUtils::blocksOnly<VPBasicBlock>(
+                 vp_depth_first_shallow(getEntry())),
+             [&VF, &Ctx](VPBasicBlock *VPBB) {
+               return !VPBB->cost(VF, Ctx).isValid();
+             }))
     return InstructionCost::getInvalid();
 
   return Cost;
diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/pr151664-cost-hoisted-vector-scalable.ll b/llvm/test/Transforms/LoopVectorize/AArch64/pr151664-cost-hoisted-vector-scalable.ll
index 8495deea31e8b..b4df63dd08fd6 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/pr151664-cost-hoisted-vector-scalable.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/pr151664-cost-hoisted-vector-scalable.ll
@@ -1,47 +1,28 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --check-globals none --filter-out-after "^scalar.ph" --version 5
-; REQUIRES: asserts
-; RUN: opt -passes=loop-vectorize -mtriple=aarch64 -mattr=+sve -S \
-; RUN:   -debug-only=loop-vectorize %s 2>&1 | FileCheck %s
+; RUN: opt -passes=loop-vectorize -mtriple=aarch64 -mattr=+sve -S %s | FileCheck %s
 
-; FIXME: Hoisted vector code should be costed with scalable cost.
-; In this example, `<vscale x 4 x float> @llvm.minimumnum` has an invalid cost,
-; and hence should not be produced by LoopVectorize.
-
-; CHECK: LV: Found an estimated cost of Invalid for VF vscale x 4 For instruction:   %res = tail call float @llvm.minimumnum.f32(float %arg, float 0.000000e+00)
 define void @cost_hoisted_vector_code(ptr %p, float %arg) {
 ; CHECK-LABEL: define void @cost_hoisted_vector_code(
 ; CHECK-SAME: ptr [[P:%.*]], float [[ARG:%.*]]) #[[ATTR0:[0-9]+]] {
 ; CHECK-NEXT:  [[ENTRY:.*:]]
-; CHECK-NEXT:    [[TMP0:%.*]] = call i64 @llvm.vscale.i64()
-; CHECK-NEXT:    [[TMP1:%.*]] = mul nuw i64 [[TMP0]], 8
 ; CHECK-NEXT:    br i1 false, label %[[SCALAR_PH:.*]], label %[[VECTOR_PH:.*]]
 ; CHECK:       [[VECTOR_PH]]:
-; CHECK-NEXT:    [[TMP2:%.*]] = call i64 @llvm.vscale.i64()
-; CHECK-NEXT:    [[TMP3:%.*]] = mul nuw i64 [[TMP2]], 8
-; CHECK-NEXT:    [[N_MOD_VF:%.*]] = urem i64 -1, [[TMP3]]
-; CHECK-NEXT:    [[N_VEC:%.*]] = sub i64 -1, [[N_MOD_VF]]
-; CHECK-NEXT:    [[TMP4:%.*]] = call i64 @llvm.vscale.i64()
-; CHECK-NEXT:    [[TMP5:%.*]] = mul nuw i64 [[TMP4]], 8
-; CHECK-NEXT:    [[BROADCAST_SPLATINSERT:%.*]] = insertelement <vscale x 4 x float> poison, float [[ARG]], i64 0
-; CHECK-NEXT:    [[BROADCAST_SPLAT:%.*]] = shufflevector <vscale x 4 x float> [[BROADCAST_SPLATINSERT]], <vscale x 4 x float> poison, <vscale x 4 x i32> zeroinitializer
-; CHECK-NEXT:    [[TMP6:%.*]] = add i64 1, [[N_VEC]]
-; CHECK-NEXT:    [[TMP7:%.*]] = call <vscale x 4 x float> @llvm.minimumnum.nxv4f32(<vscale x 4 x float> [[BROADCAST_SPLAT]], <vscale x 4 x float> zeroinitializer)
+; CHECK-NEXT:    [[BROADCAST_SPLATINSERT:%.*]] = insertelement <4 x float> poison, float [[ARG]], i64 0
+; CHECK-NEXT:    [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x float> [[BROADCAST_SPLATINSERT]], <4 x float> poison, <4 x i32> zeroinitializer
+; CHECK-NEXT:    [[TMP0:%.*]] = call <4 x float> @llvm.minimumnum.v4f32(<4 x float> [[BROADCAST_SPLAT]], <4 x float> zeroinitializer)
 ; CHECK-NEXT:    br label %[[VECTOR_BODY:.*]]
 ; CHECK:       [[VECTOR_BODY]]:
 ; CHECK-NEXT:    [[INDEX1:%.*]] = phi i64 [ 0, %[[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], %[[VECTOR_BODY]] ]
 ; CHECK-NEXT:    [[INDEX:%.*]] = add i64 1, [[INDEX1]]
 ; CHECK-NEXT:    [[TMP8:%.*]] = getelementptr float, ptr [[P]], i64 [[INDEX]]
-; CHECK-NEXT:    [[TMP9:%.*]] = call i64 @llvm.vscale.i64()
-; CHECK-NEXT:    [[TMP10:%.*]] = mul nuw i64 [[TMP9]], 4
-; CHECK-NEXT:    [[TMP11:%.*]] = getelementptr float, ptr [[TMP8]], i64 [[TMP10]]
-; CHECK-NEXT:    store <vscale x 4 x float> [[TMP7]], ptr [[TMP8]], align 4
-; CHECK-NEXT:    store <vscale x 4 x float> [[TMP7]], ptr [[TMP11]], align 4
-; CHECK-NEXT:    [[INDEX_NEXT]] = add nuw i64 [[INDEX1]], [[TMP5]]
-; CHECK-NEXT:    [[TMP12:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
-; CHECK-NEXT:    br i1 [[TMP12]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
+; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr float, ptr [[TMP8]], i32 4
+; CHECK-NEXT:    store <4 x float> [[TMP0]], ptr [[TMP8]], align 4
+; CHECK-NEXT:    store <4 x float> [[TMP0]], ptr [[TMP2]], align 4
+; CHECK-NEXT:    [[INDEX_NEXT]] = add nuw i64 [[INDEX1]], 8
+; CHECK-NEXT:    [[TMP3:%.*]] = icmp eq i64 [[INDEX_NEXT]], -8
+; CHECK-NEXT:    br i1 [[TMP3]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
 ; CHECK:       [[MIDDLE_BLOCK]]:
-; CHECK-NEXT:    [[CMP_N:%.*]] = icmp eq i64 -1, [[N_VEC]]
-; CHECK-NEXT:    br i1 [[CMP_N]], [[EXIT:label %.*]], label %[[SCALAR_PH]]
+; CHECK-NEXT:    br label %[[SCALAR_PH]]
 ; CHECK:       [[SCALAR_PH]]:
 ;
 entry:

Copy link
Contributor

@artagnon artagnon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for fixing this!

// invalid.
if (any_of(VPBlockUtils::blocksOnly<VPBasicBlock>(
vp_depth_first_shallow(getEntry())),
[&VF, &Ctx](VPBasicBlock *VPBB) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[&VF, &Ctx](VPBasicBlock *VPBB) {
[&VF, &Ctx](const VPBasicBlock *VPBB) {

Nit. I hope cost() is marked cost?

Copy link
Collaborator

@ayalz ayalz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of post-approval nits, somewhat independent.

@@ -1072,9 +1072,13 @@ InstructionCost VPlan::cost(ElementCount VF, VPCostContext &Ctx) {
// blocks, like the preheader or middle blocks.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment deserves updating?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to clarify that other blocks are checked for recipes with invalid costs, thanks.

@@ -1072,9 +1072,13 @@ InstructionCost VPlan::cost(ElementCount VF, VPCostContext &Ctx) {
// blocks, like the preheader or middle blocks.
InstructionCost Cost = getVectorLoopRegion()->cost(VF, Ctx);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better bail out first if invalid?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, added a check to the if and updated the comment, thanks!

@fhahn fhahn merged commit 95c32bf into llvm:main Aug 7, 2025
8 of 9 checks passed
@fhahn fhahn deleted the vplan-cost-check-skeleton-for-invalid branch August 7, 2025 09:45
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Aug 7, 2025
…lid costs. (#151940)

We need to reject plans that contain recipes with invalid costs. LICM
can move recipes with invalid costs out of the loop region, which then
get missed by the main cost computation.

Extend the logic to check recipes for invalid cost currently only
covering the middle block to include all skeleton blocks.

Fixes llvm/llvm-project#144358
Fixes llvm/llvm-project#151664

PR: llvm/llvm-project#151940
Comment on lines +1063 to +1067
if (!Cost.isValid() || any_of(VPBlockUtils::blocksOnly<VPBasicBlock>(
vp_depth_first_shallow(getEntry())),
[&VF, &Ctx](VPBasicBlock *VPBB) {
return !VPBB->cost(VF, Ctx).isValid();
}))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth checking the any_of() part and potentially bailing out before computing the Cost of the vector loop?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[LV] Hoisted vector code is costed without vscale [Aarch64][Codegen] fatal error: Invalid size request on a scalable vector
4 participants