-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[VPlan] Account for dead FOR splice simplification in cost model #131486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
3e3a167
3952b01
d3d7c03
cbc38f7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,61 @@ | ||||||||||
| ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5 | ||||||||||
|
||||||||||
| ; RUN: opt -p loop-vectorize -S %s | FileCheck %s | ||||||||||
|
|
||||||||||
| ; Make sure the legacy cost model doesn't add a cost for a splice when the | ||||||||||
| ; first-order recurrence isn't used inside the loop. The VPlan cost model | ||||||||||
| ; eliminates the dead VPInstruction::FirstOrderRecurrenceSplice so the two cost | ||||||||||
| ; models would go out of sync otherwise. | ||||||||||
|
|
||||||||||
| target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128" | ||||||||||
|
||||||||||
| target triple = "x86_64" | ||||||||||
|
|
||||||||||
| define void @h() { | ||||||||||
| ; CHECK-LABEL: define void @h() { | ||||||||||
| ; CHECK-NEXT: [[ENTRY:.*]]: | ||||||||||
| ; CHECK-NEXT: br i1 false, label %[[SCALAR_PH:.*]], label %[[VECTOR_PH:.*]] | ||||||||||
| ; CHECK: [[VECTOR_PH]]: | ||||||||||
| ; CHECK-NEXT: br label %[[VECTOR_BODY:.*]] | ||||||||||
| ; CHECK: [[VECTOR_BODY]]: | ||||||||||
| ; CHECK-NEXT: [[INDEX:%.*]] = phi i32 [ 0, %[[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], %[[VECTOR_BODY]] ] | ||||||||||
| ; CHECK-NEXT: [[VECTOR_RECUR:%.*]] = phi <4 x i32> [ <i32 poison, i32 poison, i32 poison, i32 0>, %[[VECTOR_PH]] ], [ [[STEP_ADD:%.*]], %[[VECTOR_BODY]] ] | ||||||||||
| ; CHECK-NEXT: [[VEC_IND:%.*]] = phi <4 x i32> [ <i32 0, i32 1, i32 2, i32 3>, %[[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.*]], %[[VECTOR_BODY]] ] | ||||||||||
| ; CHECK-NEXT: [[STEP_ADD]] = add <4 x i32> [[VEC_IND]], splat (i32 4) | ||||||||||
| ; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i32 [[INDEX]], 8 | ||||||||||
| ; CHECK-NEXT: [[VEC_IND_NEXT]] = add <4 x i32> [[STEP_ADD]], splat (i32 4) | ||||||||||
| ; CHECK-NEXT: [[TMP0:%.*]] = icmp eq i32 [[INDEX_NEXT]], 40 | ||||||||||
| ; CHECK-NEXT: br i1 [[TMP0]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]] | ||||||||||
| ; CHECK: [[MIDDLE_BLOCK]]: | ||||||||||
| ; CHECK-NEXT: [[VECTOR_RECUR_EXTRACT:%.*]] = extractelement <4 x i32> [[STEP_ADD]], i32 3 | ||||||||||
| ; CHECK-NEXT: br i1 false, label %[[F_EXIT:.*]], label %[[SCALAR_PH]] | ||||||||||
| ; CHECK: [[SCALAR_PH]]: | ||||||||||
| ; CHECK-NEXT: [[SCALAR_RECUR_INIT:%.*]] = phi i32 [ [[VECTOR_RECUR_EXTRACT]], %[[MIDDLE_BLOCK]] ], [ 0, %[[ENTRY]] ] | ||||||||||
| ; CHECK-NEXT: [[BC_RESUME_VAL:%.*]] = phi i32 [ 40, %[[MIDDLE_BLOCK]] ], [ 0, %[[ENTRY]] ] | ||||||||||
| ; CHECK-NEXT: br label %[[FOR_COND_I:.*]] | ||||||||||
| ; CHECK: [[FOR_COND_I]]: | ||||||||||
| ; CHECK-NEXT: [[D_0_I:%.*]] = phi i32 [ [[SCALAR_RECUR_INIT]], %[[SCALAR_PH]] ], [ [[E_0_I:%.*]], %[[FOR_COND_I]] ] | ||||||||||
| ; CHECK-NEXT: [[E_0_I]] = phi i32 [ [[BC_RESUME_VAL]], %[[SCALAR_PH]] ], [ [[INC_I:%.*]], %[[FOR_COND_I]] ] | ||||||||||
| ; CHECK-NEXT: [[INC_I]] = add i32 [[E_0_I]], 1 | ||||||||||
| ; CHECK-NEXT: [[EXITCOND_NOT_I:%.*]] = icmp eq i32 [[E_0_I]], 43 | ||||||||||
| ; CHECK-NEXT: br i1 [[EXITCOND_NOT_I]], label %[[F_EXIT]], label %[[FOR_COND_I]], !llvm.loop [[LOOP3:![0-9]+]] | ||||||||||
| ; CHECK: [[F_EXIT]]: | ||||||||||
| ; CHECK-NEXT: ret void | ||||||||||
| ; | ||||||||||
| entry: | ||||||||||
| br label %for.cond.i | ||||||||||
|
|
||||||||||
| for.cond.i: | ||||||||||
|
||||||||||
| for.cond.i: | |
| loop: |
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be slightly clearer to use more descriptive names for phis.
| %d.0.i = phi i32 [ 0, %entry ], [ %e.0.i, %for.cond.i ] | |
| %e.0.i = phi i32 [ 0, %entry ], [ %inc.i, %for.cond.i ] | |
| %for = phi i32 [ 0, %entry ], [ %e.0.i, %for.cond.i ] | |
| %iv = phi i32 [ 0, %entry ], [ %iv.next, %for.cond.i ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This leaves another case where we would crash, e.g. if the first-order recurrence is used by another instruction that can be removed. It may be enough to add a variant of the test where the FOR is used by another binary instruction? Actually, that simplification would already trigger planContainsAdditionalSimplifications.
Could we catch this FOR simplification in planContainsAdditionalSimplifications as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh good point, adding a dead binary op does indeed cause another assertion failure, I've added a test for it.
It looks like a dead use of the FOR isn't enough to trigger planContainsAdditionalSimplifications on its own because the legacy cost model will detect it as dead and return true for skipCostComputation.
So I've moved the check from the legacy cost model to planContainsAdditionalSimplifications to check for VPFirstOrderRecurrencePHIs without any VPInstruction::FirstOrderRecurrenceSplice uses.