-
Notifications
You must be signed in to change notification settings - Fork 15.5k
LAA: version unit stride for stores #124567
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -5,26 +5,27 @@ define void @test_variable_stride(ptr %dst, i32 %scale) { | |
| ; CHECK-LABEL: define void @test_variable_stride | ||
| ; CHECK-SAME: (ptr [[DST:%.*]], i32 [[SCALE:%.*]]) { | ||
| ; CHECK-NEXT: entry: | ||
| ; CHECK-NEXT: br i1 false, label [[SCALAR_PH:%.*]], label [[VECTOR_PH:%.*]] | ||
| ; CHECK-NEXT: br i1 false, label [[SCALAR_PH:%.*]], label [[VECTOR_SCEVCHECK:%.*]] | ||
| ; CHECK: vector.scevcheck: | ||
| ; CHECK-NEXT: [[IDENT_CHECK:%.*]] = icmp ne i32 [[SCALE]], 1 | ||
| ; CHECK-NEXT: br i1 [[IDENT_CHECK]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]] | ||
|
Comment on lines
+10
to
+11
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This looks like a regression?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Unsure about this one actually: nothing is actually vectorized, if you look further down. It's a degenerate case, and the motivation for the patch was following the same flow for stores as we do for loads: still quite unsure and confused about the test updates though.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it depends whether you enter the loop or not, right? Certainly in the vector loop the code is faster without the mul. I would imagine that turning on symbolic stride predicates for loads would have done the same thing in the past? Is this not just a special case because we're (presumably) choosing VF=1, IC=2?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah it would be faster with the |
||
| ; CHECK: vector.ph: | ||
| ; CHECK-NEXT: br label [[VECTOR_BODY:%.*]] | ||
| ; CHECK: vector.body: | ||
| ; CHECK-NEXT: [[INDEX:%.*]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ] | ||
| ; CHECK-NEXT: [[TMP0:%.*]] = add i32 [[INDEX]], 0 | ||
| ; CHECK-NEXT: [[TMP1:%.*]] = add i32 [[INDEX]], 1 | ||
| ; CHECK-NEXT: [[TMP2:%.*]] = mul i32 [[TMP0]], [[SCALE]] | ||
| ; CHECK-NEXT: [[TMP3:%.*]] = mul i32 [[TMP1]], [[SCALE]] | ||
| ; CHECK-NEXT: [[TMP4:%.*]] = getelementptr i16, ptr [[DST]], i32 [[TMP2]] | ||
| ; CHECK-NEXT: [[TMP5:%.*]] = getelementptr i16, ptr [[DST]], i32 [[TMP3]] | ||
| ; CHECK-NEXT: store i32 [[TMP0]], ptr [[TMP4]], align 2 | ||
| ; CHECK-NEXT: store i32 [[TMP1]], ptr [[TMP5]], align 2 | ||
| ; CHECK-NEXT: [[TMP2:%.*]] = getelementptr i16, ptr [[DST]], i32 [[TMP0]] | ||
| ; CHECK-NEXT: [[TMP3:%.*]] = getelementptr i16, ptr [[DST]], i32 [[TMP1]] | ||
| ; CHECK-NEXT: store i32 [[TMP0]], ptr [[TMP2]], align 2 | ||
| ; CHECK-NEXT: store i32 [[TMP1]], ptr [[TMP3]], align 2 | ||
| ; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i32 [[INDEX]], 2 | ||
| ; CHECK-NEXT: [[TMP6:%.*]] = icmp eq i32 [[INDEX_NEXT]], 1000 | ||
| ; CHECK-NEXT: br i1 [[TMP6]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]] | ||
| ; CHECK-NEXT: [[TMP4:%.*]] = icmp eq i32 [[INDEX_NEXT]], 1000 | ||
| ; CHECK-NEXT: br i1 [[TMP4]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]] | ||
| ; CHECK: middle.block: | ||
| ; CHECK-NEXT: br i1 true, label [[EXIT:%.*]], label [[SCALAR_PH]] | ||
| ; CHECK: scalar.ph: | ||
| ; CHECK-NEXT: [[BC_RESUME_VAL:%.*]] = phi i32 [ 1000, [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.*]] ] | ||
| ; CHECK-NEXT: [[BC_RESUME_VAL:%.*]] = phi i32 [ 1000, [[MIDDLE_BLOCK]] ], [ 0, [[VECTOR_SCEVCHECK]] ], [ 0, [[ENTRY:%.*]] ] | ||
| ; CHECK-NEXT: br label [[LOOP:%.*]] | ||
| ; CHECK: loop: | ||
| ; CHECK-NEXT: [[IV:%.*]] = phi i32 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ] | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.