-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[InstCombine] Pull extract through broadcast #143380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 9 commits
34c1222
31cb512
e315c89
274cec9
4fda162
5d9e4d9
a1aec2e
70d3c01
52f5c4b
e21bde6
09d5e4c
ce809f6
21b8638
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,52 @@ | ||
| ; RUN: opt -passes=instcombine -S < %s | FileCheck %s | ||
|
|
||
| define float @extract_from_zero_init_shuffle(<2 x float> %1, i64 %idx) { | ||
| ; CHECK-LABEL: @extract_from_zero_init_shuffle( | ||
| ; CHECK-NEXT: [[TMP1:%.*]] = extractelement <2 x float> [[W:%.*]], i64 0 | ||
| ; CHECK-NEXT: ret float [[TMP1]] | ||
| ; | ||
| %3 = shufflevector <2 x float> %1, <2 x float> poison, <4 x i32> zeroinitializer | ||
|
||
| %4 = extractelement <4 x float> %3, i64 %idx | ||
| ret float %4 | ||
| } | ||
|
|
||
|
|
||
| define float @extract_from_general_splat(<2 x float> %1, i64 %idx) { | ||
| ; CHECK-LABEL: @extract_from_general_splat( | ||
| ; CHECK-NEXT: [[TMP1:%.*]] = extractelement <2 x float> [[W:%.*]], i64 1 | ||
| ; CHECK-NEXT: ret float [[TMP1]] | ||
| ; | ||
| %3 = shufflevector <2 x float> %1, <2 x float> poison, <4 x i32> <i32 1, i32 1, i32 1, i32 1> | ||
| %4 = extractelement <4 x float> %3, i64 %idx | ||
| ret float %4 | ||
| } | ||
|
|
||
| define float @extract_from_general_scalable_splat(<vscale x 2 x float> %1, i64 %idx) { | ||
| ; CHECK-LABEL: @extract_from_general_scalable_splat( | ||
| ; CHECK-NEXT: [[TMP1:%.*]] = extractelement <vscale x 2 x float> [[W:%.*]], i64 0 | ||
| ; CHECK-NEXT: ret float [[TMP1]] | ||
| ; | ||
| %3 = shufflevector <vscale x 2 x float> %1, <vscale x 2 x float> poison, <vscale x 4 x i32> zeroinitializer | ||
| %4 = extractelement <vscale x 4 x float> %3, i64 %idx | ||
| ret float %4 | ||
| } | ||
|
|
||
| define float @extract_from_splat_with_poison_0(<2 x float> %1, i64 %idx) { | ||
| ; CHECK-LABEL: @extract_from_splat_with_poison_0( | ||
| ; CHECK-NEXT: [[TMP2:%.*]] = extractelement <2 x float> [[TMP1:%.*]], i64 1 | ||
| ; CHECK-NEXT: ret float [[TMP2]] | ||
| ; | ||
| %3 = shufflevector <2 x float> %1, <2 x float> poison, <4 x i32> <i32 poison, i32 1, i32 1, i32 1> | ||
| %4 = extractelement <4 x float> %3, i64 %idx | ||
| ret float %4 | ||
| } | ||
agorenstein-nvidia marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| define float @extract_from_splat_with_poison_1(<2 x float> %1, i64 %idx) { | ||
| ; CHECK-LABEL: @extract_from_splat_with_poison_1( | ||
| ; CHECK-NEXT: [[TMP2:%.*]] = extractelement <2 x float> [[TMP1:%.*]], i64 1 | ||
| ; CHECK-NEXT: ret float [[TMP2]] | ||
| ; | ||
| %3 = shufflevector <2 x float> %1, <2 x float> poison, <4 x i32> <i32 1, i32 poison, i32 1, i32 1> | ||
| %4 = extractelement <4 x float> %3, i64 %idx | ||
| ret float %4 | ||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -91,10 +91,7 @@ define i8 @extractelement_bitcast_insert_extra_use_bitcast(<vscale x 2 x i32> %a | |
|
|
||
| define i32 @extractelement_shuffle_maybe_out_of_range(i32 %v) { | ||
| ; CHECK-LABEL: @extractelement_shuffle_maybe_out_of_range( | ||
| ; CHECK-NEXT: [[IN:%.*]] = insertelement <vscale x 4 x i32> poison, i32 [[V:%.*]], i64 0 | ||
| ; CHECK-NEXT: [[SPLAT:%.*]] = shufflevector <vscale x 4 x i32> [[IN]], <vscale x 4 x i32> poison, <vscale x 4 x i32> zeroinitializer | ||
| ; CHECK-NEXT: [[R:%.*]] = extractelement <vscale x 4 x i32> [[SPLAT]], i64 4 | ||
| ; CHECK-NEXT: ret i32 [[R]] | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is analogous to the other comment; it may be that the
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. might be useful to add a comment above the test explaining the reasoning,similar to your comment here
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Did so! Thanks. |
||
| ; CHECK-NEXT: ret i32 [[V:%.*]] | ||
| ; | ||
| %in = insertelement <vscale x 4 x i32> poison, i32 %v, i32 0 | ||
| %splat = shufflevector <vscale x 4 x i32> %in, <vscale x 4 x i32> poison, <vscale x 4 x i32> zeroinitializer | ||
|
|
@@ -104,10 +101,7 @@ define i32 @extractelement_shuffle_maybe_out_of_range(i32 %v) { | |
|
|
||
| define i32 @extractelement_shuffle_invalid_index(i32 %v) { | ||
| ; CHECK-LABEL: @extractelement_shuffle_invalid_index( | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here we optimize, but also in doing so miss the Without tracing through, it looks like this may "just" be an order-of-transformations issue. I'm inclined to preserve the behavior. I've simply changed the test to reflect the new output; is there a preferred way of updating/changing these tests in this sort of situation? |
||
| ; CHECK-NEXT: [[IN:%.*]] = insertelement <vscale x 4 x i32> poison, i32 [[V:%.*]], i64 0 | ||
| ; CHECK-NEXT: [[SPLAT:%.*]] = shufflevector <vscale x 4 x i32> [[IN]], <vscale x 4 x i32> poison, <vscale x 4 x i32> zeroinitializer | ||
| ; CHECK-NEXT: [[R:%.*]] = extractelement <vscale x 4 x i32> [[SPLAT]], i64 4294967295 | ||
| ; CHECK-NEXT: ret i32 [[R]] | ||
| ; CHECK-NEXT: ret i32 [[V:%.*]] | ||
| ; | ||
| %in = insertelement <vscale x 4 x i32> poison, i32 %v, i32 0 | ||
| %splat = shufflevector <vscale x 4 x i32> %in, <vscale x 4 x i32> poison, <vscale x 4 x i32> zeroinitializer | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely happy with this. I've also considered either a tiny helper function like
bool test(int* outParam), or carrying alongbool ValidSrcIdx, or rewritinggetSplatIndexto be able to meaningfully return the all-poison splat indicator, and this seemed the best option.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use
std::optional<int> SrcIdxinstead?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@agorenstein-nvidia did std::optional not help simplify this logic?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion and ping! I was out from Thursday until yesterday [edit: inclusive]; this remains top of my "priority queue" and I'm integrating your suggestions now. Provisionally it looks like that exactly captures the intent, and clearly I hadn't considered this approach before. I'll finish putting this change in, and your test-improvement-suggestions (and run the tests/linter!) and push ASAP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been done in the latest push, please take a look when you get the chance. It passes my local versions of git-clang-format and the lit tests, so hopefully that carries through to the CI here.