-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[InstCombine] Fold out-of-range bits for squaring signed integers #153484
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 3 commits
98e536c
b3b264c
f5f6d15
246d414
0c58e22
0ff6997
72cd125
514f267
54acb35
579b510
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -49,3 +49,36 @@ define i1 @vec_reverse_known_bits_demanded_fail(<4 x i8> %xx) { | |
| %r = icmp slt i8 %ele, 0 | ||
| ret i1 %r | ||
| } | ||
|
|
||
| ; Test known bits for (sext i8 x) * (sext i8 x) | ||
| ; RUN: opt -passes=instcombine < %s -S | FileCheck %s --check-prefix=SEXT_SQUARE | ||
|
||
|
|
||
| define i1 @sext_square_bit31(i8 %x) { | ||
| ; SEXT_SQUARE-LABEL: @sext_square_bit31( | ||
| ; SEXT_SQUARE-NEXT: ret i1 false | ||
| %sx = sext i8 %x to i32 | ||
| %mul = mul nsw i32 %sx, %sx | ||
| %and = and i32 %mul, 2147483648 ; 1 << 31 | ||
| %cmp = icmp ne i32 %and, 0 | ||
| ret i1 %cmp | ||
| } | ||
|
|
||
| define i1 @sext_square_bit30(i8 %x) { | ||
| ; SEXT_SQUARE-LABEL: @sext_square_bit30( | ||
| ; SEXT_SQUARE-NEXT: ret i1 false | ||
| %sx = sext i8 %x to i32 | ||
| %mul = mul nsw i32 %sx, %sx | ||
| %and = and i32 %mul, 1073741824 ; 1 << 30 | ||
| %cmp = icmp ne i32 %and, 0 | ||
| ret i1 %cmp | ||
| } | ||
|
|
||
| define i1 @sext_square_bit14(i8 %x) { | ||
| ; SEXT_SQUARE-LABEL: @sext_square_bit14( | ||
| ; SEXT_SQUARE-NOT: ret i1 false | ||
| %sx = sext i8 %x to i32 | ||
| %mul = mul nsw i32 %sx, %sx | ||
| %and = and i32 %mul, 16384 ; 1 << 14 | ||
| %cmp = icmp ne i32 %and, 0 | ||
| ret i1 %cmp | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the zext handling here is useful. This will be handled by the generic code.
For the sext case, we know that the result is non-negative (due to self-multiply) and that we have a certain number of sign bits (due to multiply of sext), so together we know that the sign bits are actually zero bits.
I think the principled thing to do here would be, for self-multiplies, to call ComputeNumSignBits() and then set all those bits to zero.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right, the zext is redundant. I’ve updated the code so that for self-multiplies using sext, we now call ComputeNumSignBits() to determine the number of sign bits and mark them as known zero.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, after reviewing the previous commit, how should we call ComputeNumSignBits() and set the corresponding bits to zero? In this function, we only track known bits and don’t explicitly compute the product, so it’s unclear how to determine the exact number of sign bits.
I’ve made another commit that reverts to the previous approach using max/min value boundaries and removed the zext handling for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use the same logic as ComputeNumSignBits:
llvm-project/llvm/lib/Analysis/ValueTracking.cpp
Lines 4278 to 4280 in 92a91f7
Adjusted for the case where the sign bits are the same for both operands:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok thanks, i have added this into last commit. One question: currently my code uses match while other parts of this function use Op0 == Op1. Should we only handle the explicit self-multiply case (x * x), or also consider cases where both operands are sign-extensions of the same value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not necessary to handle sign extensions of the same value, as CSE will convert this into one sign extension used in both operands. So we should use
Op0 == Op1.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh ok, i moved it into the selfmultiply handling instead which uses
Op0 == Op1