-
Notifications
You must be signed in to change notification settings - Fork 14.7k
[Loads] Support dereference for non-constant offset #149551
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -361,29 +361,29 @@ bool llvm::isDereferenceableAndAlignedInLoop( | |
AccessSize = MaxPtrDiff; | ||
AccessSizeSCEV = PtrDiff; | ||
} else if (auto *MinAdd = dyn_cast<SCEVAddExpr>(AccessStart)) { | ||
if (MinAdd->getNumOperands() != 2) | ||
return false; | ||
const auto *NewBase = dyn_cast<SCEVUnknown>(SE.getPointerBase(MinAdd)); | ||
const auto *OffsetSCEV = SE.removePointerBase(MinAdd); | ||
|
||
const auto *Offset = dyn_cast<SCEVConstant>(MinAdd->getOperand(0)); | ||
const auto *NewBase = dyn_cast<SCEVUnknown>(MinAdd->getOperand(1)); | ||
if (!Offset || !NewBase) | ||
if (!OffsetSCEV || !NewBase) | ||
return false; | ||
|
||
// The following code below assumes the offset is unsigned, but GEP | ||
// offsets are treated as signed so we can end up with a signed value | ||
// here too. For example, suppose the initial PHI value is (i8 255), | ||
// the offset will be treated as (i8 -1) and sign-extended to (i64 -1). | ||
if (Offset->getAPInt().isNegative()) | ||
if (!SE.isKnownNonNegative(OffsetSCEV)) | ||
return false; | ||
|
||
// For the moment, restrict ourselves to the case where the offset is a | ||
// multiple of the requested alignment and the base is aligned. | ||
// TODO: generalize if a case found which warrants | ||
if (Offset->getAPInt().urem(Alignment.value()) != 0) | ||
auto *OffsetSCEVTy = OffsetSCEV->getType(); | ||
if (!SE.isKnownPredicate( | ||
ICmpInst::ICMP_EQ, | ||
annamthomas marked this conversation as resolved.
Show resolved
Hide resolved
|
||
SE.getURemExpr(OffsetSCEV, | ||
SE.getConstant(OffsetSCEVTy, Alignment.value())), | ||
SE.getZero(OffsetSCEVTy))) | ||
return false; | ||
|
||
AccessSize = MaxPtrDiff + Offset->getAPInt(); | ||
AccessSizeSCEV = SE.getAddExpr(PtrDiff, Offset); | ||
AccessSizeSCEV = SE.getAddExpr(PtrDiff, OffsetSCEV); | ||
const auto *Offset = dyn_cast<SCEVConstant>(OffsetSCEV); | ||
AccessSize = MaxPtrDiff + (Offset ? Offset->getAPInt() | ||
: SE.getUnsignedRangeMax(OffsetSCEV)); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just always using getUnsignedRangeMax here should be fine. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not familiar with this code, does it handle overflow in these adds correctly? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thats a good question. Just going by the original code as well, we can overflow this add:
I wonder if it is enough to just use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The other approach is using
PtrDiff is actually So, the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is the only comment I'm unsure of (and haven't addressed yet). In the original code as well, there was a chance of overflow. I think we need to add the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What's the expression for AccessStart here? I feel like just having nuw on it may be sufficient. getStartAndEndForAccess should already guarantee that there is no wrapping between AccessStart and AccessEnd, so it should just be a matter of the addrec itself wrapping. Unless that's already excluded by something else at this point. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. AccessStart: (16 + (4 * %iv.start) + %P) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, I think what we need to check is that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Unfortunately, it doesn't. I tried isKnownPredicateAt with Loop's predecessor's terminator as CtxI as well (in case assumes help).
NewBase is %P. AccessStart is `(16 + (4 * %iv.start) + %P). Note the nuw/nsw flags on the "4 * %iv.start". This is present because of the range on iv.start. From the above we should get that the offset computation itself does not wrap: should be known from the range on %iv.start. Isn't that enough for |
||
Base = NewBase->getValue(); | ||
} else | ||
return false; | ||
|
Uh oh!
There was an error while loading. Please reload this page.