-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[LoopInterchange] Don't consider loops with BTC=0 #167113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@llvm/pr-subscribers-llvm-transforms Author: Sjoerd Meijer (sjoerdmeijer) ChangesDo not consider loops with a zero backedge taken count as candidates for interchange. This seems like a sensible thing to do to me, because it suggests the loop doesn't execute and there is no point in interchanging. This avoids triggering an assert about phis and their uses. I have a feeling that this fix might be hiding the issue, but I haven't yet been able to trigger the assert with other test cases; every time the loops are rejected for other reasons. Since I think this is a self-contained improvement that avoids a lot of test failures, I propose to reject this loops while I investigate further if I can still trigger this in some way. (Partial) fix for #163954 Full diff: https://github.com/llvm/llvm-project/pull/167113.diff 7 Files Affected:
diff --git a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
index 9aaf6a5aa4d6a..11e1723eb03d9 100644
--- a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
@@ -101,6 +101,12 @@ static cl::opt<unsigned int> MaxLoopNestDepth(
"loop-interchange-max-loop-nest-depth", cl::init(10), cl::Hidden,
cl::desc("Maximum depth of loop nest considered for the transform"));
+// This is mainly for testing purposes, and certain tests that rely on
+// behaviour that is more difficult to trigger otherwise.
+static cl::opt<bool> SkipLoopsWithZeroBTC(
+ "loop-interchange-skip-zero-btc", cl::init(true), cl::Hidden,
+ cl::desc("Do not consider loops with a backedge taken count of 0"));
+
// We prefer cache cost to vectorization by default.
static cl::list<RuleTy> Profitabilities(
"loop-interchange-profitabilities", cl::ZeroOrMore,
@@ -428,6 +434,13 @@ static bool isComputableLoopNest(ScalarEvolution *SE,
LLVM_DEBUG(dbgs() << "Couldn't compute backedge count\n");
return false;
}
+ // A loop with a backedge that isn't taken, e.g. an unconditional branch
+ // true, isn't really a loop and we don't want to consider it as a
+ // candidate.
+ if (ExitCountOuter && SkipLoopsWithZeroBTC && ExitCountOuter->isZero()) {
+ LLVM_DEBUG(dbgs() << "Single iteration loop\n");
+ return false;
+ }
if (L->getNumBackEdges() != 1) {
LLVM_DEBUG(dbgs() << "NumBackEdges is not equal to 1\n");
return false;
@@ -1808,6 +1821,7 @@ static void moveLCSSAPhis(BasicBlock *InnerExit, BasicBlock *InnerHeader,
assert(all_of(P.users(),
[OuterHeader, OuterExit, IncI, InnerHeader](User *U) {
+ dbgs() << "USER: "; U->dump();
return (cast<PHINode>(U)->getParent() == OuterHeader &&
IncI->getParent() == InnerHeader) ||
cast<PHINode>(U)->getParent() == OuterExit;
diff --git a/llvm/test/Transforms/LoopInterchange/interchanged-loop-nest-4.ll b/llvm/test/Transforms/LoopInterchange/interchanged-loop-nest-4.ll
index 70fff161154d8..a4dbd5a005082 100644
--- a/llvm/test/Transforms/LoopInterchange/interchanged-loop-nest-4.ll
+++ b/llvm/test/Transforms/LoopInterchange/interchanged-loop-nest-4.ll
@@ -1,5 +1,5 @@
; REQUIRES: asserts
-; RUN: opt < %s -passes="loop(loop-interchange,loop-interchange)" -cache-line-size=8 -verify-dom-info -verify-loop-info \
+; RUN: opt < %s -passes="loop(loop-interchange,loop-interchange)" -cache-line-size=8 -verify-dom-info -verify-loop-info -loop-interchange-skip-zero-btc=false \
; RUN: -debug-only=loop-interchange 2>&1 | FileCheck %s
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
diff --git a/llvm/test/Transforms/LoopInterchange/lcssa-phi-outer-latch.ll b/llvm/test/Transforms/LoopInterchange/lcssa-phi-outer-latch.ll
index a5e3accaf8e10..ec02110c90e3b 100644
--- a/llvm/test/Transforms/LoopInterchange/lcssa-phi-outer-latch.ll
+++ b/llvm/test/Transforms/LoopInterchange/lcssa-phi-outer-latch.ll
@@ -1,5 +1,5 @@
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 6
-; RUN: opt < %s -passes=loop-interchange -cache-line-size=64 -verify-dom-info -verify-loop-info -verify-scev -verify-loop-lcssa -S | FileCheck %s
+; RUN: opt < %s -passes=loop-interchange -loop-interchange-skip-zero-btc=false -cache-line-size=64 -verify-dom-info -verify-loop-info -verify-scev -verify-loop-lcssa -S | FileCheck %s
; This test is checking that blocks outer.body and outer.latch, where outer.body is the exit
; block of the inner loop and outer.latch the latch of the outer loop, correctly
diff --git a/llvm/test/Transforms/LoopInterchange/pr43176-move-to-new-latch.ll b/llvm/test/Transforms/LoopInterchange/pr43176-move-to-new-latch.ll
index 6b25c3bc9a4ba..56140236825b7 100644
--- a/llvm/test/Transforms/LoopInterchange/pr43176-move-to-new-latch.ll
+++ b/llvm/test/Transforms/LoopInterchange/pr43176-move-to-new-latch.ll
@@ -1,5 +1,5 @@
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
-; RUN: opt < %s -passes=loop-interchange -cache-line-size=64 -pass-remarks-missed='loop-interchange' -pass-remarks-output=%t -S
+; RUN: opt < %s -passes=loop-interchange -cache-line-size=64 -pass-remarks-missed='loop-interchange' -pass-remarks-output=%t -loop-interchange-skip-zero-btc=false -S
; RUN: FileCheck --input-file=%t %s
@b = external dso_local global [5 x i32], align 16
diff --git a/llvm/test/Transforms/LoopInterchange/pr43326.ll b/llvm/test/Transforms/LoopInterchange/pr43326.ll
index c25c4fadd3042..8ce3d59e3420f 100644
--- a/llvm/test/Transforms/LoopInterchange/pr43326.ll
+++ b/llvm/test/Transforms/LoopInterchange/pr43326.ll
@@ -1,5 +1,5 @@
; RUN: opt < %s -passes=loop-interchange -cache-line-size=64 -pass-remarks-missed='loop-interchange' -pass-remarks-output=%t -S \
-; RUN: -verify-dom-info -verify-loop-info -verify-loop-lcssa -stats 2>&1
+; RUN: -loop-interchange-skip-zero-btc=false -verify-dom-info -verify-loop-info -verify-loop-lcssa -stats 2>&1
; RUN: FileCheck --input-file=%t --check-prefix=REMARKS %s
@a = global i32 0
diff --git a/llvm/test/Transforms/LoopInterchange/pr57148.ll b/llvm/test/Transforms/LoopInterchange/pr57148.ll
index 0d4194762a692..51cad15964ed4 100644
--- a/llvm/test/Transforms/LoopInterchange/pr57148.ll
+++ b/llvm/test/Transforms/LoopInterchange/pr57148.ll
@@ -1,5 +1,5 @@
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
-; RUN: opt < %s -passes=loop-interchange -cache-line-size=4 -loop-interchange-threshold=-100 -verify-dom-info -verify-loop-info -verify-scev -verify-loop-lcssa -S | FileCheck %s
+; RUN: opt < %s -passes=loop-interchange -loop-interchange-skip-zero-btc=false -cache-line-size=4 -loop-interchange-threshold=-100 -verify-dom-info -verify-loop-info -verify-scev -verify-loop-lcssa -S | FileCheck %s
; Make sure the loops are in LCSSA form after loop interchange,
; and loop interchange does not hit assertion errors and crash.
diff --git a/llvm/test/Transforms/LoopInterchange/reductions-across-inner-and-outer-loop.ll b/llvm/test/Transforms/LoopInterchange/reductions-across-inner-and-outer-loop.ll
index 27d99e05e84ee..fad6a6e70037c 100644
--- a/llvm/test/Transforms/LoopInterchange/reductions-across-inner-and-outer-loop.ll
+++ b/llvm/test/Transforms/LoopInterchange/reductions-across-inner-and-outer-loop.ll
@@ -1,5 +1,5 @@
; RUN: opt < %s -passes=loop-interchange -cache-line-size=64 -pass-remarks-missed='loop-interchange' -pass-remarks-output=%t -S \
-; RUN: -verify-dom-info -verify-loop-info -verify-loop-lcssa -stats 2>&1 | FileCheck %s
+; RUN: -loop-interchange-skip-zero-btc=false -verify-dom-info -verify-loop-info -verify-loop-lcssa -stats 2>&1 | FileCheck %s
; RUN: FileCheck --input-file=%t --check-prefix=REMARKS %s
|
899d66b to
c19c535
Compare
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
madhur13490
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with some nits.
| // This is mainly for testing purposes, and certain tests that rely on | ||
| // behaviour that is more difficult to trigger otherwise. | ||
| static cl::opt<bool> SkipLoopsWithZeroBTC( | ||
| "loop-interchange-skip-zero-btc", cl::init(true), cl::Hidden, | ||
| cl::desc("Do not consider loops with a backedge taken count of 0")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about adding an option just for this this; Ignoring BTC = 0 seems fine, although it would be good to first check/fix the phi assertion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this option ideally shouldn't exist. I will clarify that in the comment. The point is that are a couple of current tests that test some corner cases, which we wouldn't test anymore as they would now be rejected for interchange. Thus, keeping the behaviour with this option is easier than re-architecting the tests. But I will follow up on this as I am going to see if I can trigger the assert in different ways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see, so there are a few tests with BTC = 0? if so, they should be updated to have a non-zero BTC, but this needs more than simply adjusting the exit check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, exactly that. Modifying these tests isn't a straightforward fix up, and requires diving into the behaviour that they want to test and coming up with a modified or new test case. I will do that for this assert that I am trying to address in a follow up, but want to delay that for the existing test cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine to simply changing the exit check for almost all cases.
kasuga-fj
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The transformation will abort if any loop with BTC=0 is found. So we will lost the opportunity to interchange the j-loop and k-loop in the following case.
for (int i = 0; i < 1; i++)
for (int j = 0; j < 100; j++)
for (int k = 0; k < 100; k++)
A[k][j] = ...;I think it's okay as such cases are probably rare. But in that case, I'm not sure if it's make sense to treat BTC=0 as a special case in the first place.
| @@ -1,5 +1,5 @@ | |||
| ; RUN: opt < %s -passes=loop-interchange -cache-line-size=64 -pass-remarks-missed='loop-interchange' -pass-remarks-output=%t -S \ | |||
| ; RUN: -verify-dom-info -verify-loop-info -verify-loop-lcssa -stats 2>&1 | FileCheck %s | |||
| ; RUN: -loop-interchange-skip-zero-btc=false -verify-dom-info -verify-loop-info -verify-loop-lcssa -stats 2>&1 | FileCheck %s | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I don't overlook anything, there are no cases where BTC=0 in this file.
| @@ -1,5 +1,5 @@ | |||
| ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py | |||
| ; RUN: opt < %s -passes=loop-interchange -cache-line-size=64 -pass-remarks-missed='loop-interchange' -pass-remarks-output=%t -S | |||
| ; RUN: opt < %s -passes=loop-interchange -cache-line-size=64 -pass-remarks-missed='loop-interchange' -pass-remarks-output=%t -loop-interchange-skip-zero-btc=false -S | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to be a problem that has existed for some time, but as for this case, the pseudo code and the IR appear to be inconsistent. I think it's a good opportunity to revise the exit condition.
| @@ -1,5 +1,5 @@ | |||
| ; RUN: opt < %s -passes=loop-interchange -cache-line-size=64 -pass-remarks-missed='loop-interchange' -pass-remarks-output=%t -S \ | |||
| ; RUN: -verify-dom-info -verify-loop-info -verify-loop-lcssa -stats 2>&1 | |||
| ; RUN: -loop-interchange-skip-zero-btc=false -verify-dom-info -verify-loop-info -verify-loop-lcssa -stats 2>&1 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only case where it's unclear to me what's being tested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the commit message from when this test was added, this also appears to be a case that BTC=0 is not required.
| // This is mainly for testing purposes, and certain tests that rely on | ||
| // behaviour that is more difficult to trigger otherwise. | ||
| static cl::opt<bool> SkipLoopsWithZeroBTC( | ||
| "loop-interchange-skip-zero-btc", cl::init(true), cl::Hidden, | ||
| cl::desc("Do not consider loops with a backedge taken count of 0")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine to simply changing the exit check for almost all cases.
Precommit test fixups for llvm#167113
|
I have fixed the test cases in #167748 I will need to rebase this change on top of that; we then probably don't need the option. I will also see if I can move the isZero() BTC check around: instead of rejecting the whole loop nest, maybe only reject that loop as candidate. If for one reason or another that is difficult, then I will keep it as it is, but will at least add another test case for that. |
Precommit test fixups for #167113
Do not consider loops with a zero backedge taken count as candidates for interchange. This seems like a sensible thing to do to me, because it suggests the loop doesn't execute and there is no point in interchanging. This avoids triggering an assert about phis and their uses. I have a feeling that this fix might be hiding the issue, but I haven't yet been able to trigger the assert with other test cases; every time the loops are rejected for other reasons. Since I think this is a self-contained improvement that avoids a lot of test failures, I propose to reject this loops while I investigate further if I can still trigger this in some way. (Partial) fix for llvm#163954
a6a1caf to
56c3e9b
Compare
|
The last update address @kasuga-fj 's remark:
|
🐧 Linux x64 Test Results
|
kasuga-fj
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please merge or rebase main instead of committing the same changes as #167748?
| LPMUpdater &U) { | ||
| Function &F = *LN.getParent(); | ||
| SmallVector<Loop *, 8> LoopList(LN.getLoops()); | ||
| std::map<const Loop *, const SCEV *> LoopBTC; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ScalarEvolution caches the results of BTC. In general, it would be better to call getBackedgeTakenCount every time than holding local cache unless there are special circumstances.
|
|
||
| @D = common global [100 x [100 x [100 x i32]]] zeroinitializer | ||
|
|
||
| ; Test for interchange in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a brief description to clarify the purpose of this test? Like "check the loop with BTC=0 is considered unprofitable".
|
|
||
| inner2.body: | ||
| %k = phi i64 [ 0, %inner1.header ], [ %inc, %inner2.body ] | ||
| %arrayidx8 = getelementptr inbounds [100 x [100 x [100 x i32]]], ptr @D, i64 0, i64 %i, i64 %k, i64 %j |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| %arrayidx8 = getelementptr inbounds [100 x [100 x [100 x i32]]], ptr @D, i64 0, i64 %i, i64 %k, i64 %j | |
| %arrayidx8 = getelementptr inbounds [100 x [100 x i32]], ptr @D, i64 %i, i64 %k, i64 %j |
I remember hearing somewhere that this is the canonical form.
This is rebased/merged. The changes here to pr43326.ll and pr57148.ll should indeed have been included in #167748, but I missed them earlier. They were a bit masked by other things, and I discovered them during the rebase/changes here. Thanks for the review and remarks, I am now addressing them. |
Ah, I see, I misunderstood. |
kasuga-fj
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM modulo some nits.
| #include "llvm/Transforms/Utils/Local.h" | ||
| #include "llvm/Transforms/Utils/LoopUtils.h" | ||
| #include <cassert> | ||
| #include <map> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| #include <map> |
Maybe unnecessary?
| // A loop with a backedge that isn't taken, e.g. an unconditional branch | ||
| // true, isn't really a loop and we don't want to consider it as a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a loop by definition, even if BTC=0. I feel this comment is a bit misleading.
Meinersbur
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
Thanks for the reviews! |
Do not consider loops with a zero backedge taken count as candidates for interchange. This seems like a sensible thing to do to me, because it suggests the loop doesn't execute and there is no point in interchanging.
This avoids triggering an assert about phis and their uses. I have a feeling that this fix might be hiding the issue, but I haven't yet been able to trigger the assert with other test cases; every time the loops are rejected for other reasons.
Since I think this is a self-contained improvement that avoids a lot of test failures, I propose to reject this loops while I investigate further if I can still trigger this in some way.
(Partial) fix for #163954