Skip to content

Conversation

@sjoerdmeijer
Copy link
Collaborator

Do not consider loops with a zero backedge taken count as candidates for interchange. This seems like a sensible thing to do to me, because it suggests the loop doesn't execute and there is no point in interchanging.

This avoids triggering an assert about phis and their uses. I have a feeling that this fix might be hiding the issue, but I haven't yet been able to trigger the assert with other test cases; every time the loops are rejected for other reasons.

Since I think this is a self-contained improvement that avoids a lot of test failures, I propose to reject this loops while I investigate further if I can still trigger this in some way.

(Partial) fix for #163954

@llvmbot
Copy link
Member

llvmbot commented Nov 8, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Sjoerd Meijer (sjoerdmeijer)

Changes

Do not consider loops with a zero backedge taken count as candidates for interchange. This seems like a sensible thing to do to me, because it suggests the loop doesn't execute and there is no point in interchanging.

This avoids triggering an assert about phis and their uses. I have a feeling that this fix might be hiding the issue, but I haven't yet been able to trigger the assert with other test cases; every time the loops are rejected for other reasons.

Since I think this is a self-contained improvement that avoids a lot of test failures, I propose to reject this loops while I investigate further if I can still trigger this in some way.

(Partial) fix for #163954


Full diff: https://github.com/llvm/llvm-project/pull/167113.diff

7 Files Affected:

  • (modified) llvm/lib/Transforms/Scalar/LoopInterchange.cpp (+14)
  • (modified) llvm/test/Transforms/LoopInterchange/interchanged-loop-nest-4.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopInterchange/lcssa-phi-outer-latch.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopInterchange/pr43176-move-to-new-latch.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopInterchange/pr43326.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopInterchange/pr57148.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopInterchange/reductions-across-inner-and-outer-loop.ll (+1-1)
diff --git a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
index 9aaf6a5aa4d6a..11e1723eb03d9 100644
--- a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
@@ -101,6 +101,12 @@ static cl::opt<unsigned int> MaxLoopNestDepth(
     "loop-interchange-max-loop-nest-depth", cl::init(10), cl::Hidden,
     cl::desc("Maximum depth of loop nest considered for the transform"));
 
+// This is mainly for testing purposes, and certain tests that rely on
+// behaviour that is more difficult to trigger otherwise.
+static cl::opt<bool> SkipLoopsWithZeroBTC(
+    "loop-interchange-skip-zero-btc", cl::init(true), cl::Hidden,
+    cl::desc("Do not consider loops with a backedge taken count of 0"));
+
 // We prefer cache cost to vectorization by default.
 static cl::list<RuleTy> Profitabilities(
     "loop-interchange-profitabilities", cl::ZeroOrMore,
@@ -428,6 +434,13 @@ static bool isComputableLoopNest(ScalarEvolution *SE,
       LLVM_DEBUG(dbgs() << "Couldn't compute backedge count\n");
       return false;
     }
+    // A loop with a backedge that isn't taken, e.g. an unconditional branch
+    // true, isn't really a loop and we don't want to consider it as a
+    // candidate.
+    if (ExitCountOuter && SkipLoopsWithZeroBTC && ExitCountOuter->isZero()) {
+      LLVM_DEBUG(dbgs() << "Single iteration loop\n");
+      return false;
+    }
     if (L->getNumBackEdges() != 1) {
       LLVM_DEBUG(dbgs() << "NumBackEdges is not equal to 1\n");
       return false;
@@ -1808,6 +1821,7 @@ static void moveLCSSAPhis(BasicBlock *InnerExit, BasicBlock *InnerHeader,
 
     assert(all_of(P.users(),
                   [OuterHeader, OuterExit, IncI, InnerHeader](User *U) {
+		  dbgs() << "USER: "; U->dump();
                     return (cast<PHINode>(U)->getParent() == OuterHeader &&
                             IncI->getParent() == InnerHeader) ||
                            cast<PHINode>(U)->getParent() == OuterExit;
diff --git a/llvm/test/Transforms/LoopInterchange/interchanged-loop-nest-4.ll b/llvm/test/Transforms/LoopInterchange/interchanged-loop-nest-4.ll
index 70fff161154d8..a4dbd5a005082 100644
--- a/llvm/test/Transforms/LoopInterchange/interchanged-loop-nest-4.ll
+++ b/llvm/test/Transforms/LoopInterchange/interchanged-loop-nest-4.ll
@@ -1,5 +1,5 @@
 ; REQUIRES: asserts
-; RUN: opt < %s -passes="loop(loop-interchange,loop-interchange)" -cache-line-size=8 -verify-dom-info -verify-loop-info \
+; RUN: opt < %s -passes="loop(loop-interchange,loop-interchange)" -cache-line-size=8 -verify-dom-info -verify-loop-info -loop-interchange-skip-zero-btc=false \
 ; RUN:  -debug-only=loop-interchange 2>&1 | FileCheck %s
 
 target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
diff --git a/llvm/test/Transforms/LoopInterchange/lcssa-phi-outer-latch.ll b/llvm/test/Transforms/LoopInterchange/lcssa-phi-outer-latch.ll
index a5e3accaf8e10..ec02110c90e3b 100644
--- a/llvm/test/Transforms/LoopInterchange/lcssa-phi-outer-latch.ll
+++ b/llvm/test/Transforms/LoopInterchange/lcssa-phi-outer-latch.ll
@@ -1,5 +1,5 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 6
-; RUN: opt < %s -passes=loop-interchange -cache-line-size=64 -verify-dom-info -verify-loop-info -verify-scev -verify-loop-lcssa -S | FileCheck %s
+; RUN: opt < %s -passes=loop-interchange -loop-interchange-skip-zero-btc=false -cache-line-size=64 -verify-dom-info -verify-loop-info -verify-scev -verify-loop-lcssa -S | FileCheck %s
 
 ; This test is checking that blocks outer.body and outer.latch, where outer.body is the exit
 ; block of the inner loop and outer.latch the latch of the outer loop, correctly
diff --git a/llvm/test/Transforms/LoopInterchange/pr43176-move-to-new-latch.ll b/llvm/test/Transforms/LoopInterchange/pr43176-move-to-new-latch.ll
index 6b25c3bc9a4ba..56140236825b7 100644
--- a/llvm/test/Transforms/LoopInterchange/pr43176-move-to-new-latch.ll
+++ b/llvm/test/Transforms/LoopInterchange/pr43176-move-to-new-latch.ll
@@ -1,5 +1,5 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
-; RUN: opt < %s -passes=loop-interchange -cache-line-size=64 -pass-remarks-missed='loop-interchange' -pass-remarks-output=%t -S
+; RUN: opt < %s -passes=loop-interchange -cache-line-size=64 -pass-remarks-missed='loop-interchange' -pass-remarks-output=%t -loop-interchange-skip-zero-btc=false -S
 ; RUN: FileCheck --input-file=%t %s
 
 @b = external dso_local global [5 x i32], align 16
diff --git a/llvm/test/Transforms/LoopInterchange/pr43326.ll b/llvm/test/Transforms/LoopInterchange/pr43326.ll
index c25c4fadd3042..8ce3d59e3420f 100644
--- a/llvm/test/Transforms/LoopInterchange/pr43326.ll
+++ b/llvm/test/Transforms/LoopInterchange/pr43326.ll
@@ -1,5 +1,5 @@
 ; RUN: opt < %s -passes=loop-interchange -cache-line-size=64 -pass-remarks-missed='loop-interchange' -pass-remarks-output=%t -S \
-; RUN:     -verify-dom-info -verify-loop-info -verify-loop-lcssa -stats 2>&1
+; RUN:     -loop-interchange-skip-zero-btc=false -verify-dom-info -verify-loop-info -verify-loop-lcssa -stats 2>&1
 ; RUN: FileCheck --input-file=%t --check-prefix=REMARKS %s
 
 @a = global i32 0
diff --git a/llvm/test/Transforms/LoopInterchange/pr57148.ll b/llvm/test/Transforms/LoopInterchange/pr57148.ll
index 0d4194762a692..51cad15964ed4 100644
--- a/llvm/test/Transforms/LoopInterchange/pr57148.ll
+++ b/llvm/test/Transforms/LoopInterchange/pr57148.ll
@@ -1,5 +1,5 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
-; RUN: opt < %s -passes=loop-interchange -cache-line-size=4 -loop-interchange-threshold=-100 -verify-dom-info -verify-loop-info -verify-scev -verify-loop-lcssa -S | FileCheck %s
+; RUN: opt < %s -passes=loop-interchange -loop-interchange-skip-zero-btc=false -cache-line-size=4 -loop-interchange-threshold=-100 -verify-dom-info -verify-loop-info -verify-scev -verify-loop-lcssa -S | FileCheck %s
 
 ; Make sure the loops are in LCSSA form after loop interchange,
 ; and loop interchange does not hit assertion errors and crash.
diff --git a/llvm/test/Transforms/LoopInterchange/reductions-across-inner-and-outer-loop.ll b/llvm/test/Transforms/LoopInterchange/reductions-across-inner-and-outer-loop.ll
index 27d99e05e84ee..fad6a6e70037c 100644
--- a/llvm/test/Transforms/LoopInterchange/reductions-across-inner-and-outer-loop.ll
+++ b/llvm/test/Transforms/LoopInterchange/reductions-across-inner-and-outer-loop.ll
@@ -1,5 +1,5 @@
 ; RUN: opt < %s -passes=loop-interchange -cache-line-size=64 -pass-remarks-missed='loop-interchange' -pass-remarks-output=%t -S \
-; RUN:     -verify-dom-info -verify-loop-info -verify-loop-lcssa -stats 2>&1 | FileCheck %s
+; RUN:     -loop-interchange-skip-zero-btc=false -verify-dom-info -verify-loop-info -verify-loop-lcssa -stats 2>&1 | FileCheck %s
 ; RUN: FileCheck --input-file=%t --check-prefix=REMARKS %s
 
 

@github-actions
Copy link

github-actions bot commented Nov 8, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Contributor

@madhur13490 madhur13490 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with some nits.

Comment on lines 104 to 108
// This is mainly for testing purposes, and certain tests that rely on
// behaviour that is more difficult to trigger otherwise.
static cl::opt<bool> SkipLoopsWithZeroBTC(
"loop-interchange-skip-zero-btc", cl::init(true), cl::Hidden,
cl::desc("Do not consider loops with a backedge taken count of 0"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about adding an option just for this this; Ignoring BTC = 0 seems fine, although it would be good to first check/fix the phi assertion

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this option ideally shouldn't exist. I will clarify that in the comment. The point is that are a couple of current tests that test some corner cases, which we wouldn't test anymore as they would now be rejected for interchange. Thus, keeping the behaviour with this option is easier than re-architecting the tests. But I will follow up on this as I am going to see if I can trigger the assert in different ways.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see, so there are a few tests with BTC = 0? if so, they should be updated to have a non-zero BTC, but this needs more than simply adjusting the exit check?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly that. Modifying these tests isn't a straightforward fix up, and requires diving into the behaviour that they want to test and coming up with a modified or new test case. I will do that for this assert that I am trying to address in a follow up, but want to delay that for the existing test cases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to simply changing the exit check for almost all cases.

Copy link
Contributor

@kasuga-fj kasuga-fj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The transformation will abort if any loop with BTC=0 is found. So we will lost the opportunity to interchange the j-loop and k-loop in the following case.

for (int i = 0; i < 1; i++)
  for (int j = 0; j < 100; j++)
    for (int k = 0; k < 100; k++)
      A[k][j] = ...;

I think it's okay as such cases are probably rare. But in that case, I'm not sure if it's make sense to treat BTC=0 as a special case in the first place.

@@ -1,5 +1,5 @@
; RUN: opt < %s -passes=loop-interchange -cache-line-size=64 -pass-remarks-missed='loop-interchange' -pass-remarks-output=%t -S \
; RUN: -verify-dom-info -verify-loop-info -verify-loop-lcssa -stats 2>&1 | FileCheck %s
; RUN: -loop-interchange-skip-zero-btc=false -verify-dom-info -verify-loop-info -verify-loop-lcssa -stats 2>&1 | FileCheck %s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I don't overlook anything, there are no cases where BTC=0 in this file.

@@ -1,5 +1,5 @@
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
; RUN: opt < %s -passes=loop-interchange -cache-line-size=64 -pass-remarks-missed='loop-interchange' -pass-remarks-output=%t -S
; RUN: opt < %s -passes=loop-interchange -cache-line-size=64 -pass-remarks-missed='loop-interchange' -pass-remarks-output=%t -loop-interchange-skip-zero-btc=false -S
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to be a problem that has existed for some time, but as for this case, the pseudo code and the IR appear to be inconsistent. I think it's a good opportunity to revise the exit condition.

@@ -1,5 +1,5 @@
; RUN: opt < %s -passes=loop-interchange -cache-line-size=64 -pass-remarks-missed='loop-interchange' -pass-remarks-output=%t -S \
; RUN: -verify-dom-info -verify-loop-info -verify-loop-lcssa -stats 2>&1
; RUN: -loop-interchange-skip-zero-btc=false -verify-dom-info -verify-loop-info -verify-loop-lcssa -stats 2>&1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only case where it's unclear to me what's being tested.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the commit message from when this test was added, this also appears to be a case that BTC=0 is not required.

Comment on lines 104 to 108
// This is mainly for testing purposes, and certain tests that rely on
// behaviour that is more difficult to trigger otherwise.
static cl::opt<bool> SkipLoopsWithZeroBTC(
"loop-interchange-skip-zero-btc", cl::init(true), cl::Hidden,
cl::desc("Do not consider loops with a backedge taken count of 0"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to simply changing the exit check for almost all cases.

sjoerdmeijer added a commit to sjoerdmeijer/llvm-project that referenced this pull request Nov 12, 2025
@sjoerdmeijer
Copy link
Collaborator Author

I have fixed the test cases in #167748

I will need to rebase this change on top of that; we then probably don't need the option.

I will also see if I can move the isZero() BTC check around: instead of rejecting the whole loop nest, maybe only reject that loop as candidate. If for one reason or another that is difficult, then I will keep it as it is, but will at least add another test case for that.

sjoerdmeijer added a commit that referenced this pull request Nov 14, 2025
Do not consider loops with a zero backedge taken count as candidates for
interchange. This seems like a sensible thing to do to me, because it
suggests the loop doesn't execute and there is no point in
interchanging.

This avoids triggering an assert about phis and their uses. I have a
feeling that this fix might be hiding the issue, but I haven't yet been
able to trigger the assert with other test cases; every time the loops
are rejected for other reasons.

Since I think this is a self-contained improvement that avoids a lot
of test failures, I propose to reject this loops while I investigate
further if I can still trigger this in some way.

(Partial) fix for llvm#163954
@sjoerdmeijer
Copy link
Collaborator Author

The last update address @kasuga-fj 's remark:

  • It moves the code from isComputableLoopNest to avoid that a whole loop nest is rejected, and
  • a new test has been added to check that ,
  • The checks now live in isProfitable,
  • A bit of code has been added to avoid recalculating the BTC; the BTC is calculated early on, and that is now passed on to the later stage.
  • Some existing tests needed patching up because they still had a BTC=0, I missed that in the clean up patch [LoopInterchange] Fix tests with loops that have BTC=0. NFC. #167748.

@github-actions
Copy link

github-actions bot commented Nov 17, 2025

🐧 Linux x64 Test Results

  • 186374 tests passed
  • 4855 tests skipped

Copy link
Contributor

@kasuga-fj kasuga-fj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please merge or rebase main instead of committing the same changes as #167748?

LPMUpdater &U) {
Function &F = *LN.getParent();
SmallVector<Loop *, 8> LoopList(LN.getLoops());
std::map<const Loop *, const SCEV *> LoopBTC;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ScalarEvolution caches the results of BTC. In general, it would be better to call getBackedgeTakenCount every time than holding local cache unless there are special circumstances.


@D = common global [100 x [100 x [100 x i32]]] zeroinitializer

; Test for interchange in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a brief description to clarify the purpose of this test? Like "check the loop with BTC=0 is considered unprofitable".


inner2.body:
%k = phi i64 [ 0, %inner1.header ], [ %inc, %inner2.body ]
%arrayidx8 = getelementptr inbounds [100 x [100 x [100 x i32]]], ptr @D, i64 0, i64 %i, i64 %k, i64 %j
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
%arrayidx8 = getelementptr inbounds [100 x [100 x [100 x i32]]], ptr @D, i64 0, i64 %i, i64 %k, i64 %j
%arrayidx8 = getelementptr inbounds [100 x [100 x i32]], ptr @D, i64 %i, i64 %k, i64 %j

I remember hearing somewhere that this is the canonical form.

@sjoerdmeijer
Copy link
Collaborator Author

Can you please merge or rebase main instead of committing the same changes as #167748?

This is rebased/merged. The changes here to pr43326.ll and pr57148.ll should indeed have been included in #167748, but I missed them earlier. They were a bit masked by other things, and I discovered them during the rebase/changes here.

Thanks for the review and remarks, I am now addressing them.

@kasuga-fj
Copy link
Contributor

This is rebased/merged. The changes here to pr43326.ll and pr57148.ll should indeed have been included in #167748, but I missed them earlier. They were a bit masked by other things, and I discovered them during the rebase/changes here.

Ah, I see, I misunderstood.

Copy link
Contributor

@kasuga-fj kasuga-fj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo some nits.

#include "llvm/Transforms/Utils/Local.h"
#include "llvm/Transforms/Utils/LoopUtils.h"
#include <cassert>
#include <map>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#include <map>

Maybe unnecessary?

Comment on lines 1466 to 1467
// A loop with a backedge that isn't taken, e.g. an unconditional branch
// true, isn't really a loop and we don't want to consider it as a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a loop by definition, even if BTC=0. I feel this comment is a bit misleading.

Copy link
Member

@Meinersbur Meinersbur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sjoerdmeijer
Copy link
Collaborator Author

Thanks for the reviews!

@sjoerdmeijer sjoerdmeijer merged commit a2ddb02 into llvm:main Nov 19, 2025
10 checks passed
@sjoerdmeijer sjoerdmeijer deleted the interchange-no-loops branch November 19, 2025 13:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants