Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 33 additions & 0 deletions llvm/lib/Transforms/Scalar/LoopInterchange.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@
#include "llvm/Transforms/Scalar/LoopPassManager.h"
#include "llvm/Transforms/Utils/BasicBlockUtils.h"
#include "llvm/Transforms/Utils/LoopUtils.h"
#include "llvm/Transforms/Utils/Local.h"
#include <cassert>
#include <utility>
#include <vector>
Expand Down Expand Up @@ -1837,6 +1838,38 @@ static void moveLCSSAPhis(BasicBlock *InnerExit, BasicBlock *InnerHeader,
for (PHINode *P : LcssaInnerLatch)
P->moveBefore(InnerExit->getFirstNonPHIIt());

// This deals with a corner case of LCSSA phi nodes in the outer loop latch
// block: the loop was in LCSSA form, some transformations can come along
// (e.g. unswitch) and create an empty block:
//
// BB4:
// br label %BB5
// BB5:
// %old.cond.lcssa = phi i16 [ %cond, %BB4 ]
// br outer.header
//
// Interchange then brings it in LCSSA form again and we get:
//
// BB4:
// %new.cond.lcssa = phi i16 [ %cond, %BB3 ]
// br label %BB5
// BB5:
// %old.cond.lcssa = phi i16 [ %new.cond.lcssa, %BB4 ]
//
// Which means that we have a chain of LCSSA phi nodes from %new.cond.lcssa
// to %old.cond.lcssa. The problem is that interchange can reoder blocks BB4
// and BB5 placing the use before the def if we don't check this. The
// observation is that %old.cond.lcssa is unused, so instead of moving and
// renaming these phi nodes, just delete it if it's trivially dead. If it
// isn't trivially dead, it is handled above. The loop should still be in
// LCSSA form, and if it isn't, formLCSSARecursively is called after the
// interchange rewrite.
SmallVector<PHINode *, 8> LcssaOuterLatch(
llvm::make_pointer_range(OuterLatch->phis()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider iterating over OuterLatch->phis(), and only adding those instructions to the list to be erased. That's the more established pattern.

for (PHINode *P : LcssaOuterLatch)
if (isInstructionTriviallyDead(P))
P->eraseFromParent();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely sure, but what happens if the phi node isn't dead? For example, in this case, if %old.cond.lcssa is used inside BB5, would interchange still generate ill-formed IR?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review.

It's a good point. I was claiming they will then be handled by the other checks. But let me go back and look at some examples and double check, and I guess at least some asserts are required here.

// Deal with LCSSA PHI nodes in the loop nest exit block. For PHIs that have
// incoming values defined in the outer loop, we have to add a new PHI
// in the inner loop latch, which became the exit block of the outer loop,
Expand Down
75 changes: 75 additions & 0 deletions llvm/test/Transforms/LoopInterchange/lcssa-phi-outer-latch.ll
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --prefix-filecheck-ir-name TEST --version 6
; RUN: opt < %s -passes=loop-interchange -cache-line-size=64 -verify-dom-info -verify-loop-info -verify-scev -verify-loop-lcssa -S | FileCheck %s

; This test is checking that blocks BB4 and BB5, where BB4 is the exit
; block of the inner loop and BB5 the latch of the outer loop, correctly
; deal with the phi-node use-def chain %new.cond.lcssa -> %old.cond.lcssa.

target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

define i16 @main() {
; CHECK-LABEL: define i16 @main() {
; CHECK-NEXT: [[ENTRY:.*:]]
; CHECK-NEXT: br label %[[BB2_PREHEADER:.*]]
; CHECK: [[BB1_PREHEADER:.*]]:
; CHECK-NEXT: br label %[[TESTBB1:.*]]
; CHECK: [[TESTBB1]]:
; CHECK-NEXT: [[I:%.*]] = phi i64 [ [[I_NEXT:%.*]], %[[BB5:.*]] ], [ 1, %[[BB1_PREHEADER]] ]
; CHECK-NEXT: br label %[[BB2_SPLIT:.*]]
; CHECK: [[BB2_PREHEADER]]:
; CHECK-NEXT: br label %[[TESTBB2:.*]]
; CHECK: [[TESTBB2]]:
; CHECK-NEXT: [[J:%.*]] = phi i16 [ [[TMP1:%.*]], %[[BB3_SPLIT:.*]] ], [ 0, %[[BB2_PREHEADER]] ]
; CHECK-NEXT: br label %[[BB1_PREHEADER]]
; CHECK: [[BB2_SPLIT]]:
; CHECK-NEXT: [[ARRAYIDX_US_US:%.*]] = getelementptr i16, ptr null, i16 [[J]]
; CHECK-NEXT: [[TMP0:%.*]] = load i16, ptr [[ARRAYIDX_US_US]], align 1
; CHECK-NEXT: [[COND:%.*]] = select i1 false, i16 0, i16 0
; CHECK-NEXT: br label %[[TESTBB3:.*]]
; CHECK: [[TESTBB3]]:
; CHECK-NEXT: [[J_NEXT:%.*]] = add i16 [[J]], 1
; CHECK-NEXT: br label %[[TESTBB4:.*]]
; CHECK: [[BB3_SPLIT]]:
; CHECK-NEXT: [[NEW_COND_LCSSA:%.*]] = phi i16 [ [[COND]], %[[BB5]] ]
; CHECK-NEXT: [[TMP1]] = add i16 [[J]], 1
; CHECK-NEXT: br i1 true, label %[[EXIT:.*]], label %[[TESTBB2]]
; CHECK: [[TESTBB4]]:
; CHECK-NEXT: br label %[[BB5]]
; CHECK: [[BB5]]:
; CHECK-NEXT: [[I_NEXT]] = add i64 [[I]], 1
; CHECK-NEXT: [[CMP286_US:%.*]] = icmp ugt i64 [[I]], 0
; CHECK-NEXT: br i1 [[CMP286_US]], label %[[TESTBB1]], label %[[BB3_SPLIT]]
; CHECK: [[EXIT]]:
; CHECK-NEXT: ret i16 0
;
entry:
br label %BB1

BB1:
%i = phi i64 [ 1, %entry ], [ %i.next, %BB5 ]
br label %BB2

BB2:
%j = phi i16 [ 0, %BB1 ], [ %j.next, %BB3 ]
%arrayidx.us.us = getelementptr i16, ptr null, i16 %j
%0 = load i16, ptr %arrayidx.us.us, align 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this always UB?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UB in tests are unfortunately common, but LoopInterchange does not care. Could replace null in getelementptr with a funcation argument or a global

%cond = select i1 false, i16 0, i16 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to define an arbitrary value, it might be better to use freeze i16 poison.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, interchange doesn't care, it's the reproducer from the bug report, and it's short.
But yeah, it's no effort to read from a function argument, so will do that and change this accordingly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, interchange doesn't care,

Future DA might care, when checking AliasAnalysis on whether the base pointers themselves alias.

br label %BB3

BB3:
%j.next = add i16 %j, 1
br i1 true, label %BB4, label %BB2

BB4:
%new.cond.lcssa = phi i16 [ %cond, %BB3 ]
br label %BB5

BB5:
%old.cond.lcssa = phi i16 [ %new.cond.lcssa, %BB4 ]
%i.next = add i64 %i, 1
%cmp286.us = icmp ugt i64 %i, 0
br i1 %cmp286.us, label %BB1, label %exit

exit:
ret i16 0
}
Loading