Skip to content

Commit aadf55d

Browse files
committed
[NFC] EliminateDuplicatePHINodes(): small-size optimization: if there are <= 32 PHI's, O(n^2) algo is faster (geomean -0.08%)
This is functionally equivalent to the old implementation. As per https://llvm-compile-time-tracker.com/compare.php?from=5f4e9bf6416e45eba483a4e5e263749989fdb3b3&to=4739e6e4eb54d3736e6457249c0919b30f6c855a&stat=instructions this is a clear geomean compile-time regression-free win with overall geomean of `-0.08%` 32 PHI's appears to be the sweet spot; both the 16 and 64 performed worse: https://llvm-compile-time-tracker.com/compare.php?from=5f4e9bf6416e45eba483a4e5e263749989fdb3b3&to=c4efe1fbbfdf0305ac26cd19eacb0c7774cdf60e&stat=instructions https://llvm-compile-time-tracker.com/compare.php?from=5f4e9bf6416e45eba483a4e5e263749989fdb3b3&to=e4989d1c67010d3339d1a40ff5286a31f10cfe82&stat=instructions If we have more PHI's than that, we fall-back to the original DenseSet-based implementation, so the not-so-fast cases will still be handled. However compile-time isn't the main motivation here. I can name at least 3 limitations of this CSE: 1. Assumes that all PHI nodes have incoming basic blocks in the same order (can be fixed while keeping the DenseMap) 2. Does not special-handle `undef` incoming values (i don't see how we can do this with hashing) 3. Does not special-handle backedge incoming values (maybe can be fixed by hashing backedge as some magical value) Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D87408
1 parent 6f6d389 commit aadf55d

File tree

1 file changed

+52
-3
lines changed

1 file changed

+52
-3
lines changed

llvm/lib/Transforms/Utils/Local.cpp

Lines changed: 52 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -104,6 +104,12 @@ static cl::opt<bool> PHICSEDebugHash(
104104
cl::desc("Perform extra assertion checking to verify that PHINodes's hash "
105105
"function is well-behaved w.r.t. its isEqual predicate"));
106106

107+
static cl::opt<unsigned> PHICSENumPHISmallSize(
108+
"phicse-num-phi-smallsize", cl::init(32), cl::Hidden,
109+
cl::desc(
110+
"When the basic block contains not more than this number of PHI nodes, "
111+
"perform a (faster!) exhaustive search instead of set-driven one."));
112+
107113
// Max recursion depth for collectBitParts used when detecting bswap and
108114
// bitreverse idioms
109115
static const unsigned BitPartRecursionMaxDepth = 64;
@@ -1132,9 +1138,39 @@ bool llvm::TryToSimplifyUncondBranchFromEmptyBlock(BasicBlock *BB,
11321138
return true;
11331139
}
11341140

1135-
// WARNING: this logic must be kept in sync with
1136-
// Instruction::isIdenticalToWhenDefined()!
1137-
bool llvm::EliminateDuplicatePHINodes(BasicBlock *BB) {
1141+
static bool EliminateDuplicatePHINodesNaiveImpl(BasicBlock *BB) {
1142+
// This implementation doesn't currently consider undef operands
1143+
// specially. Theoretically, two phis which are identical except for
1144+
// one having an undef where the other doesn't could be collapsed.
1145+
1146+
bool Changed = false;
1147+
1148+
// Examine each PHI.
1149+
// Note that increment of I must *NOT* be in the iteration_expression, since
1150+
// we don't want to immediately advance when we restart from the beginning.
1151+
for (auto I = BB->begin(); PHINode *PN = dyn_cast<PHINode>(I);) {
1152+
++I;
1153+
// Is there an identical PHI node in this basic block?
1154+
// Note that we only look in the upper square's triangle,
1155+
// we already checked that the lower triangle PHI's aren't identical.
1156+
for (auto J = I; PHINode *DuplicatePN = dyn_cast<PHINode>(J); ++J) {
1157+
if (!DuplicatePN->isIdenticalToWhenDefined(PN))
1158+
continue;
1159+
// A duplicate. Replace this PHI with the base PHI.
1160+
++NumPHICSEs;
1161+
DuplicatePN->replaceAllUsesWith(PN);
1162+
DuplicatePN->eraseFromParent();
1163+
Changed = true;
1164+
1165+
// The RAUW can change PHIs that we already visited.
1166+
I = BB->begin();
1167+
break; // Start over from the beginning.
1168+
}
1169+
}
1170+
return Changed;
1171+
}
1172+
1173+
static bool EliminateDuplicatePHINodesSetBasedImpl(BasicBlock *BB) {
11381174
// This implementation doesn't currently consider undef operands
11391175
// specially. Theoretically, two phis which are identical except for
11401176
// one having an undef where the other doesn't could be collapsed.
@@ -1152,6 +1188,8 @@ bool llvm::EliminateDuplicatePHINodes(BasicBlock *BB) {
11521188
return PN == getEmptyKey() || PN == getTombstoneKey();
11531189
}
11541190

1191+
// WARNING: this logic must be kept in sync with
1192+
// Instruction::isIdenticalToWhenDefined()!
11551193
static unsigned getHashValueImpl(PHINode *PN) {
11561194
// Compute a hash value on the operands. Instcombine will likely have
11571195
// sorted them, which helps expose duplicates, but we have to check all
@@ -1191,6 +1229,7 @@ bool llvm::EliminateDuplicatePHINodes(BasicBlock *BB) {
11911229

11921230
// Set of unique PHINodes.
11931231
DenseSet<PHINode *, PHIDenseMapInfo> PHISet;
1232+
PHISet.reserve(4 * PHICSENumPHISmallSize);
11941233

11951234
// Examine each PHI.
11961235
bool Changed = false;
@@ -1213,6 +1252,16 @@ bool llvm::EliminateDuplicatePHINodes(BasicBlock *BB) {
12131252
return Changed;
12141253
}
12151254

1255+
bool llvm::EliminateDuplicatePHINodes(BasicBlock *BB) {
1256+
if (
1257+
#ifndef NDEBUG
1258+
!PHICSEDebugHash &&
1259+
#endif
1260+
hasNItemsOrLess(BB->phis(), PHICSENumPHISmallSize))
1261+
return EliminateDuplicatePHINodesNaiveImpl(BB);
1262+
return EliminateDuplicatePHINodesSetBasedImpl(BB);
1263+
}
1264+
12161265
/// enforceKnownAlignment - If the specified pointer points to an object that
12171266
/// we control, modify the object's alignment to PrefAlign. This isn't
12181267
/// often possible though. If alignment is important, a more reliable approach

0 commit comments

Comments
 (0)