-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[MachineLICM] Rematerialize instructions that may be hoisted before LICM #158479
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -167,3 +167,48 @@ bool llvm::isCycleInvariant(const MachineCycle *Cycle, MachineInstr &I) { | |||||
// If we got this far, the instruction is cycle invariant! | ||||||
return true; | ||||||
} | ||||||
|
||||||
bool llvm::mayLoadFromGOTOrConstantPool(MachineInstr &MI) { | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
const. Also could just be directly a function of MachineInstr? |
||||||
assert(MI.mayLoad() && "Expected MI that loads!"); | ||||||
|
||||||
// If we lost memory operands, conservatively assume that the instruction | ||||||
// reads from everything.. | ||||||
if (MI.memoperands_empty()) | ||||||
return true; | ||||||
|
||||||
for (MachineMemOperand *MemOp : MI.memoperands()) | ||||||
if (const PseudoSourceValue *PSV = MemOp->getPseudoValue()) | ||||||
if (PSV->isGOT() || PSV->isConstantPool()) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. PSV->isConstant? Or MemOp->isInvariant? Not sure if we verify those are consistent |
||||||
return true; | ||||||
|
||||||
return false; | ||||||
} | ||||||
|
||||||
bool llvm::isSinkIntoCycleCandidate(MachineInstr &MI, MachineCycle *Cycle, | ||||||
MachineRegisterInfo *MRI, | ||||||
const TargetInstrInfo *TII) { | ||||||
// Not sinking meta instruction. | ||||||
if (MI.isMetaInstruction()) | ||||||
return false; | ||||||
// Instruction not a candidate for this target. | ||||||
if (!TII->shouldSink(MI)) | ||||||
return false; | ||||||
// Instruction is not cycle invariant. | ||||||
if (!isCycleInvariant(Cycle, MI)) | ||||||
return false; | ||||||
// Instruction not safe to move. | ||||||
bool DontMoveAcrossStore = true; | ||||||
if (!MI.isSafeToMove(DontMoveAcrossStore)) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there any way to make this less restrictive than always assuming an aliasing store? Some way to do analysis such that we can determine that moving across a store isn't an issue? In my personal case, which is a matrix multiplication loading from/storing to global arrays, this condition blocks sinking. Pre-ISel, PRE/LICM is happy to hoist everything out and cause problems, but once we get here, we would be unable to sink things back in. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think technically there is preexisting code that could be reused to do load/store aliasing checks... llvm-project/llvm/lib/CodeGen/MachineSink.cpp Line 1650 in dfad983
Though IMO that'd probably be out of scope of the draft PR... |
||||||
return false; | ||||||
// Dont sink GOT or constant pool loads. | ||||||
if (MI.mayLoad() && !mayLoadFromGOTOrConstantPool(MI)) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this conditional reversed? If it's a load, and it doesn't load from GOT or constant pool, return false. This was likely copy-pasted from MachineSink, because that's where I saw this first. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Well, it does seem that way. I'll recheck this. |
||||||
return false; | ||||||
if (MI.isConvergent()) | ||||||
return false; | ||||||
const MachineOperand &MO = MI.getOperand(0); | ||||||
if (!MO.isReg() || !MO.getReg() || !MO.isDef()) | ||||||
return false; | ||||||
if (!MRI->hasOneDef(MO.getReg())) | ||||||
return false; | ||||||
return true; | ||||||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -98,6 +98,12 @@ DisableHoistingToHotterBlocks("disable-hoisting-to-hotter-blocks", | |
clEnumValN(UseBFI::All, "all", | ||
"enable the feature with/wo profile data"))); | ||
|
||
static cl::opt<bool> SinkInstsIntoCycleBeforeLICM( | ||
"sink-insts-before-licm", | ||
cl::desc("Sink instructions into cycles to avoid " | ||
"register spills"), | ||
cl::init(false), cl::Hidden); | ||
|
||
STATISTIC(NumHoisted, | ||
"Number of machine instructions hoisted out of loops"); | ||
STATISTIC(NumLowRP, | ||
|
@@ -287,6 +293,8 @@ namespace { | |
bool isTgtHotterThanSrc(MachineBasicBlock *SrcBlock, | ||
MachineBasicBlock *TgtBlock); | ||
MachineBasicBlock *getOrCreatePreheader(MachineLoop *CurLoop); | ||
|
||
bool rematerializeIntoCycle(MachineCycle *Cycle, MachineInstr &I); | ||
}; | ||
|
||
class MachineLICMBase : public MachineFunctionPass { | ||
|
@@ -304,7 +312,11 @@ namespace { | |
AU.addRequired<MachineBlockFrequencyInfoWrapperPass>(); | ||
AU.addRequired<MachineDominatorTreeWrapperPass>(); | ||
AU.addRequired<AAResultsWrapperPass>(); | ||
if (PreRegAlloc) | ||
AU.addRequired<MachineCycleInfoWrapperPass>(); | ||
AU.addPreserved<MachineLoopInfoWrapperPass>(); | ||
if (PreRegAlloc) | ||
AU.addPreserved<MachineCycleInfoWrapperPass>(); | ||
MachineFunctionPass::getAnalysisUsage(AU); | ||
} | ||
}; | ||
|
@@ -348,6 +360,7 @@ INITIALIZE_PASS_DEPENDENCY(MachineLoopInfoWrapperPass) | |
INITIALIZE_PASS_DEPENDENCY(MachineBlockFrequencyInfoWrapperPass) | ||
INITIALIZE_PASS_DEPENDENCY(MachineDominatorTreeWrapperPass) | ||
INITIALIZE_PASS_DEPENDENCY(AAResultsWrapperPass) | ||
INITIALIZE_PASS_DEPENDENCY(MachineCycleInfoWrapperPass) | ||
INITIALIZE_PASS_END(EarlyMachineLICM, "early-machinelicm", | ||
"Early Machine Loop Invariant Code Motion", false, false) | ||
|
||
|
@@ -396,6 +409,26 @@ bool MachineLICMImpl::run(MachineFunction &MF) { | |
LLVM_DEBUG(dbgs() << MF.getName() << " ********\n"); | ||
|
||
if (PreRegAlloc) { | ||
if (SinkInstsIntoCycleBeforeLICM) { | ||
auto *CI = GET_RESULT(MachineCycle, getCycleInfo, Info); | ||
SmallVector<MachineCycle *, 8> Cycles(CI->toplevel_cycles()); | ||
for (auto *Cycle : Cycles) { | ||
MachineBasicBlock *Preheader = Cycle->getCyclePreheader(); | ||
if (!Preheader) { | ||
LLVM_DEBUG(dbgs() << "Rematerialization: Can't find preheader\n"); | ||
continue; | ||
} | ||
SmallVector<MachineInstr *, 8> Candidates; | ||
for (auto &MI : *Preheader) | ||
if (isSinkIntoCycleCandidate(MI, Cycle, MRI, TII)) | ||
Candidates.push_back(&MI); | ||
// Walk the candidates in reverse order so that we start with the use | ||
// of a def-use chain, if there is any. | ||
for (MachineInstr *I : llvm::reverse(Candidates)) | ||
if (rematerializeIntoCycle(Cycle, *I)) | ||
Changed = true; | ||
} | ||
} | ||
// Estimate register pressure during pre-regalloc pass. | ||
unsigned NumRPS = TRI->getNumRegPressureSets(); | ||
RegPressure.resize(NumRPS); | ||
|
@@ -1005,24 +1038,6 @@ MachineLICMImpl::calcRegisterCost(const MachineInstr *MI, bool ConsiderSeen, | |
return Cost; | ||
} | ||
|
||
/// Return true if this machine instruction loads from global offset table or | ||
/// constant pool. | ||
static bool mayLoadFromGOTOrConstantPool(MachineInstr &MI) { | ||
assert(MI.mayLoad() && "Expected MI that loads!"); | ||
|
||
// If we lost memory operands, conservatively assume that the instruction | ||
// reads from everything.. | ||
if (MI.memoperands_empty()) | ||
return true; | ||
|
||
for (MachineMemOperand *MemOp : MI.memoperands()) | ||
if (const PseudoSourceValue *PSV = MemOp->getPseudoValue()) | ||
if (PSV->isGOT() || PSV->isConstantPool()) | ||
return true; | ||
|
||
return false; | ||
} | ||
|
||
// This function iterates through all the operands of the input store MI and | ||
// checks that each register operand statisfies isCallerPreservedPhysReg. | ||
// This means, the value being stored and the address where it is being stored | ||
|
@@ -1744,13 +1759,97 @@ bool MachineLICMImpl::isTgtHotterThanSrc(MachineBasicBlock *SrcBlock, | |
return Ratio > BlockFrequencyRatioThreshold; | ||
} | ||
|
||
/// Rematerialize instructions into cycles before Machine LICM, | ||
/// since LICM in the middle-end hoisted every instructions without considering | ||
/// register pressure. | ||
bool MachineLICMImpl::rematerializeIntoCycle(MachineCycle *Cycle, | ||
MachineInstr &I) { | ||
LLVM_DEBUG(dbgs() << "Rematerialization: Finding sink block for: " << I); | ||
MachineBasicBlock *Preheader = Cycle->getCyclePreheader(); | ||
assert(Preheader && "Cycle sink needs a preheader block"); | ||
MachineBasicBlock *SinkBlock = nullptr; | ||
const MachineOperand &MO = I.getOperand(0); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Because this is only checking the first def of MachineInstr You should change this to either account for all non-dead defs of |
||
for (MachineInstr &MI : MRI->use_instructions(MO.getReg())) { | ||
LLVM_DEBUG(dbgs() << "Rematerialization: Analysing use: " << MI); | ||
if (!Cycle->contains(MI.getParent())) { | ||
LLVM_DEBUG( | ||
dbgs() << "Rematerialization: Use not in cycle, can't sink.\n"); | ||
return false; | ||
} | ||
if (!SinkBlock) { | ||
SinkBlock = MI.getParent(); | ||
LLVM_DEBUG(dbgs() << "Rematerialization: Setting sink block to: " | ||
<< printMBBReference(*SinkBlock) << "\n"); | ||
continue; | ||
} | ||
if (MI.isPHI()) { | ||
for (unsigned I = 1; I != MI.getNumOperands(); I += 2) { | ||
Register SrcReg = MI.getOperand(I).getReg(); | ||
if (TRI->regsOverlap(SrcReg, MO.getReg())) { | ||
MachineBasicBlock *SrcBB = MI.getOperand(I + 1).getMBB(); | ||
if (SrcBB != SinkBlock) { | ||
SinkBlock = | ||
MDTU->getDomTree().findNearestCommonDominator(SinkBlock, SrcBB); | ||
if (!SinkBlock) | ||
break; | ||
} | ||
} | ||
} | ||
} else { | ||
SinkBlock = MDTU->getDomTree().findNearestCommonDominator(SinkBlock, | ||
MI.getParent()); | ||
} | ||
if (!SinkBlock) { | ||
LLVM_DEBUG( | ||
dbgs() << "Rematerialization: Can't find nearest dominator\n"); | ||
return false; | ||
} | ||
LLVM_DEBUG( | ||
dbgs() << "Rematerialization: Setting nearest common dom block: " | ||
<< printMBBReference(*SinkBlock) << "\n"); | ||
} | ||
if (!SinkBlock) { | ||
LLVM_DEBUG( | ||
dbgs() << "Rematerialization: Not sinking, can't find sink block.\n"); | ||
return false; | ||
} | ||
if (SinkBlock == Preheader) { | ||
LLVM_DEBUG( | ||
dbgs() | ||
<< "Rematerialization: Not sinking, sink block is the preheader\n"); | ||
return false; | ||
} | ||
for (MachineInstr &MI : MRI->use_instructions(MO.getReg())) { | ||
if (MI.isPHI() && MI.getParent() == SinkBlock) { | ||
LLVM_DEBUG(dbgs() << "Rematerialization: Not sinking, sink block is " | ||
"using it on PHI.\n"); | ||
return false; | ||
} | ||
} | ||
LLVM_DEBUG(dbgs() << "Rematerialization: Sinking instruction!\n"); | ||
SinkBlock->splice(SinkBlock->SkipPHIsAndLabels(SinkBlock->begin()), Preheader, | ||
I); | ||
// Conservatively clear any kill flags on uses of sunk instruction | ||
for (MachineOperand &MO : I.operands()) { | ||
if (MO.isReg() && MO.readsReg()) | ||
MRI->clearKillFlags(MO.getReg()); | ||
} | ||
// The instruction is moved from its basic block, so do not retain the | ||
// debug information. | ||
assert(!I.isDebugInstr() && "Should not sink debug inst"); | ||
I.setDebugLoc(DebugLoc()); | ||
return true; | ||
} | ||
|
||
template <typename DerivedT, bool PreRegAlloc> | ||
PreservedAnalyses MachineLICMBasePass<DerivedT, PreRegAlloc>::run( | ||
MachineFunction &MF, MachineFunctionAnalysisManager &MFAM) { | ||
bool Changed = MachineLICMImpl(PreRegAlloc, nullptr, &MFAM).run(MF); | ||
if (!Changed) | ||
return PreservedAnalyses::all(); | ||
auto PA = getMachineFunctionPassPreservedAnalyses(); | ||
if (PreRegAlloc) | ||
PA.preserve<MachineCycleAnalysis>(); | ||
PA.preserve<MachineLoopAnalysis>(); | ||
return PA; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this check so specific? Should it really be checking for invariant loads?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps it's unnecessary. I'll check it later.