Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions llvm/include/llvm/CodeGen/TargetLowering.h
Original file line number Diff line number Diff line change
Expand Up @@ -5132,6 +5132,11 @@ class LLVM_ABI TargetLowering : public TargetLoweringBase {
/// Memory, Other, Unknown.
TargetLowering::ConstraintType ConstraintType = TargetLowering::C_Unknown;

/// The register may be folded. This is used if the constraint is "rm",
/// where we prefer using a register, but can fall back to a memory slot
/// under register pressure.
bool MayFoldRegister = false;

Comment on lines +5135 to +5139
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move to last field? Or could this be expressed as a new RegisterOrMemory ConstraintType?

Copy link
Collaborator Author

@bwendling bwendling Jul 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, done.

We already encode the 'MayBeFolded' in the InlineAsm::Flag value. If we go with a C_RegisterOrMemory type we can reclaim that bit. However, it amounts to the same, and there's not a lot more we can do with it beyond "select one constraint and fall back to the other if the first one fails." I've toyed with the idea of having a completely new way of representing multi-constraints, but it's trickier than it first appears. For instance, we could encode all constraints in the INLINEASM machine instruction, selecting the "best" one during register allocation and discarding the rest. This runs into some issues though. First off, cleaning up the unused instructions might be tricky because it relies upon memory analysis, etc. The back end does DCE, but I'm worried that it wouldn't remove all instructions.

So I've been going down the rabbit hole that's been suggested in this PR: forcing the fast register allocator to spill all "foldable" registers. As you can imagine, this is easier said than done. There are some situations where the code necessary to support a memory constraint simply doesn't exist (i.e. it wasn't generated by the selection DAG builder) and needs to be replicated. It'd be great if there was a general way to generate these instructions, but I haven't found it. (If there is a place where it's done and I just haven't found it, please let me know.) So I've resorted to searching for said generated instructions, and if I don't find them I cobble together the best versions I can and cross my fingers. This works for several cases, but not for all, as you can imagine.

I'm planning on severely limiting the scope of this feature to "simple" inputs/outputs---no tied constraints, etc.---on X86 and calling it a day. It's a half-step forward, and hacky like nothing else, but the chewing gum and Flex TAPE should hold for the instances we actually care about...

/// If this is the result output operand or a clobber, this is null,
/// otherwise it is the incoming operand to the CallInst. This gets
/// modified as the asm is processed.
Expand Down
25 changes: 15 additions & 10 deletions llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1014,7 +1014,8 @@ void RegsForValue::getCopyToRegs(SDValue Val, SelectionDAG &DAG,
}

void RegsForValue::AddInlineAsmOperands(InlineAsm::Kind Code, bool HasMatching,
unsigned MatchingIdx, const SDLoc &dl,
unsigned MatchingIdx,
bool MayFoldRegister, const SDLoc &dl,
SelectionDAG &DAG,
std::vector<SDValue> &Ops) const {
const TargetLowering &TLI = DAG.getTargetLoweringInfo();
Expand All @@ -1030,7 +1031,9 @@ void RegsForValue::AddInlineAsmOperands(InlineAsm::Kind Code, bool HasMatching,
// from the def.
const MachineRegisterInfo &MRI = DAG.getMachineFunction().getRegInfo();
const TargetRegisterClass *RC = MRI.getRegClass(Regs.front());

Flag.setRegClass(RC->getID());
Flag.setRegMayBeFolded(MayFoldRegister);
}

SDValue Res = DAG.getTargetConstant(Flag, dl, MVT::i32);
Expand Down Expand Up @@ -10132,8 +10135,8 @@ void SelectionDAGBuilder::visitInlineAsm(const CallBase &Call,
AsmNodeOperands.push_back(OpInfo.CallOperand);
} else {
// Otherwise, this outputs to a register (directly for C_Register /
// C_RegisterClass, and a target-defined fashion for
// C_Immediate/C_Other). Find a register that we can use.
// C_RegisterClass, and a target-defined fashion for C_Immediate /
// C_Other). Find a register that we can use.
if (OpInfo.AssignedRegs.Regs.empty()) {
emitInlineAsmError(
Call, "couldn't allocate output register for constraint '" +
Expand All @@ -10149,7 +10152,8 @@ void SelectionDAGBuilder::visitInlineAsm(const CallBase &Call,
OpInfo.AssignedRegs.AddInlineAsmOperands(
OpInfo.isEarlyClobber ? InlineAsm::Kind::RegDefEarlyClobber
: InlineAsm::Kind::RegDef,
false, 0, getCurSDLoc(), DAG, AsmNodeOperands);
false, 0, OpInfo.MayFoldRegister, getCurSDLoc(), DAG,
AsmNodeOperands);
}
break;

Expand Down Expand Up @@ -10191,9 +10195,9 @@ void SelectionDAGBuilder::visitInlineAsm(const CallBase &Call,
SDLoc dl = getCurSDLoc();
// Use the produced MatchedRegs object to
MatchedRegs.getCopyToRegs(InOperandVal, DAG, dl, Chain, &Glue, &Call);
MatchedRegs.AddInlineAsmOperands(InlineAsm::Kind::RegUse, true,
OpInfo.getMatchedOperand(), dl, DAG,
AsmNodeOperands);
MatchedRegs.AddInlineAsmOperands(
InlineAsm::Kind::RegUse, true, OpInfo.getMatchedOperand(),
OpInfo.MayFoldRegister, dl, DAG, AsmNodeOperands);
break;
}

Expand Down Expand Up @@ -10325,16 +10329,17 @@ void SelectionDAGBuilder::visitInlineAsm(const CallBase &Call,
&Call);

OpInfo.AssignedRegs.AddInlineAsmOperands(InlineAsm::Kind::RegUse, false,
0, dl, DAG, AsmNodeOperands);
0, OpInfo.MayFoldRegister, dl,
DAG, AsmNodeOperands);
break;
}
case InlineAsm::isClobber:
// Add the clobbered value to the operand list, so that the register
// allocator is aware that the physreg got clobbered.
if (!OpInfo.AssignedRegs.Regs.empty())
OpInfo.AssignedRegs.AddInlineAsmOperands(InlineAsm::Kind::Clobber,
false, 0, getCurSDLoc(), DAG,
AsmNodeOperands);
false, 0, false, getCurSDLoc(),
DAG, AsmNodeOperands);
break;
}
}
Expand Down
5 changes: 3 additions & 2 deletions llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h
Original file line number Diff line number Diff line change
Expand Up @@ -784,8 +784,9 @@ struct RegsForValue {
/// code marker, matching input operand index (if applicable), and includes
/// the number of values added into it.
void AddInlineAsmOperands(InlineAsm::Kind Code, bool HasMatching,
unsigned MatchingIdx, const SDLoc &dl,
SelectionDAG &DAG, std::vector<SDValue> &Ops) const;
unsigned MatchingIdx, bool MayFoldRegister,
const SDLoc &dl, SelectionDAG &DAG,
std::vector<SDValue> &Ops) const;

/// Check if the total RegCount is greater than one.
bool occupiesMultipleRegs() const {
Expand Down
30 changes: 29 additions & 1 deletion llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
#include "llvm/CodeGen/MachineRegisterInfo.h"
#include "llvm/CodeGen/SDPatternMatch.h"
#include "llvm/CodeGen/SelectionDAG.h"
#include "llvm/CodeGen/TargetPassConfig.h"
#include "llvm/CodeGen/TargetRegisterInfo.h"
#include "llvm/IR/DataLayout.h"
#include "llvm/IR/DerivedTypes.h"
Expand All @@ -35,6 +36,7 @@
#include "llvm/Support/KnownBits.h"
#include "llvm/Support/MathExtras.h"
#include "llvm/Target/TargetMachine.h"
#include "llvm/TargetParser/Triple.h"
#include <cctype>
#include <deque>
using namespace llvm;
Expand Down Expand Up @@ -5899,6 +5901,7 @@ TargetLowering::ParseConstraints(const DataLayout &DL,
unsigned ResNo = 0; // ResNo - The result number of the next output.
unsigned LabelNo = 0; // LabelNo - CallBr indirect dest number.

const Triple &T = getTargetMachine().getTargetTriple();
for (InlineAsm::ConstraintInfo &CI : IA->ParseConstraints()) {
ConstraintOperands.emplace_back(std::move(CI));
AsmOperandInfo &OpInfo = ConstraintOperands.back();
Expand All @@ -5909,6 +5912,16 @@ TargetLowering::ParseConstraints(const DataLayout &DL,

OpInfo.ConstraintVT = MVT::Other;

// Special treatment for all platforms (currently only x86) that can fold a
// register into a spill. This is used for the "rm" constraint, where we
// would vastly prefer to use 'r' over 'm', but can't because of LLVM's
// architecture picks the most "conservative" constraint to ensure that (in
// the case of "rm") register pressure cause bad things to happen.
if (T.isX86() && !OpInfo.hasMatchingInput() && OpInfo.Codes.size() == 2 &&
llvm::is_contained(OpInfo.Codes, "r") &&
llvm::is_contained(OpInfo.Codes, "m"))
OpInfo.MayFoldRegister = true;

// Compute the value type for each operand.
switch (OpInfo.Type) {
case InlineAsm::isOutput:
Expand Down Expand Up @@ -6189,14 +6202,29 @@ TargetLowering::ConstraintWeight
/// 1) If there is an 'other' constraint, and if the operand is valid for
/// that constraint, use it. This makes us take advantage of 'i'
/// constraints when available.
/// 2) Otherwise, pick the most general constraint present. This prefers
/// 2) Special processing is done for the "rm" constraint. If specified, we
/// opt for the 'r' constraint, but mark the operand as being "foldable."
/// In the face of register exhaustion, the register allocator is free to
/// choose to use a stack slot. This only applies to the greedy and default
/// register allocators. FIXME: Support other allocators (fast?).
/// 3) Otherwise, pick the most general constraint present. This prefers
/// 'm' over 'r', for example.
///
TargetLowering::ConstraintGroup TargetLowering::getConstraintPreferences(
TargetLowering::AsmOperandInfo &OpInfo) const {
ConstraintGroup Ret;

Ret.reserve(OpInfo.Codes.size());

// If we can fold the register (i.e. it has an "rm" constraint), opt for the
// 'r' constraint, and allow the register allocator to spill if need be.
// Applies only to the greedy and default register allocators.
if (OpInfo.MayFoldRegister) {
Ret.emplace_back(ConstraintPair("r", getConstraintType("r")));
Ret.emplace_back(ConstraintPair("m", getConstraintType("m")));
return Ret;
}

for (StringRef Code : OpInfo.Codes) {
TargetLowering::ConstraintType CType = getConstraintType(Code);

Expand Down
Loading
Loading