Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion llvm/include/llvm/CodeGen/SelectionDAG.h
Original file line number Diff line number Diff line change
Expand Up @@ -1959,6 +1959,10 @@ class SelectionDAG {
LLVM_ABI SDValue makeEquivalentMemoryOrdering(LoadSDNode *OldLoad,
SDValue NewMemOp);

/// Get all the nodes in their topological order without modifying any states.
LLVM_ABI void getTopologicallyOrderedNodes(
SmallVectorImpl<const SDNode *> &SortedNodes) const;

/// Topological-sort the AllNodes list and a
/// assign a unique node id for each node in the DAG based on their
/// topological order. Returns the number of nodes.
Expand Down Expand Up @@ -2009,7 +2013,9 @@ class SelectionDAG {
/// function mirrors \c llvm::salvageDebugInfo.
LLVM_ABI void salvageDebugInfo(SDNode &N);

LLVM_ABI void dump() const;
/// Dump the textual format of this DAG. Print nodes in sorted orders if \p
/// Sorted is true.
LLVM_ABI void dump(bool Sorted = false) const;

/// In most cases this function returns the ABI alignment for a given type,
/// except for illegal vector types where the alignment exceeds that of the
Expand Down
129 changes: 49 additions & 80 deletions llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -12610,87 +12610,56 @@ void SelectionDAG::ReplaceAllUsesOfValuesWith(const SDValue *From,
/// based on their topological order. It returns the maximum id and a vector
/// of the SDNodes* in assigned order by reference.
unsigned SelectionDAG::AssignTopologicalOrder() {
unsigned DAGSize = 0;

// SortedPos tracks the progress of the algorithm. Nodes before it are
// sorted, nodes after it are unsorted. When the algorithm completes
// it is at the end of the list.
allnodes_iterator SortedPos = allnodes_begin();

// Visit all the nodes. Move nodes with no operands to the front of
// the list immediately. Annotate nodes that do have operands with their
// operand count. Before we do this, the Node Id fields of the nodes
// may contain arbitrary values. After, the Node Id fields for nodes
// before SortedPos will contain the topological sort index, and the
// Node Id fields for nodes At SortedPos and after will contain the
// count of outstanding operands.
for (SDNode &N : llvm::make_early_inc_range(allnodes())) {
SmallVector<const SDNode *> SortedNodes(AllNodes.size());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this vector have any compile time cost on large DAGs?

Copy link
Member Author

@mshockwave mshockwave Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for the very least, it'll be slower to not reserving enough space for all nodes ahead of time. But other than that, yeah the heap allocation time might add up if we're going to create this vector every time we dump a DAG. One solution is allowing users of this API to pass in a pre-allocated (auxiliary) vector to avoid the problem we're discussing here, what do you think?

EDIT: I thought this is the dump API, but I think having a pre-allocated space might still be an option.

Copy link
Collaborator

@topperc topperc Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern is that this added a vector to the non-dumping path too. Directly related to the downside you listed

The only downside I can think of is that while AssignTopologicalOrder sorts nodes in-place, getTopologicallyOrderedNodes uses auxiliary spaces to store nodes and metadata.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can also leave AssignTopologicalOrder untouched and only use getTopologicallyOrderedNodes in SelectionDAG::dump

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I've reverted the code of AssignTopologicalOrder to its original one.

getTopologicallyOrderedNodes(SortedNodes);

for (auto [Idx, ConstN] : enumerate(SortedNodes)) {
auto *N = const_cast<SDNode *>(ConstN);
N->setNodeId(Idx);
if (N->getIterator() != std::prev(allnodes_end()))
AllNodes.push_back(AllNodes.remove(N));
}

return SortedNodes.size();
}

void SelectionDAG::getTopologicallyOrderedNodes(
SmallVectorImpl<const SDNode *> &SortedNodes) const {
SortedNodes.clear();
// Node -> remaining number of outstanding operands.
DenseMap<const SDNode *, unsigned> RemainingOperands;

// Put nodes without any operands into SortedNodes first.
for (const SDNode &N : allnodes()) {
checkForCycles(&N, this);
unsigned Degree = N.getNumOperands();
if (Degree == 0) {
// A node with no uses, add it to the result array immediately.
N.setNodeId(DAGSize++);
allnodes_iterator Q(&N);
if (Q != SortedPos)
SortedPos = AllNodes.insert(SortedPos, AllNodes.remove(Q));
assert(SortedPos != AllNodes.end() && "Overran node list");
++SortedPos;
} else {
// Temporarily use the Node Id as scratch space for the degree count.
N.setNodeId(Degree);
}
}

// Visit all the nodes. As we iterate, move nodes into sorted order,
// such that by the time the end is reached all nodes will be sorted.
for (SDNode &Node : allnodes()) {
SDNode *N = &Node;
checkForCycles(N, this);
// N is in sorted position, so all its uses have one less operand
// that needs to be sorted.
for (SDNode *P : N->users()) {
unsigned Degree = P->getNodeId();
assert(Degree != 0 && "Invalid node degree");
--Degree;
if (Degree == 0) {
// All of P's operands are sorted, so P may sorted now.
P->setNodeId(DAGSize++);
if (P->getIterator() != SortedPos)
SortedPos = AllNodes.insert(SortedPos, AllNodes.remove(P));
assert(SortedPos != AllNodes.end() && "Overran node list");
++SortedPos;
} else {
// Update P's outstanding operand count.
P->setNodeId(Degree);
}
}
if (Node.getIterator() == SortedPos) {
#ifndef NDEBUG
allnodes_iterator I(N);
SDNode *S = &*++I;
dbgs() << "Overran sorted position:\n";
S->dumprFull(this); dbgs() << "\n";
dbgs() << "Checking if this is due to cycles\n";
checkForCycles(this, true);
#endif
llvm_unreachable(nullptr);
}
}

assert(SortedPos == AllNodes.end() &&
"Topological sort incomplete!");
assert(AllNodes.front().getOpcode() == ISD::EntryToken &&
"First node in topological sort is not the entry token!");
assert(AllNodes.front().getNodeId() == 0 &&
"First node in topological sort has non-zero id!");
assert(AllNodes.front().getNumOperands() == 0 &&
"First node in topological sort has operands!");
assert(AllNodes.back().getNodeId() == (int)DAGSize-1 &&
"Last node in topologic sort has unexpected id!");
assert(AllNodes.back().use_empty() &&
"Last node in topologic sort has users!");
assert(DAGSize == allnodes_size() && "Node count mismatch!");
return DAGSize;
unsigned NumOperands = N.getNumOperands();
if (NumOperands == 0)
SortedNodes.push_back(&N);
else
// Record their total number of outstanding operands.
RemainingOperands[&N] = NumOperands;
}

// A node is pushed into SortedNodes when all of its operands (predecessors in
// the graph) are also in SortedNodes.
for (unsigned i = 0U; i < SortedNodes.size(); ++i) {
const SDNode *N = SortedNodes[i];
for (const SDNode *U : N->users()) {
unsigned &NumRemOperands = RemainingOperands[U];
assert(NumRemOperands && "Invalid number of remaining operands");
--NumRemOperands;
if (!NumRemOperands)
SortedNodes.push_back(U);
}
}

assert(SortedNodes.size() == AllNodes.size() && "Node count mismatch");
assert(SortedNodes.front()->getOpcode() == ISD::EntryToken &&
"First node in topological sort is not the entry token");
assert(SortedNodes.front()->getNumOperands() == 0 &&
"First node in topological sort has operands");
assert(SortedNodes.back()->use_empty() &&
"Last node in topologic sort has users");
}

/// AddDbgValue - Add a dbg_value SDNode. If SD is non-null that means the
Expand Down
14 changes: 12 additions & 2 deletions llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1061,13 +1061,23 @@ static void DumpNodes(const SDNode *N, unsigned indent, const SelectionDAG *G) {
N->dump(G);
}

LLVM_DUMP_METHOD void SelectionDAG::dump() const {
LLVM_DUMP_METHOD void SelectionDAG::dump(bool Sorted) const {
dbgs() << "SelectionDAG has " << AllNodes.size() << " nodes:\n";

for (const SDNode &N : allnodes()) {
auto dumpEachNode = [this](const SDNode &N) {
if (!N.hasOneUse() && &N != getRoot().getNode() &&
(!shouldPrintInline(N, this) || N.use_empty()))
DumpNodes(&N, 2, this);
};

if (Sorted) {
SmallVector<const SDNode *> SortedNodes(AllNodes.size());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is initializing AllNodes.size() elements in the vector so it will perform a memset. getTopologicallyOrderedNodes will clear the vector which will change the size, but not the capacity back to 0. Should we use reserve instead to avoid the memset?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. It's updated now.

getTopologicallyOrderedNodes(SortedNodes);
for (const SDNode *N : SortedNodes)
dumpEachNode(*N);
} else {
for (const SDNode &N : allnodes())
dumpEachNode(N);
}

if (getRoot().getNode()) DumpNodes(getRoot().getNode(), 2, this);
Expand Down
25 changes: 15 additions & 10 deletions llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -144,6 +144,11 @@ UseMBPI("use-mbpi",
cl::init(true), cl::Hidden);

#ifndef NDEBUG
static cl::opt<bool>
DumpSortedDAG("dump-sorted-dags", cl::Hidden,
cl::desc("Print DAGs with sorted nodes in debug dump"),
cl::init(false));

static cl::opt<std::string>
FilterDAGBasicBlockName("filter-view-dags", cl::Hidden,
cl::desc("Only display the basic block whose name "
Expand Down Expand Up @@ -932,7 +937,7 @@ void SelectionDAGISel::CodeGenAndEmitDAG() {
ISEL_DUMP(dbgs() << "\nInitial selection DAG: "
<< printMBBReference(*FuncInfo->MBB) << " '" << BlockName
<< "'\n";
CurDAG->dump());
CurDAG->dump(DumpSortedDAG));

#if !defined(NDEBUG) && LLVM_ENABLE_ABI_BREAKING_CHECKS
if (TTI->hasBranchDivergence())
Expand All @@ -952,7 +957,7 @@ void SelectionDAGISel::CodeGenAndEmitDAG() {
ISEL_DUMP(dbgs() << "\nOptimized lowered selection DAG: "
<< printMBBReference(*FuncInfo->MBB) << " '" << BlockName
<< "'\n";
CurDAG->dump());
CurDAG->dump(DumpSortedDAG));

#if !defined(NDEBUG) && LLVM_ENABLE_ABI_BREAKING_CHECKS
if (TTI->hasBranchDivergence())
Expand All @@ -974,7 +979,7 @@ void SelectionDAGISel::CodeGenAndEmitDAG() {
ISEL_DUMP(dbgs() << "\nType-legalized selection DAG: "
<< printMBBReference(*FuncInfo->MBB) << " '" << BlockName
<< "'\n";
CurDAG->dump());
CurDAG->dump(DumpSortedDAG));

#if !defined(NDEBUG) && LLVM_ENABLE_ABI_BREAKING_CHECKS
if (TTI->hasBranchDivergence())
Expand All @@ -998,7 +1003,7 @@ void SelectionDAGISel::CodeGenAndEmitDAG() {
ISEL_DUMP(dbgs() << "\nOptimized type-legalized selection DAG: "
<< printMBBReference(*FuncInfo->MBB) << " '" << BlockName
<< "'\n";
CurDAG->dump());
CurDAG->dump(DumpSortedDAG));

#if !defined(NDEBUG) && LLVM_ENABLE_ABI_BREAKING_CHECKS
if (TTI->hasBranchDivergence())
Expand All @@ -1016,7 +1021,7 @@ void SelectionDAGISel::CodeGenAndEmitDAG() {
ISEL_DUMP(dbgs() << "\nVector-legalized selection DAG: "
<< printMBBReference(*FuncInfo->MBB) << " '" << BlockName
<< "'\n";
CurDAG->dump());
CurDAG->dump(DumpSortedDAG));

#if !defined(NDEBUG) && LLVM_ENABLE_ABI_BREAKING_CHECKS
if (TTI->hasBranchDivergence())
Expand All @@ -1032,7 +1037,7 @@ void SelectionDAGISel::CodeGenAndEmitDAG() {
ISEL_DUMP(dbgs() << "\nVector/type-legalized selection DAG: "
<< printMBBReference(*FuncInfo->MBB) << " '" << BlockName
<< "'\n";
CurDAG->dump());
CurDAG->dump(DumpSortedDAG));

#if !defined(NDEBUG) && LLVM_ENABLE_ABI_BREAKING_CHECKS
if (TTI->hasBranchDivergence())
Expand All @@ -1052,7 +1057,7 @@ void SelectionDAGISel::CodeGenAndEmitDAG() {
ISEL_DUMP(dbgs() << "\nOptimized vector-legalized selection DAG: "
<< printMBBReference(*FuncInfo->MBB) << " '" << BlockName
<< "'\n";
CurDAG->dump());
CurDAG->dump(DumpSortedDAG));

#if !defined(NDEBUG) && LLVM_ENABLE_ABI_BREAKING_CHECKS
if (TTI->hasBranchDivergence())
Expand All @@ -1072,7 +1077,7 @@ void SelectionDAGISel::CodeGenAndEmitDAG() {
ISEL_DUMP(dbgs() << "\nLegalized selection DAG: "
<< printMBBReference(*FuncInfo->MBB) << " '" << BlockName
<< "'\n";
CurDAG->dump());
CurDAG->dump(DumpSortedDAG));

#if !defined(NDEBUG) && LLVM_ENABLE_ABI_BREAKING_CHECKS
if (TTI->hasBranchDivergence())
Expand All @@ -1092,7 +1097,7 @@ void SelectionDAGISel::CodeGenAndEmitDAG() {
ISEL_DUMP(dbgs() << "\nOptimized legalized selection DAG: "
<< printMBBReference(*FuncInfo->MBB) << " '" << BlockName
<< "'\n";
CurDAG->dump());
CurDAG->dump(DumpSortedDAG));

#if !defined(NDEBUG) && LLVM_ENABLE_ABI_BREAKING_CHECKS
if (TTI->hasBranchDivergence())
Expand All @@ -1116,7 +1121,7 @@ void SelectionDAGISel::CodeGenAndEmitDAG() {
ISEL_DUMP(dbgs() << "\nSelected selection DAG: "
<< printMBBReference(*FuncInfo->MBB) << " '" << BlockName
<< "'\n";
CurDAG->dump());
CurDAG->dump(DumpSortedDAG));

if (ViewSchedDAGs && MatchFilterBB)
CurDAG->viewGraph("scheduler input for " + BlockName);
Expand Down