Skip to content

Conversation

@gulfemsavrun
Copy link
Contributor

@gulfemsavrun gulfemsavrun commented Sep 16, 2025

This patch significantly optimizes the LiveElementPrinter
by replacing a slow linear search with efficient hash map
lookups. It refactors the code to use a map-based system
for tracking live element addresses and managing column
assignments, leading to a major performance improvement
for large binaries.

@llvmbot
Copy link
Member

llvmbot commented Sep 16, 2025

@llvm/pr-subscribers-llvm-binary-utilities

Author: None (gulfemsavrun)

Changes

This patch significantly optimizes the LiveElementPrinter by replacing a slow linear search with efficient hash map lookups. It refactors the code to use a map-based system for tracking live element addresses and managing column assignments, leading to a major performance improvement for large binaries.


Full diff: https://github.com/llvm/llvm-project/pull/158763.diff

2 Files Affected:

  • (modified) llvm/tools/llvm-objdump/SourcePrinter.cpp (+218-52)
  • (modified) llvm/tools/llvm-objdump/SourcePrinter.h (+26-6)
diff --git a/llvm/tools/llvm-objdump/SourcePrinter.cpp b/llvm/tools/llvm-objdump/SourcePrinter.cpp
index b0ff89da97123..6dd559142a25f 100644
--- a/llvm/tools/llvm-objdump/SourcePrinter.cpp
+++ b/llvm/tools/llvm-objdump/SourcePrinter.cpp
@@ -52,6 +52,8 @@ void InlinedFunction::printElementLine(raw_ostream &OS,
                                        bool IsEnd) const {
   bool LiveIn = !IsEnd && Range.LowPC == Addr.Address;
   bool LiveOut = IsEnd && Range.HighPC == Addr.Address;
+  // This check is technically redundant as the function is only called when
+  // either a start or end address matches, but it serves as a small safeguard.
   if (!(LiveIn || LiveOut))
     return;
 
@@ -126,8 +128,18 @@ void LiveElementPrinter::addInlinedFunction(DWARFDie FuncDie,
   DWARFUnit *U = InlinedFuncDie.getDwarfUnit();
   const char *InlinedFuncName = InlinedFuncDie.getName(DINameKind::LinkageName);
   DWARFAddressRange Range{FuncLowPC, FuncHighPC, SectionIndex};
+  // Add the new element to the main vector.
   LiveElements.emplace_back(std::make_unique<InlinedFunction>(
       InlinedFuncName, U, FuncDie, InlinedFuncDie, Range));
+  // Map the element's low address (LowPC) to its pointer for fast range start
+  // lookup.
+  LiveElementsByAddress.emplace(FuncLowPC, LiveElements.back().get());
+  // Map the element's high address (HighPC) to its pointer for fast range end
+  // lookup.
+  LiveElementsByEndAddress.emplace(FuncHighPC, LiveElements.back().get());
+  // Map the pointer to its DWARF discovery index (ElementIdx) for deterministic
+  // ordering.
+  ElementPtrToIndex[LiveElements.back().get()] = LiveElements.size() - 1;
 }
 
 void LiveElementPrinter::addVariable(DWARFDie FuncDie, DWARFDie VarDie) {
@@ -147,9 +159,9 @@ void LiveElementPrinter::addVariable(DWARFDie FuncDie, DWARFDie VarDie) {
   }
 
   for (const DWARFLocationExpression &LocExpr : *Locs) {
+    std::unique_ptr<LiveVariable> NewVar;
     if (LocExpr.Range) {
-      LiveElements.emplace_back(
-          std::make_unique<LiveVariable>(LocExpr, VarName, U, FuncDie));
+      NewVar = std::make_unique<LiveVariable>(LocExpr, VarName, U, FuncDie);
     } else {
       // If the LocExpr does not have an associated range, it is valid for
       // the whole of the function.
@@ -157,8 +169,24 @@ void LiveElementPrinter::addVariable(DWARFDie FuncDie, DWARFDie VarDie) {
       // LocExpr, does that happen in reality?
       DWARFLocationExpression WholeFuncExpr{
           DWARFAddressRange(FuncLowPC, FuncHighPC, SectionIndex), LocExpr.Expr};
-      LiveElements.emplace_back(
-          std::make_unique<LiveVariable>(WholeFuncExpr, VarName, U, FuncDie));
+      NewVar =
+          std::make_unique<LiveVariable>(WholeFuncExpr, VarName, U, FuncDie);
+    }
+
+    // Add the new variable to all the data structures.
+    if (NewVar) {
+      LiveElements.emplace_back(std::move(NewVar));
+      LiveVariable *CurrentVar =
+          static_cast<LiveVariable *>(LiveElements.back().get());
+      // Map from a LiveElement pointer to its index in the LiveElements vector.
+      ElementPtrToIndex.emplace(CurrentVar, LiveElements.size() - 1);
+      if (CurrentVar->getLocExpr().Range) {
+        // Add the variable to address-based maps.
+        LiveElementsByAddress.emplace(CurrentVar->getLocExpr().Range->LowPC,
+                                      CurrentVar);
+        LiveElementsByEndAddress.emplace(CurrentVar->getLocExpr().Range->HighPC,
+                                         CurrentVar);
+      }
     }
   }
 }
@@ -205,14 +233,56 @@ unsigned LiveElementPrinter::moveToFirstVarColumn(formatted_raw_ostream &OS) {
   return FirstUnprintedLogicalColumn;
 }
 
-unsigned LiveElementPrinter::findFreeColumn() {
-  for (unsigned ColIdx = 0; ColIdx < ActiveCols.size(); ++ColIdx)
-    if (!ActiveCols[ColIdx].isActive())
-      return ColIdx;
+unsigned LiveElementPrinter::getOrCreateColumn(unsigned ElementIdx) {
+  // Check if the element already has an assigned column.
+  auto it = ElementToColumn.find(ElementIdx);
+  if (it != ElementToColumn.end()) {
+    return it->second;
+  }
+
+  unsigned ColIdx;
+  if (!FreeCols.empty()) {
+    // Get the smallest available index from the set.
+    ColIdx = *FreeCols.begin();
+    // Remove the index from the set.
+    FreeCols.erase(FreeCols.begin());
+  } else {
+    // No free columns, so create a new one.
+    ColIdx = ActiveCols.size();
+    ActiveCols.emplace_back();
+  }
+
+  // Assign the element to the column and update the map.
+  ElementToColumn[ElementIdx] = ColIdx;
+  ActiveCols[ColIdx].ElementIdx = ElementIdx;
+  return ColIdx;
+}
+
+void LiveElementPrinter::freeColumn(unsigned ColIdx) {
+  unsigned ElementIdx = ActiveCols[ColIdx].ElementIdx;
 
-  size_t OldSize = ActiveCols.size();
-  ActiveCols.grow(std::max<size_t>(OldSize * 2, 1));
-  return OldSize;
+  // Clear the column's data and add it to the free list.
+  ActiveCols[ColIdx].ElementIdx = Column::NullElementIdx;
+  ActiveCols[ColIdx].LiveIn = false;
+  ActiveCols[ColIdx].LiveOut = false;
+  ActiveCols[ColIdx].MustDrawLabel = false;
+
+  // Remove the element's entry from the map and add the column to the free
+  // list.
+  ElementToColumn.erase(ElementIdx);
+  FreeCols.insert(ColIdx);
+}
+
+std::vector<unsigned>
+LiveElementPrinter::getSortedActiveElementIndices() const {
+  // Get all ElementIdx values that currently have an assigned column.
+  std::vector<unsigned> Indices;
+  for (const auto &Pair : ElementToColumn)
+    Indices.push_back(Pair.first);
+
+  // Sort by ElementIdx, which is the DWARF discovery order.
+  llvm::stable_sort(Indices);
+  return Indices;
 }
 
 void LiveElementPrinter::dump() const {
@@ -239,57 +309,109 @@ void LiveElementPrinter::addCompileUnit(DWARFDie D) {
 void LiveElementPrinter::update(object::SectionedAddress ThisAddr,
                                 object::SectionedAddress NextAddr,
                                 bool IncludeDefinedVars) {
-  // Do not create live ranges when debug-inlined-funcs option is provided with
-  // line format option.
+  // Exit early if only printing function limits.
   if (DbgInlinedFunctions == DFLimitsOnly)
     return;
 
-  // First, check variables which have already been assigned a column, so
-  // that we don't change their order.
-  SmallSet<unsigned, 8> CheckedElementIdxs;
+  // Free columns identified in the previous cycle.
+  for (unsigned ColIdx : ColumnsToFreeNextCycle)
+    freeColumn(ColIdx);
+  ColumnsToFreeNextCycle.clear();
+
+  // Update status of active columns and collect those to free next cycle.
   for (unsigned ColIdx = 0, End = ActiveCols.size(); ColIdx < End; ++ColIdx) {
     if (!ActiveCols[ColIdx].isActive())
       continue;
 
-    CheckedElementIdxs.insert(ActiveCols[ColIdx].ElementIdx);
     const std::unique_ptr<LiveElement> &LE =
         LiveElements[ActiveCols[ColIdx].ElementIdx];
     ActiveCols[ColIdx].LiveIn = LE->liveAtAddress(ThisAddr);
     ActiveCols[ColIdx].LiveOut = LE->liveAtAddress(NextAddr);
-    std::string Name = Demangle ? demangle(LE->getName()) : LE->getName();
-    LLVM_DEBUG(dbgs() << "pass 1, " << ThisAddr.Address << "-"
-                      << NextAddr.Address << ", " << Name << ", Col " << ColIdx
-                      << ": LiveIn=" << ActiveCols[ColIdx].LiveIn
-                      << ", LiveOut=" << ActiveCols[ColIdx].LiveOut << "\n");
-
-    if (!ActiveCols[ColIdx].LiveIn && !ActiveCols[ColIdx].LiveOut)
-      ActiveCols[ColIdx].ElementIdx = Column::NullElementIdx;
+
+    LLVM_DEBUG({
+      std::string Name = Demangle ? demangle(LE->getName()) : LE->getName();
+      dbgs() << "pass 1, " << ThisAddr.Address << "-" << NextAddr.Address
+             << ", " << Name << ", Col " << ColIdx
+             << ": LiveIn=" << ActiveCols[ColIdx].LiveIn
+             << ", LiveOut=" << ActiveCols[ColIdx].LiveOut << "\n";
+    });
+
+    // Mark for cleanup in the next cycle if range ends here.
+    if (ActiveCols[ColIdx].LiveIn && !ActiveCols[ColIdx].LiveOut)
+      ColumnsToFreeNextCycle.push_back(ColIdx);
   }
 
   // Next, look for variables which don't already have a column, but which
-  // are now live.
+  // are now live (those starting at ThisAddr or NextAddr).
   if (IncludeDefinedVars) {
-    for (unsigned ElementIdx = 0, End = LiveElements.size(); ElementIdx < End;
-         ++ElementIdx) {
-      if (CheckedElementIdxs.count(ElementIdx))
+    // Collect all elements starting at ThisAddr and NextAddr.
+    std::vector<std::pair<unsigned, LiveElement *>> NewLiveElements;
+    // Process elements from a map range and add them to NewLiveElements.
+    auto CollectNewElements = [&](const auto &Range) {
+      for (auto it = Range.first; it != Range.second; ++it) {
+        LiveElement *LE = it->second;
+
+        // Get the ElementIdx for sorting and column management.
+        auto IndexIt = ElementPtrToIndex.find(LE);
+        if (IndexIt == ElementPtrToIndex.end()) {
+          LLVM_DEBUG(
+              dbgs()
+              << "Error: LiveElement in map but not in ElementPtrToIndex!\n");
+          continue;
+        }
+
+        unsigned ElementIdx = IndexIt->second;
+        // Skip elements that already have a column.
+        if (ElementToColumn.count(ElementIdx))
+          continue;
+
+        bool LiveIn = LE->liveAtAddress(ThisAddr);
+        bool LiveOut = LE->liveAtAddress(NextAddr);
+        if (!LiveIn && !LiveOut)
+          continue;
+
+        NewLiveElements.emplace_back(ElementIdx, LE);
+      }
+    };
+
+    // Collect elements starting at ThisAddr.
+    CollectNewElements(LiveElementsByAddress.equal_range(ThisAddr.Address));
+    // Collect elements starting at NextAddr (the address immediately following
+    // the instruction).
+    CollectNewElements(LiveElementsByAddress.equal_range(NextAddr.Address));
+    // Sort elements by DWARF discovery order (ElementIdx) for deterministic
+    // column assignment.
+    llvm::stable_sort(NewLiveElements, [](const auto &A, const auto &B) {
+      return A.first < B.first;
+    });
+
+    // Assign columns in deterministic order.
+    for (const auto &ElementPair : NewLiveElements) {
+      unsigned ElementIdx = ElementPair.first;
+      // Skip if element was already added from the first range.
+      if (ElementToColumn.count(ElementIdx))
         continue;
 
-      const std::unique_ptr<LiveElement> &LE = LiveElements[ElementIdx];
+      LiveElement *LE = ElementPair.second;
       bool LiveIn = LE->liveAtAddress(ThisAddr);
       bool LiveOut = LE->liveAtAddress(NextAddr);
-      if (!LiveIn && !LiveOut)
-        continue;
 
-      unsigned ColIdx = findFreeColumn();
-      std::string Name = Demangle ? demangle(LE->getName()) : LE->getName();
-      LLVM_DEBUG(dbgs() << "pass 2, " << ThisAddr.Address << "-"
-                        << NextAddr.Address << ", " << Name << ", Col "
-                        << ColIdx << ": LiveIn=" << LiveIn
-                        << ", LiveOut=" << LiveOut << "\n");
-      ActiveCols[ColIdx].ElementIdx = ElementIdx;
+      // Assign or create a column.
+      unsigned ColIdx = getOrCreateColumn(ElementIdx);
+      LLVM_DEBUG({
+        std::string Name = Demangle ? demangle(LE->getName()) : LE->getName();
+        dbgs() << "pass 2, " << ThisAddr.Address << "-" << NextAddr.Address
+               << ", " << Name << ", Col " << ColIdx << ": LiveIn=" << LiveIn
+               << ", LiveOut=" << LiveOut << "\n";
+      });
+
       ActiveCols[ColIdx].LiveIn = LiveIn;
       ActiveCols[ColIdx].LiveOut = LiveOut;
       ActiveCols[ColIdx].MustDrawLabel = true;
+
+      // Mark for cleanup next cycle if range ends here.
+      if (ActiveCols[ColIdx].LiveIn && !ActiveCols[ColIdx].LiveOut)
+        ColumnsToFreeNextCycle.push_back(ColIdx);
     }
   }
 }
@@ -360,7 +482,14 @@ void LiveElementPrinter::printAfterOtherLine(formatted_raw_ostream &OS,
 void LiveElementPrinter::printBetweenInsts(formatted_raw_ostream &OS,
                                            bool MustPrint) {
   bool PrintedSomething = false;
-  for (unsigned ColIdx = 0, End = ActiveCols.size(); ColIdx < End; ++ColIdx) {
+  // Get all active elements, sorted by discovery order (ElementIdx).
+  std::vector<unsigned> SortedElementIndices = getSortedActiveElementIndices();
+  // The outer loop iterates over the deterministic DWARF discovery order
+  // (ElementIdx).
+  for (unsigned ElementIdx : SortedElementIndices) {
+    // Look up the physical column index (ColIdx) assigned to this
+    // element. We use .at() because we are certain the element is active.
+    unsigned ColIdx = ElementToColumn.at(ElementIdx);
     if (ActiveCols[ColIdx].isActive() && ActiveCols[ColIdx].MustDrawLabel) {
       // First we need to print the live range markers for any active
       // columns to the left of this one.
@@ -375,8 +504,7 @@ void LiveElementPrinter::printBetweenInsts(formatted_raw_ostream &OS,
           OS << "  ";
       }
 
-      const std::unique_ptr<LiveElement> &LE =
-          LiveElements[ActiveCols[ColIdx].ElementIdx];
+      const std::unique_ptr<LiveElement> &LE = LiveElements[ElementIdx];
       // Then print the variable name and location of the new live range,
       // with box drawing characters joining it to the live range line.
       OS << getLineChar(ActiveCols[ColIdx].LiveIn ? LineChar::LabelCornerActive
@@ -440,20 +568,58 @@ void LiveElementPrinter::printAfterInst(formatted_raw_ostream &OS) {
 
 void LiveElementPrinter::printStartLine(formatted_raw_ostream &OS,
                                         object::SectionedAddress Addr) {
-  // Print a line to idenfity the start of an inlined function if line format
-  // is specified.
-  if (DbgInlinedFunctions == DFLimitsOnly)
-    for (const std::unique_ptr<LiveElement> &LE : LiveElements)
-      LE->printElementLine(OS, Addr, false);
+  // Only print the start line for inlined functions if DFLimitsOnly is
+  // enabled.
+  if (DbgInlinedFunctions != DFLimitsOnly)
+    return;
+
+  // Use the map to find all elements that start at the given address.
+  auto Range = LiveElementsByAddress.equal_range(Addr.Address);
+  std::vector<unsigned> ElementIndices;
+  for (auto it = Range.first; it != Range.second; ++it) {
+    LiveElement *LE = it->second;
+    // Look up the ElementIdx from the pointer.
+    auto IndexIt = ElementPtrToIndex.find(LE);
+    if (IndexIt != ElementPtrToIndex.end())
+      ElementIndices.push_back(IndexIt->second);
+  }
+
+  // Sort the indices to ensure deterministic output order (by DWARF discovery
+  // order).
+  llvm::stable_sort(ElementIndices);
+
+  for (unsigned ElementIdx : ElementIndices) {
+    LiveElement *LE = LiveElements[ElementIdx].get();
+    LE->printElementLine(OS, Addr, false);
+  }
 }
 
 void LiveElementPrinter::printEndLine(formatted_raw_ostream &OS,
                                       object::SectionedAddress Addr) {
-  // Print a line to idenfity the end of an inlined function if line format is
-  // specified.
-  if (DbgInlinedFunctions == DFLimitsOnly)
-    for (const std::unique_ptr<LiveElement> &LE : LiveElements)
-      LE->printElementLine(OS, Addr, true);
+  // Only print the end line for inlined functions if DFLimitsOnly is
+  // enabled.
+  if (DbgInlinedFunctions != DFLimitsOnly)
+    return;
+
+  // Use the map to find elements that end at the given address.
+  auto Range = LiveElementsByEndAddress.equal_range(Addr.Address);
+  std::vector<unsigned> ElementIndices;
+  for (auto it = Range.first; it != Range.second; ++it) {
+    LiveElement *LE = it->second;
+    // Look up the ElementIdx from the pointer.
+    auto IndexIt = ElementPtrToIndex.find(LE);
+    if (IndexIt != ElementPtrToIndex.end())
+      ElementIndices.push_back(IndexIt->second);
+  }
+
+  // Sort the indices to ensure deterministic output order (by DWARF discovery
+  // order).
+  llvm::stable_sort(ElementIndices);
+
+  for (unsigned ElementIdx : ElementIndices) {
+    LiveElement *LE = LiveElements[ElementIdx].get();
+    LE->printElementLine(OS, Addr, true);
+  }
 }
 
 bool SourcePrinter::cacheSource(const DILineInfo &LineInfo) {
diff --git a/llvm/tools/llvm-objdump/SourcePrinter.h b/llvm/tools/llvm-objdump/SourcePrinter.h
index 5c131a0eb1fd7..d70d9ea29d7b7 100644
--- a/llvm/tools/llvm-objdump/SourcePrinter.h
+++ b/llvm/tools/llvm-objdump/SourcePrinter.h
@@ -78,6 +78,7 @@ class LiveVariable : public LiveElement {
   bool liveAtAddress(object::SectionedAddress Addr) const override;
   void print(raw_ostream &OS, const MCRegisterInfo &MRI) const override;
   void dump(raw_ostream &OS) const override;
+  const DWARFLocationExpression &getLocExpr() const { return LocExpr; }
 };
 
 /// Helper class for printing source locations for variables and inlined
@@ -97,11 +98,22 @@ class LiveElementPrinter {
         std::numeric_limits<unsigned>::max();
   };
 
-  // All live elements we know about in the object/image file.
+  // Vector that owns all LiveElement objects for memory management.
   std::vector<std::unique_ptr<LiveElement>> LiveElements;
-
-  // The columns we are currently drawing.
-  IndexedMap<Column> ActiveCols;
+  // Map for fast lookup of live elements by their starting address (LowPC).
+  std::unordered_multimap<uint64_t, LiveElement *> LiveElementsByAddress;
+  // Map for fast lookup of live elements by their ending address (HighPC).
+  std::unordered_multimap<uint64_t, LiveElement *> LiveElementsByEndAddress;
+  // Map from a LiveElement pointer to its index in the LiveElements vector.
+  std::unordered_map<LiveElement *, unsigned> ElementPtrToIndex;
+  // Map from a live element index to column index for efficient lookup.
+  std::unordered_map<unsigned, unsigned> ElementToColumn;
+  // Vector of columns currently used for printing live ranges.
+  std::vector<Column> ActiveCols;
+  // Set of available column indices kept sorted for efficient reuse.
+  std::set<unsigned> FreeCols;
+  // Vector of available column indices that can be reused.
+  std::vector<unsigned> ColumnsToFreeNextCycle;
 
   const MCRegisterInfo &MRI;
   const MCSubtargetInfo &STI;
@@ -122,11 +134,19 @@ class LiveElementPrinter {
   // put live element lines. Pick a less overloaded word.
   unsigned moveToFirstVarColumn(formatted_raw_ostream &OS);
 
-  unsigned findFreeColumn();
+  // Get an existing column for a live element, or find a free one.
+  unsigned getOrCreateColumn(unsigned ElementIdx);
+
+  // Free a column when its element is no longer live.
+  void freeColumn(unsigned ColIdx);
+
+  // Returns the indices of all currently active elements, sorted by their DWARF
+  // discovery order (ElementIdx).
+  std::vector<unsigned> getSortedActiveElementIndices() const;
 
 public:
   LiveElementPrinter(const MCRegisterInfo &MRI, const MCSubtargetInfo &STI)
-      : ActiveCols(Column()), MRI(MRI), STI(STI) {}
+      : MRI(MRI), STI(STI) {}
 
   void dump() const;
 

This patch significantly optimizes the LiveElementPrinter
by replacing a slow linear search with efficient hash map
lookups. It refactors the code to use a map-based system
for tracking live element addresses and managing column
assignments, leading to a major performance improvement
for large binaries.
Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting idea. Can you give some example performance numbers with and without this change, please? Both memory consumption and duration would be good, if possible.

One minor point: you seem to have added a lot of comments. Could I ask your motivation for doing so?

Comment on lines 55 to 56
// This check is technically redundant as the function is only called when
// either a start or end address matches, but it serves as a small safeguard.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it just be an assertion then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can completely remove this since the callers of this function ensure that.

@gulfemsavrun
Copy link
Contributor Author

Interesting idea. Can you give some example performance numbers with and without this change, please? Both memory consumption and duration would be good, if possible.

We attempted to enable this feature (--debug-inlined-funcs=limits-only) by default for developers performing low-level debugging (e.g., in the kernel). A developer testing it on the kernel binary reported a significant performance slowdown. I investigated the slowdown, which I initially suspected was slow debug info traversal. However, I found that the issue is actually inefficient linear traversal after profiling.

Kernel binary is one of our larger binaries, where we encounter over 200,000 LiveElements (inlined functions). This is the size of our kernel binary.

$ llvm-size kernel_x64/vmzircon
   text    data     bss     dec     hex filename
3976115   82381  351345 4409841  4349f1 kernel_x64/vmzircon

Checking every inlined function for every address lookup is the primary source of the high cost.

Here are the time measurements:

Default (No flag)

$ time llvm-objdump --disassemble kernel_x64/vmzircon
real    0m55.830s
user    0m5.147s
sys     0m50.670s

Linear search (--debug-inlined-funcs)

$ time llvm-objdump --disassemble --debug-inlined-funcs=limits-only kernel_x64/vmzircon
real    24m12.487s
user    23m53.948s
sys     0m17.230s

Optimized maps (--debug-inlined-funcs)

$ time llvm-objdump --disassemble --debug-inlined-funcs=limits-only kernel_x64/vmzircon
real    1m24.995s
user    0m24.895s
sys     1m0.056s

We have not observed this radical slowdown in smaller binaries, and the issue only manifests in a relatively large binary.

One minor point: you seem to have added a lot of comments. Could I ask your motivation for doing so?

I added extensive comments because the original authors are no longer active in this area. My goal was to make the code self-explanatory for any future developers working on related features. I am, however, happy to remove any comments you think are redundant.

Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Your numbers are certainly compelling. Is it possible to get peak memory usage too? I imagine the cost is small comparatively, but would be interesting to see them, since you'll be increasing the usage with this change (the age old memory versus speed trade-off).

I've started reviewing this, but will need to come back to finish off. I've left a couple of comments for you to get on with.

// Map for fast lookup of live elements by their ending address (HighPC).
std::unordered_multimap<uint64_t, LiveElement *> LiveElementsByEndAddress;
// Map from a LiveElement pointer to its index in the LiveElements vector.
std::unordered_map<LiveElement *, unsigned> ElementPtrToIndex;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect you're better off with one of LLVM's map types, based on the discussion at https://llvm.org/docs/ProgrammersManual.html#other-map-like-container-options.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. I replaced std::unordered_map with llvm::MapVector.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you forget to push? I'm still seeing std::unordered_map in a couple of places.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is my mistake; my apologies. I replaced all instances of std::unordered_map in the recent version of the code.

@gulfemsavrun
Copy link
Contributor Author

gulfemsavrun commented Oct 1, 2025

Thanks. Your numbers are certainly compelling. Is it possible to get peak memory usage too? I imagine the cost is small comparatively, but would be interesting to see them, since you'll be increasing the usage with this change (the age old memory versus speed trade-off).

I used the /usr/bin/time -v utility to measure the Maximum Resident Set Size (RSS) in kilobytes, which I think represents the peak memory usage.

Default (No flag)

$ /usr/bin/time -v llvm-objdump --disassemble kernel_x64/vmzircon
Maximum resident set size (kbytes): 37,980

Linear search (--debug-inlined-funcs)

$ /usr/bin/time -v llvm-objdump --disassemble --debug-inlined-funcs=limits-only kernel_x64/vmzircon
Maximum resident set size (kbytes): 343,624

Optimized maps (--debug-inlined-funcs)

$ /usr/bin/time -v llvm-objdump --disassemble --debug-inlined-funcs=limits-only kernel_x64/vmzircon
Maximum resident set size (kbytes): 341,708

AFAICT, use of -debug-inlined-funcs flag, regardless of the underlying implementation, introduces a significant increase in peak memory usage over the default disassembly mode. However, the optimized maps approach does not introduce any additional memory cost compared to the original linear search method.

I've started reviewing this, but will need to come back to finish off. I've left a couple of comments for you to get on with.

Thank you very much!

@gulfemsavrun
Copy link
Contributor Author

I've started reviewing this, but will need to come back to finish off. I've left a couple of comments for you to get on with.

Thank you very much!

@jh7370 Have you had time to look over this?

@jh7370
Copy link
Collaborator

jh7370 commented Oct 31, 2025

I've started reviewing this, but will need to come back to finish off. I've left a couple of comments for you to get on with.

Thank you very much!

@jh7370 Have you had time to look over this?

My apologies, this fell off my review backlog for some reason. I'll try to take a look at it either later today or some time early next week.

Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My apologies, I clicked the wrong window when posting an LGTM for a different PR. I haven't looked at this yet.

Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the performance figures. I'm surprised that the map approach is using (slightly) less memory!

// Map for fast lookup of live elements by their ending address (HighPC).
std::unordered_multimap<uint64_t, LiveElement *> LiveElementsByEndAddress;
// Map from a LiveElement pointer to its index in the LiveElements vector.
std::unordered_map<LiveElement *, unsigned> ElementPtrToIndex;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you forget to push? I'm still seeing std::unordered_map in a couple of places.

void freeColumn(unsigned ColIdx);

// Returns the indices of all currently active elements, sorted by their DWARF
// discovery order (ElementIdx).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what the reference to ElementIdx means in the context of this comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed all mentions of ElementIdx in the comments.

if (IndexIt == ElementPtrToIndex.end()) {
LLVM_DEBUG(
dbgs()
<< "Error: LiveElement in map but not in ElementPtrToIndex!\n");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this not be an assertion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I converted the check to an assertion.

return;

const std::vector<LiveElement *> &ElementList = It->second;
// Get the ElementIdx for sorting and column management.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than refer to the variable name (which you don't even use until later), just use the real term, i.e. something like "element index".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

// Map the element's high address (HighPC) to its pointer for fast range end
// lookup.
LiveElementsByEndAddress[FuncHighPC].push_back(LiveElements.back().get());
// Map the pointer to its DWARF discovery index (ElementIdx) for deterministic
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as elsewhere "ElementIdx" in this context is not really a useful term. Better would be to say what is meant by "discovery index".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

for (LiveElement *LE : It->second) {
// Look up the ElementIdx from the pointer.
auto IndexIt = ElementPtrToIndex.find(LE);
if (IndexIt != ElementPtrToIndex.end())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under what circumstances can IndexIt be the end iterator?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is more reasonable to convert this into an assertion. As long as live elements are correctly populated across all data structures, IndexIt will never equal the end iterator.

if (DbgInlinedFunctions == DFLimitsOnly)
for (const std::unique_ptr<LiveElement> &LE : LiveElements)
LE->printElementLine(OS, Addr, true);
// Only print the end line for inlined functions if DFLimitsOnly is
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is nearly identical to printStartLine, just with a different address map to look up in and a true instead of a false. Could you fold the business logic together and just pass these in?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

unsigned LiveElementPrinter::getOrCreateColumn(unsigned ElementIdx) {
// Check if the element already has an assigned column.
auto it = ElementToColumn.find(ElementIdx);
if (it != ElementToColumn.end()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: coding standard for single line if says no braces.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if (LocExpr.Range) {
LiveElements.emplace_back(
std::make_unique<LiveVariable>(LocExpr, VarName, U, FuncDie));
NewVar = std::make_unique<LiveVariable>(LocExpr, VarName, U, FuncDie);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than creating a local variable here and in the else clause, then immediately moving it after the if/else, perhaps you could have a new function along the lines of createNewLiveVariable that creates the entry in LiveElements directly, like before, and adds it to the various data structures too? I think it would help reduce the size of this function, which is getting a bit big.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I introduced registerNewVariable function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants