Skip to content

Conversation

dzhidzhoev
Copy link
Member

Depends on:

With this change, DINodeInfoHolder is used to store abstract
and concrete out-of-line subprogram DIEs in DwarfInfoHolder.

Every definition subprogram DIE is associated with a corresponding
llvm::Function (declaration subprograms are associated with nullptr).
When a concrete subprogram DIE is queried via getOrCreateSubprogramDIE,
the corresponding llvm::Function should be provided. If none is provided:

  • DwarfUnit/DwarfTypeUnit falls back and returns any concrete DIE for
    the given DISubprogram,
  • DwarfCompileUnit is expected to return abstract DIE.

This is a step to support attachment of a DISubprogram to multiple
llvm::Functions (and to establish one-to-one-to-many correspondence between
DISubprograms, abstract DIEs and function clones, and, later,
to make the backend use uniquied DISubprograms).

@llvmbot
Copy link
Member

llvmbot commented Oct 10, 2025

@llvm/pr-subscribers-debuginfo

Author: Vladislav Dzhidzhoev (dzhidzhoev)

Changes

Depends on:

With this change, DINodeInfoHolder is used to store abstract
and concrete out-of-line subprogram DIEs in DwarfInfoHolder.

Every definition subprogram DIE is associated with a corresponding
llvm::Function (declaration subprograms are associated with nullptr).
When a concrete subprogram DIE is queried via getOrCreateSubprogramDIE,
the corresponding llvm::Function should be provided. If none is provided:

  • DwarfUnit/DwarfTypeUnit falls back and returns any concrete DIE for
    the given DISubprogram,
  • DwarfCompileUnit is expected to return abstract DIE.

This is a step to support attachment of a DISubprogram to multiple
llvm::Functions (and to establish one-to-one-to-many correspondence between
DISubprograms, abstract DIEs and function clones, and, later,
to make the backend use uniquied DISubprograms).


Patch is 27.12 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/162852.diff

7 Files Affected:

  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp (+10-5)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.h (+9-17)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp (+9-6)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfDebug.h (+2-1)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfFile.h (+143-33)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp (+45-34)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfUnit.h (+19-2)
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
index 518121e200190..ba8daf7662319 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
@@ -178,7 +178,7 @@ unsigned DwarfCompileUnit::getOrCreateSourceID(const DIFile *File) {
 DIE *DwarfCompileUnit::getOrCreateGlobalVariableDIE(
     const DIGlobalVariable *GV, ArrayRef<GlobalExpr> GlobalExprs) {
   // Check for pre-existence.
-  if (DIE *Die = getDIE(GV))
+  if (DIE *Die = getDIEs(GV).getVariableDIE(GV))
     return Die;
 
   assert(GV);
@@ -795,7 +795,9 @@ DIE *DwarfCompileUnit::constructLexicalScopeDIE(LexicalScope *Scope) {
 
 DIE *DwarfCompileUnit::constructVariableDIE(DbgVariable &DV, bool Abstract) {
   auto *VariableDie = DIE::get(DIEValueAllocator, DV.getTag());
-  insertDIE(DV.getVariable(), VariableDie);
+  getDIEs(DV.getVariable())
+      .getLVs()
+      .insertDIE(DV.getVariable(), &DV, VariableDie, Abstract);
   DV.setDIE(*VariableDie);
   // Abstract variables don't get common attributes later, so apply them now.
   if (Abstract) {
@@ -1010,7 +1012,9 @@ DIE *DwarfCompileUnit::constructVariableDIE(DbgVariable &DV,
 DIE *DwarfCompileUnit::constructLabelDIE(DbgLabel &DL,
                                          const LexicalScope &Scope) {
   auto LabelDie = DIE::get(DIEValueAllocator, DL.getTag());
-  insertDIE(DL.getLabel(), LabelDie);
+  getDIEs(DL.getLabel())
+      .getLabels()
+      .insertDIE(DL.getLabel(), &DL, LabelDie, Scope.isAbstractScope());
   DL.setDIE(*LabelDie);
 
   if (Scope.isAbstractScope())
@@ -1472,8 +1476,9 @@ DIE *DwarfCompileUnit::getOrCreateImportedEntityDIE(
   return IMDie;
 }
 
-void DwarfCompileUnit::finishSubprogramDefinition(const DISubprogram *SP) {
-  DIE *D = getDIE(SP);
+void DwarfCompileUnit::finishSubprogramDefinition(const DISubprogram *SP,
+                                                  const Function *F) {
+  DIE *D = getDIEs(SP).getLocalScopes().getConcreteDIE(SP, F);
   if (DIE *AbsSPDIE = getAbstractScopeDIEs().lookup(SP)) {
     if (D)
       // If this subprogram has an abstract definition, reference that
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.h b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.h
index a3bbc8364599d..b0dcc3e432a03 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.h
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.h
@@ -79,16 +79,10 @@ class DwarfCompileUnit final : public DwarfUnit {
   // List of concrete lexical block scopes belong to subprograms within this CU.
   DenseMap<const DILocalScope *, DIE *> LexicalBlockDIEs;
 
-  // List of abstract local scopes (either DISubprogram or DILexicalBlock).
-  DenseMap<const DILocalScope *, DIE *> AbstractLocalScopeDIEs;
-  SmallPtrSet<const DISubprogram *, 8> FinalizedAbstractSubprograms;
-
   // List of inlined lexical block scopes that belong to subprograms within this
   // CU.
   DenseMap<const DILocalScope *, SmallVector<DIE *, 2>> InlinedLocalScopeDIEs;
 
-  DenseMap<const DINode *, std::unique_ptr<DbgEntity>> AbstractEntities;
-
   /// DWO ID for correlating skeleton and split units.
   uint64_t DWOId = 0;
 
@@ -126,22 +120,20 @@ class DwarfCompileUnit final : public DwarfUnit {
 
   bool isDwoUnit() const override;
 
-  DenseMap<const DILocalScope *, DIE *> &getAbstractScopeDIEs() {
-    if (isDwoUnit() && !DD->shareAcrossDWOCUs())
-      return AbstractLocalScopeDIEs;
-    return DU->getAbstractScopeDIEs();
+  DwarfInfoHolder &getDIEs(const DINode *N) { return DwarfUnit::getDIEs(N); }
+
+  DwarfInfoHolder &getDIEs() { return getDIEs(nullptr); }
+
+  DwarfInfoHolder::AbstractScopeMapT &getAbstractScopeDIEs() {
+    return getDIEs().getLocalScopes().getAbstractDIEs();
   }
 
   DenseMap<const DINode *, std::unique_ptr<DbgEntity>> &getAbstractEntities() {
-    if (isDwoUnit() && !DD->shareAcrossDWOCUs())
-      return AbstractEntities;
-    return DU->getAbstractEntities();
+    return getDIEs().getAbstractEntities();
   }
 
   auto &getFinalizedAbstractSubprograms() {
-    if (isDwoUnit() && !DD->shareAcrossDWOCUs())
-      return FinalizedAbstractSubprograms;
-    return DU->getFinalizedAbstractSubprograms();
+    return getDIEs().getFinalizedAbstractSubprograms();
   }
 
   void finishNonUnitTypeDIE(DIE& D, const DICompositeType *CTy) override;
@@ -327,7 +319,7 @@ class DwarfCompileUnit final : public DwarfUnit {
   DIE *getOrCreateImportedEntityDIE(const DIImportedEntity *IE);
   DIE *constructImportedEntityDIE(const DIImportedEntity *IE);
 
-  void finishSubprogramDefinition(const DISubprogram *SP);
+  void finishSubprogramDefinition(const DISubprogram *SP, const Function *F);
   void finishEntityDefinition(const DbgEntity *Entity);
   void attachLexicalScopesAbstractOrigins();
 
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
index d751a7f9f01ef..5aa8b932facdc 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
@@ -502,7 +502,8 @@ void DwarfDebug::addSubprogramNames(
   // well into the name table. Only do that if we are going to actually emit
   // that name.
   if (LinkageName != "" && SP->getName() != LinkageName &&
-      (useAllLinkageNames() || InfoHolder.getAbstractScopeDIEs().lookup(SP)))
+      (useAllLinkageNames() ||
+       InfoHolder.getDIEs().getLocalScopes().getAbstractDIEs().lookup(SP)))
     addAccelName(Unit, NameTableKind, LinkageName, Die);
 
   // If this is an Objective-C selector name add it to the ObjC accelerator
@@ -1263,11 +1264,13 @@ void DwarfDebug::finishEntityDefinitions() {
 }
 
 void DwarfDebug::finishSubprogramDefinitions() {
-  for (const DISubprogram *SP : ProcessedSPNodes) {
+  for (auto SPF : ProcessedSPNodes) {
+    const DISubprogram *SP = SPF.first;
     assert(SP->getUnit()->getEmissionKind() != DICompileUnit::NoDebug);
-    forBothCUs(
-        getOrCreateDwarfCompileUnit(SP->getUnit()),
-        [&](DwarfCompileUnit &CU) { CU.finishSubprogramDefinition(SP); });
+    forBothCUs(getOrCreateDwarfCompileUnit(SP->getUnit()),
+               [&](DwarfCompileUnit &CU) {
+                 CU.finishSubprogramDefinition(SP, SPF.second);
+               });
   }
 }
 
@@ -2784,7 +2787,7 @@ void DwarfDebug::endFunctionImpl(const MachineFunction *MF) {
     constructAbstractSubprogramScopeDIE(TheCU, AScope);
   }
 
-  ProcessedSPNodes.insert(SP);
+  ProcessedSPNodes.insert(std::make_pair(SP, &F));
   DIE &ScopeDIE =
       TheCU.constructSubprogramScopeDIE(SP, F, FnScope, FunctionLineTableLabel);
   if (auto *SkelCU = TheCU.getSkeleton())
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.h b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.h
index 1a1b28a6fc035..42ac225e2d17e 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.h
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.h
@@ -373,7 +373,8 @@ class DwarfDebug : public DebugHandlerBase {
 
   /// This is a collection of subprogram MDNodes that are processed to
   /// create DIEs.
-  SmallSetVector<const DISubprogram *, 16> ProcessedSPNodes;
+  SmallSetVector<std::pair<const DISubprogram *, const Function *>, 16>
+      ProcessedSPNodes;
 
   /// Map function-local imported entities to their parent local scope
   /// (either DILexicalBlock or DISubprogram) for a processed function
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfFile.h b/llvm/lib/CodeGen/AsmPrinter/DwarfFile.h
index ef1524d875c84..94d4e5f0b7f05 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfFile.h
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfFile.h
@@ -15,9 +15,12 @@
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringRef.h"
 #include "llvm/CodeGen/DIE.h"
+#include "llvm/IR/DebugInfoMetadata.h"
 #include "llvm/Support/Allocator.h"
+#include <functional>
 #include <map>
 #include <memory>
+#include <optional>
 #include <utility>
 
 namespace llvm {
@@ -26,9 +29,6 @@ class AsmPrinter;
 class DbgEntity;
 class DbgVariable;
 class DbgLabel;
-class DINode;
-class DILocalScope;
-class DISubprogram;
 class DwarfCompileUnit;
 class DwarfUnit;
 class LexicalScope;
@@ -53,6 +53,144 @@ struct RangeSpanList {
   SmallVector<RangeSpan, 2> Ranges;
 };
 
+/// Tracks abstract and concrete DIEs for debug info entities of a certain type.
+template <typename DINodeT, typename DbgEntityT> class DINodeInfoHolder {
+public:
+  using AbstractMapT = DenseMap<const DINodeT *, DIE *>;
+  using ConcreteMapT =
+      DenseMap<const DINodeT *, SmallDenseMap<const DbgEntityT *, DIE *, 2>>;
+
+private:
+  AbstractMapT AbstractMap;
+  ConcreteMapT ConcreteMap;
+
+public:
+  void insertAbstractDIE(const DINodeT *N, DIE *D) {
+    auto [_, Inserted] = AbstractMap.try_emplace(N, D);
+    assert(Inserted && "Duplicate abstract DIE for debug info node");
+  }
+
+  void insertConcreteDIE(const DINodeT *N, const DbgEntityT *E, DIE *D) {
+    auto [_, Inserted] = ConcreteMap[N].try_emplace(E, D);
+    assert(Inserted && "Duplicate concrete DIE for debug info node");
+  }
+
+  void insertDIE(const DINodeT *N, const DbgEntityT *E, DIE *D, bool Abstract) {
+    if (Abstract)
+      insertAbstractDIE(N, D);
+    else
+      insertConcreteDIE(N, E, D);
+  }
+
+  DIE *getAbstractDIE(const DINodeT *N) const { return AbstractMap.lookup(N); }
+
+  std::optional<
+      std::reference_wrapper<const typename ConcreteMapT::mapped_type>>
+  getConcreteDIEs(const DINodeT *N) const {
+    if (auto I = ConcreteMap.find(N); I != ConcreteMap.end())
+      return std::make_optional(std::ref(I->second));
+    return std::nullopt;
+  }
+
+  DIE *getConcreteDIE(const DINodeT *N, const DbgEntityT *E) const {
+    if (auto I = getConcreteDIEs(N))
+      return I->get().lookup(E);
+    return nullptr;
+  }
+
+  DIE *getAnyConcreteDIE(const DINodeT *N) const {
+    if (auto I = getConcreteDIEs(N))
+      return I->get().empty() ? nullptr : I->get().begin()->second;
+    return nullptr;
+  }
+
+  /// Returns abstract DIE for the entity.
+  /// If no abstract DIE was created, returns any concrete DIE for the entity.
+  DIE *getDIE(const DINodeT *N) const {
+    if (DIE *D = getAbstractDIE(N))
+      return D;
+
+    return getAnyConcreteDIE(N);
+  }
+
+  AbstractMapT &getAbstractDIEs() { return AbstractMap; }
+};
+
+/// Tracks DIEs for debug info entites.
+/// These DIEs can be shared across CUs, that is why we keep the map here
+/// instead of in DwarfCompileUnit.
+class DwarfInfoHolder {
+public:
+  using LocalScopeHolderT = DINodeInfoHolder<DILocalScope, Function>;
+  using AbstractScopeMapT = LocalScopeHolderT::AbstractMapT;
+
+private:
+  /// DIEs of local DbgVariables.
+  DINodeInfoHolder<DILocalVariable, DbgVariable> LVHolder;
+  /// DIEs of labels.
+  DINodeInfoHolder<DILabel, DbgLabel> LabelHolder;
+  DenseMap<const DINode *, std::unique_ptr<DbgEntity>> AbstractEntities;
+  /// DIEs of abstract local scopes and concrete non-inlined subprograms.
+  /// Inlined subprograms and concrete lexical blocks are not stored here.
+  LocalScopeHolderT LSHolder;
+  /// Keeps track of abstract subprograms to populate them only once.
+  // FIXME: merge creation and population of abstract scopes.
+  SmallPtrSet<const DISubprogram *, 8> FinalizedAbstractSubprograms;
+
+  /// Other DINodes with the corresponding DIEs.
+  DenseMap<const DINode *, DIE *> MDNodeToDieMap;
+
+public:
+  void insertDIE(const DINode *N, DIE *Die) {
+    assert((!isa<DILabel>(N) && !isa<DILocalVariable>(N) &&
+            !isa<DILocalScope>(N)) &&
+           "Use getLabels().insertDIE() for labels or getLVs().insertDIE() for "
+           "local variables, or getSubprogram().insertDIE() for subprograms.");
+    auto [_, Inserted] = MDNodeToDieMap.try_emplace(N, Die);
+    assert((Inserted || isa<DIType>(N)) &&
+           "DIE for this DINode has already been added");
+  }
+
+  void insertDIE(DIE *D) { MDNodeToDieMap.try_emplace(nullptr, D); }
+
+  DIE *getDIE(const DINode *N) const {
+    DIE *D = MDNodeToDieMap.lookup(N);
+    assert((!D || (!isa<DILabel>(N) && !isa<DILocalVariable>(N) &&
+                   !isa<DILocalScope>(N))) &&
+           "Use getLabels().getDIE() for labels or getLVs().getDIE() for "
+           "local variables, or getLocalScopes().getDIE() for local scopes.");
+    return D;
+  }
+
+  auto &getLVs() { return LVHolder; }
+  auto &getLVs() const { return LVHolder; }
+
+  auto &getLabels() { return LabelHolder; }
+  auto &getLabels() const { return LabelHolder; }
+
+  auto &getLocalScopes() { return LSHolder; }
+  auto &getLocalScopes() const { return LSHolder; }
+
+  /// For a global variable, returns DIE of the variable.
+  ///
+  /// For a local variable, returns abstract DIE of the variable.
+  /// If no abstract DIE was created, returns any concrete DIE of the variable.
+  DIE *getVariableDIE(const DIVariable *V) const {
+    if (auto *LV = dyn_cast<DILocalVariable>(V))
+      if (DIE *D = getLVs().getDIE(LV))
+        return D;
+    return getDIE(V);
+  }
+
+  DenseMap<const DINode *, std::unique_ptr<DbgEntity>> &getAbstractEntities() {
+    return AbstractEntities;
+  }
+
+  auto &getFinalizedAbstractSubprograms() {
+    return FinalizedAbstractSubprograms;
+  }
+};
+
 class DwarfFile {
   // Target of Dwarf emission, used for sizing of abbreviations.
   AsmPrinter *Asm;
@@ -93,17 +231,7 @@ class DwarfFile {
   using LabelList = SmallVector<DbgLabel *, 4>;
   DenseMap<LexicalScope *, LabelList> ScopeLabels;
 
-  // Collection of abstract subprogram DIEs.
-  DenseMap<const DILocalScope *, DIE *> AbstractLocalScopeDIEs;
-  DenseMap<const DINode *, std::unique_ptr<DbgEntity>> AbstractEntities;
-  /// Keeps track of abstract subprograms to populate them only once.
-  // FIXME: merge creation and population of abstract scopes.
-  SmallPtrSet<const DISubprogram *, 8> FinalizedAbstractSubprograms;
-
-  /// Maps MDNodes for type system with the corresponding DIEs. These DIEs can
-  /// be shared across CUs, that is why we keep the map here instead
-  /// of in DwarfCompileUnit.
-  DenseMap<const MDNode *, DIE *> DITypeNodeToDieMap;
+  DwarfInfoHolder InfoHolder;
 
 public:
   DwarfFile(AsmPrinter *AP, StringRef Pref, BumpPtrAllocator &DA);
@@ -171,25 +299,7 @@ class DwarfFile {
     return ScopeLabels;
   }
 
-  DenseMap<const DILocalScope *, DIE *> &getAbstractScopeDIEs() {
-    return AbstractLocalScopeDIEs;
-  }
-
-  DenseMap<const DINode *, std::unique_ptr<DbgEntity>> &getAbstractEntities() {
-    return AbstractEntities;
-  }
-
-  auto &getFinalizedAbstractSubprograms() {
-    return FinalizedAbstractSubprograms;
-  }
-
-  void insertDIE(const MDNode *TypeMD, DIE *Die) {
-    DITypeNodeToDieMap.insert(std::make_pair(TypeMD, Die));
-  }
-
-  DIE *getDIE(const MDNode *TypeMD) {
-    return DITypeNodeToDieMap.lookup(TypeMD);
-  }
+  DwarfInfoHolder &getDIEs() { return InfoHolder; }
 };
 
 } // end namespace llvm
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
index aa078f3f81d49..b0d0fa147b3fc 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
@@ -188,28 +188,16 @@ bool DwarfUnit::isShareableAcrossCUs(const DINode *D) const {
   // together.
   if (isDwoUnit() && !DD->shareAcrossDWOCUs())
     return false;
-  return (isa<DIType>(D) ||
-          (isa<DISubprogram>(D) && !cast<DISubprogram>(D)->isDefinition())) &&
-         !DD->generateTypeUnits();
-}
-
-DIE *DwarfUnit::getDIE(const DINode *D) const {
-  if (isShareableAcrossCUs(D))
-    return DU->getDIE(D);
-  return MDNodeToDieMap.lookup(D);
+  return !D || ((isa<DIType>(D) || (isa<DISubprogram>(D) &&
+                                    !cast<DISubprogram>(D)->isDefinition())) &&
+                !DD->generateTypeUnits());
 }
 
 void DwarfUnit::insertDIE(const DINode *Desc, DIE *D) {
-  if (isShareableAcrossCUs(Desc)) {
-    DU->insertDIE(Desc, D);
-    return;
-  }
-  MDNodeToDieMap.insert(std::make_pair(Desc, D));
+  getDIEs(Desc).insertDIE(Desc, D);
 }
 
-void DwarfUnit::insertDIE(DIE *D) {
-  MDNodeToDieMap.insert(std::make_pair(nullptr, D));
-}
+void DwarfUnit::insertDIE(DIE *D) { InfoHolder.insertDIE(D); }
 
 void DwarfUnit::addFlag(DIE &Die, dwarf::Attribute Attribute) {
   if (DD->getDwarfVersion() >= 4)
@@ -424,6 +412,14 @@ DIE &DwarfUnit::createAndAddDIE(dwarf::Tag Tag, DIE &Parent, const DINode *N) {
   return Die;
 }
 
+DIE &DwarfUnit::createAndAddSubprogramDIE(DIE &Parent, const DISubprogram *SP,
+                                          const Function *F) {
+  DIE &Die =
+      Parent.addChild(DIE::get(DIEValueAllocator, dwarf::DW_TAG_subprogram));
+  getDIEs(SP).getLocalScopes().insertConcreteDIE(SP, F, &Die);
+  return Die;
+}
+
 void DwarfUnit::addBlock(DIE &Die, dwarf::Attribute Attribute, DIELoc *Loc) {
   Loc->computeSize(Asm->getDwarfFormParams());
   DIELocs.push_back(Loc); // Memoize so we can call the destructor later on.
@@ -803,7 +799,7 @@ void DwarfUnit::constructTypeDIE(DIE &Buffer, const DIStringType *STy) {
     addString(Buffer, dwarf::DW_AT_name, Name);
 
   if (DIVariable *Var = STy->getStringLength()) {
-    if (auto *VarDIE = getDIE(Var))
+    if (auto *VarDIE = getDIEs(Var).getVariableDIE(Var))
       addDIEEntry(Buffer, dwarf::DW_AT_string_length, *VarDIE);
   } else if (DIExpression *Expr = STy->getStringLengthExp()) {
     DIELoc *Loc = new (DIEValueAllocator) DIELoc;
@@ -1122,8 +1118,8 @@ void DwarfUnit::constructTypeDIE(DIE &Buffer, const DICompositeType *CTy) {
           constructTypeDIE(VariantPart, Composite);
         }
       } else if (Tag == dwarf::DW_TAG_namelist) {
-        auto *Var = dyn_cast<DINode>(Element);
-        auto *VarDIE = getDIE(Var);
+        auto *Var = dyn_cast<DIVariable>(Element);
+        auto *VarDIE = getDIEs(Var).getVariableDIE(Var);
         if (VarDIE) {
           DIE &ItemDie = createAndAddDIE(dwarf::DW_TAG_namelist_item, Buffer);
           addDIEEntry(ItemDie, dwarf::DW_AT_namelist_item, *VarDIE);
@@ -1185,7 +1181,7 @@ void DwarfUnit::constructTypeDIE(DIE &Buffer, const DICompositeType *CTy) {
       Tag == dwarf::DW_TAG_class_type || Tag == dwarf::DW_TAG_structure_type ||
       Tag == dwarf::DW_TAG_union_type) {
     if (auto *Var = dyn_cast_or_null<DIVariable>(CTy->getRawSizeInBits())) {
-      if (auto *VarDIE = getDIE(Var))
+      if (auto *VarDIE = getDIEs(Var).getVariableDIE(Var))
         addDIEEntry(Buffer, dwarf::DW_AT_bit_size, *VarDIE);
     } else if (auto *Exp =
                    dyn_cast_or_null<DIExpression>(CTy->getRawSizeInBits())) {
@@ -1340,6 +1336,19 @@ DIE *DwarfUnit::getOrCreateModule(const DIModule *M) {
   return &MDie;
 }
 
+DIE *DwarfUnit::getExistingSubprogramDIE(const DISubprogram *SP,
+                                         const Function *F) const {
+  if (!F) {
+    if (DIE *SPDie = getDIEs(SP).getLocalScopes().getAnyConcreteDIE(SP))
+      return SPDie;
+  } else {
+    if (DIE *SPDie = getDIEs(SP).getLocalScopes().getConcreteDIE(SP, F))
+      return SPDie;
+  }
+
+  return nullptr;
+}
+
 DIE *DwarfUnit::getOrCreateSubprogramDIE(const DISubprogram *SP,
                                          const Function *FnHint, bool Minimal) {
   // Construct the context before querying for the existence of the DIE in case
@@ -1348,7 +1357,7 @@ DIE *DwarfUnit::getOrCreateSubprogramDIE(const DISubprogram *SP,
   DIE *ContextDIE =
       getOrCreateSubprogramContextDIE(SP, shouldPlaceInUnitDIE(SP, Minimal));
 
-  if (DIE *SPDie = getDIE(SP))
+  if (DIE *SPDie = getExistingSubprogramDIE(SP, FnHint))
     return SPDie;
 
   if (auto *SPDecl = SP->getDeclaration()) {
@@ -1360,13 +1369,13 @@ DIE *DwarfUnit::getOrCreateSubprogramDIE(const DISubprogram *SP,
       // FIXME: Should the creation of definition subprogram DIE during
       // the creation of declaration subprogram DIE be allowed?
       // See https://github.com/llvm/llvm-project/pull/154636.
-      if (DIE *SPDie = getDIE(SP))
+      if (DIE *SPDie = getExistingSubprogramDIE(SP, FnHint))
         return SPDie;
     }
   }
 
   // DW_TAG_inlined_subroutine may refer to this DIE.
-  DIE &SPDie = createAndAddDIE(dwarf::DW_TAG_subprogram, *ContextDIE, SP);
+  DIE &SPDie = createAndAddSubprogramDIE(*ContextDIE, SP, FnHint);
 
   // Stop here and fill this in later, depending on whether or not this
   // subprogram turns out to have inlined instances or not.
@@ -1392,7 +1401,8 @@ bool DwarfUnit::applySubprogramDefinitionAttributes(const DISubprogram *SP,
         if (DefinitionArgs[0] != nullptr && DeclArgs[0] != DefinitionArgs[0])
           addType(SPDie, Definiti...
[truncated]

dzhidzhoev added a commit to dzhidzhoev/llvm-project that referenced this pull request Oct 10, 2025
…o multiple Functions

Depends on:
* llvm#152680
* llvm#162852

In llvm#75385 (and the following
tries), an attempt was made, to support attaching local types to
DILocalScopes, and to store function local types in DISubprogram's
`retainedNodes:` field.

That patch failed to land due to issues arising during LTO process.
If two definition DISubprograms from different compile units
represent, essentially, the same source code function, and have
common local DICompositeType, and if this DICompositeType is uniqued
(due to ODRUniquingDebugTypes feature), the subprograms end up
having wrong retainedNodes list/scoping relationship.

To tackle this issue, in
llvm#142166, it was proposed
to force-unique all DISubporgrams even if they don't contain odr-uniqued types
(llvm#142166 (comment)).
It should establish one-to-one-to-many relationship between
DISubprograms, abstract DIEs and function clones (from different CUs, in
case of LTO).

To implement that, AsmPrinter should support correct emission of debug info
for DISubprograms attached to multiple functions. This is the goal of
this commit.

Here, LexicalScope's function map is changed to multimap between
DISubprogram and (possible multiple) functions attached to it.
LexicalScope is modified to create an abstract scope for a DISubprogram
having multiple lllvm::Function attachments.
`DwarfCompileUnit::getOrCreateSubprogramDIE` can recognize the case of
DISubprogram attached to multiple Functions, and return abstract DIE
when needed.

CodeViewDebug is adopted as well. UDTs are ensured to be emmited properly in
the cases that are addressed here. Please let me know if more changes
to CodeView needed, as I'm not very familiar with the format.
@dzhidzhoev dzhidzhoev self-assigned this Oct 10, 2025
`DwarfCompileUnit::constructVariableDIE()` and `constructLabelDIE()` are meant
for constructing both abstract and concrete DIEs of a DbgEntity. They use
`DwarfUnit::insertDIE()` to store a freshly-created DIE. However,
`insertDIE()`/`DwarfUnit::DITypeNodeToDieMap` store only single DIE per DINode.
If `insertDIE()` is called several times for the same instance of DINode, only
first DIE is saved in `DwarfUnit::DITypeNodeToDieMap`, as follows from
`DenseMap::insert()` specification.

It means, depending on what is called first,
`DwarfCompileUnit::constructVariableDIE(LV, /* Abstract */ true)` or
`DwarfCompileUnit::constructVariableDIE(LV, /* Abstract */ false)`,
`DwarfUnit::DITypeNodeToDieMap` stores either abstract or concrete DIE of a
node.

This behavior suggests an obscure API of DwarfCompileUnit, as it depends on
function call order and makes it unclear what `DwarfUnit::DITypeNodeToDieMap` is
meant to store.

To address that, DwarfInfoHolder class is introduced, which stores DIEs for
DILocalVariables and DILabels separately from DIEs for other DINodes (as
DILocalVariables and DILabels may have concrete and abstract DIEs), and allows
explicit access to abstract/concrete DIEs of a debug info entity.

Also, DwarfFile and DwarfUnit have a tiny duplicate code piece.
AbstractEntities, AbstractLocalScopeDIEs and FinalizedAbstractSubprograms
tracking were moved to DwarfInfoHolder, as the corresponding entities may be
shared across CUs.

DwarfInfoHolder may later be used for tracking DIEs of abstract/concrete lexical
scopes.  Currently, concrete lexical block/subprogram DIEs are distinguished by
their DISubprogram/DILocalScope/DILocalScope+inlinedAt in DwarfCompileUnit. As a
result, the same DISubprogram can't be attached to two llvm::Functions
(https://lists.llvm.org/pipermail/llvm-dev/2020-September/145342.html). Matching
DISubprogram/DILocalScope DIEs with their LexicalScopes and letting DwarfUnit
members to access abstract scopes may enable linking DISubprogram to several
llvm::Functions, and allow the transition from distinct to uniqued DISubprograms
proposed here
llvm#142166 (comment).
Depends on:
* llvm#152680

With this change, DINodeInfoHolder is used to store abstract
and concrete out-of-line subprogram DIEs in DwarfInfoHolder.

Every definition subprogram DIE is associated with a corresponding
llvm::Function (declaration subprograms are associated with nullptr).
When a concrete subprogram DIE is queried via `getOrCreateSubprogramDIE`,
the corresponding llvm::Function should be provided. If none is provided:

* DwarfUnit/DwarfTypeUnit falls back and returns any concrete DIE for
  the given DISubprogram,
* DwarfCompileUnit is expected to return abstract DIE.

This is a step to support attachment of a DISubprogram to multiple
llvm::Functions (and to establish one-to-one-to-many correspondence between
DISubprograms, abstract DIEs and function clones, and, later,
to make the backend use uniquied DISubprograms).
@dzhidzhoev dzhidzhoev force-pushed the debuginfo/dwarf-info-handler/subprograms/base branch from 7745940 to 17c20a4 Compare October 10, 2025 14:34
dzhidzhoev added a commit to dzhidzhoev/llvm-project that referenced this pull request Oct 10, 2025
…o multiple Functions

Depends on:
* llvm#152680
* llvm#162852

In llvm#75385 (and the following
tries), an attempt was made, to support attaching local types to
DILocalScopes, and to store function local types in DISubprogram's
`retainedNodes:` field.

That patch failed to land due to issues arising during LTO process.
If two definition DISubprograms from different compile units
represent, essentially, the same source code function, and have
common local DICompositeType, and if this DICompositeType is uniqued
(due to ODRUniquingDebugTypes feature), the subprograms end up
having wrong retainedNodes list/scoping relationship.

To tackle this issue, in
llvm#142166, it was proposed
to force-unique all DISubporgrams even if they don't contain odr-uniqued types
(llvm#142166 (comment)).
It should establish one-to-one-to-many relationship between
DISubprograms, abstract DIEs and function clones (from different CUs, in
case of LTO).

To implement that, AsmPrinter should support correct emission of debug info
for DISubprograms attached to multiple functions. This is the goal of
this commit.

Here, LexicalScope's function map is changed to multimap between
DISubprogram and (possible multiple) functions attached to it.
LexicalScope is modified to create an abstract scope for a DISubprogram
having multiple lllvm::Function attachments.
`DwarfCompileUnit::getOrCreateSubprogramDIE` can recognize the case of
DISubprogram attached to multiple Functions, and return abstract DIE
when needed.

CodeViewDebug is adopted as well. UDTs are ensured to be emmited properly in
the cases that are addressed here. Please let me know if more changes
to CodeView needed, as I'm not very familiar with the format.
@dwblaikie
Copy link
Collaborator

Could you refresh my memory/add some notes here about the function clones situation - link to the original discussions we had on tihs maybe, it's a while ago and I've lost context on how we got to "we should have multiple functions associated with the same DISubprogram". Be good to have some breadcrumbs here to follow back.

@dwblaikie
Copy link
Collaborator

Could you refresh my memory/add some notes here about the function clones situation - link to the original discussions we had on tihs maybe, it's a while ago and I've lost context on how we got to "we should have multiple functions associated with the same DISubprogram". Be good to have some breadcrumbs here to follow back.

Ah, right ,it's pretty well summarized in the description of #162854

@dwblaikie
Copy link
Collaborator

So... I /think/ the original conversations were maybe of the form "we picked the function from one LTO input, but the class from another - and now they don't match" - are there actually cases where we pick /both/ functions (two functions survive LTO that both refer to conceptually the same DISubprogram?)? It sounds like that's what you're proposing supporting - and I'm not sure if that was discussed (if you've got pointers to prior discussions on this, please link it so I can catch back up -sorry for rehashing all this - the timelines are very long & it's hard to keep track of it all :/ ) - it'd be nice if we could avoid the complexity of that if possible.

@dzhidzhoev
Copy link
Member Author

dzhidzhoev commented Oct 11, 2025

So... I /think/ the original conversations were maybe of the form "we picked the function from one LTO input, but the class from another - and now they don't match" - are there actually cases where we pick /both/ functions (two functions survive LTO that both refer to conceptually the same DISubprogram?)? It sounds like that's what you're proposing supporting - and I'm not sure if that was discussed (if you've got pointers to prior discussions on this, please link it so I can catch back up -sorry for rehashing all this - the timelines are very long & it's hard to keep track of it all :/ ) - it'd be nice if we could avoid the complexity of that if possible.

Hmm, we haven't explicitly discussed the case when two functions survive LTO and get their DISubprograms merged, but I think it may happen. If it's not a concern, I'll be glad to skip these patches :)

If we consider a case when there is a header unique-type.h:

#pragma once
__attribute__((noinline))
static int bar(int x, int y) {
  return x + y;
}

and there are source files unique-type1.cpp:

#include "unique-type.h"
int bar1(int a, int b) {
  return bar(a, b);
}

and unique-type2.cpp:

#include "unique-type.h"
int bar2(int a, int b) {
  return bar(a, b);
}

After applying:

clang unique-type1.cpp -o unique-type1.bc -flto=full -c -g -fstandalone-debug -O0
clang unique-type2.cpp -o unique-type2.bc -flto=full -c -g -fstandalone-debug -O0
llvm-link unique-type1.bc unique-type2.bc -o unique-type.bc -d -v
opt unique-type.bc -o unique-type.opt.bc --passes=mergefunc

unique-type.bc will contain two functions:

define internal noundef i32 @_ZL3barii(i32 noundef %x, i32 noundef %y) #1 !dbg !24 {
...
}

define internal noundef i32 @_ZL3barii.1(i32 noundef %x, i32 noundef %y) #1 !dbg !43 {
...
}

!24 = distinct !DISubprogram(name: "bar", linkageName: "_ZL3barii", scope: !25, file: !25, line: 3, type: !12, scopeLine: 3, flags: DIFlagPrototyped, spFlags: DISPFlagLocalToUnit | DISPFlagDefinition, unit: !0, retainedNodes: !15)
!43 = distinct !DISubprogram(name: "bar", linkageName: "_ZL3barii", scope: !25, file: !25, line: 3, type: !12, scopeLine: 3, flags: DIFlagPrototyped, spFlags: DISPFlagLocalToUnit | DISPFlagDefinition, unit: !2, retainedNodes: !15)

After opt, they will be merged, indeed. Should AsmPrinter rely on that?

This example can be modified to prevent function merge: if unique-type1.cpp is compiled with -O0 (for any reason), and unique-type2.cpp is compiled with -O2, function merge doesn't happen, and in the resulting binary we have:

$ clang -flto=full unique-type1.bc unique-type2.bc -g -o unique-type.dylib -dynamiclib -fstandalone-debug -v
$ objdump -t unique-type.dylib
...
0000000000000304      d  *UND* __ZL3barii
...
0000000000000328      d  *UND* __ZL3barii.1
...

What would be better to do in such case, if DISubprogram uniquing is assumed?

@dzhidzhoev
Copy link
Member Author

This example can be modified to prevent function merge: if unique-type1.cpp is compiled with -O0 (for any reason), and unique-type2.cpp is compiled with -O2, function merge doesn't happen, and in the resulting binary we have:

$ clang -flto=full unique-type1.bc unique-type2.bc -g -o unique-type.dylib -dynamiclib -fstandalone-debug -v
$ objdump -t unique-type.dylib
...
0000000000000304      d  *UND* __ZL3barii
...
0000000000000328      d  *UND* __ZL3barii.1
...

What would be better to do in such case, if DISubprogram uniquing is assumed?

An alternative solution for that could be, to maintain DISubprogram's linkageName up-to-date with llvm::Function name, and to maintain the list of DISubprogram's retianedNodes in MetadataLoader.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants