Skip to content

Conversation

dzhidzhoev
Copy link
Member

@dzhidzhoev dzhidzhoev commented Aug 8, 2025

DwarfCompileUnit::constructVariableDIE() and constructLabelDIE() are meant
for constructing both abstract and concrete DIEs of a DbgEntity. They use
DwarfUnit::insertDIE() to store a freshly-created DIE. However,
insertDIE()/DwarfUnit::DITypeNodeToDieMap store only single DIE per DINode.
If insertDIE() is called several times for the same instance of DINode, only
first DIE is saved in DwarfUnit::DITypeNodeToDieMap, as follows from
DenseMap::insert() specification.

It means, depending on what is called first,
DwarfCompileUnit::constructVariableDIE(LV, /* Abstract */ true) or
DwarfCompileUnit::constructVariableDIE(LV, /* Abstract */ false),
DwarfUnit::DITypeNodeToDieMap stores either abstract or concrete DIE of a
node.

This behavior suggests an obscure API of DwarfCompileUnit, as it depends on
function call order and makes it unclear what DwarfUnit::DITypeNodeToDieMap is
meant to store.

To address that, DwarfInfoHolder class is introduced, which stores DIEs for
DILocalVariables and DILabels separately from DIEs for other DINodes (as
DILocalVariables and DILabels may have concrete and abstract DIEs), and allows
explicit access to abstract/concrete DIEs of a debug info entity.

Also, DwarfFile and DwarfUnit have a tiny duplicate code piece.
AbstractEntities, AbstractLocalScopeDIEs and FinalizedAbstractSubprograms
tracking were moved to DwarfInfoHolder, as the corresponding entities may be
shared across CUs.

DwarfInfoHolder may later be used for tracking DIEs of abstract/concrete lexical
scopes. Currently, concrete lexical block/subprogram DIEs are distinguished by
their DISubprogram/DILocalScope/DILocalScope+inlinedAt in DwarfCompileUnit. As a
result, the same DISubprogram can't be attached to two llvm::Functions
(https://lists.llvm.org/pipermail/llvm-dev/2020-September/145342.html). Matching
DISubprogram/DILocalScope DIEs with their LexicalScopes and letting DwarfUnit
members to access abstract scopes may enable linking DISubprogram to several
llvm::Functions, and allow the transition from distinct to uniqued DISubprograms
proposed here #142166 (comment).

@llvmbot
Copy link
Member

llvmbot commented Aug 8, 2025

@llvm/pr-subscribers-debuginfo

Author: Vladislav Dzhidzhoev (dzhidzhoev)

Changes

DwarfCompileUnit::constructVariableDIE() and constructLabelDIE() are meant for constructing both abstract and concrete DIEs of a DbgEntity. They use DwarfUnit::insertDIE() to store a freshly-created DIE. However, insertDIE()/DwarfUnit::DITypeNodeToDieMap store only single DIE per DINode. If insertDIE() is called several times for the same instance of DINode, only first DIE is saved in DwarfUnit::DITypeNodeToDieMap, as follows from DenseMap::insert() specification.

It means, depending on what is called first,
DwarfCompileUnit::constructVariableDIE(LV, /* Abstract */ true) or
DwarfCompileUnit::constructVariableDIE(LV, /* Abstract */ false),
DwarfUnit::DITypeNodeToDieMap stores either abstract or concrete DIE of a node.

This behavior suggests an obscure API of DwarfCompileUnit, as it silently discards one of the DIEs and makes it unclear what DwarfUnit::DITypeNodeToDieMap is meant to store.

Also, DwarfFile and DwarfUnit have a tiny duplicate code piece.

To address that, DwarfInfoHolder class is introduced, which stores DIEs for DILocalVariables and DILabels separately from DIEs for other DINodes (as DILocalVariables and DILabels may have concrete and abstract DIEs), and allows explicit access to abstract/concrete DIEs of a debug info entity.

DwarfInfoHolder may later be used for tracking DIEs of abstract/concrete lexical scopes.
Currently, concrete lexical block/subprogram DIEs are distinguished by their DISubprogram/DILocalScope/DILocalScope+inlinedAt in DwarfCompileUnit. As a result, the same DISubprogram can't be attached to two llvm::Functions (https://lists.llvm.org/pipermail/llvm-dev/2020-September/145342.html). Matching DISubprogram/DILocalScope DIEs with their LexicalScopes and letting DwarfUnit members to access abstract scopes may enable linking DISubprogram to several llvm::Functions, and allow the transition from distinct to uniqued DISubprograms proposed here #142166 (comment).


Full diff: https://github.com/llvm/llvm-project/pull/152680.diff

4 Files Affected:

  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp (+3-3)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfFile.h (+100-14)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp (+15-25)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfUnit.h (+13-1)
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
index 67f526fe91464..61996a38e0188 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
@@ -178,7 +178,7 @@ unsigned DwarfCompileUnit::getOrCreateSourceID(const DIFile *File) {
 DIE *DwarfCompileUnit::getOrCreateGlobalVariableDIE(
     const DIGlobalVariable *GV, ArrayRef<GlobalExpr> GlobalExprs) {
   // Check for pre-existence.
-  if (DIE *Die = getDIE(GV))
+  if (DIE *Die = DIEs(GV).getVariableDIE(GV))
     return Die;
 
   assert(GV);
@@ -794,7 +794,7 @@ DIE *DwarfCompileUnit::constructLexicalScopeDIE(LexicalScope *Scope) {
 
 DIE *DwarfCompileUnit::constructVariableDIE(DbgVariable &DV, bool Abstract) {
   auto *VariableDie = DIE::get(DIEValueAllocator, DV.getTag());
-  insertDIE(DV.getVariable(), VariableDie);
+  DIEs(DV.getVariable()).LVs().insertDIE(DV, VariableDie, Abstract);
   DV.setDIE(*VariableDie);
   // Abstract variables don't get common attributes later, so apply them now.
   if (Abstract) {
@@ -1009,7 +1009,7 @@ DIE *DwarfCompileUnit::constructVariableDIE(DbgVariable &DV,
 DIE *DwarfCompileUnit::constructLabelDIE(DbgLabel &DL,
                                          const LexicalScope &Scope) {
   auto LabelDie = DIE::get(DIEValueAllocator, DL.getTag());
-  insertDIE(DL.getLabel(), LabelDie);
+  DIEs(DL.getLabel()).Labels().insertDIE(DL, LabelDie, Scope.isAbstractScope());
   DL.setDIE(*LabelDie);
 
   if (Scope.isAbstractScope())
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfFile.h b/llvm/lib/CodeGen/AsmPrinter/DwarfFile.h
index 0fc2b91ddfa91..8b54845cea273 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfFile.h
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfFile.h
@@ -14,6 +14,7 @@
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringRef.h"
 #include "llvm/CodeGen/DIE.h"
+#include "llvm/IR/DebugInfoMetadata.h"
 #include "llvm/Support/Allocator.h"
 #include <map>
 #include <memory>
@@ -25,9 +26,6 @@ class AsmPrinter;
 class DbgEntity;
 class DbgVariable;
 class DbgLabel;
-class DINode;
-class DILocalScope;
-class DwarfCompileUnit;
 class DwarfUnit;
 class LexicalScope;
 class MCSection;
@@ -51,6 +49,102 @@ struct RangeSpanList {
   SmallVector<RangeSpan, 2> Ranges;
 };
 
+/// Tracks abstract and concrete DIEs for debug info entities of a certain type.
+template <typename DINodeT, typename DbgEntityT> class DINodeInfoHolder {
+  DenseMap<const DINodeT *, DIE *> AbstractMap;
+  DenseMap<const DINodeT *, SmallDenseMap<const DbgEntityT *, DIE *, 2>>
+      ConcreteMap;
+
+  static const DINodeT *getNode(const DbgEntityT &N) {
+    return cast<DINodeT>(N.getEntity());
+  }
+
+public:
+  void insertAbstractDIE(const DINodeT *N, DIE *D) {
+    auto [_, Inserted] = AbstractMap.try_emplace(N, D);
+    assert(Inserted && "Duplicate abstract DIE for debug info node");
+  }
+
+  void insertConcreteDIE(const DbgEntityT &N, DIE *D) {
+    auto [_, Inserted] = ConcreteMap[getNode(N)].try_emplace(&N, D);
+    assert(Inserted && "Duplicate concrete DIE for debug info node");
+  }
+
+  void insertDIE(const DbgEntityT &N, DIE *D, bool Abstract) {
+    if (Abstract)
+      insertAbstractDIE(getNode(N), D);
+    else
+      insertConcreteDIE(N, D);
+  }
+
+  DIE *getAbstractDIE(const DINodeT *N) const { return AbstractMap.lookup(N); }
+
+  DIE *getConcreteDIE(const DbgEntityT &N) const {
+    if (auto I = ConcreteMap.find(getNode(N)); I != ConcreteMap.end())
+      return I->second.lookup(&N);
+    return nullptr;
+  }
+
+  /// Returns abstract DIE for the entity.
+  /// If no abstract DIE was created, returns any concrete DIE for the entity.
+  DIE *getDIE(const DINodeT *N) const {
+    if (DIE *D = getAbstractDIE(N))
+      return D;
+
+    if (auto I = ConcreteMap.find(N); I != ConcreteMap.end())
+      return I->second.empty() ? nullptr : I->second.begin()->second;
+
+    return nullptr;
+  }
+};
+
+/// Tracks DIEs for debug info entites.
+/// These DIEs can be shared across CUs, that is why we keep the map here
+/// instead of in DwarfCompileUnit.
+class DwarfInfoHolder {
+  // DIEs of local DbgVariables.
+  DINodeInfoHolder<DILocalVariable, DbgVariable> LVHolder;
+  DINodeInfoHolder<DILabel, DbgLabel> LabelHolder;
+
+  /// Other DINodes with the corresponding DIEs.
+  DenseMap<const DINode *, DIE *> MDNodeToDieMap;
+
+public:
+  void insertDIE(const DINode *N, DIE *Die) {
+    assert((!isa<DILabel>(N) && !isa<DILocalVariable>(N)) &&
+           "Use Labels().insertDIE() for labels or LVs().insertDIE() for "
+           "local variables");
+    auto [_, Inserted] = MDNodeToDieMap.try_emplace(N, Die);
+    assert((Inserted || isa<DISubprogram>(N) || isa<DIType>(N)) &&
+           "DIE for this DINode has already been added");
+  }
+
+  void insertDIE(DIE *D) { MDNodeToDieMap.try_emplace(nullptr, D); }
+
+  DIE *getDIE(const DINode *N) const {
+    DIE *D = MDNodeToDieMap.lookup(N);
+    assert((!D || (!isa<DILabel>(N) && !isa<DILocalVariable>(N))) &&
+           "Use Labels().getDIE() for labels or LVs().getDIE() for "
+           "local variables");
+    return D;
+  }
+
+  decltype(LVHolder) &LVs() { return LVHolder; }
+
+  decltype(LabelHolder) &Labels() { return LabelHolder; }
+
+  /// For a global variable, returns DIE of the variable.
+  ///
+  /// For a local variable, returns abstract DIE of the variable.
+  /// If no abstract DIE was created, returns any concrete DIE of the variable.
+  DIE *getVariableDIE(const DIVariable *V) {
+    if (auto *LV = dyn_cast<DILocalVariable>(V))
+      if (DIE *D = LVs().getDIE(LV))
+        return D;
+    return getDIE(V);
+  }
+};
+
 class DwarfFile {
   // Target of Dwarf emission, used for sizing of abbreviations.
   AsmPrinter *Asm;
@@ -92,13 +186,11 @@ class DwarfFile {
   DenseMap<LexicalScope *, LabelList> ScopeLabels;
 
   // Collection of abstract subprogram DIEs.
+  // TODO: move it to InfoHolder?
   DenseMap<const DILocalScope *, DIE *> AbstractLocalScopeDIEs;
   DenseMap<const DINode *, std::unique_ptr<DbgEntity>> AbstractEntities;
 
-  /// Maps MDNodes for type system with the corresponding DIEs. These DIEs can
-  /// be shared across CUs, that is why we keep the map here instead
-  /// of in DwarfCompileUnit.
-  DenseMap<const MDNode *, DIE *> DITypeNodeToDieMap;
+  DwarfInfoHolder InfoHolder;
 
 public:
   DwarfFile(AsmPrinter *AP, StringRef Pref, BumpPtrAllocator &DA);
@@ -174,13 +266,7 @@ class DwarfFile {
     return AbstractEntities;
   }
 
-  void insertDIE(const MDNode *TypeMD, DIE *Die) {
-    DITypeNodeToDieMap.insert(std::make_pair(TypeMD, Die));
-  }
-
-  DIE *getDIE(const MDNode *TypeMD) {
-    return DITypeNodeToDieMap.lookup(TypeMD);
-  }
+  DwarfInfoHolder &DIEs() { return InfoHolder; }
 };
 
 } // end namespace llvm
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
index b03fac2d22a52..4d54d1075512a 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
@@ -193,23 +193,13 @@ bool DwarfUnit::isShareableAcrossCUs(const DINode *D) const {
          !DD->generateTypeUnits();
 }
 
-DIE *DwarfUnit::getDIE(const DINode *D) const {
-  if (isShareableAcrossCUs(D))
-    return DU->getDIE(D);
-  return MDNodeToDieMap.lookup(D);
-}
+DIE *DwarfUnit::getDIE(const DINode *D) const { return DIEs(D).getDIE(D); }
 
 void DwarfUnit::insertDIE(const DINode *Desc, DIE *D) {
-  if (isShareableAcrossCUs(Desc)) {
-    DU->insertDIE(Desc, D);
-    return;
-  }
-  MDNodeToDieMap.insert(std::make_pair(Desc, D));
+  DIEs(Desc).insertDIE(Desc, D);
 }
 
-void DwarfUnit::insertDIE(DIE *D) {
-  MDNodeToDieMap.insert(std::make_pair(nullptr, D));
-}
+void DwarfUnit::insertDIE(DIE *D) { InfoHolder.insertDIE(D); }
 
 void DwarfUnit::addFlag(DIE &Die, dwarf::Attribute Attribute) {
   if (DD->getDwarfVersion() >= 4)
@@ -798,7 +788,7 @@ void DwarfUnit::constructTypeDIE(DIE &Buffer, const DIStringType *STy) {
     addString(Buffer, dwarf::DW_AT_name, Name);
 
   if (DIVariable *Var = STy->getStringLength()) {
-    if (auto *VarDIE = getDIE(Var))
+    if (auto *VarDIE = DIEs(Var).getVariableDIE(Var))
       addDIEEntry(Buffer, dwarf::DW_AT_string_length, *VarDIE);
   } else if (DIExpression *Expr = STy->getStringLengthExp()) {
     DIELoc *Loc = new (DIEValueAllocator) DIELoc;
@@ -1117,8 +1107,8 @@ void DwarfUnit::constructTypeDIE(DIE &Buffer, const DICompositeType *CTy) {
           constructTypeDIE(VariantPart, Composite);
         }
       } else if (Tag == dwarf::DW_TAG_namelist) {
-        auto *Var = dyn_cast<DINode>(Element);
-        auto *VarDIE = getDIE(Var);
+        auto *Var = dyn_cast<DIVariable>(Element);
+        auto *VarDIE = DIEs(Var).getVariableDIE(Var);
         if (VarDIE) {
           DIE &ItemDie = createAndAddDIE(dwarf::DW_TAG_namelist_item, Buffer);
           addDIEEntry(ItemDie, dwarf::DW_AT_namelist_item, *VarDIE);
@@ -1180,7 +1170,7 @@ void DwarfUnit::constructTypeDIE(DIE &Buffer, const DICompositeType *CTy) {
       Tag == dwarf::DW_TAG_class_type || Tag == dwarf::DW_TAG_structure_type ||
       Tag == dwarf::DW_TAG_union_type) {
     if (auto *Var = dyn_cast_or_null<DIVariable>(CTy->getRawSizeInBits())) {
-      if (auto *VarDIE = getDIE(Var))
+      if (auto *VarDIE = DIEs(Var).getVariableDIE(Var))
         addDIEEntry(Buffer, dwarf::DW_AT_bit_size, *VarDIE);
     } else if (auto *Exp =
                    dyn_cast_or_null<DIExpression>(CTy->getRawSizeInBits())) {
@@ -1578,7 +1568,7 @@ void DwarfUnit::constructSubrangeDIE(DIE &DW_Subrange, const DISubrangeType *SR,
   auto AddBoundTypeEntry = [&](dwarf::Attribute Attr,
                                DISubrangeType::BoundType Bound) -> void {
     if (auto *BV = dyn_cast_if_present<DIVariable *>(Bound)) {
-      if (auto *VarDIE = getDIE(BV))
+      if (auto *VarDIE = DIEs(BV).getVariableDIE(BV))
         addDIEEntry(DW_Subrange, Attr, *VarDIE);
     } else if (auto *BE = dyn_cast_if_present<DIExpression *>(Bound)) {
       DIELoc *Loc = new (DIEValueAllocator) DIELoc;
@@ -1620,7 +1610,7 @@ void DwarfUnit::constructSubrangeDIE(DIE &Buffer, const DISubrange *SR) {
   auto AddBoundTypeEntry = [&](dwarf::Attribute Attr,
                                DISubrange::BoundType Bound) -> void {
     if (auto *BV = dyn_cast_if_present<DIVariable *>(Bound)) {
-      if (auto *VarDIE = getDIE(BV))
+      if (auto *VarDIE = DIEs(BV).getVariableDIE(BV))
         addDIEEntry(DW_Subrange, Attr, *VarDIE);
     } else if (auto *BE = dyn_cast_if_present<DIExpression *>(Bound)) {
       DIELoc *Loc = new (DIEValueAllocator) DIELoc;
@@ -1662,7 +1652,7 @@ void DwarfUnit::constructGenericSubrangeDIE(DIE &Buffer,
   auto AddBoundTypeEntry = [&](dwarf::Attribute Attr,
                                DIGenericSubrange::BoundType Bound) -> void {
     if (auto *BV = dyn_cast_if_present<DIVariable *>(Bound)) {
-      if (auto *VarDIE = getDIE(BV))
+      if (auto *VarDIE = DIEs(BV).getVariableDIE(BV))
         addDIEEntry(DwGenericSubrange, Attr, *VarDIE);
     } else if (auto *BE = dyn_cast_if_present<DIExpression *>(Bound)) {
       if (BE->isConstant() &&
@@ -1742,7 +1732,7 @@ void DwarfUnit::constructArrayTypeDIE(DIE &Buffer, const DICompositeType *CTy) {
   }
 
   if (DIVariable *Var = CTy->getDataLocation()) {
-    if (auto *VarDIE = getDIE(Var))
+    if (auto *VarDIE = DIEs(Var).getVariableDIE(Var))
       addDIEEntry(Buffer, dwarf::DW_AT_data_location, *VarDIE);
   } else if (DIExpression *Expr = CTy->getDataLocationExp()) {
     DIELoc *Loc = new (DIEValueAllocator) DIELoc;
@@ -1753,7 +1743,7 @@ void DwarfUnit::constructArrayTypeDIE(DIE &Buffer, const DICompositeType *CTy) {
   }
 
   if (DIVariable *Var = CTy->getAssociated()) {
-    if (auto *VarDIE = getDIE(Var))
+    if (auto *VarDIE = DIEs(Var).getVariableDIE(Var))
       addDIEEntry(Buffer, dwarf::DW_AT_associated, *VarDIE);
   } else if (DIExpression *Expr = CTy->getAssociatedExp()) {
     DIELoc *Loc = new (DIEValueAllocator) DIELoc;
@@ -1764,7 +1754,7 @@ void DwarfUnit::constructArrayTypeDIE(DIE &Buffer, const DICompositeType *CTy) {
   }
 
   if (DIVariable *Var = CTy->getAllocated()) {
-    if (auto *VarDIE = getDIE(Var))
+    if (auto *VarDIE = DIEs(Var).getVariableDIE(Var))
       addDIEEntry(Buffer, dwarf::DW_AT_allocated, *VarDIE);
   } else if (DIExpression *Expr = CTy->getAllocatedExp()) {
     DIELoc *Loc = new (DIEValueAllocator) DIELoc;
@@ -1887,7 +1877,7 @@ DIE &DwarfUnit::constructMemberDIE(DIE &Buffer, const DIDerivedType *DT) {
 
     // Handle the size.
     if (auto *Var = dyn_cast_or_null<DIVariable>(DT->getRawSizeInBits())) {
-      if (auto *VarDIE = getDIE(Var))
+      if (auto *VarDIE = DIEs(Var).getVariableDIE(Var))
         addDIEEntry(MemberDie, dwarf::DW_AT_bit_size, *VarDIE);
     } else if (auto *Exp =
                    dyn_cast_or_null<DIExpression>(DT->getRawSizeInBits())) {
@@ -1913,7 +1903,7 @@ DIE &DwarfUnit::constructMemberDIE(DIE &Buffer, const DIDerivedType *DT) {
     // See https://dwarfstd.org/issues/250501.1.html
     if (auto *Var = dyn_cast_or_null<DIVariable>(DT->getRawOffsetInBits())) {
       if (!Asm->TM.Options.DebugStrictDwarf || DD->getDwarfVersion() >= 6) {
-        if (auto *VarDIE = getDIE(Var))
+        if (auto *VarDIE = DIEs(Var).getVariableDIE(Var))
           addDIEEntry(MemberDie, dwarf::DW_AT_data_bit_offset, *VarDIE);
       }
     } else if (auto *Expr =
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.h b/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.h
index fe05766cf36e1..be8179bae6caf 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.h
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.h
@@ -60,7 +60,7 @@ class DwarfUnit : public DIEUnit {
 
   /// Tracks the mapping of unit level debug information variables to debug
   /// information entries.
-  DenseMap<const MDNode *, DIE *> MDNodeToDieMap;
+  DwarfInfoHolder InfoHolder;
 
   /// A list of all the DIEBlocks in use.
   std::vector<DIEBlock *> DIEBlocks;
@@ -152,6 +152,18 @@ class DwarfUnit : public DIEUnit {
 
   void insertDIE(DIE *D);
 
+  const DwarfInfoHolder &DIEs(const DINode *N) const {
+    if (isShareableAcrossCUs(N))
+      return DU->DIEs();
+
+    return InfoHolder;
+  }
+
+  DwarfInfoHolder &DIEs(const DINode *N) {
+    return const_cast<DwarfInfoHolder &>(
+        const_cast<const DwarfUnit *>(this)->DIEs(N));
+  }
+
   /// Add a flag that is true to the DIE.
   void addFlag(DIE &Die, dwarf::Attribute Attribute);
 

@dzhidzhoev dzhidzhoev changed the title [DwarfDebug] Move abstract entities tracking from DwarfCompileUnit to DwarfUnit [DwarfDebug] Track abstract entities in DwarfUnit separately Aug 11, 2025
@dzhidzhoev
Copy link
Member Author

Gentle ping

@dzhidzhoev dzhidzhoev force-pushed the debuginfo/dwarf-info-handler/base branch from 28fed0f to 6442e01 Compare October 10, 2025 14:08
dzhidzhoev added a commit to dzhidzhoev/llvm-project that referenced this pull request Oct 10, 2025
Depends on:
* llvm#152680

With this change, DINodeInfoHolder is used to store abstract
and concrete out-of-line subprogram DIEs in DwarfInfoHolder.

Every definition subprogram DIE is associated with a corresponding
llvm::Function (declaration subprograms are associated with nullptr).
When a concrete subprogram DIE is queried via `getOrCreateSubprogramDIE`,
the corresponding llvm::Function should be provided. If none is provided:

* DwarfUnit/DwarfTypeUnit falls back and returns any concrete DIE for
  the given DISubprogram,
* DwarfCompileUnit is expected to return abstract DIE.

This is a step to support attachment of a DISubprogram to multiple
llvm::Functions (and to establish one-to-one-to-many correspondence between
DISubprograms, abstract DIEs and function clones, and, later,
to make the backend use uniquied DISubprograms).
dzhidzhoev added a commit to dzhidzhoev/llvm-project that referenced this pull request Oct 10, 2025
…o multiple Functions

Depends on:
* llvm#152680
* llvm#162852

In llvm#75385 (and the following
tries), an attempt was made, to support attaching local types to
DILocalScopes, and to store function local types in DISubprogram's
`retainedNodes:` field.

That patch failed to land due to issues arising during LTO process.
If two definition DISubprograms from different compile units
represent, essentially, the same source code function, and have
common local DICompositeType, and if this DICompositeType is uniqued
(due to ODRUniquingDebugTypes feature), the subprograms end up
having wrong retainedNodes list/scoping relationship.

To tackle this issue, in
llvm#142166, it was proposed
to force-unique all DISubporgrams even if they don't contain odr-uniqued types
(llvm#142166 (comment)).
It should establish one-to-one-to-many relationship between
DISubprograms, abstract DIEs and function clones (from different CUs, in
case of LTO).

To implement that, AsmPrinter should support correct emission of debug info
for DISubprograms attached to multiple functions. This is the goal of
this commit.

Here, LexicalScope's function map is changed to multimap between
DISubprogram and (possible multiple) functions attached to it.
LexicalScope is modified to create an abstract scope for a DISubprogram
having multiple lllvm::Function attachments.
`DwarfCompileUnit::getOrCreateSubprogramDIE` can recognize the case of
DISubprogram attached to multiple Functions, and return abstract DIE
when needed.

CodeViewDebug is adopted as well. UDTs are ensured to be emmited properly in
the cases that are addressed here. Please let me know if more changes
to CodeView needed, as I'm not very familiar with the format.
@dzhidzhoev dzhidzhoev self-assigned this Oct 10, 2025
@dzhidzhoev dzhidzhoev requested a review from Michael137 October 10, 2025 14:30
`DwarfCompileUnit::constructVariableDIE()` and `constructLabelDIE()` are meant
for constructing both abstract and concrete DIEs of a DbgEntity. They use
`DwarfUnit::insertDIE()` to store a freshly-created DIE. However,
`insertDIE()`/`DwarfUnit::DITypeNodeToDieMap` store only single DIE per DINode.
If `insertDIE()` is called several times for the same instance of DINode, only
first DIE is saved in `DwarfUnit::DITypeNodeToDieMap`, as follows from
`DenseMap::insert()` specification.

It means, depending on what is called first,
`DwarfCompileUnit::constructVariableDIE(LV, /* Abstract */ true)` or
`DwarfCompileUnit::constructVariableDIE(LV, /* Abstract */ false)`,
`DwarfUnit::DITypeNodeToDieMap` stores either abstract or concrete DIE of a
node.

This behavior suggests an obscure API of DwarfCompileUnit, as it depends on
function call order and makes it unclear what `DwarfUnit::DITypeNodeToDieMap` is
meant to store.

To address that, DwarfInfoHolder class is introduced, which stores DIEs for
DILocalVariables and DILabels separately from DIEs for other DINodes (as
DILocalVariables and DILabels may have concrete and abstract DIEs), and allows
explicit access to abstract/concrete DIEs of a debug info entity.

Also, DwarfFile and DwarfUnit have a tiny duplicate code piece.
AbstractEntities, AbstractLocalScopeDIEs and FinalizedAbstractSubprograms
tracking were moved to DwarfInfoHolder, as the corresponding entities may be
shared across CUs.

DwarfInfoHolder may later be used for tracking DIEs of abstract/concrete lexical
scopes.  Currently, concrete lexical block/subprogram DIEs are distinguished by
their DISubprogram/DILocalScope/DILocalScope+inlinedAt in DwarfCompileUnit. As a
result, the same DISubprogram can't be attached to two llvm::Functions
(https://lists.llvm.org/pipermail/llvm-dev/2020-September/145342.html). Matching
DISubprogram/DILocalScope DIEs with their LexicalScopes and letting DwarfUnit
members to access abstract scopes may enable linking DISubprogram to several
llvm::Functions, and allow the transition from distinct to uniqued DISubprograms
proposed here
llvm#142166 (comment).
@dzhidzhoev dzhidzhoev force-pushed the debuginfo/dwarf-info-handler/base branch from 6442e01 to f9a78ea Compare October 10, 2025 14:33
dzhidzhoev added a commit to dzhidzhoev/llvm-project that referenced this pull request Oct 10, 2025
Depends on:
* llvm#152680

With this change, DINodeInfoHolder is used to store abstract
and concrete out-of-line subprogram DIEs in DwarfInfoHolder.

Every definition subprogram DIE is associated with a corresponding
llvm::Function (declaration subprograms are associated with nullptr).
When a concrete subprogram DIE is queried via `getOrCreateSubprogramDIE`,
the corresponding llvm::Function should be provided. If none is provided:

* DwarfUnit/DwarfTypeUnit falls back and returns any concrete DIE for
  the given DISubprogram,
* DwarfCompileUnit is expected to return abstract DIE.

This is a step to support attachment of a DISubprogram to multiple
llvm::Functions (and to establish one-to-one-to-many correspondence between
DISubprograms, abstract DIEs and function clones, and, later,
to make the backend use uniquied DISubprograms).
dzhidzhoev added a commit to dzhidzhoev/llvm-project that referenced this pull request Oct 10, 2025
…o multiple Functions

Depends on:
* llvm#152680
* llvm#162852

In llvm#75385 (and the following
tries), an attempt was made, to support attaching local types to
DILocalScopes, and to store function local types in DISubprogram's
`retainedNodes:` field.

That patch failed to land due to issues arising during LTO process.
If two definition DISubprograms from different compile units
represent, essentially, the same source code function, and have
common local DICompositeType, and if this DICompositeType is uniqued
(due to ODRUniquingDebugTypes feature), the subprograms end up
having wrong retainedNodes list/scoping relationship.

To tackle this issue, in
llvm#142166, it was proposed
to force-unique all DISubporgrams even if they don't contain odr-uniqued types
(llvm#142166 (comment)).
It should establish one-to-one-to-many relationship between
DISubprograms, abstract DIEs and function clones (from different CUs, in
case of LTO).

To implement that, AsmPrinter should support correct emission of debug info
for DISubprograms attached to multiple functions. This is the goal of
this commit.

Here, LexicalScope's function map is changed to multimap between
DISubprogram and (possible multiple) functions attached to it.
LexicalScope is modified to create an abstract scope for a DISubprogram
having multiple lllvm::Function attachments.
`DwarfCompileUnit::getOrCreateSubprogramDIE` can recognize the case of
DISubprogram attached to multiple Functions, and return abstract DIE
when needed.

CodeViewDebug is adopted as well. UDTs are ensured to be emmited properly in
the cases that are addressed here. Please let me know if more changes
to CodeView needed, as I'm not very familiar with the format.
@dzhidzhoev
Copy link
Member Author

Updated&fixed PR description.

DIE *getDIE(const DINodeT *N) const {
if (DIE *D = getAbstractDIE(N))
return D;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra blank line could be removed to make this consistent with similar code above?

std::reference_wrapper<const typename ConcreteMapT::mapped_type>>
getConcreteDIEs(const DINodeT *N) const {
if (auto I = ConcreteMap.find(N); I != ConcreteMap.end())
return std::make_optional(std::ref(I->second));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional references are a bit awkward - could we use a raw DIE*? (or does it need to be a DIE**, I guess? That'd be OK, if a bit awkward in its own right)

};

/// Tracks abstract and concrete DIEs for debug info entities of a certain type.
template <typename DINodeT, typename DbgEntityT> class DINodeInfoHolder {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this could be DINodeMap?

Comment on lines +157 to +161
auto &getLVs() { return LVHolder; }
auto &getLVs() const { return LVHolder; }

auto &getLabels() { return LabelHolder; }
auto &getLabels() const { return LabelHolder; }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably worth spelling out the return types

auto *Var = dyn_cast<DINode>(Element);
auto *VarDIE = getDIE(Var);
auto *Var = dyn_cast<DIVariable>(Element);
auto *VarDIE = getDIEs(Var).getVariableDIE(Var);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some checking we can add to the general DINode map (not the label or variable ones you've added) to ensure labels and variables aren't added to or queried from that map?

Comment on lines +122 to +135
class DwarfInfoHolder {
/// DIEs of local DbgVariables.
DINodeInfoHolder<DILocalVariable, DbgVariable> LVHolder;
/// DIEs of labels.
DINodeInfoHolder<DILabel, DbgLabel> LabelHolder;
DenseMap<const DINode *, std::unique_ptr<DbgEntity>> AbstractEntities;
// List of abstract local scopes (either DISubprogram or DILexicalBlock).
DenseMap<const DILocalScope *, DIE *> AbstractLocalScopeDIEs;
/// Keeps track of abstract subprograms to populate them only once.
// FIXME: merge creation and population of abstract scopes.
SmallPtrSet<const DISubprogram *, 8> FinalizedAbstractSubprograms;

/// Other DINodes with the corresponding DIEs.
DenseMap<const DINode *, DIE *> MDNodeToDieMap;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this creates a union of the way we were tracking the node<>die map at the file level and the unit level - and then stores that union at both the file and the unit level?

That seems like a bit of added complexity (but I guess it's the title of the patch). I guess I didn't quite follow from the patch description why that's the goal.

Could you try to summarize/rephrase this goal/motivation?

Copy link
Member Author

@dzhidzhoev dzhidzhoev Oct 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, we have fields, where the DINodes shared across CUs are stored:

  • DwarfFile::DITypeNodeToDieMap (which can also serve as a container for DISubprograms, according to DwarfUnit::isShareableAcrossCUs logic)
  • DwarfFile::AbstractLocalScopeDIEs
  • DwarfFile::AbstractEntities
  • DwarfFile::FinalizedAbstractSubprograms

And we have the corresponding fields in DwarfUnit/DwarfCompileUnit:

  • DwarfUnit::MDNodeToDieMap
  • DwarfCompileUnit::AbstractLocalScopeDIEs
  • DwarfCompileUnit::AbstractEntities
  • DwarfCompileUnit::FinalizedAbstractSubprograms

The first goal of DwarfInfoHolder is to deduplicate these field definitions by putting them inside one "container", and reuse that container when needed. DwarfUnit::InfoHolder will store CU-local (or type-unit-local) DIEs, DwarfFile::InfoHolder will store shareable DIEs.

So this creates a union of the way we were tracking the node<>die map at the file level and the unit level - and then stores that union at both the file and the unit level?

So, an instance of DwarfInfoHolder holds either shareable or CU-local DIEs. It's not a union of them.

Second goal is to split the general DIE map into separate maps for DILocalVariables, DILabels, DILocalScopes (in the following two PRs), and a map for the rest of DINodes (mostly types), since, as I assume, we can get DILocalVariables/DILabels/DILocalScopes used in multiple llvm::Functions during LTO with DISubprogram uniquing, so DIEs for these nodes created in different Functions should be somehow distinguished. It's done with additional DbgEntityT pointer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants