Skip to content

Conversation

Michael137
Copy link
Member

@Michael137 Michael137 commented Jul 21, 2025

Depends on

This patch is an implementation of this discussion about handling ABI-tagged structors during expression evaluation.

Motivation

LLDB encodes the mangled name of a DW_TAG_subprogram into AsmLabels on function and method Clang AST nodes. This means that when calls to these functions get lowered into IR (when running JITted expressions), the address resolver can locate the appropriate symbol by mangled name (and it is guaranteed to find the symbol because we got the mangled name from debug-info, instead of letting Clang mangle it based on AST structure). However, we don't do this for CXXConstructorDecls/CXXDestructorDecls because these structor declarations in DWARF don't have a linkage name. This is because there can be multiple variants of a structor, each with a distinct mangling in the Itanium ABI. Each structor variant has its own definition DW_TAG_subprogram. So LLDB doesn't know which mangled name to put into the AsmLabel.

Currently this means using ABI-tagged structors in LLDB expressions won't work (see this RFC for concrete examples).

Proposed Solution

The FunctionCallLabel encoding that we put into AsmLabels already supports stuffing more info about a DIE into it. So this patch extends the FunctionCallLabel to contain an optional discriminator (a sequence of bytes) which the SymbolFileDWARF plugin interprets as the constructor/destructor variant of that DIE. So when searching for the definition DIE, LLDB will include the structor variant in its heuristic for determining a match.

There's a few subtleties here:

  1. At the point at which LLDB first constructs the label, it has no way of knowing (just by looking at the debug-info declaration), which structor variant the expression evaluator is supposed to call. That's something that gets decided when compiling the expression. So we let the Clang mangler inject the correct structor variant into the AsmLabel during JITing. I adjusted the AsmLabelAttr mangling for this in [clang][Mangle] Inject structor type into mangled name when mangling for LLDB JIT expressions #155485. An option would've been to create a new Clang attribute which behaved like an AsmLabel but with these special semantics for LLDB. My main concern there is that we'd have to adjust all the AsmLabelAttr checks around Clang to also now account for this new attribute.
  2. The compiler is free to omit the C1 variant of a constructor if the C2 variant is sufficient. In that case it may alias C1 to C2, leaving us with only the C2 DW_TAG_subprogram in the object file. Linux is one of the platforms where this occurs. For those cases I added a heuristic in SymbolFileDWARF where we pick C2 if we asked for C1 but it doesn't exist. This may not always be correct (e.g., if the compiler decided to drop C1 for other reasons).
  3. In [clang][DebugInfo] Emit unified (Itanium) mangled name to structor declarations #154142 Clang will emit C4/D4 variants of ctors/dtors on declarations. When resolving the FunctionCallLabel we will now substitute the actual variant that Clang told us we need to call into the mangled name. We do this using LLDB's ManglingSubstitutor. That way we find the definition DIE exactly the same way we do for regular function calls.
  4. In cases where declarations and definitions live in separate modules, the DIE ID encoded in the function call label may not be enough to find the definition DIE in the encoded module ID. For those cases we fall back to how LLDB used to work: look up in all images of the target. To make sure we don't use the unified mangled name for the fallback lookup, we change the lookup name to whatever mangled name the FunctionCallLabel resolved to.

rdar://104968288

@Michael137 Michael137 requested a review from labath July 21, 2025 14:45
@Michael137 Michael137 requested a review from adrian-prantl July 21, 2025 14:46
@Michael137 Michael137 force-pushed the lldb/reliable-function-call-labels-structors branch 2 times, most recently from aeebf1f to ce7fa42 Compare July 22, 2025 15:08
@Michael137 Michael137 force-pushed the lldb/reliable-function-call-labels-structors branch from ce7fa42 to 2a70642 Compare July 31, 2025 08:52
Copy link

github-actions bot commented Jul 31, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@Michael137 Michael137 force-pushed the lldb/reliable-function-call-labels-structors branch from 2a70642 to 44410a4 Compare July 31, 2025 13:19
Michael137 added a commit that referenced this pull request Aug 1, 2025
…#148877)

LLDB currently attaches `AsmLabel`s to `FunctionDecl`s such that that
the `IRExecutionUnit` can determine which mangled name to call (we can't
rely on Clang deriving the correct mangled name to call because the
debug-info AST doesn't contain all the info that would be encoded in the
DWARF linkage names). However, we don't attach `AsmLabel`s for structors
because they have multiple variants and thus it's not clear which
mangled name to use. In the [RFC on fixing expression evaluation of
abi-tagged
structors](https://discourse.llvm.org/t/rfc-lldb-handling-abi-tagged-constructors-destructors-in-expression-evaluator/82816)
we discussed encoding the structor variant into the `AsmLabel`s.
Specifically in [this
thread](https://discourse.llvm.org/t/rfc-lldb-handling-abi-tagged-constructors-destructors-in-expression-evaluator/82816/7)
we discussed that the contents of the `AsmLabel` are completely under
LLDB's control and we could make use of it to uniquely identify a
function by encoding the exact module and DIE that the function is
associated with (mangled names need not be enough since two identical
mangled symbols may live in different modules). So if we already have a
custom `AsmLabel` format, we can encode the structor variant in a
follow-up (the current idea is to append the structor variant as a
suffix to our custom `AsmLabel` when Clang emits the mangled name into
the JITted IR). Then we would just have to teach the `IRExecutionUnit`
to pick the correct structor variant DIE during symbol resolution. The
draft of this is available
[here](#149827)

This patch sets up the infrastructure for the custom `AsmLabel` format
by encoding the module id, DIE id and mangled name in it.

**Implementation**

The flow is as follows:
1. Create the label in `DWARFASTParserClang`. The format is:
`$__lldb_func:module_id:die_id:mangled_name`
2. When resolving external symbols in `IRExecutionUnit`, we parse this
label and then do a lookup by DIE ID (or mangled name into the module if
the encoded DIE is a declaration).

Depends on #151355
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Aug 1, 2025
…n AsmLabels (#148877)

LLDB currently attaches `AsmLabel`s to `FunctionDecl`s such that that
the `IRExecutionUnit` can determine which mangled name to call (we can't
rely on Clang deriving the correct mangled name to call because the
debug-info AST doesn't contain all the info that would be encoded in the
DWARF linkage names). However, we don't attach `AsmLabel`s for structors
because they have multiple variants and thus it's not clear which
mangled name to use. In the [RFC on fixing expression evaluation of
abi-tagged
structors](https://discourse.llvm.org/t/rfc-lldb-handling-abi-tagged-constructors-destructors-in-expression-evaluator/82816)
we discussed encoding the structor variant into the `AsmLabel`s.
Specifically in [this
thread](https://discourse.llvm.org/t/rfc-lldb-handling-abi-tagged-constructors-destructors-in-expression-evaluator/82816/7)
we discussed that the contents of the `AsmLabel` are completely under
LLDB's control and we could make use of it to uniquely identify a
function by encoding the exact module and DIE that the function is
associated with (mangled names need not be enough since two identical
mangled symbols may live in different modules). So if we already have a
custom `AsmLabel` format, we can encode the structor variant in a
follow-up (the current idea is to append the structor variant as a
suffix to our custom `AsmLabel` when Clang emits the mangled name into
the JITted IR). Then we would just have to teach the `IRExecutionUnit`
to pick the correct structor variant DIE during symbol resolution. The
draft of this is available
[here](llvm/llvm-project#149827)

This patch sets up the infrastructure for the custom `AsmLabel` format
by encoding the module id, DIE id and mangled name in it.

**Implementation**

The flow is as follows:
1. Create the label in `DWARFASTParserClang`. The format is:
`$__lldb_func:module_id:die_id:mangled_name`
2. When resolving external symbols in `IRExecutionUnit`, we parse this
label and then do a lookup by DIE ID (or mangled name into the module if
the encoded DIE is a declaration).

Depends on llvm/llvm-project#151355
@Michael137 Michael137 force-pushed the lldb/reliable-function-call-labels-structors branch 3 times, most recently from 499aa52 to 9ec9eef Compare August 1, 2025 21:47
@Michael137 Michael137 force-pushed the lldb/reliable-function-call-labels-structors branch 5 times, most recently from e10a1eb to 82075a2 Compare August 4, 2025 11:30
@Michael137 Michael137 changed the title [DRAFT] [lldb][Expression] Add structor variant to LLDB's function call labels [lldb][Expression] Add structor variant to LLDB's function call labels Aug 4, 2025
@Michael137 Michael137 marked this pull request as ready for review August 4, 2025 11:33
@Michael137 Michael137 requested review from a team and JDevlieghere as code owners August 4, 2025 11:33
@llvmbot llvmbot added clang Clang issues not falling into any other category libc++abi libc++abi C++ Runtime Library. Not libc++. lldb clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Aug 4, 2025
@llvmbot
Copy link
Member

llvmbot commented Aug 4, 2025

@llvm/pr-subscribers-clang-driver
@llvm/pr-subscribers-debuginfo
@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-lldb

Author: Michael Buch (Michael137)

Changes

Depends on #148877


Patch is 30.28 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/149827.diff

16 Files Affected:

  • (modified) clang/include/clang/Basic/Attr.td (+4-13)
  • (modified) clang/lib/AST/Mangle.cpp (+21-3)
  • (modified) clang/lib/Sema/SemaDecl.cpp (+5-8)
  • (modified) clang/unittests/AST/DeclTest.cpp (+1-8)
  • (modified) libcxxabi/src/demangle/ItaniumDemangle.h (+2)
  • (modified) lldb/include/lldb/Expression/Expression.h (+6-2)
  • (modified) lldb/source/Expression/Expression.cpp (+17-12)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp (+45-2)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp (+131-18)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h (+4)
  • (modified) lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp (+3-4)
  • (modified) lldb/unittests/Expression/ExpressionTest.cpp (+22-15)
  • (modified) lldb/unittests/Symbol/TestTypeSystemClang.cpp (+7-7)
  • (modified) llvm/include/llvm/Demangle/Demangle.h (+3)
  • (modified) llvm/include/llvm/Demangle/ItaniumDemangle.h (+2)
  • (modified) llvm/lib/Demangle/ItaniumDemangle.cpp (+7-3)
diff --git a/clang/include/clang/Basic/Attr.td b/clang/include/clang/Basic/Attr.td
index b3ff45b3e90a3..a0efb21218312 100644
--- a/clang/include/clang/Basic/Attr.td
+++ b/clang/include/clang/Basic/Attr.td
@@ -1059,22 +1059,13 @@ def AVRSignal : InheritableAttr, TargetSpecificAttr<TargetAVR> {
 def AsmLabel : InheritableAttr {
   let Spellings = [CustomKeyword<"asm">, CustomKeyword<"__asm__">];
   let Args = [
-    // Label specifies the mangled name for the decl.
-    StringArgument<"Label">,
-
-    // IsLiteralLabel specifies whether the label is literal (i.e. suppresses
-    // the global C symbol prefix) or not. If not, the mangle-suppression prefix
-    // ('\01') is omitted from the decl name at the LLVM IR level.
-    //
-    // Non-literal labels are used by some external AST sources like LLDB.
-    BoolArgument<"IsLiteralLabel", /*optional=*/0, /*fake=*/1>
-  ];
+      // Label specifies the mangled name for the decl.
+      StringArgument<"Label">, ];
   let SemaHandler = 0;
   let Documentation = [AsmLabelDocs];
-  let AdditionalMembers =
-[{
+  let AdditionalMembers = [{
 bool isEquivalent(AsmLabelAttr *Other) const {
-  return getLabel() == Other->getLabel() && getIsLiteralLabel() == Other->getIsLiteralLabel();
+  return getLabel() == Other->getLabel();
 }
 }];
 }
diff --git a/clang/lib/AST/Mangle.cpp b/clang/lib/AST/Mangle.cpp
index 9652fdbc4e125..58029476572c5 100644
--- a/clang/lib/AST/Mangle.cpp
+++ b/clang/lib/AST/Mangle.cpp
@@ -152,6 +152,20 @@ bool MangleContext::shouldMangleDeclName(const NamedDecl *D) {
   return shouldMangleCXXName(D);
 }
 
+static llvm::StringRef g_lldb_func_call_label_prefix = "$__lldb_func";
+
+static void emitLLDBAsmLabel(llvm::StringRef label, GlobalDecl GD,
+                             llvm::raw_ostream &Out) {
+  Out << g_lldb_func_call_label_prefix << ":";
+
+  if (llvm::isa<clang::CXXConstructorDecl>(GD.getDecl()))
+    Out << "C" << GD.getCtorType();
+  else if (llvm::isa<clang::CXXDestructorDecl>(GD.getDecl()))
+    Out << "D" << GD.getDtorType();
+
+  Out << label.substr(g_lldb_func_call_label_prefix.size() + 1);
+}
+
 void MangleContext::mangleName(GlobalDecl GD, raw_ostream &Out) {
   const ASTContext &ASTContext = getASTContext();
   const NamedDecl *D = cast<NamedDecl>(GD.getDecl());
@@ -161,9 +175,9 @@ void MangleContext::mangleName(GlobalDecl GD, raw_ostream &Out) {
   if (const AsmLabelAttr *ALA = D->getAttr<AsmLabelAttr>()) {
     // If we have an asm name, then we use it as the mangling.
 
-    // If the label isn't literal, or if this is an alias for an LLVM intrinsic,
+    // If the label is an alias for an LLVM intrinsic,
     // do not add a "\01" prefix.
-    if (!ALA->getIsLiteralLabel() || ALA->getLabel().starts_with("llvm.")) {
+    if (ALA->getLabel().starts_with("llvm.")) {
       Out << ALA->getLabel();
       return;
     }
@@ -185,7 +199,11 @@ void MangleContext::mangleName(GlobalDecl GD, raw_ostream &Out) {
     if (!UserLabelPrefix.empty())
       Out << '\01'; // LLVM IR Marker for __asm("foo")
 
-    Out << ALA->getLabel();
+    if (ALA->getLabel().starts_with(g_lldb_func_call_label_prefix))
+      emitLLDBAsmLabel(ALA->getLabel(), GD, Out);
+    else
+      Out << ALA->getLabel();
+
     return;
   }
 
diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp
index e2ac648320c0f..885d04b9ab926 100644
--- a/clang/lib/Sema/SemaDecl.cpp
+++ b/clang/lib/Sema/SemaDecl.cpp
@@ -8113,9 +8113,7 @@ NamedDecl *Sema::ActOnVariableDeclarator(
       }
     }
 
-    NewVD->addAttr(AsmLabelAttr::Create(Context, Label,
-                                        /*IsLiteralLabel=*/true,
-                                        SE->getStrTokenLoc(0)));
+    NewVD->addAttr(AsmLabelAttr::Create(Context, Label, SE->getStrTokenLoc(0)));
   } else if (!ExtnameUndeclaredIdentifiers.empty()) {
     llvm::DenseMap<IdentifierInfo*,AsmLabelAttr*>::iterator I =
       ExtnameUndeclaredIdentifiers.find(NewVD->getIdentifier());
@@ -10345,9 +10343,8 @@ Sema::ActOnFunctionDeclarator(Scope *S, Declarator &D, DeclContext *DC,
   if (Expr *E = D.getAsmLabel()) {
     // The parser guarantees this is a string.
     StringLiteral *SE = cast<StringLiteral>(E);
-    NewFD->addAttr(AsmLabelAttr::Create(Context, SE->getString(),
-                                        /*IsLiteralLabel=*/true,
-                                        SE->getStrTokenLoc(0)));
+    NewFD->addAttr(
+        AsmLabelAttr::Create(Context, SE->getString(), SE->getStrTokenLoc(0)));
   } else if (!ExtnameUndeclaredIdentifiers.empty()) {
     llvm::DenseMap<IdentifierInfo*,AsmLabelAttr*>::iterator I =
       ExtnameUndeclaredIdentifiers.find(NewFD->getIdentifier());
@@ -20598,8 +20595,8 @@ void Sema::ActOnPragmaRedefineExtname(IdentifierInfo* Name,
                                          LookupOrdinaryName);
   AttributeCommonInfo Info(AliasName, SourceRange(AliasNameLoc),
                            AttributeCommonInfo::Form::Pragma());
-  AsmLabelAttr *Attr = AsmLabelAttr::CreateImplicit(
-      Context, AliasName->getName(), /*IsLiteralLabel=*/true, Info);
+  AsmLabelAttr *Attr =
+      AsmLabelAttr::CreateImplicit(Context, AliasName->getName(), Info);
 
   // If a declaration that:
   // 1) declares a function or a variable
diff --git a/clang/unittests/AST/DeclTest.cpp b/clang/unittests/AST/DeclTest.cpp
index ed635da683aab..afaf413493299 100644
--- a/clang/unittests/AST/DeclTest.cpp
+++ b/clang/unittests/AST/DeclTest.cpp
@@ -74,7 +74,6 @@ TEST(Decl, AsmLabelAttr) {
   StringRef Code = R"(
     struct S {
       void f() {}
-      void g() {}
     };
   )";
   auto AST =
@@ -87,11 +86,8 @@ TEST(Decl, AsmLabelAttr) {
   const auto *DeclS =
       selectFirst<CXXRecordDecl>("d", match(cxxRecordDecl().bind("d"), Ctx));
   NamedDecl *DeclF = *DeclS->method_begin();
-  NamedDecl *DeclG = *(++DeclS->method_begin());
 
-  // Attach asm labels to the decls: one literal, and one not.
-  DeclF->addAttr(AsmLabelAttr::Create(Ctx, "foo", /*LiteralLabel=*/true));
-  DeclG->addAttr(AsmLabelAttr::Create(Ctx, "goo", /*LiteralLabel=*/false));
+  DeclF->addAttr(AsmLabelAttr::Create(Ctx, "foo"));
 
   // Mangle the decl names.
   std::string MangleF, MangleG;
@@ -99,14 +95,11 @@ TEST(Decl, AsmLabelAttr) {
       ItaniumMangleContext::create(Ctx, Diags));
   {
     llvm::raw_string_ostream OS_F(MangleF);
-    llvm::raw_string_ostream OS_G(MangleG);
     MC->mangleName(DeclF, OS_F);
-    MC->mangleName(DeclG, OS_G);
   }
 
   ASSERT_EQ(MangleF, "\x01"
                      "foo");
-  ASSERT_EQ(MangleG, "goo");
 }
 
 TEST(Decl, MangleDependentSizedArray) {
diff --git a/libcxxabi/src/demangle/ItaniumDemangle.h b/libcxxabi/src/demangle/ItaniumDemangle.h
index 6f27da7b9cadf..7b3983bc89367 100644
--- a/libcxxabi/src/demangle/ItaniumDemangle.h
+++ b/libcxxabi/src/demangle/ItaniumDemangle.h
@@ -1766,6 +1766,8 @@ class CtorDtorName final : public Node {
 
   template<typename Fn> void match(Fn F) const { F(Basename, IsDtor, Variant); }
 
+  int getVariant() const { return Variant; }
+
   void printLeft(OutputBuffer &OB) const override {
     if (IsDtor)
       OB += "~";
diff --git a/lldb/include/lldb/Expression/Expression.h b/lldb/include/lldb/Expression/Expression.h
index 20067f469895b..847226167d584 100644
--- a/lldb/include/lldb/Expression/Expression.h
+++ b/lldb/include/lldb/Expression/Expression.h
@@ -103,11 +103,15 @@ class Expression {
 ///
 /// The format being:
 ///
-///   <prefix>:<module uid>:<symbol uid>:<name>
+///   <prefix>:<discriminator>:<module uid>:<symbol uid>:<name>
 ///
 /// The label string needs to stay valid for the entire lifetime
 /// of this object.
 struct FunctionCallLabel {
+  /// Arbitrary string which language plugins can interpret for their
+  /// own needs.
+  llvm::StringRef discriminator;
+
   /// Unique identifier of the lldb_private::Module
   /// which contains the symbol identified by \c symbol_id.
   lldb::user_id_t module_id;
@@ -133,7 +137,7 @@ struct FunctionCallLabel {
   ///
   /// The representation roundtrips through \c fromString:
   /// \code{.cpp}
-  /// llvm::StringRef encoded = "$__lldb_func:0x0:0x0:_Z3foov";
+  /// llvm::StringRef encoded = "$__lldb_func:blah:0x0:0x0:_Z3foov";
   /// FunctionCallLabel label = *fromString(label);
   ///
   /// assert (label.toString() == encoded);
diff --git a/lldb/source/Expression/Expression.cpp b/lldb/source/Expression/Expression.cpp
index 796851ff15ca3..16ecb1d7deef8 100644
--- a/lldb/source/Expression/Expression.cpp
+++ b/lldb/source/Expression/Expression.cpp
@@ -34,10 +34,10 @@ Expression::Expression(ExecutionContextScope &exe_scope)
 
 llvm::Expected<FunctionCallLabel>
 lldb_private::FunctionCallLabel::fromString(llvm::StringRef label) {
-  llvm::SmallVector<llvm::StringRef, 4> components;
-  label.split(components, ":", /*MaxSplit=*/3);
+  llvm::SmallVector<llvm::StringRef, 5> components;
+  label.split(components, ":", /*MaxSplit=*/4);
 
-  if (components.size() != 4)
+  if (components.size() != 5)
     return llvm::createStringError("malformed function call label.");
 
   if (components[0] != FunctionCallLabelPrefix)
@@ -45,8 +45,10 @@ lldb_private::FunctionCallLabel::fromString(llvm::StringRef label) {
         "expected function call label prefix '{0}' but found '{1}' instead.",
         FunctionCallLabelPrefix, components[0]));
 
-  llvm::StringRef module_label = components[1];
-  llvm::StringRef die_label = components[2];
+  llvm::StringRef discriminator = components[1];
+  llvm::StringRef module_label = components[2];
+  llvm::StringRef die_label = components[3];
+  llvm::StringRef lookup_name = components[4];
 
   lldb::user_id_t module_id = 0;
   if (!llvm::to_integer(module_label, module_id))
@@ -58,20 +60,23 @@ lldb_private::FunctionCallLabel::fromString(llvm::StringRef label) {
     return llvm::createStringError(
         llvm::formatv("failed to parse symbol ID from '{0}'.", die_label));
 
-  return FunctionCallLabel{/*.module_id=*/module_id,
+  return FunctionCallLabel{/*.discriminator=*/discriminator,
+                           /*.module_id=*/module_id,
                            /*.symbol_id=*/die_id,
-                           /*.lookup_name=*/components[3]};
+                           /*.lookup_name=*/lookup_name};
 }
 
 std::string lldb_private::FunctionCallLabel::toString() const {
-  return llvm::formatv("{0}:{1:x}:{2:x}:{3}", FunctionCallLabelPrefix,
-                       module_id, symbol_id, lookup_name)
+  return llvm::formatv("{0}:{1}:{2:x}:{3:x}:{4}", FunctionCallLabelPrefix,
+                       discriminator, module_id, symbol_id, lookup_name)
       .str();
 }
 
 void llvm::format_provider<FunctionCallLabel>::format(
     const FunctionCallLabel &label, raw_ostream &OS, StringRef Style) {
-  OS << llvm::formatv("FunctionCallLabel{ module_id: {0:x}, symbol_id: {1:x}, "
-                      "lookup_name: {2} }",
-                      label.module_id, label.symbol_id, label.lookup_name);
+  OS << llvm::formatv("FunctionCallLabel{ discriminator: {0}, module_id: "
+                      "{1:x}, symbol_id: {2:x}, "
+                      "lookup_name: {3} }",
+                      label.discriminator, label.module_id, label.symbol_id,
+                      label.lookup_name);
 }
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
index 781c1c6c5745d..5c674308160e5 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
@@ -250,11 +250,52 @@ static unsigned GetCXXMethodCVQuals(const DWARFDIE &subprogram,
   return cv_quals;
 }
 
+static const char *GetMangledOrStructorName(const DWARFDIE &die) {
+  const char *name = die.GetMangledName(/*substitute_name_allowed*/ false);
+  if (name)
+    return name;
+
+  name = die.GetName();
+  if (!name)
+    return nullptr;
+
+  DWARFDIE parent = die.GetParent();
+  if (!parent.IsStructUnionOrClass())
+    return nullptr;
+
+  const char *parent_name = parent.GetName();
+  if (!parent_name)
+    return nullptr;
+
+  // Constructor.
+  if (::strcmp(parent_name, name) == 0)
+    return name;
+
+  // Destructor.
+  if (name[0] == '~' && ::strcmp(parent_name, name + 1))
+    return name;
+
+  return nullptr;
+}
+
 static std::string MakeLLDBFuncAsmLabel(const DWARFDIE &die) {
-  char const *name = die.GetMangledName(/*substitute_name_allowed*/ false);
+  char const *name = GetMangledOrStructorName(die);
   if (!name)
     return {};
 
+  auto *cu = die.GetCU();
+  if (!cu)
+    return {};
+
+  // FIXME: When resolving function call labels, we check that
+  // that the definition's DW_AT_specification points to the
+  // declaration that we encoded into the label here. But if the
+  // declaration came from a type-unit (and the definition from
+  // .debug_info), that check won't work. So for now, don't use
+  // function call labels for declaration DIEs from type-units.
+  if (cu->IsTypeUnit())
+    return {};
+
   SymbolFileDWARF *dwarf = die.GetDWARF();
   if (!dwarf)
     return {};
@@ -285,7 +326,9 @@ static std::string MakeLLDBFuncAsmLabel(const DWARFDIE &die) {
   if (die_id == LLDB_INVALID_UID)
     return {};
 
-  return FunctionCallLabel{/*module_id=*/module_id,
+  // Note, discriminator is added by Clang during mangling.
+  return FunctionCallLabel{/*discriminator=*/{},
+                           /*module_id=*/module_id,
                            /*symbol_id=*/die_id,
                            /*.lookup_name=*/name}
       .toString();
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
index 42a66ce75d6d6..911ce5930ca4c 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
@@ -7,7 +7,9 @@
 //===----------------------------------------------------------------------===//
 
 #include "SymbolFileDWARF.h"
+#include "clang/Basic/ABI.h"
 #include "llvm/ADT/STLExtras.h"
+#include "llvm/ADT/StringExtras.h"
 #include "llvm/DebugInfo/DWARF/DWARFAddressRange.h"
 #include "llvm/DebugInfo/DWARF/DWARFDebugLoc.h"
 #include "llvm/Support/Casting.h"
@@ -78,6 +80,7 @@
 
 #include "llvm/DebugInfo/DWARF/DWARFContext.h"
 #include "llvm/DebugInfo/DWARF/DWARFDebugAbbrev.h"
+#include "llvm/Demangle/Demangle.h"
 #include "llvm/Support/FileSystem.h"
 #include "llvm/Support/FormatVariadic.h"
 
@@ -2482,6 +2485,133 @@ bool SymbolFileDWARF::ResolveFunction(const DWARFDIE &orig_die,
   return false;
 }
 
+static int ClangToItaniumCtorKind(clang::CXXCtorType kind) {
+  switch (kind) {
+  case clang::CXXCtorType::Ctor_Complete:
+    return 1;
+  case clang::CXXCtorType::Ctor_Base:
+    return 2;
+  case clang::CXXCtorType::Ctor_CopyingClosure:
+  case clang::CXXCtorType::Ctor_DefaultClosure:
+  case clang::CXXCtorType::Ctor_Comdat:
+    llvm_unreachable("Unexpected constructor kind.");
+  }
+}
+
+static int ClangToItaniumDtorKind(clang::CXXDtorType kind) {
+  switch (kind) {
+  case clang::CXXDtorType::Dtor_Deleting:
+    return 0;
+  case clang::CXXDtorType::Dtor_Complete:
+    return 1;
+  case clang::CXXDtorType::Dtor_Base:
+    return 2;
+  case clang::CXXDtorType::Dtor_Comdat:
+    llvm_unreachable("Unexpected destructor kind.");
+  }
+}
+
+static std::optional<int>
+GetItaniumCtorDtorVariant(llvm::StringRef discriminator) {
+  const bool is_ctor = discriminator.consume_front("C");
+  if (!is_ctor && !discriminator.consume_front("D"))
+    return std::nullopt;
+
+  uint64_t structor_kind;
+  if (!llvm::to_integer(discriminator, structor_kind))
+    return std::nullopt;
+
+  if (is_ctor) {
+    if (structor_kind > clang::CXXCtorType::Ctor_DefaultClosure)
+      return std::nullopt;
+
+    return ClangToItaniumCtorKind(
+        static_cast<clang::CXXCtorType>(structor_kind));
+  }
+
+  if (structor_kind > clang::CXXDtorType::Dtor_Deleting)
+    return std::nullopt;
+
+  return ClangToItaniumDtorKind(static_cast<clang::CXXDtorType>(structor_kind));
+}
+
+DWARFDIE SymbolFileDWARF::FindFunctionDefinition(const FunctionCallLabel &label,
+                                                 const DWARFDIE &declaration) {
+  DWARFDIE definition;
+  llvm::DenseMap<int, DWARFDIE> structor_variant_to_die;
+
+  // eFunctionNameTypeFull for mangled name lookup.
+  // eFunctionNameTypeMethod is required for structor lookups (since we look
+  // those up by DW_AT_name).
+  Module::LookupInfo info(ConstString(label.lookup_name),
+                          lldb::eFunctionNameTypeFull |
+                              lldb::eFunctionNameTypeMethod,
+                          lldb::eLanguageTypeUnknown);
+
+  m_index->GetFunctions(info, *this, {}, [&](DWARFDIE entry) {
+    if (entry.GetAttributeValueAsUnsigned(llvm::dwarf::DW_AT_declaration, 0))
+      return IterationAction::Continue;
+
+    auto spec = entry.GetAttributeValueAsReferenceDIE(DW_AT_specification);
+    if (!spec)
+      return IterationAction::Continue;
+
+    if (spec != declaration)
+      return IterationAction::Continue;
+
+    // We're not picking a specific structor variant DIE, so we're done here.
+    if (label.discriminator.empty()) {
+      definition = entry;
+      return IterationAction::Stop;
+    }
+
+    const char *mangled =
+        entry.GetMangledName(/*substitute_name_allowed=*/false);
+    if (!mangled)
+      return IterationAction::Continue;
+
+    llvm::ItaniumPartialDemangler D;
+    if (D.partialDemangle(mangled))
+      return IterationAction::Continue;
+
+    auto structor_variant = D.getCtorOrDtorVariant();
+    if (!structor_variant)
+      return IterationAction::Continue;
+
+    auto [_, inserted] = structor_variant_to_die.try_emplace(*structor_variant,
+                                                             std::move(entry));
+    assert(inserted);
+
+    // The compiler may choose to alias the constructor variants
+    // (notably this happens on Linux), so we might not have a definition
+    // DIE for some structor variants. Hence we iterate over all variants
+    // and pick the most appropriate one out of those.
+    return IterationAction::Continue;
+  });
+
+  if (definition)
+    return definition;
+
+  auto label_variant = GetItaniumCtorDtorVariant(label.discriminator);
+  if (!label_variant)
+    return {};
+
+  auto it = structor_variant_to_die.find(*label_variant);
+
+  // Found the exact variant.
+  if (it != structor_variant_to_die.end())
+    return it->getSecond();
+
+  // C1 was aliased to C2
+  if (!label.lookup_name.starts_with("~") && label_variant == 1) {
+    if (auto it = structor_variant_to_die.find(2);
+        it != structor_variant_to_die.end())
+      return it->getSecond();
+  }
+
+  return {};
+}
+
 llvm::Expected<SymbolContext>
 SymbolFileDWARF::ResolveFunctionCallLabel(const FunctionCallLabel &label) {
   std::lock_guard<std::recursive_mutex> guard(GetModuleMutex());
@@ -2494,24 +2624,7 @@ SymbolFileDWARF::ResolveFunctionCallLabel(const FunctionCallLabel &label) {
   // Label was created using a declaration DIE. Need to fetch the definition
   // to resolve the function call.
   if (die.GetAttributeValueAsUnsigned(llvm::dwarf::DW_AT_declaration, 0)) {
-    Module::LookupInfo info(ConstString(label.lookup_name),
-                            lldb::eFunctionNameTypeFull,
-                            lldb::eLanguageTypeUnknown);
-
-    m_index->GetFunctions(info, *this, {}, [&](DWARFDIE entry) {
-      if (entry.GetAttributeValueAsUnsigned(llvm::dwarf::DW_AT_declaration, 0))
-        return IterationAction::Continue;
-
-      // We don't check whether the specification DIE for this function
-      // corresponds to the declaration DIE because the declaration might be in
-      // a type-unit but the definition in the compile-unit (and it's
-      // specifcation would point to the declaration in the compile-unit). We
-      // rely on the mangled name within the module to be enough to find us the
-      // unique definition.
-      die = entry;
-      return IterationAction::Stop;
-    });
-
+    die = FindFunctionDefinition(label, die);
     if (die.GetAttributeValueAsUnsigned(llvm::dwarf::DW_AT_declaration, 0))
       return llvm::createStringError(
           llvm::formatv("failed to find definition DIE for {0}", label));
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h
index 3ec538da8cf77..843346619db37 100644
--- a/lldb/s...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Aug 4, 2025

@llvm/pr-subscribers-libcxxabi

Author: Michael Buch (Michael137)

Changes

Depends on #148877


Patch is 30.28 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/149827.diff

16 Files Affected:

  • (modified) clang/include/clang/Basic/Attr.td (+4-13)
  • (modified) clang/lib/AST/Mangle.cpp (+21-3)
  • (modified) clang/lib/Sema/SemaDecl.cpp (+5-8)
  • (modified) clang/unittests/AST/DeclTest.cpp (+1-8)
  • (modified) libcxxabi/src/demangle/ItaniumDemangle.h (+2)
  • (modified) lldb/include/lldb/Expression/Expression.h (+6-2)
  • (modified) lldb/source/Expression/Expression.cpp (+17-12)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp (+45-2)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp (+131-18)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h (+4)
  • (modified) lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp (+3-4)
  • (modified) lldb/unittests/Expression/ExpressionTest.cpp (+22-15)
  • (modified) lldb/unittests/Symbol/TestTypeSystemClang.cpp (+7-7)
  • (modified) llvm/include/llvm/Demangle/Demangle.h (+3)
  • (modified) llvm/include/llvm/Demangle/ItaniumDemangle.h (+2)
  • (modified) llvm/lib/Demangle/ItaniumDemangle.cpp (+7-3)
diff --git a/clang/include/clang/Basic/Attr.td b/clang/include/clang/Basic/Attr.td
index b3ff45b3e90a3..a0efb21218312 100644
--- a/clang/include/clang/Basic/Attr.td
+++ b/clang/include/clang/Basic/Attr.td
@@ -1059,22 +1059,13 @@ def AVRSignal : InheritableAttr, TargetSpecificAttr<TargetAVR> {
 def AsmLabel : InheritableAttr {
   let Spellings = [CustomKeyword<"asm">, CustomKeyword<"__asm__">];
   let Args = [
-    // Label specifies the mangled name for the decl.
-    StringArgument<"Label">,
-
-    // IsLiteralLabel specifies whether the label is literal (i.e. suppresses
-    // the global C symbol prefix) or not. If not, the mangle-suppression prefix
-    // ('\01') is omitted from the decl name at the LLVM IR level.
-    //
-    // Non-literal labels are used by some external AST sources like LLDB.
-    BoolArgument<"IsLiteralLabel", /*optional=*/0, /*fake=*/1>
-  ];
+      // Label specifies the mangled name for the decl.
+      StringArgument<"Label">, ];
   let SemaHandler = 0;
   let Documentation = [AsmLabelDocs];
-  let AdditionalMembers =
-[{
+  let AdditionalMembers = [{
 bool isEquivalent(AsmLabelAttr *Other) const {
-  return getLabel() == Other->getLabel() && getIsLiteralLabel() == Other->getIsLiteralLabel();
+  return getLabel() == Other->getLabel();
 }
 }];
 }
diff --git a/clang/lib/AST/Mangle.cpp b/clang/lib/AST/Mangle.cpp
index 9652fdbc4e125..58029476572c5 100644
--- a/clang/lib/AST/Mangle.cpp
+++ b/clang/lib/AST/Mangle.cpp
@@ -152,6 +152,20 @@ bool MangleContext::shouldMangleDeclName(const NamedDecl *D) {
   return shouldMangleCXXName(D);
 }
 
+static llvm::StringRef g_lldb_func_call_label_prefix = "$__lldb_func";
+
+static void emitLLDBAsmLabel(llvm::StringRef label, GlobalDecl GD,
+                             llvm::raw_ostream &Out) {
+  Out << g_lldb_func_call_label_prefix << ":";
+
+  if (llvm::isa<clang::CXXConstructorDecl>(GD.getDecl()))
+    Out << "C" << GD.getCtorType();
+  else if (llvm::isa<clang::CXXDestructorDecl>(GD.getDecl()))
+    Out << "D" << GD.getDtorType();
+
+  Out << label.substr(g_lldb_func_call_label_prefix.size() + 1);
+}
+
 void MangleContext::mangleName(GlobalDecl GD, raw_ostream &Out) {
   const ASTContext &ASTContext = getASTContext();
   const NamedDecl *D = cast<NamedDecl>(GD.getDecl());
@@ -161,9 +175,9 @@ void MangleContext::mangleName(GlobalDecl GD, raw_ostream &Out) {
   if (const AsmLabelAttr *ALA = D->getAttr<AsmLabelAttr>()) {
     // If we have an asm name, then we use it as the mangling.
 
-    // If the label isn't literal, or if this is an alias for an LLVM intrinsic,
+    // If the label is an alias for an LLVM intrinsic,
     // do not add a "\01" prefix.
-    if (!ALA->getIsLiteralLabel() || ALA->getLabel().starts_with("llvm.")) {
+    if (ALA->getLabel().starts_with("llvm.")) {
       Out << ALA->getLabel();
       return;
     }
@@ -185,7 +199,11 @@ void MangleContext::mangleName(GlobalDecl GD, raw_ostream &Out) {
     if (!UserLabelPrefix.empty())
       Out << '\01'; // LLVM IR Marker for __asm("foo")
 
-    Out << ALA->getLabel();
+    if (ALA->getLabel().starts_with(g_lldb_func_call_label_prefix))
+      emitLLDBAsmLabel(ALA->getLabel(), GD, Out);
+    else
+      Out << ALA->getLabel();
+
     return;
   }
 
diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp
index e2ac648320c0f..885d04b9ab926 100644
--- a/clang/lib/Sema/SemaDecl.cpp
+++ b/clang/lib/Sema/SemaDecl.cpp
@@ -8113,9 +8113,7 @@ NamedDecl *Sema::ActOnVariableDeclarator(
       }
     }
 
-    NewVD->addAttr(AsmLabelAttr::Create(Context, Label,
-                                        /*IsLiteralLabel=*/true,
-                                        SE->getStrTokenLoc(0)));
+    NewVD->addAttr(AsmLabelAttr::Create(Context, Label, SE->getStrTokenLoc(0)));
   } else if (!ExtnameUndeclaredIdentifiers.empty()) {
     llvm::DenseMap<IdentifierInfo*,AsmLabelAttr*>::iterator I =
       ExtnameUndeclaredIdentifiers.find(NewVD->getIdentifier());
@@ -10345,9 +10343,8 @@ Sema::ActOnFunctionDeclarator(Scope *S, Declarator &D, DeclContext *DC,
   if (Expr *E = D.getAsmLabel()) {
     // The parser guarantees this is a string.
     StringLiteral *SE = cast<StringLiteral>(E);
-    NewFD->addAttr(AsmLabelAttr::Create(Context, SE->getString(),
-                                        /*IsLiteralLabel=*/true,
-                                        SE->getStrTokenLoc(0)));
+    NewFD->addAttr(
+        AsmLabelAttr::Create(Context, SE->getString(), SE->getStrTokenLoc(0)));
   } else if (!ExtnameUndeclaredIdentifiers.empty()) {
     llvm::DenseMap<IdentifierInfo*,AsmLabelAttr*>::iterator I =
       ExtnameUndeclaredIdentifiers.find(NewFD->getIdentifier());
@@ -20598,8 +20595,8 @@ void Sema::ActOnPragmaRedefineExtname(IdentifierInfo* Name,
                                          LookupOrdinaryName);
   AttributeCommonInfo Info(AliasName, SourceRange(AliasNameLoc),
                            AttributeCommonInfo::Form::Pragma());
-  AsmLabelAttr *Attr = AsmLabelAttr::CreateImplicit(
-      Context, AliasName->getName(), /*IsLiteralLabel=*/true, Info);
+  AsmLabelAttr *Attr =
+      AsmLabelAttr::CreateImplicit(Context, AliasName->getName(), Info);
 
   // If a declaration that:
   // 1) declares a function or a variable
diff --git a/clang/unittests/AST/DeclTest.cpp b/clang/unittests/AST/DeclTest.cpp
index ed635da683aab..afaf413493299 100644
--- a/clang/unittests/AST/DeclTest.cpp
+++ b/clang/unittests/AST/DeclTest.cpp
@@ -74,7 +74,6 @@ TEST(Decl, AsmLabelAttr) {
   StringRef Code = R"(
     struct S {
       void f() {}
-      void g() {}
     };
   )";
   auto AST =
@@ -87,11 +86,8 @@ TEST(Decl, AsmLabelAttr) {
   const auto *DeclS =
       selectFirst<CXXRecordDecl>("d", match(cxxRecordDecl().bind("d"), Ctx));
   NamedDecl *DeclF = *DeclS->method_begin();
-  NamedDecl *DeclG = *(++DeclS->method_begin());
 
-  // Attach asm labels to the decls: one literal, and one not.
-  DeclF->addAttr(AsmLabelAttr::Create(Ctx, "foo", /*LiteralLabel=*/true));
-  DeclG->addAttr(AsmLabelAttr::Create(Ctx, "goo", /*LiteralLabel=*/false));
+  DeclF->addAttr(AsmLabelAttr::Create(Ctx, "foo"));
 
   // Mangle the decl names.
   std::string MangleF, MangleG;
@@ -99,14 +95,11 @@ TEST(Decl, AsmLabelAttr) {
       ItaniumMangleContext::create(Ctx, Diags));
   {
     llvm::raw_string_ostream OS_F(MangleF);
-    llvm::raw_string_ostream OS_G(MangleG);
     MC->mangleName(DeclF, OS_F);
-    MC->mangleName(DeclG, OS_G);
   }
 
   ASSERT_EQ(MangleF, "\x01"
                      "foo");
-  ASSERT_EQ(MangleG, "goo");
 }
 
 TEST(Decl, MangleDependentSizedArray) {
diff --git a/libcxxabi/src/demangle/ItaniumDemangle.h b/libcxxabi/src/demangle/ItaniumDemangle.h
index 6f27da7b9cadf..7b3983bc89367 100644
--- a/libcxxabi/src/demangle/ItaniumDemangle.h
+++ b/libcxxabi/src/demangle/ItaniumDemangle.h
@@ -1766,6 +1766,8 @@ class CtorDtorName final : public Node {
 
   template<typename Fn> void match(Fn F) const { F(Basename, IsDtor, Variant); }
 
+  int getVariant() const { return Variant; }
+
   void printLeft(OutputBuffer &OB) const override {
     if (IsDtor)
       OB += "~";
diff --git a/lldb/include/lldb/Expression/Expression.h b/lldb/include/lldb/Expression/Expression.h
index 20067f469895b..847226167d584 100644
--- a/lldb/include/lldb/Expression/Expression.h
+++ b/lldb/include/lldb/Expression/Expression.h
@@ -103,11 +103,15 @@ class Expression {
 ///
 /// The format being:
 ///
-///   <prefix>:<module uid>:<symbol uid>:<name>
+///   <prefix>:<discriminator>:<module uid>:<symbol uid>:<name>
 ///
 /// The label string needs to stay valid for the entire lifetime
 /// of this object.
 struct FunctionCallLabel {
+  /// Arbitrary string which language plugins can interpret for their
+  /// own needs.
+  llvm::StringRef discriminator;
+
   /// Unique identifier of the lldb_private::Module
   /// which contains the symbol identified by \c symbol_id.
   lldb::user_id_t module_id;
@@ -133,7 +137,7 @@ struct FunctionCallLabel {
   ///
   /// The representation roundtrips through \c fromString:
   /// \code{.cpp}
-  /// llvm::StringRef encoded = "$__lldb_func:0x0:0x0:_Z3foov";
+  /// llvm::StringRef encoded = "$__lldb_func:blah:0x0:0x0:_Z3foov";
   /// FunctionCallLabel label = *fromString(label);
   ///
   /// assert (label.toString() == encoded);
diff --git a/lldb/source/Expression/Expression.cpp b/lldb/source/Expression/Expression.cpp
index 796851ff15ca3..16ecb1d7deef8 100644
--- a/lldb/source/Expression/Expression.cpp
+++ b/lldb/source/Expression/Expression.cpp
@@ -34,10 +34,10 @@ Expression::Expression(ExecutionContextScope &exe_scope)
 
 llvm::Expected<FunctionCallLabel>
 lldb_private::FunctionCallLabel::fromString(llvm::StringRef label) {
-  llvm::SmallVector<llvm::StringRef, 4> components;
-  label.split(components, ":", /*MaxSplit=*/3);
+  llvm::SmallVector<llvm::StringRef, 5> components;
+  label.split(components, ":", /*MaxSplit=*/4);
 
-  if (components.size() != 4)
+  if (components.size() != 5)
     return llvm::createStringError("malformed function call label.");
 
   if (components[0] != FunctionCallLabelPrefix)
@@ -45,8 +45,10 @@ lldb_private::FunctionCallLabel::fromString(llvm::StringRef label) {
         "expected function call label prefix '{0}' but found '{1}' instead.",
         FunctionCallLabelPrefix, components[0]));
 
-  llvm::StringRef module_label = components[1];
-  llvm::StringRef die_label = components[2];
+  llvm::StringRef discriminator = components[1];
+  llvm::StringRef module_label = components[2];
+  llvm::StringRef die_label = components[3];
+  llvm::StringRef lookup_name = components[4];
 
   lldb::user_id_t module_id = 0;
   if (!llvm::to_integer(module_label, module_id))
@@ -58,20 +60,23 @@ lldb_private::FunctionCallLabel::fromString(llvm::StringRef label) {
     return llvm::createStringError(
         llvm::formatv("failed to parse symbol ID from '{0}'.", die_label));
 
-  return FunctionCallLabel{/*.module_id=*/module_id,
+  return FunctionCallLabel{/*.discriminator=*/discriminator,
+                           /*.module_id=*/module_id,
                            /*.symbol_id=*/die_id,
-                           /*.lookup_name=*/components[3]};
+                           /*.lookup_name=*/lookup_name};
 }
 
 std::string lldb_private::FunctionCallLabel::toString() const {
-  return llvm::formatv("{0}:{1:x}:{2:x}:{3}", FunctionCallLabelPrefix,
-                       module_id, symbol_id, lookup_name)
+  return llvm::formatv("{0}:{1}:{2:x}:{3:x}:{4}", FunctionCallLabelPrefix,
+                       discriminator, module_id, symbol_id, lookup_name)
       .str();
 }
 
 void llvm::format_provider<FunctionCallLabel>::format(
     const FunctionCallLabel &label, raw_ostream &OS, StringRef Style) {
-  OS << llvm::formatv("FunctionCallLabel{ module_id: {0:x}, symbol_id: {1:x}, "
-                      "lookup_name: {2} }",
-                      label.module_id, label.symbol_id, label.lookup_name);
+  OS << llvm::formatv("FunctionCallLabel{ discriminator: {0}, module_id: "
+                      "{1:x}, symbol_id: {2:x}, "
+                      "lookup_name: {3} }",
+                      label.discriminator, label.module_id, label.symbol_id,
+                      label.lookup_name);
 }
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
index 781c1c6c5745d..5c674308160e5 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
@@ -250,11 +250,52 @@ static unsigned GetCXXMethodCVQuals(const DWARFDIE &subprogram,
   return cv_quals;
 }
 
+static const char *GetMangledOrStructorName(const DWARFDIE &die) {
+  const char *name = die.GetMangledName(/*substitute_name_allowed*/ false);
+  if (name)
+    return name;
+
+  name = die.GetName();
+  if (!name)
+    return nullptr;
+
+  DWARFDIE parent = die.GetParent();
+  if (!parent.IsStructUnionOrClass())
+    return nullptr;
+
+  const char *parent_name = parent.GetName();
+  if (!parent_name)
+    return nullptr;
+
+  // Constructor.
+  if (::strcmp(parent_name, name) == 0)
+    return name;
+
+  // Destructor.
+  if (name[0] == '~' && ::strcmp(parent_name, name + 1))
+    return name;
+
+  return nullptr;
+}
+
 static std::string MakeLLDBFuncAsmLabel(const DWARFDIE &die) {
-  char const *name = die.GetMangledName(/*substitute_name_allowed*/ false);
+  char const *name = GetMangledOrStructorName(die);
   if (!name)
     return {};
 
+  auto *cu = die.GetCU();
+  if (!cu)
+    return {};
+
+  // FIXME: When resolving function call labels, we check that
+  // that the definition's DW_AT_specification points to the
+  // declaration that we encoded into the label here. But if the
+  // declaration came from a type-unit (and the definition from
+  // .debug_info), that check won't work. So for now, don't use
+  // function call labels for declaration DIEs from type-units.
+  if (cu->IsTypeUnit())
+    return {};
+
   SymbolFileDWARF *dwarf = die.GetDWARF();
   if (!dwarf)
     return {};
@@ -285,7 +326,9 @@ static std::string MakeLLDBFuncAsmLabel(const DWARFDIE &die) {
   if (die_id == LLDB_INVALID_UID)
     return {};
 
-  return FunctionCallLabel{/*module_id=*/module_id,
+  // Note, discriminator is added by Clang during mangling.
+  return FunctionCallLabel{/*discriminator=*/{},
+                           /*module_id=*/module_id,
                            /*symbol_id=*/die_id,
                            /*.lookup_name=*/name}
       .toString();
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
index 42a66ce75d6d6..911ce5930ca4c 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
@@ -7,7 +7,9 @@
 //===----------------------------------------------------------------------===//
 
 #include "SymbolFileDWARF.h"
+#include "clang/Basic/ABI.h"
 #include "llvm/ADT/STLExtras.h"
+#include "llvm/ADT/StringExtras.h"
 #include "llvm/DebugInfo/DWARF/DWARFAddressRange.h"
 #include "llvm/DebugInfo/DWARF/DWARFDebugLoc.h"
 #include "llvm/Support/Casting.h"
@@ -78,6 +80,7 @@
 
 #include "llvm/DebugInfo/DWARF/DWARFContext.h"
 #include "llvm/DebugInfo/DWARF/DWARFDebugAbbrev.h"
+#include "llvm/Demangle/Demangle.h"
 #include "llvm/Support/FileSystem.h"
 #include "llvm/Support/FormatVariadic.h"
 
@@ -2482,6 +2485,133 @@ bool SymbolFileDWARF::ResolveFunction(const DWARFDIE &orig_die,
   return false;
 }
 
+static int ClangToItaniumCtorKind(clang::CXXCtorType kind) {
+  switch (kind) {
+  case clang::CXXCtorType::Ctor_Complete:
+    return 1;
+  case clang::CXXCtorType::Ctor_Base:
+    return 2;
+  case clang::CXXCtorType::Ctor_CopyingClosure:
+  case clang::CXXCtorType::Ctor_DefaultClosure:
+  case clang::CXXCtorType::Ctor_Comdat:
+    llvm_unreachable("Unexpected constructor kind.");
+  }
+}
+
+static int ClangToItaniumDtorKind(clang::CXXDtorType kind) {
+  switch (kind) {
+  case clang::CXXDtorType::Dtor_Deleting:
+    return 0;
+  case clang::CXXDtorType::Dtor_Complete:
+    return 1;
+  case clang::CXXDtorType::Dtor_Base:
+    return 2;
+  case clang::CXXDtorType::Dtor_Comdat:
+    llvm_unreachable("Unexpected destructor kind.");
+  }
+}
+
+static std::optional<int>
+GetItaniumCtorDtorVariant(llvm::StringRef discriminator) {
+  const bool is_ctor = discriminator.consume_front("C");
+  if (!is_ctor && !discriminator.consume_front("D"))
+    return std::nullopt;
+
+  uint64_t structor_kind;
+  if (!llvm::to_integer(discriminator, structor_kind))
+    return std::nullopt;
+
+  if (is_ctor) {
+    if (structor_kind > clang::CXXCtorType::Ctor_DefaultClosure)
+      return std::nullopt;
+
+    return ClangToItaniumCtorKind(
+        static_cast<clang::CXXCtorType>(structor_kind));
+  }
+
+  if (structor_kind > clang::CXXDtorType::Dtor_Deleting)
+    return std::nullopt;
+
+  return ClangToItaniumDtorKind(static_cast<clang::CXXDtorType>(structor_kind));
+}
+
+DWARFDIE SymbolFileDWARF::FindFunctionDefinition(const FunctionCallLabel &label,
+                                                 const DWARFDIE &declaration) {
+  DWARFDIE definition;
+  llvm::DenseMap<int, DWARFDIE> structor_variant_to_die;
+
+  // eFunctionNameTypeFull for mangled name lookup.
+  // eFunctionNameTypeMethod is required for structor lookups (since we look
+  // those up by DW_AT_name).
+  Module::LookupInfo info(ConstString(label.lookup_name),
+                          lldb::eFunctionNameTypeFull |
+                              lldb::eFunctionNameTypeMethod,
+                          lldb::eLanguageTypeUnknown);
+
+  m_index->GetFunctions(info, *this, {}, [&](DWARFDIE entry) {
+    if (entry.GetAttributeValueAsUnsigned(llvm::dwarf::DW_AT_declaration, 0))
+      return IterationAction::Continue;
+
+    auto spec = entry.GetAttributeValueAsReferenceDIE(DW_AT_specification);
+    if (!spec)
+      return IterationAction::Continue;
+
+    if (spec != declaration)
+      return IterationAction::Continue;
+
+    // We're not picking a specific structor variant DIE, so we're done here.
+    if (label.discriminator.empty()) {
+      definition = entry;
+      return IterationAction::Stop;
+    }
+
+    const char *mangled =
+        entry.GetMangledName(/*substitute_name_allowed=*/false);
+    if (!mangled)
+      return IterationAction::Continue;
+
+    llvm::ItaniumPartialDemangler D;
+    if (D.partialDemangle(mangled))
+      return IterationAction::Continue;
+
+    auto structor_variant = D.getCtorOrDtorVariant();
+    if (!structor_variant)
+      return IterationAction::Continue;
+
+    auto [_, inserted] = structor_variant_to_die.try_emplace(*structor_variant,
+                                                             std::move(entry));
+    assert(inserted);
+
+    // The compiler may choose to alias the constructor variants
+    // (notably this happens on Linux), so we might not have a definition
+    // DIE for some structor variants. Hence we iterate over all variants
+    // and pick the most appropriate one out of those.
+    return IterationAction::Continue;
+  });
+
+  if (definition)
+    return definition;
+
+  auto label_variant = GetItaniumCtorDtorVariant(label.discriminator);
+  if (!label_variant)
+    return {};
+
+  auto it = structor_variant_to_die.find(*label_variant);
+
+  // Found the exact variant.
+  if (it != structor_variant_to_die.end())
+    return it->getSecond();
+
+  // C1 was aliased to C2
+  if (!label.lookup_name.starts_with("~") && label_variant == 1) {
+    if (auto it = structor_variant_to_die.find(2);
+        it != structor_variant_to_die.end())
+      return it->getSecond();
+  }
+
+  return {};
+}
+
 llvm::Expected<SymbolContext>
 SymbolFileDWARF::ResolveFunctionCallLabel(const FunctionCallLabel &label) {
   std::lock_guard<std::recursive_mutex> guard(GetModuleMutex());
@@ -2494,24 +2624,7 @@ SymbolFileDWARF::ResolveFunctionCallLabel(const FunctionCallLabel &label) {
   // Label was created using a declaration DIE. Need to fetch the definition
   // to resolve the function call.
   if (die.GetAttributeValueAsUnsigned(llvm::dwarf::DW_AT_declaration, 0)) {
-    Module::LookupInfo info(ConstString(label.lookup_name),
-                            lldb::eFunctionNameTypeFull,
-                            lldb::eLanguageTypeUnknown);
-
-    m_index->GetFunctions(info, *this, {}, [&](DWARFDIE entry) {
-      if (entry.GetAttributeValueAsUnsigned(llvm::dwarf::DW_AT_declaration, 0))
-        return IterationAction::Continue;
-
-      // We don't check whether the specification DIE for this function
-      // corresponds to the declaration DIE because the declaration might be in
-      // a type-unit but the definition in the compile-unit (and it's
-      // specifcation would point to the declaration in the compile-unit). We
-      // rely on the mangled name within the module to be enough to find us the
-      // unique definition.
-      die = entry;
-      return IterationAction::Stop;
-    });
-
+    die = FindFunctionDefinition(label, die);
     if (die.GetAttributeValueAsUnsigned(llvm::dwarf::DW_AT_declaration, 0))
       return llvm::createStringError(
           llvm::formatv("failed to find definition DIE for {0}", label));
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h
index 3ec538da8cf77..843346619db37 100644
--- a/lldb/s...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Aug 4, 2025

@llvm/pr-subscribers-clang

Author: Michael Buch (Michael137)

Changes

Depends on #148877


Patch is 30.28 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/149827.diff

16 Files Affected:

  • (modified) clang/include/clang/Basic/Attr.td (+4-13)
  • (modified) clang/lib/AST/Mangle.cpp (+21-3)
  • (modified) clang/lib/Sema/SemaDecl.cpp (+5-8)
  • (modified) clang/unittests/AST/DeclTest.cpp (+1-8)
  • (modified) libcxxabi/src/demangle/ItaniumDemangle.h (+2)
  • (modified) lldb/include/lldb/Expression/Expression.h (+6-2)
  • (modified) lldb/source/Expression/Expression.cpp (+17-12)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp (+45-2)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp (+131-18)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h (+4)
  • (modified) lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp (+3-4)
  • (modified) lldb/unittests/Expression/ExpressionTest.cpp (+22-15)
  • (modified) lldb/unittests/Symbol/TestTypeSystemClang.cpp (+7-7)
  • (modified) llvm/include/llvm/Demangle/Demangle.h (+3)
  • (modified) llvm/include/llvm/Demangle/ItaniumDemangle.h (+2)
  • (modified) llvm/lib/Demangle/ItaniumDemangle.cpp (+7-3)
diff --git a/clang/include/clang/Basic/Attr.td b/clang/include/clang/Basic/Attr.td
index b3ff45b3e90a3..a0efb21218312 100644
--- a/clang/include/clang/Basic/Attr.td
+++ b/clang/include/clang/Basic/Attr.td
@@ -1059,22 +1059,13 @@ def AVRSignal : InheritableAttr, TargetSpecificAttr<TargetAVR> {
 def AsmLabel : InheritableAttr {
   let Spellings = [CustomKeyword<"asm">, CustomKeyword<"__asm__">];
   let Args = [
-    // Label specifies the mangled name for the decl.
-    StringArgument<"Label">,
-
-    // IsLiteralLabel specifies whether the label is literal (i.e. suppresses
-    // the global C symbol prefix) or not. If not, the mangle-suppression prefix
-    // ('\01') is omitted from the decl name at the LLVM IR level.
-    //
-    // Non-literal labels are used by some external AST sources like LLDB.
-    BoolArgument<"IsLiteralLabel", /*optional=*/0, /*fake=*/1>
-  ];
+      // Label specifies the mangled name for the decl.
+      StringArgument<"Label">, ];
   let SemaHandler = 0;
   let Documentation = [AsmLabelDocs];
-  let AdditionalMembers =
-[{
+  let AdditionalMembers = [{
 bool isEquivalent(AsmLabelAttr *Other) const {
-  return getLabel() == Other->getLabel() && getIsLiteralLabel() == Other->getIsLiteralLabel();
+  return getLabel() == Other->getLabel();
 }
 }];
 }
diff --git a/clang/lib/AST/Mangle.cpp b/clang/lib/AST/Mangle.cpp
index 9652fdbc4e125..58029476572c5 100644
--- a/clang/lib/AST/Mangle.cpp
+++ b/clang/lib/AST/Mangle.cpp
@@ -152,6 +152,20 @@ bool MangleContext::shouldMangleDeclName(const NamedDecl *D) {
   return shouldMangleCXXName(D);
 }
 
+static llvm::StringRef g_lldb_func_call_label_prefix = "$__lldb_func";
+
+static void emitLLDBAsmLabel(llvm::StringRef label, GlobalDecl GD,
+                             llvm::raw_ostream &Out) {
+  Out << g_lldb_func_call_label_prefix << ":";
+
+  if (llvm::isa<clang::CXXConstructorDecl>(GD.getDecl()))
+    Out << "C" << GD.getCtorType();
+  else if (llvm::isa<clang::CXXDestructorDecl>(GD.getDecl()))
+    Out << "D" << GD.getDtorType();
+
+  Out << label.substr(g_lldb_func_call_label_prefix.size() + 1);
+}
+
 void MangleContext::mangleName(GlobalDecl GD, raw_ostream &Out) {
   const ASTContext &ASTContext = getASTContext();
   const NamedDecl *D = cast<NamedDecl>(GD.getDecl());
@@ -161,9 +175,9 @@ void MangleContext::mangleName(GlobalDecl GD, raw_ostream &Out) {
   if (const AsmLabelAttr *ALA = D->getAttr<AsmLabelAttr>()) {
     // If we have an asm name, then we use it as the mangling.
 
-    // If the label isn't literal, or if this is an alias for an LLVM intrinsic,
+    // If the label is an alias for an LLVM intrinsic,
     // do not add a "\01" prefix.
-    if (!ALA->getIsLiteralLabel() || ALA->getLabel().starts_with("llvm.")) {
+    if (ALA->getLabel().starts_with("llvm.")) {
       Out << ALA->getLabel();
       return;
     }
@@ -185,7 +199,11 @@ void MangleContext::mangleName(GlobalDecl GD, raw_ostream &Out) {
     if (!UserLabelPrefix.empty())
       Out << '\01'; // LLVM IR Marker for __asm("foo")
 
-    Out << ALA->getLabel();
+    if (ALA->getLabel().starts_with(g_lldb_func_call_label_prefix))
+      emitLLDBAsmLabel(ALA->getLabel(), GD, Out);
+    else
+      Out << ALA->getLabel();
+
     return;
   }
 
diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp
index e2ac648320c0f..885d04b9ab926 100644
--- a/clang/lib/Sema/SemaDecl.cpp
+++ b/clang/lib/Sema/SemaDecl.cpp
@@ -8113,9 +8113,7 @@ NamedDecl *Sema::ActOnVariableDeclarator(
       }
     }
 
-    NewVD->addAttr(AsmLabelAttr::Create(Context, Label,
-                                        /*IsLiteralLabel=*/true,
-                                        SE->getStrTokenLoc(0)));
+    NewVD->addAttr(AsmLabelAttr::Create(Context, Label, SE->getStrTokenLoc(0)));
   } else if (!ExtnameUndeclaredIdentifiers.empty()) {
     llvm::DenseMap<IdentifierInfo*,AsmLabelAttr*>::iterator I =
       ExtnameUndeclaredIdentifiers.find(NewVD->getIdentifier());
@@ -10345,9 +10343,8 @@ Sema::ActOnFunctionDeclarator(Scope *S, Declarator &D, DeclContext *DC,
   if (Expr *E = D.getAsmLabel()) {
     // The parser guarantees this is a string.
     StringLiteral *SE = cast<StringLiteral>(E);
-    NewFD->addAttr(AsmLabelAttr::Create(Context, SE->getString(),
-                                        /*IsLiteralLabel=*/true,
-                                        SE->getStrTokenLoc(0)));
+    NewFD->addAttr(
+        AsmLabelAttr::Create(Context, SE->getString(), SE->getStrTokenLoc(0)));
   } else if (!ExtnameUndeclaredIdentifiers.empty()) {
     llvm::DenseMap<IdentifierInfo*,AsmLabelAttr*>::iterator I =
       ExtnameUndeclaredIdentifiers.find(NewFD->getIdentifier());
@@ -20598,8 +20595,8 @@ void Sema::ActOnPragmaRedefineExtname(IdentifierInfo* Name,
                                          LookupOrdinaryName);
   AttributeCommonInfo Info(AliasName, SourceRange(AliasNameLoc),
                            AttributeCommonInfo::Form::Pragma());
-  AsmLabelAttr *Attr = AsmLabelAttr::CreateImplicit(
-      Context, AliasName->getName(), /*IsLiteralLabel=*/true, Info);
+  AsmLabelAttr *Attr =
+      AsmLabelAttr::CreateImplicit(Context, AliasName->getName(), Info);
 
   // If a declaration that:
   // 1) declares a function or a variable
diff --git a/clang/unittests/AST/DeclTest.cpp b/clang/unittests/AST/DeclTest.cpp
index ed635da683aab..afaf413493299 100644
--- a/clang/unittests/AST/DeclTest.cpp
+++ b/clang/unittests/AST/DeclTest.cpp
@@ -74,7 +74,6 @@ TEST(Decl, AsmLabelAttr) {
   StringRef Code = R"(
     struct S {
       void f() {}
-      void g() {}
     };
   )";
   auto AST =
@@ -87,11 +86,8 @@ TEST(Decl, AsmLabelAttr) {
   const auto *DeclS =
       selectFirst<CXXRecordDecl>("d", match(cxxRecordDecl().bind("d"), Ctx));
   NamedDecl *DeclF = *DeclS->method_begin();
-  NamedDecl *DeclG = *(++DeclS->method_begin());
 
-  // Attach asm labels to the decls: one literal, and one not.
-  DeclF->addAttr(AsmLabelAttr::Create(Ctx, "foo", /*LiteralLabel=*/true));
-  DeclG->addAttr(AsmLabelAttr::Create(Ctx, "goo", /*LiteralLabel=*/false));
+  DeclF->addAttr(AsmLabelAttr::Create(Ctx, "foo"));
 
   // Mangle the decl names.
   std::string MangleF, MangleG;
@@ -99,14 +95,11 @@ TEST(Decl, AsmLabelAttr) {
       ItaniumMangleContext::create(Ctx, Diags));
   {
     llvm::raw_string_ostream OS_F(MangleF);
-    llvm::raw_string_ostream OS_G(MangleG);
     MC->mangleName(DeclF, OS_F);
-    MC->mangleName(DeclG, OS_G);
   }
 
   ASSERT_EQ(MangleF, "\x01"
                      "foo");
-  ASSERT_EQ(MangleG, "goo");
 }
 
 TEST(Decl, MangleDependentSizedArray) {
diff --git a/libcxxabi/src/demangle/ItaniumDemangle.h b/libcxxabi/src/demangle/ItaniumDemangle.h
index 6f27da7b9cadf..7b3983bc89367 100644
--- a/libcxxabi/src/demangle/ItaniumDemangle.h
+++ b/libcxxabi/src/demangle/ItaniumDemangle.h
@@ -1766,6 +1766,8 @@ class CtorDtorName final : public Node {
 
   template<typename Fn> void match(Fn F) const { F(Basename, IsDtor, Variant); }
 
+  int getVariant() const { return Variant; }
+
   void printLeft(OutputBuffer &OB) const override {
     if (IsDtor)
       OB += "~";
diff --git a/lldb/include/lldb/Expression/Expression.h b/lldb/include/lldb/Expression/Expression.h
index 20067f469895b..847226167d584 100644
--- a/lldb/include/lldb/Expression/Expression.h
+++ b/lldb/include/lldb/Expression/Expression.h
@@ -103,11 +103,15 @@ class Expression {
 ///
 /// The format being:
 ///
-///   <prefix>:<module uid>:<symbol uid>:<name>
+///   <prefix>:<discriminator>:<module uid>:<symbol uid>:<name>
 ///
 /// The label string needs to stay valid for the entire lifetime
 /// of this object.
 struct FunctionCallLabel {
+  /// Arbitrary string which language plugins can interpret for their
+  /// own needs.
+  llvm::StringRef discriminator;
+
   /// Unique identifier of the lldb_private::Module
   /// which contains the symbol identified by \c symbol_id.
   lldb::user_id_t module_id;
@@ -133,7 +137,7 @@ struct FunctionCallLabel {
   ///
   /// The representation roundtrips through \c fromString:
   /// \code{.cpp}
-  /// llvm::StringRef encoded = "$__lldb_func:0x0:0x0:_Z3foov";
+  /// llvm::StringRef encoded = "$__lldb_func:blah:0x0:0x0:_Z3foov";
   /// FunctionCallLabel label = *fromString(label);
   ///
   /// assert (label.toString() == encoded);
diff --git a/lldb/source/Expression/Expression.cpp b/lldb/source/Expression/Expression.cpp
index 796851ff15ca3..16ecb1d7deef8 100644
--- a/lldb/source/Expression/Expression.cpp
+++ b/lldb/source/Expression/Expression.cpp
@@ -34,10 +34,10 @@ Expression::Expression(ExecutionContextScope &exe_scope)
 
 llvm::Expected<FunctionCallLabel>
 lldb_private::FunctionCallLabel::fromString(llvm::StringRef label) {
-  llvm::SmallVector<llvm::StringRef, 4> components;
-  label.split(components, ":", /*MaxSplit=*/3);
+  llvm::SmallVector<llvm::StringRef, 5> components;
+  label.split(components, ":", /*MaxSplit=*/4);
 
-  if (components.size() != 4)
+  if (components.size() != 5)
     return llvm::createStringError("malformed function call label.");
 
   if (components[0] != FunctionCallLabelPrefix)
@@ -45,8 +45,10 @@ lldb_private::FunctionCallLabel::fromString(llvm::StringRef label) {
         "expected function call label prefix '{0}' but found '{1}' instead.",
         FunctionCallLabelPrefix, components[0]));
 
-  llvm::StringRef module_label = components[1];
-  llvm::StringRef die_label = components[2];
+  llvm::StringRef discriminator = components[1];
+  llvm::StringRef module_label = components[2];
+  llvm::StringRef die_label = components[3];
+  llvm::StringRef lookup_name = components[4];
 
   lldb::user_id_t module_id = 0;
   if (!llvm::to_integer(module_label, module_id))
@@ -58,20 +60,23 @@ lldb_private::FunctionCallLabel::fromString(llvm::StringRef label) {
     return llvm::createStringError(
         llvm::formatv("failed to parse symbol ID from '{0}'.", die_label));
 
-  return FunctionCallLabel{/*.module_id=*/module_id,
+  return FunctionCallLabel{/*.discriminator=*/discriminator,
+                           /*.module_id=*/module_id,
                            /*.symbol_id=*/die_id,
-                           /*.lookup_name=*/components[3]};
+                           /*.lookup_name=*/lookup_name};
 }
 
 std::string lldb_private::FunctionCallLabel::toString() const {
-  return llvm::formatv("{0}:{1:x}:{2:x}:{3}", FunctionCallLabelPrefix,
-                       module_id, symbol_id, lookup_name)
+  return llvm::formatv("{0}:{1}:{2:x}:{3:x}:{4}", FunctionCallLabelPrefix,
+                       discriminator, module_id, symbol_id, lookup_name)
       .str();
 }
 
 void llvm::format_provider<FunctionCallLabel>::format(
     const FunctionCallLabel &label, raw_ostream &OS, StringRef Style) {
-  OS << llvm::formatv("FunctionCallLabel{ module_id: {0:x}, symbol_id: {1:x}, "
-                      "lookup_name: {2} }",
-                      label.module_id, label.symbol_id, label.lookup_name);
+  OS << llvm::formatv("FunctionCallLabel{ discriminator: {0}, module_id: "
+                      "{1:x}, symbol_id: {2:x}, "
+                      "lookup_name: {3} }",
+                      label.discriminator, label.module_id, label.symbol_id,
+                      label.lookup_name);
 }
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
index 781c1c6c5745d..5c674308160e5 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
@@ -250,11 +250,52 @@ static unsigned GetCXXMethodCVQuals(const DWARFDIE &subprogram,
   return cv_quals;
 }
 
+static const char *GetMangledOrStructorName(const DWARFDIE &die) {
+  const char *name = die.GetMangledName(/*substitute_name_allowed*/ false);
+  if (name)
+    return name;
+
+  name = die.GetName();
+  if (!name)
+    return nullptr;
+
+  DWARFDIE parent = die.GetParent();
+  if (!parent.IsStructUnionOrClass())
+    return nullptr;
+
+  const char *parent_name = parent.GetName();
+  if (!parent_name)
+    return nullptr;
+
+  // Constructor.
+  if (::strcmp(parent_name, name) == 0)
+    return name;
+
+  // Destructor.
+  if (name[0] == '~' && ::strcmp(parent_name, name + 1))
+    return name;
+
+  return nullptr;
+}
+
 static std::string MakeLLDBFuncAsmLabel(const DWARFDIE &die) {
-  char const *name = die.GetMangledName(/*substitute_name_allowed*/ false);
+  char const *name = GetMangledOrStructorName(die);
   if (!name)
     return {};
 
+  auto *cu = die.GetCU();
+  if (!cu)
+    return {};
+
+  // FIXME: When resolving function call labels, we check that
+  // that the definition's DW_AT_specification points to the
+  // declaration that we encoded into the label here. But if the
+  // declaration came from a type-unit (and the definition from
+  // .debug_info), that check won't work. So for now, don't use
+  // function call labels for declaration DIEs from type-units.
+  if (cu->IsTypeUnit())
+    return {};
+
   SymbolFileDWARF *dwarf = die.GetDWARF();
   if (!dwarf)
     return {};
@@ -285,7 +326,9 @@ static std::string MakeLLDBFuncAsmLabel(const DWARFDIE &die) {
   if (die_id == LLDB_INVALID_UID)
     return {};
 
-  return FunctionCallLabel{/*module_id=*/module_id,
+  // Note, discriminator is added by Clang during mangling.
+  return FunctionCallLabel{/*discriminator=*/{},
+                           /*module_id=*/module_id,
                            /*symbol_id=*/die_id,
                            /*.lookup_name=*/name}
       .toString();
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
index 42a66ce75d6d6..911ce5930ca4c 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
@@ -7,7 +7,9 @@
 //===----------------------------------------------------------------------===//
 
 #include "SymbolFileDWARF.h"
+#include "clang/Basic/ABI.h"
 #include "llvm/ADT/STLExtras.h"
+#include "llvm/ADT/StringExtras.h"
 #include "llvm/DebugInfo/DWARF/DWARFAddressRange.h"
 #include "llvm/DebugInfo/DWARF/DWARFDebugLoc.h"
 #include "llvm/Support/Casting.h"
@@ -78,6 +80,7 @@
 
 #include "llvm/DebugInfo/DWARF/DWARFContext.h"
 #include "llvm/DebugInfo/DWARF/DWARFDebugAbbrev.h"
+#include "llvm/Demangle/Demangle.h"
 #include "llvm/Support/FileSystem.h"
 #include "llvm/Support/FormatVariadic.h"
 
@@ -2482,6 +2485,133 @@ bool SymbolFileDWARF::ResolveFunction(const DWARFDIE &orig_die,
   return false;
 }
 
+static int ClangToItaniumCtorKind(clang::CXXCtorType kind) {
+  switch (kind) {
+  case clang::CXXCtorType::Ctor_Complete:
+    return 1;
+  case clang::CXXCtorType::Ctor_Base:
+    return 2;
+  case clang::CXXCtorType::Ctor_CopyingClosure:
+  case clang::CXXCtorType::Ctor_DefaultClosure:
+  case clang::CXXCtorType::Ctor_Comdat:
+    llvm_unreachable("Unexpected constructor kind.");
+  }
+}
+
+static int ClangToItaniumDtorKind(clang::CXXDtorType kind) {
+  switch (kind) {
+  case clang::CXXDtorType::Dtor_Deleting:
+    return 0;
+  case clang::CXXDtorType::Dtor_Complete:
+    return 1;
+  case clang::CXXDtorType::Dtor_Base:
+    return 2;
+  case clang::CXXDtorType::Dtor_Comdat:
+    llvm_unreachable("Unexpected destructor kind.");
+  }
+}
+
+static std::optional<int>
+GetItaniumCtorDtorVariant(llvm::StringRef discriminator) {
+  const bool is_ctor = discriminator.consume_front("C");
+  if (!is_ctor && !discriminator.consume_front("D"))
+    return std::nullopt;
+
+  uint64_t structor_kind;
+  if (!llvm::to_integer(discriminator, structor_kind))
+    return std::nullopt;
+
+  if (is_ctor) {
+    if (structor_kind > clang::CXXCtorType::Ctor_DefaultClosure)
+      return std::nullopt;
+
+    return ClangToItaniumCtorKind(
+        static_cast<clang::CXXCtorType>(structor_kind));
+  }
+
+  if (structor_kind > clang::CXXDtorType::Dtor_Deleting)
+    return std::nullopt;
+
+  return ClangToItaniumDtorKind(static_cast<clang::CXXDtorType>(structor_kind));
+}
+
+DWARFDIE SymbolFileDWARF::FindFunctionDefinition(const FunctionCallLabel &label,
+                                                 const DWARFDIE &declaration) {
+  DWARFDIE definition;
+  llvm::DenseMap<int, DWARFDIE> structor_variant_to_die;
+
+  // eFunctionNameTypeFull for mangled name lookup.
+  // eFunctionNameTypeMethod is required for structor lookups (since we look
+  // those up by DW_AT_name).
+  Module::LookupInfo info(ConstString(label.lookup_name),
+                          lldb::eFunctionNameTypeFull |
+                              lldb::eFunctionNameTypeMethod,
+                          lldb::eLanguageTypeUnknown);
+
+  m_index->GetFunctions(info, *this, {}, [&](DWARFDIE entry) {
+    if (entry.GetAttributeValueAsUnsigned(llvm::dwarf::DW_AT_declaration, 0))
+      return IterationAction::Continue;
+
+    auto spec = entry.GetAttributeValueAsReferenceDIE(DW_AT_specification);
+    if (!spec)
+      return IterationAction::Continue;
+
+    if (spec != declaration)
+      return IterationAction::Continue;
+
+    // We're not picking a specific structor variant DIE, so we're done here.
+    if (label.discriminator.empty()) {
+      definition = entry;
+      return IterationAction::Stop;
+    }
+
+    const char *mangled =
+        entry.GetMangledName(/*substitute_name_allowed=*/false);
+    if (!mangled)
+      return IterationAction::Continue;
+
+    llvm::ItaniumPartialDemangler D;
+    if (D.partialDemangle(mangled))
+      return IterationAction::Continue;
+
+    auto structor_variant = D.getCtorOrDtorVariant();
+    if (!structor_variant)
+      return IterationAction::Continue;
+
+    auto [_, inserted] = structor_variant_to_die.try_emplace(*structor_variant,
+                                                             std::move(entry));
+    assert(inserted);
+
+    // The compiler may choose to alias the constructor variants
+    // (notably this happens on Linux), so we might not have a definition
+    // DIE for some structor variants. Hence we iterate over all variants
+    // and pick the most appropriate one out of those.
+    return IterationAction::Continue;
+  });
+
+  if (definition)
+    return definition;
+
+  auto label_variant = GetItaniumCtorDtorVariant(label.discriminator);
+  if (!label_variant)
+    return {};
+
+  auto it = structor_variant_to_die.find(*label_variant);
+
+  // Found the exact variant.
+  if (it != structor_variant_to_die.end())
+    return it->getSecond();
+
+  // C1 was aliased to C2
+  if (!label.lookup_name.starts_with("~") && label_variant == 1) {
+    if (auto it = structor_variant_to_die.find(2);
+        it != structor_variant_to_die.end())
+      return it->getSecond();
+  }
+
+  return {};
+}
+
 llvm::Expected<SymbolContext>
 SymbolFileDWARF::ResolveFunctionCallLabel(const FunctionCallLabel &label) {
   std::lock_guard<std::recursive_mutex> guard(GetModuleMutex());
@@ -2494,24 +2624,7 @@ SymbolFileDWARF::ResolveFunctionCallLabel(const FunctionCallLabel &label) {
   // Label was created using a declaration DIE. Need to fetch the definition
   // to resolve the function call.
   if (die.GetAttributeValueAsUnsigned(llvm::dwarf::DW_AT_declaration, 0)) {
-    Module::LookupInfo info(ConstString(label.lookup_name),
-                            lldb::eFunctionNameTypeFull,
-                            lldb::eLanguageTypeUnknown);
-
-    m_index->GetFunctions(info, *this, {}, [&](DWARFDIE entry) {
-      if (entry.GetAttributeValueAsUnsigned(llvm::dwarf::DW_AT_declaration, 0))
-        return IterationAction::Continue;
-
-      // We don't check whether the specification DIE for this function
-      // corresponds to the declaration DIE because the declaration might be in
-      // a type-unit but the definition in the compile-unit (and it's
-      // specifcation would point to the declaration in the compile-unit). We
-      // rely on the mangled name within the module to be enough to find us the
-      // unique definition.
-      die = entry;
-      return IterationAction::Stop;
-    });
-
+    die = FindFunctionDefinition(label, die);
     if (die.GetAttributeValueAsUnsigned(llvm::dwarf::DW_AT_declaration, 0))
       return llvm::createStringError(
           llvm::formatv("failed to find definition DIE for {0}", label));
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h
index 3ec538da8cf77..843346619db37 100644
--- a/lldb/s...
[truncated]

Michael137 added a commit that referenced this pull request Sep 1, 2025
…een declaration and definition (#154137)

This patch is motivated by
#149827, where we plan on using
mangled names on structor declarations to find the exact structor
definition that LLDB's expression evaluator should call.

So far LLVM expects the declaration and definition linkage names to be
identical (or the declaration to just not have a linkage name). But we
plan on attaching the GCC-style "unified" mangled name to declarations,
which will be different to linkage name on the definition. This patch
relaxes this restriction.
@Michael137 Michael137 force-pushed the lldb/reliable-function-call-labels-structors branch from 4340cbc to 419c94e Compare September 1, 2025 14:45
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Sep 1, 2025
…ferent between declaration and definition (#154137)

This patch is motivated by
llvm/llvm-project#149827, where we plan on using
mangled names on structor declarations to find the exact structor
definition that LLDB's expression evaluator should call.

So far LLVM expects the declaration and definition linkage names to be
identical (or the declaration to just not have a linkage name). But we
plan on attaching the GCC-style "unified" mangled name to declarations,
which will be different to linkage name on the definition. This patch
relaxes this restriction.
@Michael137 Michael137 force-pushed the lldb/reliable-function-call-labels-structors branch from 419c94e to a3afd66 Compare September 1, 2025 15:59
Michael137 added a commit to swiftlang/llvm-project that referenced this pull request Sep 2, 2025
…llvm#148877)

LLDB currently attaches `AsmLabel`s to `FunctionDecl`s such that that
the `IRExecutionUnit` can determine which mangled name to call (we can't
rely on Clang deriving the correct mangled name to call because the
debug-info AST doesn't contain all the info that would be encoded in the
DWARF linkage names). However, we don't attach `AsmLabel`s for structors
because they have multiple variants and thus it's not clear which
mangled name to use. In the [RFC on fixing expression evaluation of
abi-tagged
structors](https://discourse.llvm.org/t/rfc-lldb-handling-abi-tagged-constructors-destructors-in-expression-evaluator/82816)
we discussed encoding the structor variant into the `AsmLabel`s.
Specifically in [this
thread](https://discourse.llvm.org/t/rfc-lldb-handling-abi-tagged-constructors-destructors-in-expression-evaluator/82816/7)
we discussed that the contents of the `AsmLabel` are completely under
LLDB's control and we could make use of it to uniquely identify a
function by encoding the exact module and DIE that the function is
associated with (mangled names need not be enough since two identical
mangled symbols may live in different modules). So if we already have a
custom `AsmLabel` format, we can encode the structor variant in a
follow-up (the current idea is to append the structor variant as a
suffix to our custom `AsmLabel` when Clang emits the mangled name into
the JITted IR). Then we would just have to teach the `IRExecutionUnit`
to pick the correct structor variant DIE during symbol resolution. The
draft of this is available
[here](llvm#149827)

This patch sets up the infrastructure for the custom `AsmLabel` format
by encoding the module id, DIE id and mangled name in it.

**Implementation**

The flow is as follows:
1. Create the label in `DWARFASTParserClang`. The format is:
`$__lldb_func:module_id:die_id:mangled_name`
2. When resolving external symbols in `IRExecutionUnit`, we parse this
label and then do a lookup by DIE ID (or mangled name into the module if
the encoded DIE is a declaration).

Depends on llvm#151355

(cherry picked from commit f890591)
Michael137 added a commit to swiftlang/llvm-project that referenced this pull request Sep 2, 2025
This was added purely for the needs of LLDB. However, starting with
llvm#151355 and
llvm#148877 we no longer create
literal AsmLabels in LLDB either. So this is unused and we can get rid
of it.

In the near future LLDB will want to add special support for mangling
`AsmLabel`s (specifically
llvm#149827).

(cherry picked from commit cd8f348)
Michael137 added a commit that referenced this pull request Sep 9, 2025
…for LLDB JIT expressions (#155485)

Part of #149827

This patch adds special handling for `AsmLabel`s created by LLDB. LLDB
uses `AsmLabel`s to encode information about a function declaration to
make it easier to locate function symbols when JITing C++ expressions.
For constructors/destructors LLDB doesn't know at the time of creating
the `AsmLabelAttr` which structor variant the expression evaluator will
need to call (this is decided when compiling the expression). So we make
the Clang mangler inject this information into our custom label when
we're JITting the expression.
This patch is an implementation of [this discussion](https://discourse.llvm.org/t/rfc-lldb-handling-abi-tagged-constructors-destructors-in-expression-evaluator/82816/7) about handling ABI-tagged structors during expression evaluation.

**Motivation**

LLDB encodes the mangled name of a `DW_TAG_subprogram` into `AsmLabel`s on function and method Clang AST nodes. This means that when calls to these functions get lowered into IR (when running JITted expressions), the address resolver can locate the appropriate symbol by mangled name (and it's guaranteed to find the symbol because we got the mangled name from debug-info, instead of letting Clang mangle it based on AST structure). However, we don't do this for `CXXConstructorDecl`s/`CXXDestructorDecl`s because these structor declarations in DWARF don't have a linkage name. This is because there can be multiple variants of a structor, each with a distinct mangling in the Itanium ABI. Each structor variant has its own definition `DW_TAG_subprogram`. So LLDB doesn't know which mangled name to put into the `AsmLabel`.

Currently this means using an ABI-tagged constructor in LLDB expressions won't work (see [this RFC](https://discourse.llvm.org/t/rfc-lldb-handling-abi-tagged-constructors-destructors-in-expression-evaluator/82816) for concrete examples).

**Proposed Solution**

The `FunctionCallLabel` encoding that we put into `AsmLabel`s already supports stuffing more info about a DIE into it. So this patch extends the `FunctionCallLabel` to contain an optional discriminator (a sequence of bytes) which the `SymbolFileDWARF` plugin interprets as the constructor/destructor variant of that DIE.

There's a few subtleties here:
1. At the point at which LLDB first constructs the label, it has no way of knowing (just by looking the debug-info declaration), which structor variant the expression evaluator is supposed to call. That's something that gets decided when compiling the expression. So we let the Clang mangler inject the correct structor variant into the `AsmLabel` during JITing. I adjusted the `AsmLabelAttr` mangling for this. An option would've been to create a new Clang attribute which behaved like an `AsmLabel` but with these special semantics for LLDB. My main concern there is that we'd have to adjust all the `AsmLabelAttr` checks around Clang to also now account for this new attribute.
2. The compiler is free to omit the `C1` variant of a constructor if the `C2` variant is sufficient. In that case it may alias `C1` to `C2`, leaving us with only the `C2` `DW_TAG_subprogram` in the object file. Linux is one of the platforms where this occurs. For those cases there's heuristic in `SymbolFileDWARF` where we pick `C2` if we asked for `C1` but it doesn't exist. This may not always be correct (if for some reason the compiler decided to drop `C1` for other reasons).
@Michael137 Michael137 force-pushed the lldb/reliable-function-call-labels-structors branch from a3afd66 to c362890 Compare September 9, 2025 08:11
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Sep 9, 2025
…n mangling for LLDB JIT expressions (#155485)

Part of llvm/llvm-project#149827

This patch adds special handling for `AsmLabel`s created by LLDB. LLDB
uses `AsmLabel`s to encode information about a function declaration to
make it easier to locate function symbols when JITing C++ expressions.
For constructors/destructors LLDB doesn't know at the time of creating
the `AsmLabelAttr` which structor variant the expression evaluator will
need to call (this is decided when compiling the expression). So we make
the Clang mangler inject this information into our custom label when
we're JITting the expression.
Michael137 added a commit that referenced this pull request Sep 9, 2025
…clarations (#154142)

Depends on #154137

This patch is motivated by
#149827, where we plan on using
mangled names on structor declarations to find the exact structor
definition that LLDB's expression evaluator should call.

Given a `DW_TAG_subprogram` for a function declaration, the most
convenient way for a debugger to find the corresponding definition is to
use the `DW_AT_linkage_name` (i.e., the mangled name). However, we
currently can't do that for constructors/destructors because Clang
doesn't attach linkage names to them. This is because, depending on ABI,
there can be multiple definitions for a single constructor/destructor
declaration. The way GCC works around this is by producing a `C4`/`D4`
"unified" mangling for structor declarations (see
[godbolt](https://godbolt.org/z/Wds6cja9K)). GDB uses this to locate the
relevant definitions.

This patch aligns Clang with GCC's DWARF output and allows us to
implement the same lookup scheme in LLDB.
@Michael137 Michael137 enabled auto-merge (squash) September 9, 2025 08:50
@Michael137 Michael137 merged commit 57a7907 into llvm:main Sep 9, 2025
9 checks passed
@Michael137 Michael137 deleted the lldb/reliable-function-call-labels-structors branch September 9, 2025 09:03
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Sep 9, 2025
…structor declarations (#154142)

Depends on llvm/llvm-project#154137

This patch is motivated by
llvm/llvm-project#149827, where we plan on using
mangled names on structor declarations to find the exact structor
definition that LLDB's expression evaluator should call.

Given a `DW_TAG_subprogram` for a function declaration, the most
convenient way for a debugger to find the corresponding definition is to
use the `DW_AT_linkage_name` (i.e., the mangled name). However, we
currently can't do that for constructors/destructors because Clang
doesn't attach linkage names to them. This is because, depending on ABI,
there can be multiple definitions for a single constructor/destructor
declaration. The way GCC works around this is by producing a `C4`/`D4`
"unified" mangling for structor declarations (see
[godbolt](https://godbolt.org/z/Wds6cja9K)). GDB uses this to locate the
relevant definitions.

This patch aligns Clang with GCC's DWARF output and allows us to
implement the same lookup scheme in LLDB.
Michael137 added a commit to swiftlang/llvm-project that referenced this pull request Sep 9, 2025
llvm#155483)

Part of llvm#149827

Allows us to use the mangling substitution facilities in
CPlusPlusLanguage but also SymbolFileDWARF.

Added tests now that they're "public".

(cherry picked from commit 9dd38b0)
Michael137 added a commit to swiftlang/llvm-project that referenced this pull request Sep 9, 2025
…een declaration and definition (llvm#154137)

This patch is motivated by
llvm#149827, where we plan on using
mangled names on structor declarations to find the exact structor
definition that LLDB's expression evaluator should call.

So far LLVM expects the declaration and definition linkage names to be
identical (or the declaration to just not have a linkage name). But we
plan on attaching the GCC-style "unified" mangled name to declarations,
which will be different to linkage name on the definition. This patch
relaxes this restriction.

(cherry picked from commit c6286b3)
Michael137 added a commit to swiftlang/llvm-project that referenced this pull request Sep 9, 2025
…for LLDB JIT expressions (llvm#155485)

Part of llvm#149827

This patch adds special handling for `AsmLabel`s created by LLDB. LLDB
uses `AsmLabel`s to encode information about a function declaration to
make it easier to locate function symbols when JITing C++ expressions.
For constructors/destructors LLDB doesn't know at the time of creating
the `AsmLabelAttr` which structor variant the expression evaluator will
need to call (this is decided when compiling the expression). So we make
the Clang mangler inject this information into our custom label when
we're JITting the expression.

(cherry picked from commit db8cad0)
Michael137 added a commit to swiftlang/llvm-project that referenced this pull request Sep 9, 2025
…clarations (llvm#154142)

Depends on llvm#154137

This patch is motivated by
llvm#149827, where we plan on using
mangled names on structor declarations to find the exact structor
definition that LLDB's expression evaluator should call.

Given a `DW_TAG_subprogram` for a function declaration, the most
convenient way for a debugger to find the corresponding definition is to
use the `DW_AT_linkage_name` (i.e., the mangled name). However, we
currently can't do that for constructors/destructors because Clang
doesn't attach linkage names to them. This is because, depending on ABI,
there can be multiple definitions for a single constructor/destructor
declaration. The way GCC works around this is by producing a `C4`/`D4`
"unified" mangling for structor declarations (see
[godbolt](https://godbolt.org/z/Wds6cja9K)). GDB uses this to locate the
relevant definitions.

This patch aligns Clang with GCC's DWARF output and allows us to
implement the same lookup scheme in LLDB.

(cherry picked from commit 06d202b)
@antmox
Copy link
Contributor

antmox commented Sep 9, 2025

Hi @Michael137 ,

It looks like this patch caused a bot failure ( lldb-aarch64-windows ) :
https://lab.llvm.org/buildbot/#/builders/141/builds/11398/

Could you please look at this ?

https://lab.llvm.org/buildbot/#/builders/141/builds/11398/steps/6/logs/stdio :
FAIL: test_nested_no_structor_linkage_names_dwarf (TestAbiTagStructors.AbiTagStructorsTestCase.test_nested_no_structor_linkage_names_dwarf)
FAIL: test_nested_with_structor_linkage_names_dwarf (TestAbiTagStructors.AbiTagStructorsTestCase.test_nested_with_structor_linkage_names_dwarf)
FAIL: test_no_structor_linkage_names_dwarf (TestAbiTagStructors.AbiTagStructorsTestCase.test_no_structor_linkage_names_dwarf)
FAIL: test_with_structor_linkage_names_dwarf (TestAbiTagStructors.AbiTagStructorsTestCase.test_with_structor_linkage_names_dwarf)

@DavidSpickett
Copy link
Collaborator

Might be fixed next build, I saw the Linux ones fail then go green right after.

@Michael137
Copy link
Member Author

Might be fixed next build, I saw the Linux ones fail then go green right after.

Yea I hoped so too, but looks like that test run had both my changes. I think this is just not supported on Windows for now (i doubt this worked before the tests were added either). I XFAILed them for now in 6dfc255ee103

Michael137 added a commit to swiftlang/llvm-project that referenced this pull request Sep 9, 2025
llvm#149827)

Depends on
* llvm#148877
* llvm#155483
* llvm#155485
* llvm#154137
* llvm#154142

This patch is an implementation of [this
discussion](https://discourse.llvm.org/t/rfc-lldb-handling-abi-tagged-constructors-destructors-in-expression-evaluator/82816/7)
about handling ABI-tagged structors during expression evaluation.

**Motivation**

LLDB encodes the mangled name of a `DW_TAG_subprogram` into `AsmLabel`s
on function and method Clang AST nodes. This means that when calls to
these functions get lowered into IR (when running JITted expressions),
the address resolver can locate the appropriate symbol by mangled name
(and it is guaranteed to find the symbol because we got the mangled name
from debug-info, instead of letting Clang mangle it based on AST
structure). However, we don't do this for
`CXXConstructorDecl`s/`CXXDestructorDecl`s because these structor
declarations in DWARF don't have a linkage name. This is because there
can be multiple variants of a structor, each with a distinct mangling in
the Itanium ABI. Each structor variant has its own definition
`DW_TAG_subprogram`. So LLDB doesn't know which mangled name to put into
the `AsmLabel`.

Currently this means using ABI-tagged structors in LLDB expressions
won't work (see [this
RFC](https://discourse.llvm.org/t/rfc-lldb-handling-abi-tagged-constructors-destructors-in-expression-evaluator/82816)
for concrete examples).

**Proposed Solution**

The `FunctionCallLabel` encoding that we put into `AsmLabel`s already
supports stuffing more info about a DIE into it. So this patch extends
the `FunctionCallLabel` to contain an optional discriminator (a sequence
of bytes) which the `SymbolFileDWARF` plugin interprets as the
constructor/destructor variant of that DIE. So when searching for the
definition DIE, LLDB will include the structor variant in its heuristic
for determining a match.

There's a few subtleties here:
1. At the point at which LLDB first constructs the label, it has no way
of knowing (just by looking at the debug-info declaration), which
structor variant the expression evaluator is supposed to call. That's
something that gets decided when compiling the expression. So we let the
Clang mangler inject the correct structor variant into the `AsmLabel`
during JITing. I adjusted the `AsmLabelAttr` mangling for this in
llvm#155485. An option would've
been to create a new Clang attribute which behaved like an `AsmLabel`
but with these special semantics for LLDB. My main concern there is that
we'd have to adjust all the `AsmLabelAttr` checks around Clang to also
now account for this new attribute.
2. The compiler is free to omit the `C1` variant of a constructor if the
`C2` variant is sufficient. In that case it may alias `C1` to `C2`,
leaving us with only the `C2` `DW_TAG_subprogram` in the object file.
Linux is one of the platforms where this occurs. For those cases I added
a heuristic in `SymbolFileDWARF` where we pick `C2` if we asked for `C1`
but it doesn't exist. This may not always be correct (e.g., if the
compiler decided to drop `C1` for other reasons).
3. In llvm#154142 Clang will emit
`C4`/`D4` variants of ctors/dtors on declarations. When resolving the
`FunctionCallLabel` we will now substitute the actual variant that Clang
told us we need to call into the mangled name. We do this using LLDB's
`ManglingSubstitutor`. That way we find the definition DIE exactly the
same way we do for regular function calls.
4. In cases where declarations and definitions live in separate modules,
the DIE ID encoded in the function call label may not be enough to find
the definition DIE in the encoded module ID. For those cases we fall
back to how LLDB used to work: look up in all images of the target. To
make sure we don't use the unified mangled name for the fallback lookup,
we change the lookup name to whatever mangled name the FunctionCallLabel
resolved to.

rdar://104968288
(cherry picked from commit 57a7907)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:codegen IR generation bugs: mangling, exceptions, etc. clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category debuginfo libc++abi libc++abi C++ Runtime Library. Not libc++. lldb llvm:codegen

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants