Skip to content

[lldb][Expression] Encode Module and DIE UIDs into function AsmLabels #148877

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 39 commits into from
Aug 1, 2025

Conversation

Michael137
Copy link
Member

@Michael137 Michael137 commented Jul 15, 2025

LLDB currently attaches AsmLabels to FunctionDecls such that that the IRExecutionUnit can determine which mangled name to call (we can't rely on Clang deriving the correct mangled name to call because the debug-info AST doesn't contain all the info that would be encoded in the DWARF linkage names). However, we don't attach AsmLabels for structors because they have multiple variants and thus it's not clear which mangled name to use. In the RFC on fixing expression evaluation of abi-tagged structors we discussed encoding the structor variant into the AsmLabels. Specifically in this thread we discussed that the contents of the AsmLabel are completely under LLDB's control and we could make use of it to uniquely identify a function by encoding the exact module and DIE that the function is associated with (mangled names need not be enough since two identical mangled symbols may live in different modules). So if we already have a custom AsmLabel format, we can encode the structor variant in a follow-up (the current idea is to append the structor variant as a suffix to our custom AsmLabel when Clang emits the mangled name into the JITted IR). Then we would just have to teach the IRExecutionUnit to pick the correct structor variant DIE during symbol resolution. The draft of this is available here

This patch sets up the infrastructure for the custom AsmLabel format by encoding the module id, DIE id and mangled name in it.

Implementation

The flow is as follows:

  1. Create the label in DWARFASTParserClang. The format is: $__lldb_func:module_id:die_id:mangled_name
  2. When resolving external symbols in IRExecutionUnit, we parse this label and then do a lookup by DIE ID (or mangled name into the module if the encoded DIE is a declaration).

Depends on #151355

@Michael137 Michael137 requested a review from labath July 15, 2025 16:01
Copy link

github-actions bot commented Jul 15, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Comment on lines 270 to 277
// Module UID is only a Darwin concept (?)
// If UUID is not available encode as pointer.
// Maybe add character to signal whether this is a pointer
// or UUID. Or maybe if it's not hex that implies a UUID?
auto module_id = module_sp->GetUUID();
Module * module_ptr = nullptr;
if (!module_id.IsValid())
module_ptr = module_sp.get();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@labath I'm trying to clean up this prototype of your suggestion in my ABI-tagged structors RFC.

This function here is what encodes the label.

Something I wasn't sure about though was how we should encode the Module in the string. Since UIDs need not exist (e.g., on Linux IIUC?), was your original suggestion to simply print out the pointer? If so, how would we defend against the module getting unloaded under our feet? I might've misunderstood your idea though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(e.g., on Linux IIUC?)

It's more common there, but even on darwin you can create binaries without a UUID (ld --no_uuid). And even if there is one, there's no guarantee it's going to be truly unique.

was your original suggestion to simply print out the pointer? If so, how would we defend against the module getting unloaded under our feet? I might've misunderstood your idea though.

Yes, that was the idea, though I can't say I have given much thought about this aspect. Intuitively, I would expect this to be fine. Since the module pointer is only encoded in the AST inside that module, the fact that we got that pointer here means that the module was present at the beginning of the compilation process. The pointer is consumed after clang is done (which happens midway into compilation), and I'd hope that noone can unload a module between those two points.

That said, I admit this is risky, and I'm not pushing for this particular encoding. In particular, it may enable someone to crash lldb by creating a function with a funny name.

There are plenty of other ways to implement this detail though, the most obvious one being assigning a surrogate ID (an increasing integer) to each module upon creation (just like we do with debuggers).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are plenty of other ways to implement this detail though, the most obvious one being assigning a surrogate ID (an increasing integer) to each module upon creation (just like we do with debuggers).

Yea that seems like a good alternative

@Michael137 Michael137 force-pushed the lldb/reliable-function-call-labels branch 8 times, most recently from cfc23ca to dd5782e Compare July 19, 2025 21:55
@Michael137 Michael137 force-pushed the lldb/reliable-function-call-labels branch 6 times, most recently from 4c87014 to fd6b6e8 Compare July 21, 2025 09:01
@Michael137 Michael137 marked this pull request as ready for review July 21, 2025 11:08
@Michael137 Michael137 requested a review from JDevlieghere as a code owner July 21, 2025 11:08
@Michael137 Michael137 changed the title [DRAFT] [lldb][Expression] Encode Module and DIE UIDs into function AsmLabels [lldb][Expression] Encode Module and DIE UIDs into function AsmLabels Jul 21, 2025
@llvmbot llvmbot added the lldb label Jul 21, 2025
@llvmbot
Copy link
Member

llvmbot commented Jul 21, 2025

@llvm/pr-subscribers-lldb

Author: Michael Buch (Michael137)

Changes

LLDB currently attaches AsmLabels to FunctionDecls such that that the IRExecutionUnit can determine which mangled name to call (we can't rely on Clang deriving the correct mangled name to call because the debug-info AST doesn't contain all the info that would be encoded in the DWARF linkage names). However, we don't attach AsmLabels for structors because they have multiple variants and thus it's not clear which mangled name to use. In the RFC on fixing expression evaluation of abi-tagged structors we discussed encoding the structor variant into the AsmLabels. Specifically in this thread we discussed that the contents of the AsmLabel are completely under LLDB's control and we could make use of it to uniquely identify a function by encoding the exact module and DIE that the function is associated with (mangled names need not be enough since two identical mangled symbols may live in different modules). So if we already have a custom AsmLabel format, we can encode the structor variant in a follow-up (the current idea is to append the structor variant as a suffix to our custom AsmLabel when Clang emits the mangled name into the JITted IR). Then we would just have to teach the IRExecutionUnit to pick the correct structor variant DIE during symbol resolution.

This patch sets up the infrastructure for the custom AsmLabel format by encoding the module id, DIE id and mangled name in it.

Implementation

The flow is as follows:

  1. Create the label in DWARFASTParserClang. The format is: $__lldb_func:mangled_name:module_id:die_id
  2. When resolving external symbols in IRExecutionUnit, we parse this label and then do a lookup by mangled name into the module. The reason I'm not using the DIE ID here is that for C++ methods, we create the decl using the declaration DW_TAG_subprogram. So we would still need to do some kind of search for the definition.

I made the label creation an API on TypeSystemClang because the whole AsmLabel logic is Clang-specific, and in the future I want to stuff knowledge about constructor/destructor variants into the label, which is even more C++/Clang specific.

But happy to consider other architectures for this.


Patch is 34.44 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/148877.diff

17 Files Affected:

  • (modified) lldb/include/lldb/Core/Module.h (+3-1)
  • (modified) lldb/include/lldb/Expression/Expression.h (+25)
  • (modified) lldb/include/lldb/Symbol/SymbolFile.h (+14)
  • (modified) lldb/include/lldb/Symbol/TypeSystem.h (+16)
  • (modified) lldb/source/Core/Module.cpp (+16-3)
  • (modified) lldb/source/Expression/Expression.cpp (+16)
  • (modified) lldb/source/Expression/IRExecutionUnit.cpp (+74)
  • (modified) lldb/source/Plugins/ExpressionParser/Clang/ClangExpressionDeclMap.cpp (+1-1)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp (+26-17)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp (+41)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h (+3)
  • (modified) lldb/source/Plugins/SymbolFile/NativePDB/PdbAstBuilder.cpp (+3-3)
  • (modified) lldb/source/Plugins/SymbolFile/NativePDB/UdtRecordCompleter.cpp (+2-3)
  • (modified) lldb/source/Plugins/SymbolFile/PDB/PDBASTParser.cpp (+4-3)
  • (modified) lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp (+76-5)
  • (modified) lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.h (+9-2)
  • (modified) lldb/unittests/Symbol/TestTypeSystemClang.cpp (+6-6)
diff --git a/lldb/include/lldb/Core/Module.h b/lldb/include/lldb/Core/Module.h
index 8bb55c95773bc..3991a12997541 100644
--- a/lldb/include/lldb/Core/Module.h
+++ b/lldb/include/lldb/Core/Module.h
@@ -86,7 +86,8 @@ struct ModuleFunctionSearchOptions {
 ///
 /// The module will parse more detailed information as more queries are made.
 class Module : public std::enable_shared_from_this<Module>,
-               public SymbolContextScope {
+               public SymbolContextScope,
+               public UserID {
 public:
   class LookupInfo;
   // Static functions that can track the lifetime of module objects. This is
@@ -97,6 +98,7 @@ class Module : public std::enable_shared_from_this<Module>,
   // using the "--global" (-g for short).
   static size_t GetNumberAllocatedModules();
 
+  static Module *GetAllocatedModuleWithUID(lldb::user_id_t uid);
   static Module *GetAllocatedModuleAtIndex(size_t idx);
 
   static std::recursive_mutex &GetAllocationModuleCollectionMutex();
diff --git a/lldb/include/lldb/Expression/Expression.h b/lldb/include/lldb/Expression/Expression.h
index 8de9364436ccf..f32878c9bf876 100644
--- a/lldb/include/lldb/Expression/Expression.h
+++ b/lldb/include/lldb/Expression/Expression.h
@@ -96,6 +96,31 @@ class Expression {
                                  ///invalid.
 };
 
+/// Holds parsed information about a function call label that
+/// LLDB attaches as an AsmLabel to function AST nodes it parses
+/// from debug-info.
+///
+/// The format being:
+///
+///   <prefix>:<mangled name>:<module id>:<DIE id>
+///
+/// The label string needs to stay valid for the entire lifetime
+/// of this object.
+struct FunctionCallLabel {
+  llvm::StringRef m_lookup_name;
+  lldb::user_id_t m_module_id;
+
+  /// Mostly for debuggability.
+  lldb::user_id_t m_die_id;
+};
+
+/// LLDB attaches this prefix to mangled names of functions that it get called
+/// from JITted expressions.
+inline constexpr llvm::StringRef FunctionCallLabelPrefix = "$__lldb_func";
+
+bool consumeFunctionCallLabelPrefix(llvm::StringRef &name);
+bool hasFunctionCallLabelPrefix(llvm::StringRef name);
+
 } // namespace lldb_private
 
 #endif // LLDB_EXPRESSION_EXPRESSION_H
diff --git a/lldb/include/lldb/Symbol/SymbolFile.h b/lldb/include/lldb/Symbol/SymbolFile.h
index e95f95553c17c..6aca276fc85b6 100644
--- a/lldb/include/lldb/Symbol/SymbolFile.h
+++ b/lldb/include/lldb/Symbol/SymbolFile.h
@@ -18,6 +18,7 @@
 #include "lldb/Symbol/CompilerType.h"
 #include "lldb/Symbol/Function.h"
 #include "lldb/Symbol/SourceModule.h"
+#include "lldb/Symbol/SymbolContext.h"
 #include "lldb/Symbol/Type.h"
 #include "lldb/Symbol/TypeList.h"
 #include "lldb/Symbol/TypeSystem.h"
@@ -328,6 +329,19 @@ class SymbolFile : public PluginInterface {
   GetMangledNamesForFunction(const std::string &scope_qualified_name,
                              std::vector<ConstString> &mangled_names);
 
+  /// Resolves the function DIE identified by \c lookup_name within
+  /// this SymbolFile.
+  ///
+  /// \param[in,out] sc_list The resolved functions will be appended to this
+  /// list.
+  ///
+  /// \param[in] lookup_name The UID of the function DIE to resolve.
+  ///
+  virtual llvm::Error FindAndResolveFunction(SymbolContextList &sc_list,
+                                             llvm::StringRef lookup_name) {
+    return llvm::createStringError("Not implemented");
+  }
+
   virtual void GetTypes(lldb_private::SymbolContextScope *sc_scope,
                         lldb::TypeClass type_mask,
                         lldb_private::TypeList &type_list) = 0;
diff --git a/lldb/include/lldb/Symbol/TypeSystem.h b/lldb/include/lldb/Symbol/TypeSystem.h
index cb1f0130b548d..742c09251ea2f 100644
--- a/lldb/include/lldb/Symbol/TypeSystem.h
+++ b/lldb/include/lldb/Symbol/TypeSystem.h
@@ -548,6 +548,22 @@ class TypeSystem : public PluginInterface,
   bool GetHasForcefullyCompletedTypes() const {
     return m_has_forcefully_completed_types;
   }
+
+  /// Returns the components of the specified function call label.
+  ///
+  /// The format of \c label is described in \c FunctionCallLabel.
+  /// The label prefix is not one of the components.
+  virtual llvm::Expected<llvm::SmallVector<llvm::StringRef, 3>>
+  splitFunctionCallLabel(llvm::StringRef label) const {
+    return llvm::createStringError("Not implemented.");
+  }
+
+  // Decodes the function label into a \c FunctionCallLabel.
+  virtual llvm::Expected<FunctionCallLabel>
+  makeFunctionCallLabel(llvm::StringRef label) const {
+    return llvm::createStringError("Not implemented.");
+  }
+
 protected:
   SymbolFile *m_sym_file = nullptr;
   /// Used for reporting statistics.
diff --git a/lldb/source/Core/Module.cpp b/lldb/source/Core/Module.cpp
index 90997dada3666..edd79aff5d065 100644
--- a/lldb/source/Core/Module.cpp
+++ b/lldb/source/Core/Module.cpp
@@ -121,6 +121,15 @@ size_t Module::GetNumberAllocatedModules() {
   return GetModuleCollection().size();
 }
 
+Module *Module::GetAllocatedModuleWithUID(lldb::user_id_t uid) {
+  std::lock_guard<std::recursive_mutex> guard(
+      GetAllocationModuleCollectionMutex());
+  for (Module *mod : GetModuleCollection())
+    if (mod->GetID() == uid)
+      return mod;
+  return nullptr;
+}
+
 Module *Module::GetAllocatedModuleAtIndex(size_t idx) {
   std::lock_guard<std::recursive_mutex> guard(
       GetAllocationModuleCollectionMutex());
@@ -130,8 +139,11 @@ Module *Module::GetAllocatedModuleAtIndex(size_t idx) {
   return nullptr;
 }
 
+// TODO: needs a mutex
+static lldb::user_id_t g_unique_id = 1;
+
 Module::Module(const ModuleSpec &module_spec)
-    : m_unwind_table(*this), m_file_has_changed(false),
+    : UserID(g_unique_id++), m_unwind_table(*this), m_file_has_changed(false),
       m_first_file_changed_log(false) {
   // Scope for locker below...
   {
@@ -236,7 +248,8 @@ Module::Module(const ModuleSpec &module_spec)
 Module::Module(const FileSpec &file_spec, const ArchSpec &arch,
                ConstString object_name, lldb::offset_t object_offset,
                const llvm::sys::TimePoint<> &object_mod_time)
-    : m_mod_time(FileSystem::Instance().GetModificationTime(file_spec)),
+    : UserID(g_unique_id++),
+      m_mod_time(FileSystem::Instance().GetModificationTime(file_spec)),
       m_arch(arch), m_file(file_spec), m_object_name(object_name),
       m_object_offset(object_offset), m_object_mod_time(object_mod_time),
       m_unwind_table(*this), m_file_has_changed(false),
@@ -257,7 +270,7 @@ Module::Module(const FileSpec &file_spec, const ArchSpec &arch,
 }
 
 Module::Module()
-    : m_unwind_table(*this), m_file_has_changed(false),
+    : UserID(g_unique_id++), m_unwind_table(*this), m_file_has_changed(false),
       m_first_file_changed_log(false) {
   std::lock_guard<std::recursive_mutex> guard(
       GetAllocationModuleCollectionMutex());
diff --git a/lldb/source/Expression/Expression.cpp b/lldb/source/Expression/Expression.cpp
index 93f585edfce3d..e19c804caa3c6 100644
--- a/lldb/source/Expression/Expression.cpp
+++ b/lldb/source/Expression/Expression.cpp
@@ -10,6 +10,10 @@
 #include "lldb/Target/ExecutionContextScope.h"
 #include "lldb/Target/Target.h"
 
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/StringRef.h"
+#include "llvm/Support/Error.h"
+
 using namespace lldb_private;
 
 Expression::Expression(Target &target)
@@ -26,3 +30,15 @@ Expression::Expression(ExecutionContextScope &exe_scope)
       m_jit_end_addr(LLDB_INVALID_ADDRESS) {
   assert(m_target_wp.lock());
 }
+
+bool lldb_private::consumeFunctionCallLabelPrefix(llvm::StringRef &name) {
+  // On Darwin mangled names get a '_' prefix.
+  name.consume_front("_");
+  return name.consume_front(FunctionCallLabelPrefix);
+}
+
+bool lldb_private::hasFunctionCallLabelPrefix(llvm::StringRef name) {
+  // On Darwin mangled names get a '_' prefix.
+  name.consume_front("_");
+  return name.starts_with(FunctionCallLabelPrefix);
+}
diff --git a/lldb/source/Expression/IRExecutionUnit.cpp b/lldb/source/Expression/IRExecutionUnit.cpp
index 6f812b91a8b1d..a9ea244889cce 100644
--- a/lldb/source/Expression/IRExecutionUnit.cpp
+++ b/lldb/source/Expression/IRExecutionUnit.cpp
@@ -13,6 +13,7 @@
 #include "llvm/IR/DiagnosticInfo.h"
 #include "llvm/IR/LLVMContext.h"
 #include "llvm/IR/Module.h"
+#include "llvm/Support/Error.h"
 #include "llvm/Support/SourceMgr.h"
 #include "llvm/Support/raw_ostream.h"
 
@@ -20,6 +21,7 @@
 #include "lldb/Core/Disassembler.h"
 #include "lldb/Core/Module.h"
 #include "lldb/Core/Section.h"
+#include "lldb/Expression/Expression.h"
 #include "lldb/Expression/IRExecutionUnit.h"
 #include "lldb/Expression/ObjectFileJIT.h"
 #include "lldb/Host/HostInfo.h"
@@ -36,6 +38,7 @@
 #include "lldb/Utility/LLDBAssert.h"
 #include "lldb/Utility/LLDBLog.h"
 #include "lldb/Utility/Log.h"
+#include "lldb/lldb-defines.h"
 
 #include <optional>
 
@@ -771,6 +774,63 @@ class LoadAddressResolver {
   lldb::addr_t m_best_internal_load_address = LLDB_INVALID_ADDRESS;
 };
 
+/// Returns address of the function referred to by the special function call
+/// label \c label.
+///
+/// \param[in] label Function call label encoding the unique location of the
+/// function to look up.
+///                  Assumes that the \c FunctionCallLabelPrefix has been
+///                  stripped from the front of the label.
+static llvm::Expected<lldb::addr_t>
+ResolveFunctionCallLabel(llvm::StringRef name,
+                         const lldb_private::SymbolContext &sc,
+                         bool &symbol_was_missing_weak) {
+  symbol_was_missing_weak = false;
+
+  if (!sc.target_sp)
+    return llvm::createStringError("target not available.");
+
+  auto ts_or_err = sc.target_sp->GetScratchTypeSystemForLanguage(
+      lldb::eLanguageTypeC_plus_plus);
+  if (!ts_or_err || !*ts_or_err)
+    return llvm::joinErrors(
+        llvm::createStringError(
+            "failed to find scratch C++ TypeSystem for current target."),
+        ts_or_err.takeError());
+
+  auto ts_sp = *ts_or_err;
+
+  auto label_or_err = ts_sp->makeFunctionCallLabel(name);
+  if (!label_or_err)
+    return llvm::joinErrors(
+        llvm::createStringError("failed to create FunctionCallLabel from: %s",
+                                name.data()),
+        label_or_err.takeError());
+
+  const auto &label = *label_or_err;
+
+  Module *module = Module::GetAllocatedModuleWithUID(label.m_module_id);
+
+  if (!module)
+    return llvm::createStringError(
+        llvm::formatv("failed to find module by UID {0}", label.m_module_id));
+
+  auto *symbol_file = module->GetSymbolFile();
+  if (!symbol_file)
+    return llvm::createStringError(
+        llvm::formatv("no SymbolFile found on module {0:x}.", module));
+
+  SymbolContextList sc_list;
+  if (auto err =
+          symbol_file->FindAndResolveFunction(sc_list, label.m_lookup_name))
+    return llvm::joinErrors(
+        llvm::createStringError("failed to resolve function by UID"),
+        std::move(err));
+
+  LoadAddressResolver resolver(*sc.target_sp, symbol_was_missing_weak);
+  return resolver.Resolve(sc_list).value_or(LLDB_INVALID_ADDRESS);
+}
+
 lldb::addr_t
 IRExecutionUnit::FindInSymbols(const std::vector<ConstString> &names,
                                const lldb_private::SymbolContext &sc,
@@ -906,6 +966,20 @@ lldb::addr_t IRExecutionUnit::FindInUserDefinedSymbols(
 
 lldb::addr_t IRExecutionUnit::FindSymbol(lldb_private::ConstString name,
                                          bool &missing_weak) {
+  if (hasFunctionCallLabelPrefix(name.GetStringRef())) {
+    if (auto addr_or_err = ResolveFunctionCallLabel(name.GetStringRef(),
+                                                    m_sym_ctx, missing_weak)) {
+      return *addr_or_err;
+    } else {
+      LLDB_LOG_ERROR(GetLog(LLDBLog::Expressions), addr_or_err.takeError(),
+                     "Failed to resolve function call label {1}: {0}",
+                     name.GetStringRef());
+      return LLDB_INVALID_ADDRESS;
+    }
+  }
+
+  // TODO: do we still need the following lookups?
+
   std::vector<ConstString> candidate_C_names;
   std::vector<ConstString> candidate_CPlusPlus_names;
 
diff --git a/lldb/source/Plugins/ExpressionParser/Clang/ClangExpressionDeclMap.cpp b/lldb/source/Plugins/ExpressionParser/Clang/ClangExpressionDeclMap.cpp
index 9f77fbc1d2434..a6c4334bf2e59 100644
--- a/lldb/source/Plugins/ExpressionParser/Clang/ClangExpressionDeclMap.cpp
+++ b/lldb/source/Plugins/ExpressionParser/Clang/ClangExpressionDeclMap.cpp
@@ -1991,7 +1991,7 @@ void ClangExpressionDeclMap::AddContextClassType(NameSearchContext &context,
     const bool is_artificial = false;
 
     CXXMethodDecl *method_decl = m_clang_ast_context->AddMethodToCXXRecordType(
-        copied_clang_type.GetOpaqueQualType(), "$__lldb_expr", nullptr,
+        copied_clang_type.GetOpaqueQualType(), "$__lldb_expr", std::nullopt,
         method_type, lldb::eAccessPublic, is_virtual, is_static, is_inline,
         is_explicit, is_attr_used, is_artificial);
 
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
index c76d67b47b336..6e9598c13ed5c 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
@@ -24,6 +24,7 @@
 #include "Plugins/Language/ObjC/ObjCLanguage.h"
 #include "lldb/Core/Module.h"
 #include "lldb/Core/Value.h"
+#include "lldb/Expression/Expression.h"
 #include "lldb/Host/Host.h"
 #include "lldb/Symbol/CompileUnit.h"
 #include "lldb/Symbol/Function.h"
@@ -249,6 +250,28 @@ static unsigned GetCXXMethodCVQuals(const DWARFDIE &subprogram,
   return cv_quals;
 }
 
+static std::optional<std::string> MakeLLDBFuncAsmLabel(const DWARFDIE &die) {
+  char const *name = die.GetMangledName(/*substitute_name_allowed*/ false);
+  if (!name)
+    return std::nullopt;
+
+  auto module_sp = die.GetModule();
+  if (!module_sp)
+    return std::nullopt;
+
+  lldb::user_id_t module_id = module_sp->GetID();
+  if (module_id == LLDB_INVALID_UID)
+    return std::nullopt;
+
+  const auto die_id = die.GetID();
+  if (die_id == LLDB_INVALID_UID)
+    return std::nullopt;
+
+  return llvm::formatv("{0}:{1}:{2:x}:{3:x}", FunctionCallLabelPrefix, name,
+                       module_id, die_id)
+      .str();
+}
+
 TypeSP DWARFASTParserClang::ParseTypeFromClangModule(const SymbolContext &sc,
                                                      const DWARFDIE &die,
                                                      Log *log) {
@@ -1231,7 +1254,7 @@ std::pair<bool, TypeSP> DWARFASTParserClang::ParseCXXMethod(
 
   clang::CXXMethodDecl *cxx_method_decl = m_ast.AddMethodToCXXRecordType(
       class_opaque_type.GetOpaqueQualType(), attrs.name.GetCString(),
-      attrs.mangled_name, clang_type, accessibility, attrs.is_virtual,
+      MakeLLDBFuncAsmLabel(die), clang_type, accessibility, attrs.is_virtual,
       is_static, attrs.is_inline, attrs.is_explicit, is_attr_used,
       attrs.is_artificial);
 
@@ -1384,7 +1407,7 @@ DWARFASTParserClang::ParseSubroutine(const DWARFDIE &die,
             ignore_containing_context ? m_ast.GetTranslationUnitDecl()
                                       : containing_decl_ctx,
             GetOwningClangModule(die), name, clang_type, attrs.storage,
-            attrs.is_inline);
+            attrs.is_inline, MakeLLDBFuncAsmLabel(die));
         std::free(name_buf);
 
         if (has_template_params) {
@@ -1394,7 +1417,7 @@ DWARFASTParserClang::ParseSubroutine(const DWARFDIE &die,
               ignore_containing_context ? m_ast.GetTranslationUnitDecl()
                                         : containing_decl_ctx,
               GetOwningClangModule(die), attrs.name.GetStringRef(), clang_type,
-              attrs.storage, attrs.is_inline);
+              attrs.storage, attrs.is_inline, /*asm_label=*/std::nullopt);
           clang::FunctionTemplateDecl *func_template_decl =
               m_ast.CreateFunctionTemplateDecl(
                   containing_decl_ctx, GetOwningClangModule(die),
@@ -1406,20 +1429,6 @@ DWARFASTParserClang::ParseSubroutine(const DWARFDIE &die,
         lldbassert(function_decl);
 
         if (function_decl) {
-          // Attach an asm(<mangled_name>) label to the FunctionDecl.
-          // This ensures that clang::CodeGen emits function calls
-          // using symbols that are mangled according to the DW_AT_linkage_name.
-          // If we didn't do this, the external symbols wouldn't exactly
-          // match the mangled name LLDB knows about and the IRExecutionUnit
-          // would have to fall back to searching object files for
-          // approximately matching function names. The motivating
-          // example is generating calls to ABI-tagged template functions.
-          // This is done separately for member functions in
-          // AddMethodToCXXRecordType.
-          if (attrs.mangled_name)
-            function_decl->addAttr(clang::AsmLabelAttr::CreateImplicit(
-                m_ast.getASTContext(), attrs.mangled_name, /*literal=*/false));
-
           LinkDeclContextToDIE(function_decl, die);
 
           const clang::FunctionProtoType *function_prototype(
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
index 5b16ce5f75138..b6960bfe38fde 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
@@ -2471,6 +2471,47 @@ bool SymbolFileDWARF::ResolveFunction(const DWARFDIE &orig_die,
   return false;
 }
 
+llvm::Error
+SymbolFileDWARF::FindAndResolveFunction(SymbolContextList &sc_list,
+                                        llvm::StringRef lookup_name) {
+  std::lock_guard<std::recursive_mutex> guard(GetModuleMutex());
+
+  DWARFDIE die;
+  Module::LookupInfo info(ConstString(lookup_name), lldb::eFunctionNameTypeFull,
+                          lldb::eLanguageTypeUnknown);
+
+  m_index->GetFunctions(info, *this, {}, [&](DWARFDIE entry) {
+    if (entry.GetAttributeValueAsUnsigned(llvm::dwarf::DW_AT_declaration, 0))
+      return true;
+
+    // We don't check whether the specification DIE for this function
+    // corresponds to the declaration DIE because the declaration might be in
+    // a type-unit but the definition in the compile-unit (and it's
+    // specifcation would point to the declaration in the compile-unit). We
+    // rely on the mangled name within the module to be enough to find us the
+    // unique definition.
+    die = entry;
+    return false;
+  });
+
+  if (!die.IsValid())
+    return llvm::createStringError(
+        llvm::formatv("failed to find definition DIE for '{0}'", lookup_name));
+
+  if (!ResolveFunction(die, false, sc_list))
+    return llvm::createStringError("failed to resolve function DIE");
+
+  if (sc_list.IsEmpty())
+    return llvm::createStringError("no definition DIE found");
+
+  if (sc_list.GetSize() > 1)
+    return llvm::createStringError(
+        "found %d functions for %s but expected only 1", sc_list.GetSize(),
+        lookup_name.data());
+
+  return llvm::Error::success();
+}
+
 bool SymbolFileDWARF::DIEInDeclContext(const CompilerDeclContext &decl_ctx,
                                        const DWARFDIE &die,
                                        bool only_root_namespaces) {
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h
index 2dc862cccca14..ee966366d3f2b 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h
+++ b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h
@@ -434,6 +434,9 @@ class SymbolFileDWARF : public SymbolFileCommon {
   DIEArray MergeBlockAbstractParameters(const DWARFDIE &block_die,
                                         DIEArray &&variable_dies);
 
+  llvm::Error FindAndResolveFunction(SymbolContextList &sc_list,
+                                     llvm::StringRef lookup_name) override;
+
   // Given a die_offset, figure out the symbol context representing that die.
   bool ResolveFunction(const DWARFDIE &die, bool include_inlines,
                        SymbolContextList &sc_list);
diff --git a/lldb/source/Plugins/SymbolFile/NativePDB/PdbAstBuilder.cpp b/lldb/source/Plugins/SymbolFile/NativePDB/PdbAstBuilder.cpp
index 702ec5e5c9ea9..bce721c149fee 100644
--- a/lldb/source/Plugins/SymbolFile/NativePDB/PdbAstBuilder.cpp
+++ b/lldb/source/Plugins/SymbolFile/Nativ...
[truncated]

lldb::user_id_t m_module_id;

/// Mostly for debuggability.
lldb::user_id_t m_die_id;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we shouldn't use the work DIE here in the generic code (PDB and all). It's just an function identifier that the symbol file knows how to interpret.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about symbol_id or function_id?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

/// from JITted expressions.
inline constexpr llvm::StringRef FunctionCallLabelPrefix = "$__lldb_func";

bool consumeFunctionCallLabelPrefix(llvm::StringRef &name);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment to say this accounts for the target-specific mangling prefix. Otherwise, it's not clear why one needs this function.
That said, we might be able to use the llvm \01 mangling escape prefix to avoid this difference. Clang seems to do that automatically:

$ clang -target arm64-apple-macosx -x c -o - -S - <<<'void f() asm("FFF"); void f(){}' | grep FFF
	.globl	FFF                             ; -- Begin function FFF
FFF:                                    ; @"\01FFF"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, we might be able to use the llvm \01 mangling escape prefix to avoid this difference. Clang seems to do that automatically:

Good point. LLDB disabled the mangling prefix in:

commit f6bc251274f39dee5d09f668a56922c88bd027d8
Author: Vedant Kumar <[email protected]>
Date:   Wed Sep 25 18:00:31 2019 +0000

    [Mangle] Add flag to asm labels to disable '\01' prefixing
    
    LLDB synthesizes decls using asm labels. These decls cannot have a mangle
    different than the one specified in the label name. I.e., the '\01' prefix
    should not be added.
    
    Fixes an expression evaluation failure in lldb's TestVirtual.py on iOS.

But that wouldn't apply with this patch anymore since we're doing the handling of the AsmLabel ourselves.

Didn't realize that the \01 prevents the global mangling prefix to get added into the IR names.

I'll go with this approach. Then we don't need these APIs either

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe that means we can also get rid of this functionality in Clang (since it was just added for LLDB).

Comment on lines 124 to 132
Module *Module::GetAllocatedModuleWithUID(lldb::user_id_t uid) {
std::lock_guard<std::recursive_mutex> guard(
GetAllocationModuleCollectionMutex());
for (Module *mod : GetModuleCollection())
if (mod->GetID() == uid)
return mod;
return nullptr;
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably put this function into the Target class. It will not need to iterate through that many modules and it avoids accidentally pulling in an unrelated module.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah now I remember why i did it this way initially. In some of the tests, the module that we get from die.GetModule() is for an object file (e.g., main.o) but the target image list doesn't contain that module. So I iterated over all the allocated modules. There's probably a better way to get from DIE module to module in the image list. Though I haven't found a good way yet

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, that's tricky. We have something sort of similar in (e.g.) SymbolFileDWARF::GetDIEToType() which returns the DWARFDebugMap die-to-type map when dealing with a MachO .o files. It's not completely ideal, but I think it wouldn't be too bad if we had a SymbolFileDWARF function to return the ID of the "main" module.

Alternatively, since main symbol file which creates the type system (and the type system creates the DWARFASTParser), we may be able to plumb the module ID to the DWARFASTParser constructor.

@@ -130,8 +139,11 @@ Module *Module::GetAllocatedModuleAtIndex(size_t idx) {
return nullptr;
}

// TODO: needs a mutex
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.. or you can make it an atomic.

/// The label string needs to stay valid for the entire lifetime
/// of this object.
struct FunctionCallLabel {
llvm::StringRef m_lookup_name;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't use the m_ prefix for plain structs.

Comment on lines 793 to 794
auto ts_or_err = sc.target_sp->GetScratchTypeSystemForLanguage(
lldb::eLanguageTypeC_plus_plus);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it makes sense to involve the type system here. At this point, do you even know if you're running a c++ expression?

Is there any harm in making this mangling scheme global? We can always change it or even add some language identifier if needed.

Alternatively, we could make each module responsible for its own mangling scheme. We could just parse out the module ID here and then pass the rest of the string verbatim.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I made it TypeSystemClang specific is that in the follow-up patch to support constructor/destructor variants (#149827) I plan to put a Clang type into the FunctionCallLabel structure. Of course I could avoid pulling Clang in by using plain integers instead of (clang::CXXCtorType/clang::CXXDtorType). Not sure if we get here outside of expression evaluation, but happy to hoist this to somewhere more global.

Do you prefer me move this out of TypeSystem and into Expression (or something similar). And we can refactor it if needed for the constructor/destructor patch?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to put clang-specific types into the generic FunctionCallLabel structure. If you want that, then the struct itself should by plugin-specific. I think that'd would be doable if you make the generic code treat the label.. generically (as a string I guess). The parsing would then happen inside the SymbolFile, which can know the encoding scheme used, the language and what not. In this world, it may make sense for this code to live in the type system, but then I'd make sure it's called from the SymbolFile class, and not directly from generic code.

However, I think that keeping the struct generic is also a viable option. We just need to avoid language and plugin-specific terms. For the structor type field, we could just say we have an additional "discriminator" field which the plugin can use to discriminate different "variants" of the given function. In this world, the label parsing should not be in a plugin. It could be just a method defined in Expression.h, next to the struct definition.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea happy to do that instead. That was what my first iteration of the patch did. I'll add a "uint64_t discriminator" field here instead. And pass the FunctionCallLabel structure into the SymbolFile instead of the individual components. Makes things much simpler

if (die_id == LLDB_INVALID_UID)
return std::nullopt;

return llvm::formatv("{0}:{1}:{2:x}:{3:x}", FunctionCallLabelPrefix, name,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should at least be in the same plugin (ideally, in the same file) as the code that parses it.

std::lock_guard<std::recursive_mutex> guard(GetModuleMutex());

DWARFDIE die;
Module::LookupInfo info(ConstString(lookup_name), lldb::eFunctionNameTypeFull,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised to see a name lookup here. Is the idea to change that in a follow-up?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially I wanted to just use the DIE ID that we encoded. We can do that for regular functions, because the DWARFASTParserClang encodes the definition DIE ID. But for methods (and especially constructors/destructors) the DWARFASTParserClang just looks at the declaration DIE. So I do the lookup here.

We could make sure to encode the definition DIE into the label and move this logic into the DWARFASTParserClang, but unless I'm missing something we need the lookup to fetch the definitions (for methods at least)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. So, we definitely don't want to look for the definition DIE just so we can encode it into the label. What we could do is encode the declaration DIE, and then when (and if -- I expect that the majority of functions will never be called) we need to resolve the call, we look up the matching definition DIE. That would be my ideal implementation as that's the only way to avoid ambiguities arising from multiple functions having identical linkage names (these occur in anonymous namespace. the ambiguities are solved because there's no need to look up definition DIEs, as these DIEs will be CU-local and always be definitions). It would also let us debug code which doesn't include the linkage names in the definition DIEs (which is what clang does with -gsce).

Another reason for dropping the linkage names might be to reduce the memory footprint of long asm labels, but if this isn't big, we may want to keep them anyway to help with debugging.

That said, I can imagine postponing that to a later time. However, I'd suggest designing this API such that it is possible to implement that in the future, such as by passing the entire FunctionCallLabel object so that the module can choose what it uses to perform the lookup.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

encode the declaration DIE, and then when (and if -- I expect that the majority of functions will never be called) we need to resolve the call, we look up the matching definition DIE

Hmm how do we do this without a name lookup? SymbolFileDWARF::FindDefinitionDIE?

Another reason for dropping the linkage names might be to reduce the memory footprint of long asm labels, but if this isn't big, we may want to keep them anyway to help with debugging.

Yea had the same thought. For debugging it has been useful actually. When something goes wrong and you get a couldn't resolve symbol $__lldb_func:_Z3foov:0x123:0xf00d having the mangled name there is pretty nice.

Copy link
Collaborator

@labath labath Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm how do we do this without a name lookup? SymbolFileDWARF::FindDefinitionDIE?

Maybe, although I suspect that function only works on types/classes right now. Ultimately, we are going to need some kind of a name lookup to go from the declaration to the definition DIE, and I think that using the mangled name (when it exists) for that is the fastest and simplest way to achieve that. However, I think we should do that only if its needed. The most unique identifier is the DIE ID, so I'd start by looking that up. After that, there are three cases:

  1. This is a definition DIE. We don't need to do anything. The DIE doesn't even have to contain a mangled name. We just take the DW_AT_low/entry_pc, and we're done.
  2. This is a declaration of a CU-local entity (anonymous namespace). Here, we need to do a lookup, but we know the result must be in the current CU. The mangled name is the simplest way to do that. We could try to do structural matching (like we do for classes with simplified template names) if DW_AT_linkage_name is not present, but that's a job for another patch.
  3. This is a declaration of a global entity. We also need to do a lookup, but the result could be anywhere(*) -- though we still may want to prefer the definition in the current CU, if it exists.

(*) I just realized that "anywhere" also means "in a different module", which means that there should be some sort of a fallback to searching in other modules. Encoding the mangled name into the asm label might make that easier, as a retry could happen at a different level -- though it would still be nice if the "owning module" would signal that this kind of search is necessary (instead of blindly doing the full search whenever the first module doesn't produce a match).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying! Yes, that's about what I imagined it would look like

I just realized that "anywhere" also means "in a different module", which means that there should be some sort of a fallback to searching in other modules

Great point. I think we can re-use the existing IRExecutionUnit to do a fallback search through the other modules.

@@ -9759,3 +9782,51 @@ void TypeSystemClang::LogCreation() const {
LLDB_LOG(log, "Created new TypeSystem for (ASTContext*){0:x} '{1}'",
&getASTContext(), getDisplayName());
}

// Expected format is:
// $__lldb_func:<mangled name>:<module id>:<definition/declaration DIE id>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe put mangled name last? That way you don't have to worry about it containing :s...

@Michael137 Michael137 force-pushed the lldb/reliable-function-call-labels branch from e64ce7d to 9aec4ec Compare July 28, 2025 14:45
lldb_private::ConstString name(constant_func->getName());
lldb_private::ConstString name(
llvm::GlobalValue::dropLLVMManglingEscape(
constant_func->getName()));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had to drop the \01 mangling prefix here since we're not creating literal AsmLabels anymore. Fixed a couple of test failures where we ran expressions like expression func (to print the function pointer). We would try to find the symbol here (including the mangling prefix) and fail.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can do this in a separate PR technically

@Michael137 Michael137 force-pushed the lldb/reliable-function-call-labels branch from 3b5a380 to f1e3ce1 Compare July 31, 2025 07:54
Copy link
Collaborator

@labath labath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for working on this.

@Michael137 Michael137 merged commit f890591 into llvm:main Aug 1, 2025
9 checks passed
@Michael137 Michael137 deleted the lldb/reliable-function-call-labels branch August 1, 2025 06:21
@llvm-ci
Copy link
Collaborator

llvm-ci commented Aug 1, 2025

LLVM Buildbot has detected a new failure on builder lldb-remote-linux-win running on as-builder-10 while building lldb at step 16 "test-check-lldb-unit".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/197/builds/7609

Here is the relevant piece of the build log for the reference
Step 16 (test-check-lldb-unit) failure: Test just built components: check-lldb-unit completed (failure)
******************** TEST 'lldb-unit :: Symbol/./SymbolTests.exe/56/91' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:C:\buildbot\as-builder-10\lldb-x-aarch64\build\tools\lldb\unittests\Symbol\.\SymbolTests.exe-lldb-unit-11500-56-91.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=91 GTEST_SHARD_INDEX=56 C:\buildbot\as-builder-10\lldb-x-aarch64\build\tools\lldb\unittests\Symbol\.\SymbolTests.exe
--

Script:
--
C:\buildbot\as-builder-10\lldb-x-aarch64\build\tools\lldb\unittests\Symbol\.\SymbolTests.exe --gtest_filter=TestTypeSystemClang.AsmLabel_CtorDtor
--
C:\buildbot\as-builder-10\lldb-x-aarch64\llvm-project\lldb\unittests\Symbol\TestTypeSystemClang.cpp(1170): error: Expected equality of these values:
  m_ast->DeclGetMangledName(ctor_nolabel).GetCString()
    Which is: "??0S@@QEAA@XZ"
  "_ZN1SC1Ev"


C:\buildbot\as-builder-10\lldb-x-aarch64\llvm-project\lldb\unittests\Symbol\TestTypeSystemClang.cpp:1170
Expected equality of these values:
  m_ast->DeclGetMangledName(ctor_nolabel).GetCString()
    Which is: "??0S@@QEAA@XZ"
  "_ZN1SC1Ev"



********************


@llvm-ci
Copy link
Collaborator

llvm-ci commented Aug 1, 2025

LLVM Buildbot has detected a new failure on builder lldb-x86_64-win running on as-builder-10 while building lldb at step 8 "test-check-lldb-unit".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/211/builds/985

Here is the relevant piece of the build log for the reference
Step 8 (test-check-lldb-unit) failure: Test just built components: check-lldb-unit completed (failure)
******************** TEST 'lldb-unit :: Symbol/./SymbolTests.exe/56/91' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:C:\buildbot\as-builder-10\lldb-x86-64\build\tools\lldb\unittests\Symbol\.\SymbolTests.exe-lldb-unit-10636-56-91.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=91 GTEST_SHARD_INDEX=56 C:\buildbot\as-builder-10\lldb-x86-64\build\tools\lldb\unittests\Symbol\.\SymbolTests.exe
--

Script:
--
C:\buildbot\as-builder-10\lldb-x86-64\build\tools\lldb\unittests\Symbol\.\SymbolTests.exe --gtest_filter=TestTypeSystemClang.AsmLabel_CtorDtor
--
C:\buildbot\as-builder-10\lldb-x86-64\llvm-project\lldb\unittests\Symbol\TestTypeSystemClang.cpp(1170): error: Expected equality of these values:
  m_ast->DeclGetMangledName(ctor_nolabel).GetCString()
    Which is: "??0S@@QEAA@XZ"
  "_ZN1SC1Ev"


C:\buildbot\as-builder-10\lldb-x86-64\llvm-project\lldb\unittests\Symbol\TestTypeSystemClang.cpp:1170
Expected equality of these values:
  m_ast->DeclGetMangledName(ctor_nolabel).GetCString()
    Which is: "??0S@@QEAA@XZ"
  "_ZN1SC1Ev"



********************

Step 9 (test-check-lldb-api) failure: Test just built components: check-lldb-api completed (failure)
...
PASS: lldb-api :: lang/cpp/default-template-args/TestDefaultTemplateArgs.py (795 of 1292)
PASS: lldb-api :: lang/cpp/dereferencing_references/TestCPPDereferencingReferences.py (796 of 1292)
UNSUPPORTED: lldb-api :: lang/cpp/enum_types/TestCPP11EnumTypes.py (797 of 1292)
PASS: lldb-api :: lang/cpp/enum_promotion/TestCPPEnumPromotion.py (798 of 1292)
PASS: lldb-api :: lang/cpp/class_types/TestClassTypesDisassembly.py (799 of 1292)
PASS: lldb-api :: lang/cpp/elaborated-types/TestElaboratedTypes.py (800 of 1292)
PASS: lldb-api :: lang/cpp/dynamic-value-same-basename/TestDynamicValueSameBase.py (801 of 1292)
PASS: lldb-api :: lang/cpp/dynamic-value/TestCppValueCast.py (802 of 1292)
PASS: lldb-api :: lang/cpp/class_static/TestStaticVariables.py (803 of 1292)
UNRESOLVED: lldb-api :: lang/cpp/expr-definition-in-dylib/TestExprDefinitionInDylib.py (804 of 1292)
******************** TEST 'lldb-api :: lang/cpp/expr-definition-in-dylib/TestExprDefinitionInDylib.py' FAILED ********************
Script:
--
C:/Python312/python.exe C:/buildbot/as-builder-10/lldb-x86-64/llvm-project/lldb\test\API\dotest.py -u CXXFLAGS -u CFLAGS --env LLVM_LIBS_DIR=C:/buildbot/as-builder-10/lldb-x86-64/build/./lib --env LLVM_INCLUDE_DIR=C:/buildbot/as-builder-10/lldb-x86-64/build/include --env LLVM_TOOLS_DIR=C:/buildbot/as-builder-10/lldb-x86-64/build/./bin --arch x86_64 --build-dir C:/buildbot/as-builder-10/lldb-x86-64/build/lldb-test-build.noindex --lldb-module-cache-dir C:/buildbot/as-builder-10/lldb-x86-64/build/lldb-test-build.noindex/module-cache-lldb\lldb-api --clang-module-cache-dir C:/buildbot/as-builder-10/lldb-x86-64/build/lldb-test-build.noindex/module-cache-clang\lldb-api --executable C:/buildbot/as-builder-10/lldb-x86-64/build/./bin/lldb.exe --compiler C:/buildbot/as-builder-10/lldb-x86-64/build/./bin/clang.exe --dsymutil C:/buildbot/as-builder-10/lldb-x86-64/build/./bin/dsymutil.exe --make C:/ninja/make.exe --llvm-tools-dir C:/buildbot/as-builder-10/lldb-x86-64/build/./bin --lldb-obj-root C:/buildbot/as-builder-10/lldb-x86-64/build/tools/lldb --lldb-libs-dir C:/buildbot/as-builder-10/lldb-x86-64/build/./lib --cmake-build-type Release --skip-category=lldb-dap C:\buildbot\as-builder-10\lldb-x86-64\llvm-project\lldb\test\API\lang\cpp\expr-definition-in-dylib -p TestExprDefinitionInDylib.py
--
Exit Code: 1

Command Output (stdout):
--
lldb version 22.0.0git (https://github.com/llvm/llvm-project.git revision f89059140bd54699771fe076da30dd1db060cc93)
  clang revision f89059140bd54699771fe076da30dd1db060cc93
  llvm revision f89059140bd54699771fe076da30dd1db060cc93
Skipping the following test categories: ['lldb-dap', 'libc++', 'libstdcxx', 'dwo', 'dsym', 'gmodules', 'debugserver', 'objc', 'fork', 'pexpect']


--
Command Output (stderr):
--
FAIL: LLDB (C:\buildbot\as-builder-10\lldb-x86-64\build\bin\clang.exe-x86_64) :: test (TestExprDefinitionInDylib.ExprDefinitionInDylibTestCase.test)

======================================================================

ERROR: test (TestExprDefinitionInDylib.ExprDefinitionInDylibTestCase.test)

   Tests that we can call functions whose definition

----------------------------------------------------------------------

Error when building test subject.



Build Command:

C:/ninja/make.exe 'VPATH=C:\buildbot\as-builder-10\lldb-x86-64\llvm-project\lldb\test\API\lang\cpp\expr-definition-in-dylib' -C 'C:\buildbot\as-builder-10\lldb-x86-64\build\lldb-test-build.noindex\lang\cpp\expr-definition-in-dylib\TestExprDefinitionInDylib.test' -I 'C:\buildbot\as-builder-10\lldb-x86-64\llvm-project\lldb\test\API\lang\cpp\expr-definition-in-dylib' -I 'C:\buildbot\as-builder-10\lldb-x86-64\llvm-project\lldb\packages\Python\lldbsuite\test\make' -f 'C:\buildbot\as-builder-10\lldb-x86-64\llvm-project\lldb\test\API\lang\cpp\expr-definition-in-dylib\Makefile' all ARCH=x86_64 'CC=C:\buildbot\as-builder-10\lldb-x86-64\build\bin\clang.exe' CC_TYPE=clang 'CXX=C:\buildbot\as-builder-10\lldb-x86-64\build\bin\clang++.exe' 'LLVM_AR=C:/buildbot/as-builder-10/lldb-x86-64/build/./bin\llvm-ar.exe' 'AR=C:/buildbot/as-builder-10/lldb-x86-64/build/./bin\llvm-ar.exe' 'OBJCOPY=C:/buildbot/as-builder-10/lldb-x86-64/build/./bin\llvm-objcopy.exe' 'STRIP=C:/buildbot/as-builder-10/lldb-x86-64/build/./bin\llvm-strip.exe' 'ARCHIVER=C:/buildbot/as-builder-10/lldb-x86-64/build/./bin\llvm-ar.exe' 'DWP=C:/buildbot/as-builder-10/lldb-x86-64/build/./bin\llvm-dwp.exe' 'CLANG_MODULE_CACHE_DIR=C:/buildbot/as-builder-10/lldb-x86-64/build/lldb-test-build.noindex/module-cache-clang\lldb-api' LLDB_OBJ_ROOT=C:/buildbot/as-builder-10/lldb-x86-64/build/tools/lldb OS=Windows_NT HOST_OS=Windows_NT



Build Command Output:

@llvm-ci
Copy link
Collaborator

llvm-ci commented Aug 1, 2025

LLVM Buildbot has detected a new failure on builder lldb-aarch64-windows running on linaro-armv8-windows-msvc-05 while building lldb at step 6 "test".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/141/builds/10514

Here is the relevant piece of the build log for the reference
Step 6 (test) failure: build (failure)
...
PASS: lldb-api :: lang/cpp/dereferencing_references/TestCPPDereferencingReferences.py (809 of 2283)
PASS: lldb-api :: lang/cpp/dynamic-value-same-basename/TestDynamicValueSameBase.py (810 of 2283)
PASS: lldb-api :: lang/cpp/diamond/TestCppDiamond.py (811 of 2283)
PASS: lldb-api :: lang/cpp/dynamic-value/TestCppValueCast.py (812 of 2283)
PASS: lldb-api :: lang/cpp/elaborated-types/TestElaboratedTypes.py (813 of 2283)
PASS: lldb-api :: lang/cpp/enum_promotion/TestCPPEnumPromotion.py (814 of 2283)
UNSUPPORTED: lldb-api :: lang/cpp/enum_types/TestCPP11EnumTypes.py (815 of 2283)
XFAIL: lldb-api :: lang/cpp/dynamic-value/TestDynamicValue.py (816 of 2283)
UNSUPPORTED: lldb-api :: lang/cpp/exceptions/TestCPPExceptionBreakpoints.py (817 of 2283)
UNRESOLVED: lldb-api :: lang/cpp/expr-definition-in-dylib/TestExprDefinitionInDylib.py (818 of 2283)
******************** TEST 'lldb-api :: lang/cpp/expr-definition-in-dylib/TestExprDefinitionInDylib.py' FAILED ********************
Script:
--
C:/Users/tcwg/scoop/apps/python/current/python.exe C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/llvm-project/lldb\test\API\dotest.py -u CXXFLAGS -u CFLAGS --env LLVM_LIBS_DIR=C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/./lib --env LLVM_INCLUDE_DIR=C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/include --env LLVM_TOOLS_DIR=C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/./bin --arch aarch64 --build-dir C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/lldb-test-build.noindex --lldb-module-cache-dir C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/lldb-test-build.noindex/module-cache-lldb\lldb-api --clang-module-cache-dir C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/lldb-test-build.noindex/module-cache-clang\lldb-api --executable C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/./bin/lldb.exe --compiler C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/./bin/clang.exe --dsymutil C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/./bin/dsymutil.exe --make C:/Users/tcwg/scoop/shims/make.exe --llvm-tools-dir C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/./bin --lldb-obj-root C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/tools/lldb --lldb-libs-dir C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/./lib --cmake-build-type Release --skip-category=watchpoint C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\API\lang\cpp\expr-definition-in-dylib -p TestExprDefinitionInDylib.py
--
Exit Code: 1

Command Output (stdout):
--
lldb version 22.0.0git (https://github.com/llvm/llvm-project.git revision f89059140bd54699771fe076da30dd1db060cc93)
  clang revision f89059140bd54699771fe076da30dd1db060cc93
  llvm revision f89059140bd54699771fe076da30dd1db060cc93
Skipping the following test categories: ['watchpoint', 'libc++', 'libstdcxx', 'dwo', 'dsym', 'gmodules', 'debugserver', 'objc', 'fork', 'pexpect']


--
Command Output (stderr):
--
FAIL: LLDB (C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\bin\clang.exe-aarch64) :: test (TestExprDefinitionInDylib.ExprDefinitionInDylibTestCase.test)

======================================================================

ERROR: test (TestExprDefinitionInDylib.ExprDefinitionInDylibTestCase.test)

   Tests that we can call functions whose definition

----------------------------------------------------------------------

Error when building test subject.



Build Command:

C:/Users/tcwg/scoop/shims/make.exe 'VPATH=C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\API\lang\cpp\expr-definition-in-dylib' -C 'C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\lldb-test-build.noindex\lang\cpp\expr-definition-in-dylib\TestExprDefinitionInDylib.test' -I 'C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\API\lang\cpp\expr-definition-in-dylib' -I 'C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\packages\Python\lldbsuite\test\make' -f 'C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\API\lang\cpp\expr-definition-in-dylib\Makefile' all ARCH=aarch64 'CC=C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\bin\clang.exe' CC_TYPE=clang 'CXX=C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\bin\clang++.exe' 'LLVM_AR=C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/./bin\llvm-ar.exe' 'AR=C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/./bin\llvm-ar.exe' 'OBJCOPY=C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/./bin\llvm-objcopy.exe' 'STRIP=C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/./bin\llvm-strip.exe' 'ARCHIVER=C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/./bin\llvm-ar.exe' 'DWP=C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/./bin\llvm-dwp.exe' 'CLANG_MODULE_CACHE_DIR=C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/lldb-test-build.noindex/module-cache-clang\lldb-api' LLDB_OBJ_ROOT=C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/tools/lldb OS=Windows_NT HOST_OS=Windows_NT



Build Command Output:

Michael137 added a commit to Michael137/llvm-project that referenced this pull request Aug 3, 2025
This was added purely for the needs of LLDB. However, starting with
llvm#151355 and llvm#148877 we no longer create literal AsmLabels in LLDB either. So this is unused and we can get rid of it.

In the near future LLDB will want to add special support for mangling `AsmLabel`s, and the "literal label" codepath in the mangler made this more cumbersome.
Michael137 added a commit to Michael137/llvm-project that referenced this pull request Aug 3, 2025
This was added purely for the needs of LLDB. However, starting with
llvm#151355 and llvm#148877 we no longer create literal AsmLabels in LLDB either. So this is unused and we can get rid of it.

In the near future LLDB will want to add special support for mangling `AsmLabel`s, and the "literal label" codepath in the mangler made this more cumbersome.

(cherry picked from commit 342f0d5)
Michael137 added a commit to Michael137/llvm-project that referenced this pull request Aug 4, 2025
This was added purely for the needs of LLDB. However, starting with
llvm#151355 and llvm#148877 we no longer create literal AsmLabels in LLDB either. So this is unused and we can get rid of it.

In the near future LLDB will want to add special support for mangling `AsmLabel`s, and the "literal label" codepath in the mangler made this more cumbersome.

(cherry picked from commit 342f0d5)
Michael137 added a commit that referenced this pull request Aug 4, 2025
This was added purely for the needs of LLDB. However, starting with
#151355 and
#148877 we no longer create
literal AsmLabels in LLDB either. So this is unused and we can get rid
of it.

In the near future LLDB will want to add special support for mangling
`AsmLabel`s (specifically
#149827).
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Aug 4, 2025
…(#151858)

This was added purely for the needs of LLDB. However, starting with
llvm/llvm-project#151355 and
llvm/llvm-project#148877 we no longer create
literal AsmLabels in LLDB either. So this is unused and we can get rid
of it.

In the near future LLDB will want to add special support for mangling
`AsmLabel`s (specifically
llvm/llvm-project#149827).
krishna2803 pushed a commit to krishna2803/llvm-project that referenced this pull request Aug 12, 2025
…vm#151355)

Split out from llvm#148877

This patch prepares `TypeSystemClang` APIs to take `AsmLabel`s which
concatenated strings (hence `std::string`) instead of a plain `const
char*`.
krishna2803 pushed a commit to krishna2803/llvm-project that referenced this pull request Aug 12, 2025
…llvm#148877)

LLDB currently attaches `AsmLabel`s to `FunctionDecl`s such that that
the `IRExecutionUnit` can determine which mangled name to call (we can't
rely on Clang deriving the correct mangled name to call because the
debug-info AST doesn't contain all the info that would be encoded in the
DWARF linkage names). However, we don't attach `AsmLabel`s for structors
because they have multiple variants and thus it's not clear which
mangled name to use. In the [RFC on fixing expression evaluation of
abi-tagged
structors](https://discourse.llvm.org/t/rfc-lldb-handling-abi-tagged-constructors-destructors-in-expression-evaluator/82816)
we discussed encoding the structor variant into the `AsmLabel`s.
Specifically in [this
thread](https://discourse.llvm.org/t/rfc-lldb-handling-abi-tagged-constructors-destructors-in-expression-evaluator/82816/7)
we discussed that the contents of the `AsmLabel` are completely under
LLDB's control and we could make use of it to uniquely identify a
function by encoding the exact module and DIE that the function is
associated with (mangled names need not be enough since two identical
mangled symbols may live in different modules). So if we already have a
custom `AsmLabel` format, we can encode the structor variant in a
follow-up (the current idea is to append the structor variant as a
suffix to our custom `AsmLabel` when Clang emits the mangled name into
the JITted IR). Then we would just have to teach the `IRExecutionUnit`
to pick the correct structor variant DIE during symbol resolution. The
draft of this is available
[here](llvm#149827)

This patch sets up the infrastructure for the custom `AsmLabel` format
by encoding the module id, DIE id and mangled name in it.

**Implementation**

The flow is as follows:
1. Create the label in `DWARFASTParserClang`. The format is:
`$__lldb_func:module_id:die_id:mangled_name`
2. When resolving external symbols in `IRExecutionUnit`, we parse this
label and then do a lookup by DIE ID (or mangled name into the module if
the encoded DIE is a declaration).

Depends on llvm#151355
krishna2803 pushed a commit to krishna2803/llvm-project that referenced this pull request Aug 12, 2025
This was added purely for the needs of LLDB. However, starting with
llvm#151355 and
llvm#148877 we no longer create
literal AsmLabels in LLDB either. So this is unused and we can get rid
of it.

In the near future LLDB will want to add special support for mangling
`AsmLabel`s (specifically
llvm#149827).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants