Skip to content

Conversation

@imkiva
Copy link
Member

@imkiva imkiva commented Oct 31, 2025

Currently LLDB's ParseRustVariantPart generates the following CXXRecordDecl for a Rust enum

enum AA {
  A(u8)
}
CXXRecordDecl 0x5555568d5970 <<invalid sloc>> <invalid sloc> struct AA
|-CXXRecordDecl 0x5555568d5ab0 <<invalid sloc>> <invalid sloc> union test_issue::AA$Inner definition
| |-CXXRecordDecl 0x5555568d5d18 <<invalid sloc>> <invalid sloc> struct A$Variant definition
| | |-DefinitionData pass_in_registers aggregate standard_layout trivially_copyable trivial
| | | `-Destructor simple irrelevant trivial needs_implicit
| | `-FieldDecl 0x555555a77880 <<invalid sloc>> <invalid sloc> value 'test_issue::AA::A'
| `-FieldDecl 0x555555a778f0 <<invalid sloc>> <invalid sloc> $variant$ 'test_issue::AA::test_issue::AA$Inner::A$Variant'
|-CXXRecordDecl 0x5555568d5c48 <<invalid sloc>> <invalid sloc> struct A definition
| `-FieldDecl 0x555555a777e0 <<invalid sloc>> <invalid sloc> __0 'unsigned char'
`-FieldDecl 0x555555a77960 <<invalid sloc>> <invalid sloc> $variants$ 'test_issue::AA::test_issue::AA$Inner'

While when the Rust enum type name is the same as its variant name, the generated CXXRecordDecl becomes the following – there's a circular reference between struct A$Variant and struct A, causing #163048.

enum A {
  A(u8)
}
CXXRecordDecl 0x5555568d5760 <<invalid sloc>> <invalid sloc> struct A
|-CXXRecordDecl 0x5555568d58a0 <<invalid sloc>> <invalid sloc> union test_issue::A$Inner definition
| |-CXXRecordDecl 0x5555568d5a38 <<invalid sloc>> <invalid sloc> struct A$Variant definition
| | `-FieldDecl 0x5555568d5b70 <<invalid sloc>> <invalid sloc> value 'test_issue::A'    <---- bug here
| `-FieldDecl 0x5555568d5be0 <<invalid sloc>> <invalid sloc> $variant$ 'test_issue::A::test_issue::A$Inner::A$Variant'
`-FieldDecl 0x5555568d5c50 <<invalid sloc>> <invalid sloc> $variants$ 'test_issue::A::test_issue::A$Inner'

The problem was caused by GetUniqueTypeNameAndDeclaration not returning the correct qualified name for DWARF DIE test_issue::A::A, instead, it returned A. This caused ParseStructureLikeDIE to find the wrong type test_issue::A and returned early.

The failure in GetUniqueTypeNameAndDeclaration appears to stem from a language check that returns early unless the language is C++. I changed it so Rust follows the C++ path rather than returning. I’m not entirely sure this is the right approach — Rust’s qualified name rules look similar, but not identical? Alternatively, we could add a Rust-specific implementation that forms qualified names according to Rust's rules.

@imkiva imkiva requested a review from JDevlieghere as a code owner October 31, 2025 08:48
@llvmbot llvmbot added the lldb label Oct 31, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 31, 2025

@llvm/pr-subscribers-lldb

Author: Kiva (imkiva)

Changes

Currently LLDB's ParseRustVariantPart generates the following CXXRecordDecl for a Rust enum

enum AA {
  A(u8)
}
CXXRecordDecl 0x5555568d5970 &lt;&lt;invalid sloc&gt;&gt; &lt;invalid sloc&gt; struct AA
|-CXXRecordDecl 0x5555568d5ab0 &lt;&lt;invalid sloc&gt;&gt; &lt;invalid sloc&gt; union test_issue::AA$Inner definition
| |-CXXRecordDecl 0x5555568d5d18 &lt;&lt;invalid sloc&gt;&gt; &lt;invalid sloc&gt; struct A$Variant definition
| | |-DefinitionData pass_in_registers aggregate standard_layout trivially_copyable trivial
| | | `-Destructor simple irrelevant trivial needs_implicit
| | `-FieldDecl 0x555555a77880 &lt;&lt;invalid sloc&gt;&gt; &lt;invalid sloc&gt; value 'test_issue::AA::A'
| `-FieldDecl 0x555555a778f0 &lt;&lt;invalid sloc&gt;&gt; &lt;invalid sloc&gt; $variant$ 'test_issue::AA::test_issue::AA$Inner::A$Variant'
|-CXXRecordDecl 0x5555568d5c48 &lt;&lt;invalid sloc&gt;&gt; &lt;invalid sloc&gt; struct A definition
| `-FieldDecl 0x555555a777e0 &lt;&lt;invalid sloc&gt;&gt; &lt;invalid sloc&gt; __0 'unsigned char'
`-FieldDecl 0x555555a77960 &lt;&lt;invalid sloc&gt;&gt; &lt;invalid sloc&gt; $variants$ 'test_issue::AA::test_issue::AA$Inner'

While when the Rust enum type name is the same as its variant name, the generated CXXRecordDecl becomes the following – there's a circular reference between struct A$Variant and struct A, causing #163048.

enum A {
  A(u8)
}
CXXRecordDecl 0x5555568d5760 &lt;&lt;invalid sloc&gt;&gt; &lt;invalid sloc&gt; struct A
|-CXXRecordDecl 0x5555568d58a0 &lt;&lt;invalid sloc&gt;&gt; &lt;invalid sloc&gt; union test_issue::A$Inner definition
| |-CXXRecordDecl 0x5555568d5a38 &lt;&lt;invalid sloc&gt;&gt; &lt;invalid sloc&gt; struct A$Variant definition
| | `-FieldDecl 0x5555568d5b70 &lt;&lt;invalid sloc&gt;&gt; &lt;invalid sloc&gt; value 'test_issue::A'    &lt;---- bug here
| `-FieldDecl 0x5555568d5be0 &lt;&lt;invalid sloc&gt;&gt; &lt;invalid sloc&gt; $variant$ 'test_issue::A::test_issue::A$Inner::A$Variant'
`-FieldDecl 0x5555568d5c50 &lt;&lt;invalid sloc&gt;&gt; &lt;invalid sloc&gt; $variants$ 'test_issue::A::test_issue::A$Inner'

The problem was caused by GetUniqueTypeNameAndDeclaration not returning the correct qualified name for DWARF DIE test_issue::A::A, instead, it returned A. This caused ParseStructureLikeDIE to find the wrong type test_issue::A and returned early.

The failure in GetUniqueTypeNameAndDeclaration appears to stem from a language check that returns early unless the language is C++. I changed it so Rust follows the C++ path rather than returning. I’m not entirely sure this is the right approach — Rust’s qualified name rules look similar, but not identical? Alternatively, we could add a Rust-specific implementation that forms qualified names according to Rust's rules.


Full diff: https://github.com/llvm/llvm-project/pull/165840.diff

3 Files Affected:

  • (modified) lldb/include/lldb/Target/Language.h (+2)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp (+5-1)
  • (modified) lldb/source/Target/Language.cpp (+4)
diff --git a/lldb/include/lldb/Target/Language.h b/lldb/include/lldb/Target/Language.h
index 9958b6ea2f815..f3aac6a324c34 100644
--- a/lldb/include/lldb/Target/Language.h
+++ b/lldb/include/lldb/Target/Language.h
@@ -436,6 +436,8 @@ class Language : public PluginInterface {
 
   static bool LanguageIsC(lldb::LanguageType language);
 
+  static bool LanguageIsRust(lldb::LanguageType language);
+
   /// Equivalent to \c LanguageIsC||LanguageIsObjC||LanguageIsCPlusPlus.
   static bool LanguageIsCFamily(lldb::LanguageType language);
 
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
index c049829f37219..388ec20cdc5fe 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
@@ -1700,7 +1700,11 @@ void DWARFASTParserClang::GetUniqueTypeNameAndDeclaration(
   // For C++, we rely solely upon the one definition rule that says
   // only one thing can exist at a given decl context. We ignore the
   // file and line that things are declared on.
-  if (!die.IsValid() || !Language::LanguageIsCPlusPlus(language) ||
+  // For Rust, we do the same since Rust also has a similar qualified name?
+  // Is there a better way to do this for Rust?
+  if (!die.IsValid() ||
+      (!Language::LanguageIsCPlusPlus(language) &&
+       !Language::LanguageIsRust(language)) ||
       unique_typename.IsEmpty())
     return;
   decl_declaration.Clear();
diff --git a/lldb/source/Target/Language.cpp b/lldb/source/Target/Language.cpp
index 8268d4ae4bb27..ad11fd94bb6b6 100644
--- a/lldb/source/Target/Language.cpp
+++ b/lldb/source/Target/Language.cpp
@@ -316,6 +316,10 @@ bool Language::LanguageIsCPlusPlus(LanguageType language) {
   }
 }
 
+bool Language::LanguageIsRust(LanguageType language) {
+  return language == eLanguageTypeRust;
+}
+
 bool Language::LanguageIsObjC(LanguageType language) {
   switch (language) {
   case eLanguageTypeObjC:

}

bool Language::LanguageIsRust(LanguageType language) {
return language == eLanguageTypeRust;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets remove this API and just check the language type directly at the callsite for now

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe eventually we'd want something like a "PretendsToBeCxx" API, which is true for rust and couple other languages. But that's out of scope

// For Rust, we do the same since Rust also has a similar qualified name?
// Is there a better way to do this for Rust?
if (!die.IsValid() ||
(!Language::LanguageIsCPlusPlus(language) &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we split the language check into a separate if-statement? Makes it easier to read

// file and line that things are declared on.
if (!die.IsValid() || !Language::LanguageIsCPlusPlus(language) ||
// For Rust, we do the same since Rust also has a similar qualified name?
// Is there a better way to do this for Rust?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would reword this to something like:
"FIXME: Rust pretends to be C++ for now, so use C++ name qualification rules"

Copy link
Member

@Michael137 Michael137 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a test? The original crash from the issue should do?

@imkiva
Copy link
Member Author

imkiva commented Oct 31, 2025

Could we add a test? The original crash from the issue should do?

Yes I am doing so. I noticed Rust tests used obj2yaml and I am learning it rn😆

@Michael137
Copy link
Member

Could we add a test? The original crash from the issue should do?

Yes I am doing so. I noticed Rust tests used obj2yaml and I am learning it rn😆

A unit-test might be simpler. Check DWARFASTParserClangTests for examples. They have pretty minimal yaml examples. Feel free to ping here if you need help

@github-actions
Copy link

github-actions bot commented Oct 31, 2025

✅ With the latest revision this PR passed the Python code formatter.

@imkiva
Copy link
Member Author

imkiva commented Nov 3, 2025

I added a test according to the existing enum-structs tests in LLDB. Do we need another unittest version for this? Maybe we can check the generated CXXRecordDecl to ensure it does not introduce circular references?

Comment on lines 1704 to 1706
auto isCPlusPlusOrSimilar = Language::LanguageIsCPlusPlus(language) ||
language == lldb::eLanguageTypeRust;
if (!die.IsValid() || !isCPlusPlusOrSimilar || unique_typename.IsEmpty())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just do:

Suggested change
auto isCPlusPlusOrSimilar = Language::LanguageIsCPlusPlus(language) ||
language == lldb::eLanguageTypeRust;
if (!die.IsValid() || !isCPlusPlusOrSimilar || unique_typename.IsEmpty())
if (!Language::LanguageIsCPlusPlus(language) && !language == lldb::eLanguageTypeRust)
return;
if (!die.IsValid() || unique_typename.IsEmpty())
return;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thanks~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants