Skip to content

Conversation

@Jlalond
Copy link
Contributor

@Jlalond Jlalond commented Nov 5, 2025

Recently I've been deep diving ELF cores in LLDB, aspiring to move LLDB closer to GDB in capability. One issue I encountered was a system lib losing it's unwind plan when loading the debuginfo. The reason for this was the debuginfo has the eh_frame section stripped and the main executable did not.

The root cause of this was this line in ObjectFileElf

  // For eTypeDebugInfo files, the Symbol Vendor will take care of updating the
  // unified section list.
  if (GetType() != eTypeDebugInfo)
    unified_section_list = *m_sections_up;

This would always be executed because CalculateType can never return an eTypeDebugInfo

ObjectFile::Type ObjectFileELF::CalculateType() {
  switch (m_header.e_type) {
  case llvm::ELF::ET_NONE:
    // 0 - No file type
    return eTypeUnknown;

  case llvm::ELF::ET_REL:
    // 1 - Relocatable file
    return eTypeObjectFile;

  case llvm::ELF::ET_EXEC:
    // 2 - Executable file
    return eTypeExecutable;

  case llvm::ELF::ET_DYN:
    // 3 - Shared object file
    return eTypeSharedLibrary;

  case ET_CORE:
    // 4 - Core file
    return eTypeCoreFile;

  default:
    break;
  }
  return eTypeUnknown;
}

This makes sense as there isn't a explicit sh_type to denote that this file is a debuginfo. After some discussion with @clayborg and @GeorgeHuyubo we settled on joining the exciting unified section list with whatever new sections were being added. Adding each new unique section, or taking the section with the maximum file size. We picked this strategy to pick the section with the most information. In most scenarios, LHS should be SHT_NOBITS and RHS would be SHT_PROGBITS.

Here is a diagram documenting the existing vs proposed new way.
image

@llvmbot
Copy link
Member

llvmbot commented Nov 5, 2025

@llvm/pr-subscribers-lldb

Author: Jacob Lalonde (Jlalond)

Changes

Recently I've been deep diving ELF cores in LLDB, aspiring to move LLDB closer to GDB in capability. One issue I encountered was a system lib losing it's unwind plan when loading the debuginfo. The reason for this was the debuginfo has the eh_frame section stripped and the main executable did not.

The root cause of this was this line in ObjectFileElf

  // For eTypeDebugInfo files, the Symbol Vendor will take care of updating the
  // unified section list.
  if (GetType() != eTypeDebugInfo)
    unified_section_list = *m_sections_up;

This would always be executed because CalculateType can never return an eTypeDebugInfo

ObjectFile::Type ObjectFileELF::CalculateType() {
  switch (m_header.e_type) {
  case llvm::ELF::ET_NONE:
    // 0 - No file type
    return eTypeUnknown;

  case llvm::ELF::ET_REL:
    // 1 - Relocatable file
    return eTypeObjectFile;

  case llvm::ELF::ET_EXEC:
    // 2 - Executable file
    return eTypeExecutable;

  case llvm::ELF::ET_DYN:
    // 3 - Shared object file
    return eTypeSharedLibrary;

  case ET_CORE:
    // 4 - Core file
    return eTypeCoreFile;

  default:
    break;
  }
  return eTypeUnknown;
}

This makes sense as there isn't a explicit sh_type to denote that this file is a debuginfo. After some discussion with @clayborg and @GeorgeHuyubo we settled on joining the exciting unified section list with whatever new sections were being added. Adding new unique section, or overwriting in the case that the unified section list has SHT_NOBITS.

Here is a diagram documenting the existing vs proposed new way.
<img width="1666" height="1093" alt="image" src="https://github.com/user-attachments/assets/73ba9620-c737-439e-9934-ac350d88a3b5" />


Full diff: https://github.com/llvm/llvm-project/pull/166635.diff

6 Files Affected:

  • (modified) lldb/include/lldb/Core/Section.h (+11)
  • (modified) lldb/source/Core/Section.cpp (+28)
  • (modified) lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp (+41-4)
  • (added) lldb/test/API/python_api/unified_section_list/Makefile (+5)
  • (added) lldb/test/API/python_api/unified_section_list/TestModuleUnifiedSectionList.py (+50)
  • (added) lldb/test/API/python_api/unified_section_list/main.cpp (+5)
diff --git a/lldb/include/lldb/Core/Section.h b/lldb/include/lldb/Core/Section.h
index f0f5a0b3499c0..3bde0f3adc12f 100644
--- a/lldb/include/lldb/Core/Section.h
+++ b/lldb/include/lldb/Core/Section.h
@@ -96,6 +96,17 @@ class SectionList {
   /// information.
   uint64_t GetDebugInfoSize() const;
 
+  // Callback to decide of two matching sections should be used in the merged
+  // output.
+  using MergeCallback =
+      std::function<lldb::SectionSP(lldb::SectionSP, lldb::SectionSP)>;
+
+  // Function that merges two different sections into a new output list. All
+  // unique sections will be checked for conflict and resolved using the
+  // supplied merging callback.
+  static std::shared_ptr<SectionList> Merge(SectionList &lhs, SectionList &rhs,
+                                            MergeCallback filter);
+
 protected:
   collection m_sections;
 };
diff --git a/lldb/source/Core/Section.cpp b/lldb/source/Core/Section.cpp
index 02d9d86fe5374..244cdf934c0ec 100644
--- a/lldb/source/Core/Section.cpp
+++ b/lldb/source/Core/Section.cpp
@@ -683,6 +683,34 @@ uint64_t SectionList::GetDebugInfoSize() const {
   return debug_info_size;
 }
 
+std::shared_ptr<SectionList>
+SectionList::Merge(SectionList &lhs, SectionList &rhs, MergeCallback filter) {
+  std::shared_ptr<SectionList> output_sp = std::make_shared<SectionList>();
+
+  // Iterate through all the sections in lhs and see if we have matches in
+  // the rhs list.
+  for (const auto &lhs_section : lhs) {
+    auto rhs_section = rhs.FindSectionByName(lhs_section->GetName());
+    if (rhs_section) {
+      assert(lhs_section->GetName() == rhs_section->GetName());
+      output_sp->AddSection(filter(lhs_section, rhs_section));
+    } else
+      output_sp->AddSection(lhs_section);
+  }
+
+  // Now that we've visited all possible duplicates, we can iterate over
+  // the rhs and take any values not in lhs.
+  for (const auto &rhs_section : rhs) {
+    auto lhs_section = lhs.FindSectionByName(rhs_section->GetName());
+    // Because we already visited everything overlapping between rhs
+    // and lhs, any section not in lhs is unique and can be output.
+    if (!lhs_section)
+      output_sp->AddSection(rhs_section);
+  }
+
+  return output_sp;
+}
+
 namespace llvm {
 namespace json {
 
diff --git a/lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp b/lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
index 097c91b623e8f..06815b03d47c5 100644
--- a/lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
+++ b/lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
@@ -130,6 +130,42 @@ class ELFRelocation {
 
   RelocUnion reloc;
 };
+
+lldb::SectionSP MergeSections(lldb::SectionSP lhs, lldb::SectionSP rhs) {
+  assert(lhs && rhs);
+
+  // We only take the RHS is the LHS is SHT_NOBITS, which would be
+  // represented as a size of 0. Where we can take the rhs.
+  if (lhs->GetByteSize() == 0)
+    return rhs;
+
+  lldb::ModuleSP lhs_module_parent = lhs->GetModule();
+  lldb::ModuleSP rhs_module_parent = rhs->GetModule();
+  assert(lhs_module_parent && rhs_module_parent);
+
+  // Now that we're taking the lhs, we should do some sanity checks to warn the
+  // user if this debuginfo/dwp looks invalid. However if RHS is SHT_NO_BITS
+  // we want to ignore the size comparison.
+  if (rhs->GetByteSize() > 0 && lhs->GetByteSize() != rhs->GetByteSize())
+    lhs_module_parent->ReportWarning(
+        "Mismatched size for section {0} when merging with {1}, expected: "
+        "{2:x}, "
+        "actual: {3:x}",
+        lhs->GetTypeAsCString(),
+        rhs_module_parent->GetFileSpec().GetPathAsConstString().GetCString(),
+        lhs->GetByteSize(), rhs->GetByteSize());
+
+  if (lhs->GetFileAddress() != rhs->GetFileAddress())
+    lhs_module_parent->ReportWarning(
+        "Mismatch addresses for section {0} when "
+        "merging with {1}, expected: {2:x}, "
+        "actual: {3:x}",
+        lhs->GetTypeAsCString(),
+        rhs_module_parent->GetFileSpec().GetPathAsConstString().GetCString(),
+        lhs->GetByteSize(), rhs->GetByteSize());
+
+  return lhs;
+}
 } // end anonymous namespace
 
 ELFRelocation::ELFRelocation(unsigned type) {
@@ -1967,10 +2003,11 @@ void ObjectFileELF::CreateSections(SectionList &unified_section_list) {
     provider.AddSection(std::move(*InfoOr), std::move(section_sp));
   }
 
-  // For eTypeDebugInfo files, the Symbol Vendor will take care of updating the
-  // unified section list.
-  if (GetType() != eTypeDebugInfo)
-    unified_section_list = *m_sections_up;
+  // Merge the two adding any new sections, and overwriting any existing
+  // sections that are SHT_NOBITS
+  std::shared_ptr<SectionList> merged_section_list =
+      SectionList::Merge(unified_section_list, *m_sections_up, MergeSections);
+  unified_section_list = *merged_section_list;
 
   // If there's a .gnu_debugdata section, we'll try to read the .symtab that's
   // embedded in there and replace the one in the original object file (if any).
diff --git a/lldb/test/API/python_api/unified_section_list/Makefile b/lldb/test/API/python_api/unified_section_list/Makefile
new file mode 100644
index 0000000000000..431e716ab8f69
--- /dev/null
+++ b/lldb/test/API/python_api/unified_section_list/Makefile
@@ -0,0 +1,5 @@
+CXX_SOURCES := main.cpp
+
+SPLIT_DEBUG_SYMBOLS := YES
+
+include Makefile.rules
diff --git a/lldb/test/API/python_api/unified_section_list/TestModuleUnifiedSectionList.py b/lldb/test/API/python_api/unified_section_list/TestModuleUnifiedSectionList.py
new file mode 100644
index 0000000000000..dfec5f630bfb0
--- /dev/null
+++ b/lldb/test/API/python_api/unified_section_list/TestModuleUnifiedSectionList.py
@@ -0,0 +1,50 @@
+"""
+Test Unified Section List merging.
+"""
+
+import os
+
+import lldb
+from lldbsuite.test.decorators import *
+from lldbsuite.test.lldbtest import *
+from lldbsuite.test import lldbutil
+from lldbsuite.test.lldbutil import symbol_type_to_str
+
+
+class ModuleUnifiedSectionList(TestBase):
+    def test_unified_section_list(self):
+        self.build()
+        exe = self.getBuildArtifact("a.out")
+        debug_info = self.getBuildArtifact("a.out.debug")
+        new_dir = os.path.join(os.path.dirname(debug_info), "new_dir")
+        os.mkdir(new_dir)
+        renamed_debug_info = os.path.join(new_dir, "renamed.debug")
+        os.rename(debug_info, renamed_debug_info)
+        target = self.dbg.CreateTarget(exe)
+        self.assertTrue(target, VALID_TARGET)
+        self.assertGreater(target.GetNumModules(), 0)
+
+        main_exe_module = target.GetModuleAtIndex(0)
+        eh_frame = main_exe_module.FindSection(".eh_frame")
+        self.assertTrue(eh_frame.IsValid())
+        self.assertGreater(eh_frame.size, 0)
+
+        # Should be stripped in main executable.
+        debug_info_section = main_exe_module.FindSection(".debug_info")
+        self.assertFalse(debug_info_section.IsValid())
+
+        ci = self.dbg.GetCommandInterpreter()
+        res = lldb.SBCommandReturnObject()
+        ci.HandleCommand(f"target symbols add {renamed_debug_info}", res)
+        self.assertTrue(res.Succeeded())
+
+        # Should be stripped in .debuginfo but be present in main executable.
+        main_exe_module = target.GetModuleAtIndex(0)
+        eh_frame = main_exe_module.FindSection(".eh_frame")
+        self.assertTrue(eh_frame.IsValid())
+        self.assertGreater(eh_frame.size, 0)
+
+        # Should be unified and both sections should have contents.
+        debug_info_section = main_exe_module.FindSection(".debug_info")
+        self.assertTrue(debug_info_section.IsValid())
+        self.assertGreater(debug_info_section.file_size, 0)
diff --git a/lldb/test/API/python_api/unified_section_list/main.cpp b/lldb/test/API/python_api/unified_section_list/main.cpp
new file mode 100644
index 0000000000000..943123b4059db
--- /dev/null
+++ b/lldb/test/API/python_api/unified_section_list/main.cpp
@@ -0,0 +1,5 @@
+#include <stdio.h>
+
+int main() {
+  printf("Hello World\n");
+}

@github-actions
Copy link

github-actions bot commented Nov 5, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@github-actions
Copy link

github-actions bot commented Nov 5, 2025

✅ With the latest revision this PR passed the Python code formatter.

@Jlalond Jlalond force-pushed the fix-debuginfo-overwriting-elf branch from 16af1a7 to bf0c366 Compare November 5, 2025 20:55
Copy link
Collaborator

@clayborg clayborg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See inlined comments.

Copy link
Collaborator

@DavidSpickett DavidSpickett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overall logic makes sense to me but others have the expertise to approve.

Arm 32-bit is usually sensitive to these changes because it needs to know what code regions are Thumb vs. Arm code. IIRC this uses a combination of mapping symbols and function symbol addresses.

I haven't looked too closely at this change but it seems likely that it would be at worst neutral and quite likely an improvement for Arm.

Our bot is good at catching this stuff so if it fails, I'll revert this and figure out the problem for you.

@Jlalond Jlalond force-pushed the fix-debuginfo-overwriting-elf branch from d2fd8d4 to 9692c6a Compare November 6, 2025 23:01
@Jlalond Jlalond merged commit 32ebf63 into llvm:main Nov 6, 2025
8 of 9 checks passed
@Jlalond Jlalond deleted the fix-debuginfo-overwriting-elf branch November 7, 2025 00:02
vinay-deshmukh pushed a commit to vinay-deshmukh/llvm-project that referenced this pull request Nov 8, 2025
…#166635)

Recently I've been deep diving ELF cores in LLDB, aspiring to move LLDB
closer to GDB in capability. One issue I encountered was a system lib
losing it's unwind plan when loading the debuginfo. The reason for this
was the debuginfo has the eh_frame section stripped and the main
executable did not.

The root cause of this was this line in
[ObjectFileElf](https://github.com/llvm/llvm-project/blob/163933e9e7099f352ff8df1973f9a9c3d7def6c5/lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp#L1972)
```
  // For eTypeDebugInfo files, the Symbol Vendor will take care of updating the
  // unified section list.
  if (GetType() != eTypeDebugInfo)
    unified_section_list = *m_sections_up;
```

This would always be executed because CalculateType can never return an
eTypeDebugInfo
```
ObjectFile::Type ObjectFileELF::CalculateType() {
  switch (m_header.e_type) {
  case llvm::ELF::ET_NONE:
    // 0 - No file type
    return eTypeUnknown;

  case llvm::ELF::ET_REL:
    // 1 - Relocatable file
    return eTypeObjectFile;

  case llvm::ELF::ET_EXEC:
    // 2 - Executable file
    return eTypeExecutable;

  case llvm::ELF::ET_DYN:
    // 3 - Shared object file
    return eTypeSharedLibrary;

  case ET_CORE:
    // 4 - Core file
    return eTypeCoreFile;

  default:
    break;
  }
  return eTypeUnknown;
}
```

This makes sense as there isn't a explicit sh_type to denote that this
file is a debuginfo. After some discussion with @clayborg and
@GeorgeHuyubo we settled on joining the exciting unified section list
with whatever new sections were being added. Adding each new unique
section, or taking the section with the maximum file size. We picked
this strategy to pick the section with the most information. In most
scenarios, LHS should be SHT_NOBITS and RHS would be SHT_PROGBITS.

Here is a diagram documenting the existing vs proposed new way.
<img width="1666" height="1093" alt="image"
src="https://github.com/user-attachments/assets/73ba9620-c737-439e-9934-ac350d88a3b5"
/>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants