-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[LLDB] Fix debuginfo ELF files overwriting Unified Section List #166635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@llvm/pr-subscribers-lldb Author: Jacob Lalonde (Jlalond) ChangesRecently I've been deep diving ELF cores in LLDB, aspiring to move LLDB closer to GDB in capability. One issue I encountered was a system lib losing it's unwind plan when loading the debuginfo. The reason for this was the debuginfo has the eh_frame section stripped and the main executable did not. The root cause of this was this line in ObjectFileElf This would always be executed because CalculateType can never return an eTypeDebugInfo This makes sense as there isn't a explicit sh_type to denote that this file is a debuginfo. After some discussion with @clayborg and @GeorgeHuyubo we settled on joining the exciting unified section list with whatever new sections were being added. Adding new unique section, or overwriting in the case that the unified section list has SHT_NOBITS. Here is a diagram documenting the existing vs proposed new way. Full diff: https://github.com/llvm/llvm-project/pull/166635.diff 6 Files Affected:
diff --git a/lldb/include/lldb/Core/Section.h b/lldb/include/lldb/Core/Section.h
index f0f5a0b3499c0..3bde0f3adc12f 100644
--- a/lldb/include/lldb/Core/Section.h
+++ b/lldb/include/lldb/Core/Section.h
@@ -96,6 +96,17 @@ class SectionList {
/// information.
uint64_t GetDebugInfoSize() const;
+ // Callback to decide of two matching sections should be used in the merged
+ // output.
+ using MergeCallback =
+ std::function<lldb::SectionSP(lldb::SectionSP, lldb::SectionSP)>;
+
+ // Function that merges two different sections into a new output list. All
+ // unique sections will be checked for conflict and resolved using the
+ // supplied merging callback.
+ static std::shared_ptr<SectionList> Merge(SectionList &lhs, SectionList &rhs,
+ MergeCallback filter);
+
protected:
collection m_sections;
};
diff --git a/lldb/source/Core/Section.cpp b/lldb/source/Core/Section.cpp
index 02d9d86fe5374..244cdf934c0ec 100644
--- a/lldb/source/Core/Section.cpp
+++ b/lldb/source/Core/Section.cpp
@@ -683,6 +683,34 @@ uint64_t SectionList::GetDebugInfoSize() const {
return debug_info_size;
}
+std::shared_ptr<SectionList>
+SectionList::Merge(SectionList &lhs, SectionList &rhs, MergeCallback filter) {
+ std::shared_ptr<SectionList> output_sp = std::make_shared<SectionList>();
+
+ // Iterate through all the sections in lhs and see if we have matches in
+ // the rhs list.
+ for (const auto &lhs_section : lhs) {
+ auto rhs_section = rhs.FindSectionByName(lhs_section->GetName());
+ if (rhs_section) {
+ assert(lhs_section->GetName() == rhs_section->GetName());
+ output_sp->AddSection(filter(lhs_section, rhs_section));
+ } else
+ output_sp->AddSection(lhs_section);
+ }
+
+ // Now that we've visited all possible duplicates, we can iterate over
+ // the rhs and take any values not in lhs.
+ for (const auto &rhs_section : rhs) {
+ auto lhs_section = lhs.FindSectionByName(rhs_section->GetName());
+ // Because we already visited everything overlapping between rhs
+ // and lhs, any section not in lhs is unique and can be output.
+ if (!lhs_section)
+ output_sp->AddSection(rhs_section);
+ }
+
+ return output_sp;
+}
+
namespace llvm {
namespace json {
diff --git a/lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp b/lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
index 097c91b623e8f..06815b03d47c5 100644
--- a/lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
+++ b/lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
@@ -130,6 +130,42 @@ class ELFRelocation {
RelocUnion reloc;
};
+
+lldb::SectionSP MergeSections(lldb::SectionSP lhs, lldb::SectionSP rhs) {
+ assert(lhs && rhs);
+
+ // We only take the RHS is the LHS is SHT_NOBITS, which would be
+ // represented as a size of 0. Where we can take the rhs.
+ if (lhs->GetByteSize() == 0)
+ return rhs;
+
+ lldb::ModuleSP lhs_module_parent = lhs->GetModule();
+ lldb::ModuleSP rhs_module_parent = rhs->GetModule();
+ assert(lhs_module_parent && rhs_module_parent);
+
+ // Now that we're taking the lhs, we should do some sanity checks to warn the
+ // user if this debuginfo/dwp looks invalid. However if RHS is SHT_NO_BITS
+ // we want to ignore the size comparison.
+ if (rhs->GetByteSize() > 0 && lhs->GetByteSize() != rhs->GetByteSize())
+ lhs_module_parent->ReportWarning(
+ "Mismatched size for section {0} when merging with {1}, expected: "
+ "{2:x}, "
+ "actual: {3:x}",
+ lhs->GetTypeAsCString(),
+ rhs_module_parent->GetFileSpec().GetPathAsConstString().GetCString(),
+ lhs->GetByteSize(), rhs->GetByteSize());
+
+ if (lhs->GetFileAddress() != rhs->GetFileAddress())
+ lhs_module_parent->ReportWarning(
+ "Mismatch addresses for section {0} when "
+ "merging with {1}, expected: {2:x}, "
+ "actual: {3:x}",
+ lhs->GetTypeAsCString(),
+ rhs_module_parent->GetFileSpec().GetPathAsConstString().GetCString(),
+ lhs->GetByteSize(), rhs->GetByteSize());
+
+ return lhs;
+}
} // end anonymous namespace
ELFRelocation::ELFRelocation(unsigned type) {
@@ -1967,10 +2003,11 @@ void ObjectFileELF::CreateSections(SectionList &unified_section_list) {
provider.AddSection(std::move(*InfoOr), std::move(section_sp));
}
- // For eTypeDebugInfo files, the Symbol Vendor will take care of updating the
- // unified section list.
- if (GetType() != eTypeDebugInfo)
- unified_section_list = *m_sections_up;
+ // Merge the two adding any new sections, and overwriting any existing
+ // sections that are SHT_NOBITS
+ std::shared_ptr<SectionList> merged_section_list =
+ SectionList::Merge(unified_section_list, *m_sections_up, MergeSections);
+ unified_section_list = *merged_section_list;
// If there's a .gnu_debugdata section, we'll try to read the .symtab that's
// embedded in there and replace the one in the original object file (if any).
diff --git a/lldb/test/API/python_api/unified_section_list/Makefile b/lldb/test/API/python_api/unified_section_list/Makefile
new file mode 100644
index 0000000000000..431e716ab8f69
--- /dev/null
+++ b/lldb/test/API/python_api/unified_section_list/Makefile
@@ -0,0 +1,5 @@
+CXX_SOURCES := main.cpp
+
+SPLIT_DEBUG_SYMBOLS := YES
+
+include Makefile.rules
diff --git a/lldb/test/API/python_api/unified_section_list/TestModuleUnifiedSectionList.py b/lldb/test/API/python_api/unified_section_list/TestModuleUnifiedSectionList.py
new file mode 100644
index 0000000000000..dfec5f630bfb0
--- /dev/null
+++ b/lldb/test/API/python_api/unified_section_list/TestModuleUnifiedSectionList.py
@@ -0,0 +1,50 @@
+"""
+Test Unified Section List merging.
+"""
+
+import os
+
+import lldb
+from lldbsuite.test.decorators import *
+from lldbsuite.test.lldbtest import *
+from lldbsuite.test import lldbutil
+from lldbsuite.test.lldbutil import symbol_type_to_str
+
+
+class ModuleUnifiedSectionList(TestBase):
+ def test_unified_section_list(self):
+ self.build()
+ exe = self.getBuildArtifact("a.out")
+ debug_info = self.getBuildArtifact("a.out.debug")
+ new_dir = os.path.join(os.path.dirname(debug_info), "new_dir")
+ os.mkdir(new_dir)
+ renamed_debug_info = os.path.join(new_dir, "renamed.debug")
+ os.rename(debug_info, renamed_debug_info)
+ target = self.dbg.CreateTarget(exe)
+ self.assertTrue(target, VALID_TARGET)
+ self.assertGreater(target.GetNumModules(), 0)
+
+ main_exe_module = target.GetModuleAtIndex(0)
+ eh_frame = main_exe_module.FindSection(".eh_frame")
+ self.assertTrue(eh_frame.IsValid())
+ self.assertGreater(eh_frame.size, 0)
+
+ # Should be stripped in main executable.
+ debug_info_section = main_exe_module.FindSection(".debug_info")
+ self.assertFalse(debug_info_section.IsValid())
+
+ ci = self.dbg.GetCommandInterpreter()
+ res = lldb.SBCommandReturnObject()
+ ci.HandleCommand(f"target symbols add {renamed_debug_info}", res)
+ self.assertTrue(res.Succeeded())
+
+ # Should be stripped in .debuginfo but be present in main executable.
+ main_exe_module = target.GetModuleAtIndex(0)
+ eh_frame = main_exe_module.FindSection(".eh_frame")
+ self.assertTrue(eh_frame.IsValid())
+ self.assertGreater(eh_frame.size, 0)
+
+ # Should be unified and both sections should have contents.
+ debug_info_section = main_exe_module.FindSection(".debug_info")
+ self.assertTrue(debug_info_section.IsValid())
+ self.assertGreater(debug_info_section.file_size, 0)
diff --git a/lldb/test/API/python_api/unified_section_list/main.cpp b/lldb/test/API/python_api/unified_section_list/main.cpp
new file mode 100644
index 0000000000000..943123b4059db
--- /dev/null
+++ b/lldb/test/API/python_api/unified_section_list/main.cpp
@@ -0,0 +1,5 @@
+#include <stdio.h>
+
+int main() {
+ printf("Hello World\n");
+}
|
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
|
✅ With the latest revision this PR passed the Python code formatter. |
16af1a7 to
bf0c366
Compare
clayborg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See inlined comments.
DavidSpickett
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The overall logic makes sense to me but others have the expertise to approve.
Arm 32-bit is usually sensitive to these changes because it needs to know what code regions are Thumb vs. Arm code. IIRC this uses a combination of mapping symbols and function symbol addresses.
I haven't looked too closely at this change but it seems likely that it would be at worst neutral and quite likely an improvement for Arm.
Our bot is good at catching this stuff so if it fails, I'll revert this and figure out the problem for you.
… yamlized elf files
…mpare against instead of needing to inspect files.
d2fd8d4 to
9692c6a
Compare
…#166635) Recently I've been deep diving ELF cores in LLDB, aspiring to move LLDB closer to GDB in capability. One issue I encountered was a system lib losing it's unwind plan when loading the debuginfo. The reason for this was the debuginfo has the eh_frame section stripped and the main executable did not. The root cause of this was this line in [ObjectFileElf](https://github.com/llvm/llvm-project/blob/163933e9e7099f352ff8df1973f9a9c3d7def6c5/lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp#L1972) ``` // For eTypeDebugInfo files, the Symbol Vendor will take care of updating the // unified section list. if (GetType() != eTypeDebugInfo) unified_section_list = *m_sections_up; ``` This would always be executed because CalculateType can never return an eTypeDebugInfo ``` ObjectFile::Type ObjectFileELF::CalculateType() { switch (m_header.e_type) { case llvm::ELF::ET_NONE: // 0 - No file type return eTypeUnknown; case llvm::ELF::ET_REL: // 1 - Relocatable file return eTypeObjectFile; case llvm::ELF::ET_EXEC: // 2 - Executable file return eTypeExecutable; case llvm::ELF::ET_DYN: // 3 - Shared object file return eTypeSharedLibrary; case ET_CORE: // 4 - Core file return eTypeCoreFile; default: break; } return eTypeUnknown; } ``` This makes sense as there isn't a explicit sh_type to denote that this file is a debuginfo. After some discussion with @clayborg and @GeorgeHuyubo we settled on joining the exciting unified section list with whatever new sections were being added. Adding each new unique section, or taking the section with the maximum file size. We picked this strategy to pick the section with the most information. In most scenarios, LHS should be SHT_NOBITS and RHS would be SHT_PROGBITS. Here is a diagram documenting the existing vs proposed new way. <img width="1666" height="1093" alt="image" src="https://github.com/user-attachments/assets/73ba9620-c737-439e-9934-ac350d88a3b5" />
Recently I've been deep diving ELF cores in LLDB, aspiring to move LLDB closer to GDB in capability. One issue I encountered was a system lib losing it's unwind plan when loading the debuginfo. The reason for this was the debuginfo has the eh_frame section stripped and the main executable did not.
The root cause of this was this line in ObjectFileElf
This would always be executed because CalculateType can never return an eTypeDebugInfo
This makes sense as there isn't a explicit sh_type to denote that this file is a debuginfo. After some discussion with @clayborg and @GeorgeHuyubo we settled on joining the exciting unified section list with whatever new sections were being added. Adding each new unique section, or taking the section with the maximum file size. We picked this strategy to pick the section with the most information. In most scenarios, LHS should be SHT_NOBITS and RHS would be SHT_PROGBITS.
Here is a diagram documenting the existing vs proposed new way.
