-
Notifications
You must be signed in to change notification settings - Fork 15.2k
Fix GetDIE is outside of its CU error from .debug_names #157574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@llvm/pr-subscribers-lldb Author: None (jeffreytan81) ChangesThere is a user reporting that, when .debug_names + split dwarf dwo are enabled, they keep on getting tons of This PR fixes the issue by verifying what The newly added testcase will fail before the change while pass now. Full diff: https://github.com/llvm/llvm-project/pull/157574.diff 2 Files Affected:
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp
index fa5baf1a0eeb1..08089a4e5ad39 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp
@@ -131,8 +131,12 @@ DebugNamesDWARFIndex::GetNonSkeletonUnit(const DebugNames::Entry &entry) const {
unit_offset = entry.getLocalTUOffset();
if (unit_offset) {
if (DWARFUnit *cu = m_debug_info.GetUnitAtOffset(DIERef::Section::DebugInfo,
- *unit_offset))
- return &cu->GetNonSkeletonUnit();
+ *unit_offset)) {
+ DWARFUnit &ret = cu->GetNonSkeletonUnit();
+ if (ret.IsSkeletonUnit())
+ return nullptr;
+ return &ret;
+ }
}
return nullptr;
}
diff --git a/lldb/test/Shell/SymbolFile/DWARF/dwo-miss-getdie-ouside-cu-error.c b/lldb/test/Shell/SymbolFile/DWARF/dwo-miss-getdie-ouside-cu-error.c
new file mode 100644
index 0000000000000..bbe3dcebbe9ad
--- /dev/null
+++ b/lldb/test/Shell/SymbolFile/DWARF/dwo-miss-getdie-ouside-cu-error.c
@@ -0,0 +1,29 @@
+/// Check that LLDB does not emit "GetDIE for DIE {{0x[0-9a-f]+}} is outside of its CU"
+/// error message when user is searching for a matching symbol from .debug_names
+/// and fail to locate the corresponding .dwo file.
+
+/// -gsplit-dwarf is supported only on Linux.
+// REQUIRES: system-linux
+
+// RUN: echo "Temp directory: %t.compdir"
+// RUN: rm -rf %t.compdir/
+// RUN: mkdir -p %t.compdir/a/b/
+// RUN: cp %s %t.compdir/a/b/main.c
+// RUN: cd %t.compdir/a/
+/// The produced DWO is named /b/main-main.dwo, with dwarf5 .debug_names
+// RUN: %clang_host -g -gsplit-dwarf -gpubnames -gdwarf-5 -fdebug-prefix-map=%t.compdir=. b/main.c -o b/main
+// RUN: cd ../..
+/// Move the DWO file away from the expected location.
+// RUN: mv %t.compdir/a/b/*.dwo %t.compdir/
+/// LLDB won't find the DWO next to the binary or by adding the relative path
+/// to any of the search paths. So it should find the DWO file at
+/// %t.compdir/main-main.dwo.
+// RUN: %lldb --no-lldbinit %t.compdir/a/b/main \
+// RUN: -o "b main" --batch 2>&1 | FileCheck %s
+
+// CHECK: warning: {{.*}}main unable to locate separate debug file (dwo, dwp). Debugging will be degraded.
+// CHECK-NOT: main GetDIE for DIE {{0x[0-9a-f]+}} is outside of its CU {{0x[0-9a-f]+}}
+
+int num = 5;
+
+int main(void) { return 0; }
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be wrong, but isn't it also supported on Windows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied this from dwo-missing-error.test and trusted it:
https://github.com/llvm/llvm-project/blob/main/lldb/test/Shell/SymbolFile/DWARF/dwo-missing-error.test#L5-L6
Since I do not have a windows machine to test so feels safer to limit for Linux. Let me know if you find otherwise and I can update the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be more readable if we didn't cd in and out of directories here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems suprising that the GetNonSkeletonUnit() call can actually return a skeleton unit. Maybe a better approach here would be to model the fact that we can actually fail to find the non-skeleton unit by returning an Expected
llvm::Expected<DWARFUnit&> DWARFUnit::GetNonSkeletonUnit()
I did a quick search and see other
examples where it looks like we might hit this same bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this in general. Actually, that's the original approach I suggested while discussing with Greg.
Unfortunately, there are many existing other callers/code paths of DWARFUnit::GetNonSkeletonUnit are expecting the default behavior to fallback to return skeleton unit if failing to find dwo files. Changing DWARFUnit::GetNonSkeletonUnit's semantics requiring auditing all other callers/code paths to ensure the behaviors are expected which is a much bigger task than I thought. Yesterday, I tried to change all callers of DWARFUnit::GetNonSkeletonUnit to use new API/semantics, it is failing several tests. Some tests are related with apple debug names, -gmodules flag, PCH modules containing CU with only dwo_id without dwo_name (resulting in dwo error) which I am not feeling comfortable/justified to fix.
Overall, I feel fixing this known code path is safer (not failing any tests) with better scope to reason about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for trying out that approach. Do you have a WIP patch you can push to a branch somewhere? Might be something we want to tackle later.
Your fix seems targeted and reasonable given the amount of code relying on existing behavior.
lldb/test/Shell/SymbolFile/DWARF/dwo-miss-getdie-ouside-cu-error.c
Outdated
Show resolved
Hide resolved
4de6926 to
01ca83a
Compare
|
Updated the test to remove unnecessary steps, be more readable. |
| /// -gsplit-dwarf is supported only on Linux. | ||
| // REQUIRES: system-linux | ||
|
|
||
| // RUN: %clang_host -g -gsplit-dwarf -gpubnames -gdwarf-5 %s -o main |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks the test looks much easier to understand now.
I think there is a problem with where it is writing the files though. The lit tests to not automatically get a unique build directory for the test outputs, but uses a shared directory for all the tests in that directory. That means writing and deleting files can interfere with other tests in the directory.
We can use the %t to create unique a temporary directory for the test.
// RUN: mkdir -p %t.dir
// RUN: %clang_host -g -gsplit-dwarf -gpubnames -gdwarf-5 %s -o %t.dir/main
// RUN: rm %t.dir/*.dwo
// RUN: %lldb --no-lldbinit %t.dir/main \
// RUN: -o "b main" --batch 2>&1 | FileCheck %s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for trying out that approach. Do you have a WIP patch you can push to a branch somewhere? Might be something we want to tackle later.
Your fix seems targeted and reasonable given the amount of code relying on existing behavior.
There is a user reporting that, when .debug_names + split dwarf dwo are enabled, they keep on getting tons of
GetDIE for DIE XXXX is outside of its CUerror messages.The real root cause was caused by some kind of build configuration issue that failed to materialize the underlying dwo files. However, even we failed to locate dwo files, lldb should not keep on emitting this annoying errors. Investigation shows that when .debug_names failed to find the underlying dwo files,
GetNonSkeletonUnit()API fallback to skeleton unit, then it tries to get the DIE from the skeleton unit's offset. This will fail because the .debug_names entry offset should be based on non-skeleton unit in underlying dwo files not from its skeleton unit, causing above error to be reported.This PR fixes the issue by verifying what
GetNonSkeletonUnit()returned is indeed non-skeleton unit, otherwise returning null to stop further probing the DIE offset.The newly added testcase will fail before the change while pass now.