-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[DebugInfo][DWARF] Set is_stmt on first non-line-0 instruction in BB #105524
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Fixes: llvm#104695 This patch adds the is_stmt flag to line table entries for the first instruction with a non-0 line location in each basic block, to ensure that it will be used for stepping even if the last instruction in the previous basic block had the same line number; this is important for cases where the new BB is reachable from BBs other than the preceding block.
|
@llvm/pr-subscribers-backend-x86 @llvm/pr-subscribers-debuginfo Author: Stephen Tozer (SLTozer) ChangesFixes: #104695 This patch adds the is_stmt flag to line table entries for the first instruction with a non-0 line location in each basic block, to ensure that it will be used for stepping even if the last instruction in the previous basic block had the same line number; this is important for cases where the new BB is reachable from BBs other than the preceding block. Full diff: https://github.com/llvm/llvm-project/pull/105524.diff 2 Files Affected:
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
index f111b4aea06f1b..aa11200731e1b3 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
@@ -2119,9 +2119,12 @@ void DwarfDebug::beginInstruction(const MachineInstr *MI) {
PrologEndLoc = DebugLoc();
}
// If the line changed, we call that a new statement; unless we went to
- // line 0 and came back, in which case it is not a new statement.
+ // line 0 and came back, in which case it is not a new statement. We also
+ // mark is_stmt for the first non-0 line in each BB, in case a predecessor BB
+ // ends with a different line.
unsigned OldLine = PrevInstLoc ? PrevInstLoc.getLine() : LastAsmLine;
- if (DL.getLine() && DL.getLine() != OldLine)
+ if (DL.getLine() &&
+ (DL.getLine() != OldLine || PrevInstBB != MI->getParent()))
Flags |= DWARF2_FLAG_IS_STMT;
const MDNode *Scope = DL.getScope();
diff --git a/llvm/test/DebugInfo/Generic/is_stmt-at-block-start.ll b/llvm/test/DebugInfo/Generic/is_stmt-at-block-start.ll
new file mode 100644
index 00000000000000..21efceab38eca0
--- /dev/null
+++ b/llvm/test/DebugInfo/Generic/is_stmt-at-block-start.ll
@@ -0,0 +1,38 @@
+;; Checks that when an instruction at the start of a BasicBlock has the same
+;; DebugLoc as the instruction at the end of the previous BasicBlock, we add
+;; is_stmt to the new line, to ensure that we still step on it if we arrive from
+;; a BasicBlock other than the immediately preceding one.
+
+; RUN: %llc_dwarf -O0 -filetype=obj < %s | llvm-dwarfdump --debug-line - | FileCheck %s
+
+; CHECK: {{0x[0-9a-f]+}} 13 5 {{.+}} is_stmt
+; CHECK-NEXT: {{0x[0-9a-f]+}} 13 25 {{.+}} is_stmt
+
+define void @_Z1fi(i1 %cond) !dbg !21 {
+entry:
+ br i1 %cond, label %if.then2, label %if.else4
+
+if.then2: ; preds = %entry
+ br label %if.end8, !dbg !24
+
+if.else4: ; preds = %entry
+ %0 = load i32, ptr null, align 4, !dbg !28
+ %call5 = call i1 null(i32 %0)
+ ret void
+
+if.end8: ; preds = %if.then2
+ ret void
+}
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!20}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !1, producer: "clang version 20.0.0", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: None)
+!1 = !DIFile(filename: "test.cpp", directory: "/home/gbtozers/dev/upstream-llvm")
+!20 = !{i32 2, !"Debug Info Version", i32 3}
+!21 = distinct !DISubprogram(name: "f", linkageName: "_Z1fi", scope: !1, file: !1, line: 7, type: !22, scopeLine: 7, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition, unit: !0)
+!22 = distinct !DISubroutineType(types: !23)
+!23 = !{null}
+!24 = !DILocation(line: 13, column: 5, scope: !25)
+!25 = distinct !DILexicalBlock(scope: !21, file: !1, line: 11, column: 27)
+!28 = !DILocation(line: 13, column: 25, scope: !25)
|
|
This does result in some is_stmts being added that are unnecessary, i.e. cases where the last instruction in the current BB's sole predecessor has the same line number as the first instruction in the current BB. We could potentially include that in the analysis, but I think there's no way to do so without doing a more complex (though probably not too expensive) analysis in DwarfDebug; we also wouldn't be able to account for indirect branches. I think in any case this patch is an improvement if we're considering such cases, as not only might the current BB not have the previous BB as its sole predecessor, but the previous BB might not even be a predecessor - making the is_stmt-omission completely incorrect. |
pogo59
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this. I should have considered BB transitions when I did the line-0 accommodation in the first place. See question about the new test.
| ; RUN: %llc_dwarf -O0 -filetype=obj < %s | llvm-dwarfdump --debug-line - | FileCheck %s | ||
|
|
||
| ; CHECK: {{0x[0-9a-f]+}} 13 5 {{.+}} is_stmt | ||
| ; CHECK-NEXT: {{0x[0-9a-f]+}} 13 25 {{.+}} is_stmt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ISTM the first check is for block if.then2, the second is for if.else4? Should there also be one for if.end8?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for this test specifically, as only these two instructions have non-0 line numbers (excluding 0x0, which has the function header).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right, !dbg isn't inherited the way it is for asm.
But, you could use the same !dbg for both, showing that it really is the BB that triggers it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, trying to make this work has revealed another an issue - when the lines are actually identical, they will be merged into a single entry (whether this patch is in effect or not), which any jump to the branch would land in the middle of. I'm not sure whether that would result in a step in the debugger or not; it prevents the test from working, at the very least.
pogo59
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with a test suggestion.
| ; RUN: %llc_dwarf -O0 -filetype=obj < %s | llvm-dwarfdump --debug-line - | FileCheck %s | ||
|
|
||
| ; CHECK: {{0x[0-9a-f]+}} 13 5 {{.+}} is_stmt | ||
| ; CHECK-NEXT: {{0x[0-9a-f]+}} 13 25 {{.+}} is_stmt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right, !dbg isn't inherited the way it is for asm.
But, you could use the same !dbg for both, showing that it really is the BB that triggers it.
jryans
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a very sensible improvement, thanks for working on this! 😄
|
Had a free moment so updated this - the previous version (as noted in my response to @pogo59 above) would still merge adjacent DLs across BB boundaries if they were exactly identical, rather than differing in column number; I've changed this to now only merge adjacent DLs within a BB, which strictly reduces the cases where we'll merge identical DLs. |
jryans
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like a good tweak, looks good overall to me.
pogo59
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Smooth. Still LGTM.
|
Updated a couple more tests whose outputs have changed as of the latest functional change - all of these look "correct" to me, in that they're adding is_stmt to instructions at the start of a basic block, or in |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/190/builds/4660 Here is the relevant piece of the build log for the reference |
|
I added the test to the |
…n in BB (llvm#105524)" Reverted (along with the NFC followup fix) due to failures on buildbots. This reverts commit 3ef37e2, and commit 616f7d3.
…n in BB (#105524)" Reverted (along with the NFC followup fix) due to buildbot failure: https://lab.llvm.org/buildbot/#/builders/160/builds/4142 This reverts commit 3ef37e2, and commit 616f7d3.
|
Hm, another buildbot error - this time |
|
We're seeing a pretty significant size regression from this change in self-host builds of clang and some other internal binaries we track. For an internal debug build (with Split DWARF, compressed debug info sections in object files but uncompressed in the linked binary) we saw a total 3.6% increase in the size of the clang binary, a 12.7% increase in the size of the .debug_line section in object files, and a 18.3% increase in the size of .debug_line in the linked executable. Might be worth reverting this and reconsidering whether the growth is acceptable or whether there's a more targeted fix that can be made? |
I have another patch up now that tries to be more cautious with its application of is_stmt: #108251 It currently needs a fix, and confirmation that it has suitable performance, but it should result in fewer is_stmts being applied. With that said, if it doesn't bring the numbers down that much, there is a reasonable question about correctness vs size - it might not be feasible to compute the minimum set of is_stmts that don't result in missing steps, so there might be either a tradeoff to make, or a flag to add. |
|
Could you put this behind a flag or revert this for now (to see if you can find a solution that's sufficiently cautious to lower the size cost to something more acceptable) - I think the growth is a bit big to take by default? (perhaps other people have other opinions, though, in which case I'm open to that/we can take the hit if that's the prevailing opinion) |
I can revert it, at least until the next patch is ready to land. I've tried to investigate this locally to see if the situation is improved by the next patch along, but my local build isn't showing anywhere close to the same increase - I've got .debug_line size increasing by 1.3% as a result of this commit. When you say "Debug", does that mean the CMake "Debug" option, or a release build with debug info enabled? Edit: RelWithDebInfo shows an even smaller difference, which lines up with the compile time tracker results (showing a larger relative increase with optimizations enabled). |
…on in BB (#105524)" Reverted due to large .debug_line size regressions for some configurations; work currently in place to improve the output of this behaviour in PR #108251. This patch also modifies two tests that were created or modified after the original commit landed and are affected by the revert: llvm/test/CodeGen/X86/pseudo_cmov_lower2.ll llvm/test/DebugInfo/X86/empty-line-info.ll This reverts commit 5fef40c.
…on in BB (llvm#105524)" Reverted due to large .debug_line size regressions for some configurations; work currently in place to improve the output of this behaviour in PR llvm#108251. This patch also modifies two tests that were created or modified after the original commit landed and are affected by the revert: llvm/test/CodeGen/X86/pseudo_cmov_lower2.ll llvm/test/DebugInfo/X86/empty-line-info.ll This reverts commit 5fef40c.
…on in BB (llvm#105524)" Reverted due to large .debug_line size regressions for some configurations; work currently in place to improve the output of this behaviour in PR llvm#108251. This patch also modifies two tests that were created or modified after the original commit landed and are affected by the revert: llvm/test/CodeGen/X86/pseudo_cmov_lower2.ll llvm/test/DebugInfo/X86/empty-line-info.ll This reverts commit 5fef40c.
…108251) This patch follows on from the changes made in #105524, by adding an additional heuristic that prevents us from applying the start-of-MBB is_stmt flag when we can see that, for all direct branches to the MBB, the last line stepped on before the branch is the same as the first line of the MBB. This is mainly to prevent certain pathological cases, such as macros that expand to multiple basic blocks that all have the same source location, from giving us repeated steps on the same line. This approach is not comprehensive, since it relies on analyzeBranch to read edges, but the default fallback of applying is_stmt may lead only to useless steps in some cases, rather than skipping useful steps altogether.
Fixes: #104695
This patch adds the is_stmt flag to line table entries for the first instruction with a non-0 line location in each basic block, to ensure that it will be used for stepping even if the last instruction in the previous basic block had the same line number; this is important for cases where the new BB is reachable from BBs other than the preceding block.