Skip to content

Conversation

@SLTozer
Copy link
Contributor

@SLTozer SLTozer commented Aug 21, 2024

Fixes: #104695

This patch adds the is_stmt flag to line table entries for the first instruction with a non-0 line location in each basic block, to ensure that it will be used for stepping even if the last instruction in the previous basic block had the same line number; this is important for cases where the new BB is reachable from BBs other than the preceding block.

Fixes: llvm#104695

This patch adds the is_stmt flag to line table entries for the first
instruction with a non-0 line location in each basic block, to ensure
that it will be used for stepping even if the last instruction in the
previous basic block had the same line number; this is important for cases
where the new BB is reachable from BBs other than the preceding block.
@llvmbot
Copy link
Member

llvmbot commented Aug 21, 2024

@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-debuginfo

Author: Stephen Tozer (SLTozer)

Changes

Fixes: #104695

This patch adds the is_stmt flag to line table entries for the first instruction with a non-0 line location in each basic block, to ensure that it will be used for stepping even if the last instruction in the previous basic block had the same line number; this is important for cases where the new BB is reachable from BBs other than the preceding block.


Full diff: https://github.com/llvm/llvm-project/pull/105524.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp (+5-2)
  • (added) llvm/test/DebugInfo/Generic/is_stmt-at-block-start.ll (+38)
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
index f111b4aea06f1b..aa11200731e1b3 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
@@ -2119,9 +2119,12 @@ void DwarfDebug::beginInstruction(const MachineInstr *MI) {
     PrologEndLoc = DebugLoc();
   }
   // If the line changed, we call that a new statement; unless we went to
-  // line 0 and came back, in which case it is not a new statement.
+  // line 0 and came back, in which case it is not a new statement. We also
+  // mark is_stmt for the first non-0 line in each BB, in case a predecessor BB
+  // ends with a different line.
   unsigned OldLine = PrevInstLoc ? PrevInstLoc.getLine() : LastAsmLine;
-  if (DL.getLine() && DL.getLine() != OldLine)
+  if (DL.getLine() &&
+      (DL.getLine() != OldLine || PrevInstBB != MI->getParent()))
     Flags |= DWARF2_FLAG_IS_STMT;
 
   const MDNode *Scope = DL.getScope();
diff --git a/llvm/test/DebugInfo/Generic/is_stmt-at-block-start.ll b/llvm/test/DebugInfo/Generic/is_stmt-at-block-start.ll
new file mode 100644
index 00000000000000..21efceab38eca0
--- /dev/null
+++ b/llvm/test/DebugInfo/Generic/is_stmt-at-block-start.ll
@@ -0,0 +1,38 @@
+;; Checks that when an instruction at the start of a BasicBlock has the same
+;; DebugLoc as the instruction at the end of the previous BasicBlock, we add
+;; is_stmt to the new line, to ensure that we still step on it if we arrive from
+;; a BasicBlock other than the immediately preceding one.
+
+; RUN: %llc_dwarf -O0 -filetype=obj < %s | llvm-dwarfdump --debug-line - | FileCheck %s
+
+; CHECK:      {{0x[0-9a-f]+}}     13      5 {{.+}} is_stmt
+; CHECK-NEXT: {{0x[0-9a-f]+}}     13     25 {{.+}} is_stmt
+
+define void @_Z1fi(i1 %cond) !dbg !21 {
+entry:
+  br i1 %cond, label %if.then2, label %if.else4
+
+if.then2:                                         ; preds = %entry
+  br label %if.end8, !dbg !24
+
+if.else4:                                         ; preds = %entry
+  %0 = load i32, ptr null, align 4, !dbg !28
+  %call5 = call i1 null(i32 %0)
+  ret void
+
+if.end8:                                          ; preds = %if.then2
+  ret void
+}
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!20}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !1, producer: "clang version 20.0.0", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: None)
+!1 = !DIFile(filename: "test.cpp", directory: "/home/gbtozers/dev/upstream-llvm")
+!20 = !{i32 2, !"Debug Info Version", i32 3}
+!21 = distinct !DISubprogram(name: "f", linkageName: "_Z1fi", scope: !1, file: !1, line: 7, type: !22, scopeLine: 7, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition, unit: !0)
+!22 = distinct !DISubroutineType(types: !23)
+!23 = !{null}
+!24 = !DILocation(line: 13, column: 5, scope: !25)
+!25 = distinct !DILexicalBlock(scope: !21, file: !1, line: 11, column: 27)
+!28 = !DILocation(line: 13, column: 25, scope: !25)

@SLTozer
Copy link
Contributor Author

SLTozer commented Aug 21, 2024

This does result in some is_stmts being added that are unnecessary, i.e. cases where the last instruction in the current BB's sole predecessor has the same line number as the first instruction in the current BB. We could potentially include that in the analysis, but I think there's no way to do so without doing a more complex (though probably not too expensive) analysis in DwarfDebug; we also wouldn't be able to account for indirect branches.

I think in any case this patch is an improvement if we're considering such cases, as not only might the current BB not have the previous BB as its sole predecessor, but the previous BB might not even be a predecessor - making the is_stmt-omission completely incorrect.

Copy link
Collaborator

@pogo59 pogo59 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this. I should have considered BB transitions when I did the line-0 accommodation in the first place. See question about the new test.

; RUN: %llc_dwarf -O0 -filetype=obj < %s | llvm-dwarfdump --debug-line - | FileCheck %s

; CHECK: {{0x[0-9a-f]+}} 13 5 {{.+}} is_stmt
; CHECK-NEXT: {{0x[0-9a-f]+}} 13 25 {{.+}} is_stmt
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ISTM the first check is for block if.then2, the second is for if.else4? Should there also be one for if.end8?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for this test specifically, as only these two instructions have non-0 line numbers (excluding 0x0, which has the function header).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right, !dbg isn't inherited the way it is for asm.
But, you could use the same !dbg for both, showing that it really is the BB that triggers it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, trying to make this work has revealed another an issue - when the lines are actually identical, they will be merged into a single entry (whether this patch is in effect or not), which any jump to the branch would land in the middle of. I'm not sure whether that would result in a step in the debugger or not; it prevents the test from working, at the very least.

@SLTozer SLTozer requested a review from adrian-prantl August 21, 2024 14:24
Copy link
Collaborator

@pogo59 pogo59 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a test suggestion.

; RUN: %llc_dwarf -O0 -filetype=obj < %s | llvm-dwarfdump --debug-line - | FileCheck %s

; CHECK: {{0x[0-9a-f]+}} 13 5 {{.+}} is_stmt
; CHECK-NEXT: {{0x[0-9a-f]+}} 13 25 {{.+}} is_stmt
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right, !dbg isn't inherited the way it is for asm.
But, you could use the same !dbg for both, showing that it really is the BB that triggers it.

Copy link
Member

@jryans jryans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a very sensible improvement, thanks for working on this! 😄

@SLTozer
Copy link
Contributor Author

SLTozer commented Aug 28, 2024

Had a free moment so updated this - the previous version (as noted in my response to @pogo59 above) would still merge adjacent DLs across BB boundaries if they were exactly identical, rather than differing in column number; I've changed this to now only merge adjacent DLs within a BB, which strictly reduces the cases where we'll merge identical DLs.

Copy link
Member

@jryans jryans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a good tweak, looks good overall to me.

Copy link
Collaborator

@pogo59 pogo59 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Smooth. Still LGTM.

@SLTozer
Copy link
Contributor Author

SLTozer commented Aug 28, 2024

Updated a couple more tests whose outputs have changed as of the latest functional change - all of these look "correct" to me, in that they're adding is_stmt to instructions at the start of a basic block, or in empty-inline.mir, adding a few new entries that prevent a line-0 instruction at the start of a basic block from having a non-0 fallthrough location, and adding a new is_stmt entry at the start of a block (similar to the case in the initial report).

@SLTozer SLTozer merged commit 3ef37e2 into llvm:main Aug 29, 2024
@llvm-ci
Copy link
Collaborator

llvm-ci commented Aug 29, 2024

LLVM Buildbot has detected a new failure on builder llvm-clang-aarch64-darwin running on doug-worker-4 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/190/builds/4660

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: DebugInfo/Generic/is_stmt-at-block-start.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 6: /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/llc -O0 -filetype=obj < /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/llvm/test/DebugInfo/Generic/is_stmt-at-block-start.ll | /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/llvm-dwarfdump --debug-line - | /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/FileCheck /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/llvm/test/DebugInfo/Generic/is_stmt-at-block-start.ll
+ /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/llc -O0 -filetype=obj
+ /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/llvm-dwarfdump --debug-line -
+ /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/FileCheck /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/llvm/test/DebugInfo/Generic/is_stmt-at-block-start.ll
�[1m/Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/llvm/test/DebugInfo/Generic/is_stmt-at-block-start.ll:9:15: �[0m�[0;1;31merror: �[0m�[1mCHECK-NEXT: expected string not found in input
�[0m; CHECK-NEXT: {{0x[0-9a-f]+}} 13 5 {{.+}} is_stmt
�[0;1;32m              ^
�[0m�[1m<stdin>:37:40: �[0m�[0;1;30mnote: �[0m�[1mscanning from here
�[0m0x000000000000000c 13 5 1 0 0 0 is_stmt prologue_end
�[0;1;32m                                       ^
�[0m�[1m<stdin>:39:8: �[0m�[0;1;30mnote: �[0m�[1mpossible intended match here
�[0m0x0000000000000014 13 5 1 0 0 0 
�[0;1;32m       ^
�[0m
Input file: <stdin>
Check file: /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/llvm/test/DebugInfo/Generic/is_stmt-at-block-start.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
�[1m�[0m�[0;1;30m          1: �[0m�[1m�[0;1;46m-: file format Mach-O arm64 �[0m
�[0;1;30m          2: �[0m�[1m�[0;1;46m �[0m
�[0;1;30m          3: �[0m�[1m�[0;1;46m.debug_line contents: �[0m
�[0;1;30m          4: �[0m�[1m�[0;1;46mdebug_line[0x00000000] �[0m
�[0;1;30m          5: �[0m�[1m�[0;1;46mLine table prologue: �[0m
�[0;1;30m          6: �[0m�[1m�[0;1;46m total_length: 0x00000045 �[0m
�[0;1;30m          7: �[0m�[1m�[0;1;46m format: DWARF32 �[0m
�[0;1;30m          8: �[0m�[1m�[0;1;46m version: 4 �[0m
�[0;1;30m          9: �[0m�[1m�[0;1;46m prologue_length: 0x00000020 �[0m
�[0;1;30m         10: �[0m�[1m�[0;1;46m min_inst_length: 1 �[0m
�[0;1;30m         11: �[0m�[1m�[0;1;46mmax_ops_per_inst: 1 �[0m
�[0;1;30m         12: �[0m�[1m�[0;1;46m default_is_stmt: 1 �[0m
�[0;1;30m         13: �[0m�[1m�[0;1;46m line_base: -5 �[0m
�[0;1;30m         14: �[0m�[1m�[0;1;46m line_range: 14 �[0m
�[0;1;30m         15: �[0m�[1m�[0;1;46m opcode_base: 13 �[0m
�[0;1;30m         16: �[0m�[1m�[0;1;46mstandard_opcode_lengths[DW_LNS_copy] = 0 �[0m
�[0;1;30m         17: �[0m�[1m�[0;1;46mstandard_opcode_lengths[DW_LNS_advance_pc] = 1 �[0m
�[0;1;30m         18: �[0m�[1m�[0;1;46mstandard_opcode_lengths[DW_LNS_advance_line] = 1 �[0m
�[0;1;30m         19: �[0m�[1m�[0;1;46mstandard_opcode_lengths[DW_LNS_set_file] = 1 �[0m
�[0;1;30m         20: �[0m�[1m�[0;1;46mstandard_opcode_lengths[DW_LNS_set_column] = 1 �[0m
�[0;1;30m         21: �[0m�[1m�[0;1;46mstandard_opcode_lengths[DW_LNS_negate_stmt] = 0 �[0m
�[0;1;30m         22: �[0m�[1m�[0;1;46mstandard_opcode_lengths[DW_LNS_set_basic_block] = 0 �[0m
�[0;1;30m         23: �[0m�[1m�[0;1;46mstandard_opcode_lengths[DW_LNS_const_add_pc] = 0 �[0m
�[0;1;30m         24: �[0m�[1m�[0;1;46mstandard_opcode_lengths[DW_LNS_fixed_advance_pc] = 1 �[0m
...

@SLTozer
Copy link
Contributor Author

SLTozer commented Aug 29, 2024

I added the test to the Generic/ test dir since it doesn't test target-specific behaviour, but it does rely on the basic block layout which varies between targets; added commit 616f7d3 to move the test into X86/, which should fix the buildbot errors.

SLTozer added a commit to SLTozer/llvm-project that referenced this pull request Aug 29, 2024
…n in BB (llvm#105524)"

Reverted (along with the NFC followup fix) due to failures on buildbots.

This reverts commit 3ef37e2, and commit
616f7d3.
SLTozer added a commit that referenced this pull request Aug 29, 2024
…n in BB (#105524)"

Reverted (along with the NFC followup fix) due to buildbot failure:
https://lab.llvm.org/buildbot/#/builders/160/builds/4142

This reverts commit 3ef37e2, and commit
616f7d3.
@SLTozer
Copy link
Contributor Author

SLTozer commented Aug 29, 2024

Hm, another buildbot error - this time createMCCodeEmitter failed, which seems like it will also be a trivial fix, but I'll revert for now.

SLTozer added a commit that referenced this pull request Aug 29, 2024
…on in BB (#105524)"

Fixes the previous buildbot error by adding an explicit triple to the test,
ensuring that llc can produce a valid object file.

This reverts commit 926f097.
@dwblaikie
Copy link
Collaborator

We're seeing a pretty significant size regression from this change in self-host builds of clang and some other internal binaries we track.

For an internal debug build (with Split DWARF, compressed debug info sections in object files but uncompressed in the linked binary) we saw a total 3.6% increase in the size of the clang binary, a 12.7% increase in the size of the .debug_line section in object files, and a 18.3% increase in the size of .debug_line in the linked executable.

Might be worth reverting this and reconsidering whether the growth is acceptable or whether there's a more targeted fix that can be made?

@SLTozer
Copy link
Contributor Author

SLTozer commented Sep 13, 2024

Might be worth reverting this and reconsidering whether the growth is acceptable or whether there's a more targeted fix that can be made?

I have another patch up now that tries to be more cautious with its application of is_stmt: #108251

It currently needs a fix, and confirmation that it has suitable performance, but it should result in fewer is_stmts being applied. With that said, if it doesn't bring the numbers down that much, there is a reasonable question about correctness vs size - it might not be feasible to compute the minimum set of is_stmts that don't result in missing steps, so there might be either a tradeoff to make, or a flag to add.

@dwblaikie
Copy link
Collaborator

Could you put this behind a flag or revert this for now (to see if you can find a solution that's sufficiently cautious to lower the size cost to something more acceptable) - I think the growth is a bit big to take by default? (perhaps other people have other opinions, though, in which case I'm open to that/we can take the hit if that's the prevailing opinion)

@SLTozer
Copy link
Contributor Author

SLTozer commented Sep 17, 2024

Could you put this behind a flag or revert this for now

I can revert it, at least until the next patch is ready to land. I've tried to investigate this locally to see if the situation is improved by the next patch along, but my local build isn't showing anywhere close to the same increase - I've got .debug_line size increasing by 1.3% as a result of this commit. When you say "Debug", does that mean the CMake "Debug" option, or a release build with debug info enabled? Edit: RelWithDebInfo shows an even smaller difference, which lines up with the compile time tracker results (showing a larger relative increase with optimizations enabled).

SLTozer added a commit that referenced this pull request Sep 17, 2024
…on in BB (#105524)"

Reverted due to large .debug_line size regressions for some
configurations; work currently in place to improve the output of this
behaviour in PR #108251.

This patch also modifies two tests that were created or modified after
the original commit landed and are affected by the revert:

  llvm/test/CodeGen/X86/pseudo_cmov_lower2.ll
  llvm/test/DebugInfo/X86/empty-line-info.ll

This reverts commit 5fef40c.
hamphet pushed a commit to hamphet/llvm-project that referenced this pull request Sep 18, 2024
…on in BB (llvm#105524)"

Reverted due to large .debug_line size regressions for some
configurations; work currently in place to improve the output of this
behaviour in PR llvm#108251.

This patch also modifies two tests that were created or modified after
the original commit landed and are affected by the revert:

  llvm/test/CodeGen/X86/pseudo_cmov_lower2.ll
  llvm/test/DebugInfo/X86/empty-line-info.ll

This reverts commit 5fef40c.
tmsri pushed a commit to tmsri/llvm-project that referenced this pull request Sep 19, 2024
…on in BB (llvm#105524)"

Reverted due to large .debug_line size regressions for some
configurations; work currently in place to improve the output of this
behaviour in PR llvm#108251.

This patch also modifies two tests that were created or modified after
the original commit landed and are affected by the revert:

  llvm/test/CodeGen/X86/pseudo_cmov_lower2.ll
  llvm/test/DebugInfo/X86/empty-line-info.ll

This reverts commit 5fef40c.
SLTozer added a commit that referenced this pull request Nov 12, 2024
…108251)

This patch follows on from the changes made in #105524, by adding an
additional heuristic that prevents us from applying the start-of-MBB
is_stmt flag when we can see that, for all direct branches to the MBB,
the last line stepped on before the branch is the same as the first line
of the MBB. This is mainly to prevent certain pathological cases, such
as macros that expand to multiple basic blocks that all have the same
source location, from giving us repeated steps on the same line. This
approach is not comprehensive, since it relies on analyzeBranch to read
edges, but the default fallback of applying is_stmt may lead only to
useless steps in some cases, rather than skipping useful steps
altogether.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Missing is_stmt flag for 'else if' conditions

6 participants