Skip to content

Conversation

@jmorse
Copy link
Member

@jmorse jmorse commented Nov 22, 2024

The optimiser will produce empty blocks that are unconditionally executed according to the CFG -- while it may not be meaningful code, and won't get a prologue_end position, we need to not crash on this input.

The fault comes from assuming that there's always a next block with some instructions in it, that will eventually produce some meaningful control flow to stop at -- in the given reproducer in issue #117206 this isn't true, because the function terminates with unreachable. Thus, I've refactored the "get next instruction logic" into a helper that'll step through all blocks and terminate if there aren't any more.

Reproducer from @aeubanks

The optimiser will produce empty blocks that are unconditionally executed
according to the CFG -- while it may not be meaningful code, and won't get
a prologue_end position, we need to not crash on this input.
@llvmbot
Copy link
Member

llvmbot commented Nov 22, 2024

@llvm/pr-subscribers-debuginfo

Author: Jeremy Morse (jmorse)

Changes

The optimiser will produce empty blocks that are unconditionally executed according to the CFG -- while it may not be meaningful code, and won't get a prologue_end position, we need to not crash on this input.

The fault comes from assuming that there's always a next block with some instructions in it, that will eventually produce some meaningful control flow to stop at -- in the given reproducer in issue #117206 this isn't true, because the function terminates with unreachable. Thus, I've refactored the "get next instruction logic" into a helper that'll step through all blocks and terminate if there aren't any more.

Reproducer from @aeubanks


Full diff: https://github.com/llvm/llvm-project/pull/117320.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp (+34-23)
  • (modified) llvm/test/DebugInfo/MIR/X86/dbg-prologue-backup-loc2.mir (+49)
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
index affc72b5950fd9..09adf79dac4787 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
@@ -2192,14 +2192,42 @@ findPrologueEndLoc(const MachineFunction *MF) {
   // better off synthesising an early prologue_end.
   auto CurBlock = MF->begin();
   auto CurInst = CurBlock->begin();
-  while (true) {
+
+  // Find the initial instruction, we're guaranteed one by the caller.
+  while (CurBlock->empty())
+    CurInst = (++CurBlock)->begin();
+
+  // Helper function for stepping through the initial sequence of
+  // unconditionally executed instructions. The optimiser will also chuck in
+  // sequences of empty blocks alas.
+  auto getNextInst = [&CurBlock, &CurInst, MF]() -> bool {
+    // We've reached the end of the block. Did we just look at a terminator?
+    if (CurInst->isTerminator()) {
+      // Some kind of "real" control flow is occurring. At the very least
+      // we would have to start exploring the CFG, a good signal that the
+      // prologue is over.
+      return false;
+    }
+
+    // If we've already fallen through into a loop, don't fall through
+    // further, use a backup-location.
+    if (CurBlock->pred_size() > 1)
+      return false;
+
+    // Fall-through from entry to the next block. This is common at -O0 when
+    // there's no initialisation in the function. Bail if we're also at the
+    // end of the function, or the remaining blocks have no instructions.
     // Skip empty blocks, in rare cases the entry can be empty.
-    if (CurInst == CurBlock->end()) {
+    do {
       ++CurBlock;
-      CurInst = CurBlock->begin();
-      continue;
-    }
+      if (CurBlock == MF->end())
+        return false;
+    } while (CurBlock->empty());
+    CurInst = CurBlock->begin();
+    return true;
+  };
 
+  while (true) {
     // Check whether this non-meta instruction a good position for prologue_end.
     if (!CurInst->isMetaInstruction()) {
       auto FoundInst = ExamineInst(*CurInst);
@@ -2216,25 +2244,8 @@ findPrologueEndLoc(const MachineFunction *MF) {
       continue;
     }
 
-    // We've reached the end of the block. Did we just look at a terminator?
-    if (CurInst->isTerminator()) {
-      // Some kind of "real" control flow is occurring. At the very least
-      // we would have to start exploring the CFG, a good signal that the
-      // prologue is over.
-      break;
-    }
-
-    // If we've already fallen through into a loop, don't fall through
-    // further, use a backup-location.
-    if (CurBlock->pred_size() > 1)
-      break;
-
-    // Fall-through from entry to the next block. This is common at -O0 when
-    // there's no initialisation in the function. Bail if we're also at the
-    // end of the function.
-    if (++CurBlock == MF->end())
+    if (!getNextInst())
       break;
-    CurInst = CurBlock->begin();
   }
 
   // We couldn't find any source-location, suggesting all meaningful information
diff --git a/llvm/test/DebugInfo/MIR/X86/dbg-prologue-backup-loc2.mir b/llvm/test/DebugInfo/MIR/X86/dbg-prologue-backup-loc2.mir
index c27655ac801316..29cdbc0853365b 100644
--- a/llvm/test/DebugInfo/MIR/X86/dbg-prologue-backup-loc2.mir
+++ b/llvm/test/DebugInfo/MIR/X86/dbg-prologue-backup-loc2.mir
@@ -31,6 +31,13 @@
 # CHECK-NEXT:   .LBB0_1:
 # CHECK-LABEL:  addl    %esi, %edx
 
+## Second function in this file: test that we don't crash when having trailing
+## empty blocks and no location for a prologue. Test that a .loc is produced,
+## with an implicit-not check for there being no prologue_end.
+#
+# CHECK-LABEL: f:
+# CHECK: .loc    0 1234 0
+
  
 --- |
   ; ModuleID = 'out2.ll'
@@ -66,6 +73,17 @@
     ret i32 0, !dbg !17
   }
   
+  define void @f() !dbg !18 {
+  entry:
+    %0 = call ptr @llvm.returnaddress(i32 0)
+    br label %do.body
+
+  do.body:
+    unreachable
+  }
+
+  declare ptr @llvm.returnaddress(i32 immarg)
+
   !llvm.dbg.cu = !{!2}
   !llvm.module.flags = !{!6, !7}
   !llvm.ident = !{!8}
@@ -88,6 +106,7 @@
   !15 = distinct !DILexicalBlock(scope: !9, file: !3, line: 4, column: 15)
   !16 = !DILocation(line: 6, column: 9, scope: !15)
   !17 = !DILocation(line: 8, column: 3, scope: !9)
+  !18 = distinct !DISubprogram(name: "f", scope: !3, file: !3, line: 37, type: !10, scopeLine: 1234, flags: DIFlagPrototyped, spFlags: DISPFlagLocalToUnit | DISPFlagDefinition, unit: !2)
 
 ...
 ---
@@ -132,3 +151,33 @@ body:             |
     RET64 $eax, debug-location !17
 
 ...
+---
+name:            f
+alignment:       16
+tracksRegLiveness: true
+noPhis:          true
+isSSA:           false
+noVRegs:         true
+hasFakeUses:     false
+tracksDebugUserValues: true
+frameInfo:
+  stackSize:       8
+  offsetAdjustment: -8
+  maxAlignment:    1
+  maxCallFrameSize: 0
+  isCalleeSavedInfoValid: true
+fixedStack:
+  - { id: 0, type: spill-slot, offset: -16, size: 8, alignment: 16 }
+machineFunctionInfo:
+  amxProgModel:    None
+body:             |
+  bb.0.entry:
+    frame-setup PUSH64r killed $rbp, implicit-def $rsp, implicit $rsp
+    frame-setup CFI_INSTRUCTION def_cfa_offset 16
+    frame-setup CFI_INSTRUCTION offset $rbp, -16
+    $rbp = frame-setup MOV64rr $rsp
+    frame-setup CFI_INSTRUCTION def_cfa_register $rbp
+
+  bb.1.do.body:
+
+...

Copy link
Contributor

@OCHyams OCHyams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with some nits

// there's no initialisation in the function. Bail if we're also at the
// end of the function.
if (++CurBlock == MF->end())
if (!getNextInst())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get what this is doing but from the name getNextInst() I think it sounds more like it's doing auto NextInst = std::next(CurInst) as above. Maybe getNextBlockInst ? That doesn't sound great either. findNextBlock? Ymmv, feel free to ignore.

Comment on lines 2200 to 2202
// Helper function for stepping through the initial sequence of
// unconditionally executed instructions. The optimiser will also chuck in
// sequences of empty blocks alas.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if "The optimiser will also chuck in sequences of empty blocks alas." is slightly too informal (could cause trouble for folks with English as a 2nd language?). Also probably worth moving this part of the into the lambda body as it seems more like an implementation detail?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah, this is written in my default "Jeremy-flippancy" dialect

// Find the initial instruction, we're guaranteed one by the caller.
while (CurBlock->empty())
CurInst = (++CurBlock)->begin();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Find the initial instruction, we're guaranteed one by the caller. Can we assert CurInst != CurBlock->end() here to check that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose so, yeah -- you mean immediately after the loop terminates, yes? (I'll send that up in the patch to demonstrate)

Copy link
Contributor

@OCHyams OCHyams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still LGTM

@jmorse jmorse merged commit 624e52b into llvm:main Nov 26, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants