Skip to content

Conversation

spaits
Copy link
Contributor

@spaits spaits commented Sep 13, 2025

Reland #157363.
Fixes #148052.

This PR differs in three thing from #157363:

  • Properly detaching and emptying the catchret/cleanupret block.
  • Not explicitly handling catchswitches anymore.
  • Move some deletion and detaching related code into static function and reuse them for EH Pad handling.

The first and third points are pretty self explanatory :)

The second point is not that much. If I remove my handling code for the catchswithc, then we may have catchpad within poison, which is similar semanitic (or at leas I imagine, not offically, since on paper there shouldn't be poison value) to catchpad within none. But here is the catch: catchpad within none is not valid llvm-ir. I think, temporarily, when removing basic blocks it may work.

@llvmbot
Copy link
Member

llvmbot commented Sep 13, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Gábor Spaits (spaits)

Changes

This is a fix for a crash reported under PR #157363 that fixes issue #148052.

This PR contains two things:

  • Properly detaching and emptying the catchret/cleanupret block.
  • Not handling catchswitches anymore.

The first point is pretty explanatory :)

The second point is not that much. If I remove my handling code for the catchswithc, then we may have catchpad within poison, which is similar semanitic (or at leas I imagine, not offically) to catchpad within none. But here is the catch: catchpad within none is not valid llvm-ir. I think, temporarily, when removing basic blocks it may work.


Full diff: https://github.com/llvm/llvm-project/pull/158435.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Utils/BasicBlockUtils.cpp (+51-53)
  • (modified) llvm/test/Transforms/SimplifyCFG/unreachable-multi-basic-block-funclet.ll (+37)
diff --git a/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp b/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
index d2391e166f942..7cd9dbfb56095 100644
--- a/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
+++ b/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
@@ -58,36 +58,64 @@ static cl::opt<unsigned> MaxDeoptOrUnreachableSuccessorCheckDepth(
              "is followed by a block that either has a terminating "
              "deoptimizing call or is terminated with an unreachable"));
 
-static void replaceFuncletPadsRetWithUnreachable(Instruction &I) {
+static void zapAllInstructionInDeadBasicBlock(BasicBlock *BB) {
+  while (!BB->empty()) {
+    Instruction &I = BB->back();
+    // If this instruction is used, replace uses with an arbitrary value.
+    // Because control flow can't get here, we don't care what we replace the
+    // value with. Note that since this block is unreachable, and all values
+    // contained within it must dominate their uses, that all uses will
+    // eventually be removed (they are themselves dead).
+    if (!I.use_empty())
+      I.replaceAllUsesWith(PoisonValue::get(I.getType()));
+    BB->back().eraseFromParent();
+  }
+  new UnreachableInst(BB->getContext(), BB);
+  assert(BB->size() == 1 && isa<UnreachableInst>(BB->getTerminator()) &&
+           "The successor list of BB isn't empty before "
+           "applying corresponding DTU updates.");
+}
+
+static void deleteBasicBlockFromSuccessor(
+    BasicBlock *BB, SmallVectorImpl<DominatorTree::UpdateType> *Updates,
+    bool KeepOneInputPHIs) {
+  SmallPtrSet<BasicBlock *, 4> UniqueSuccessors;
+  for (BasicBlock *Succ : successors(BB)) {
+    Succ->removePredecessor(BB, KeepOneInputPHIs);
+    if (Updates && UniqueSuccessors.insert(Succ).second)
+      Updates->push_back({DominatorTree::Delete, BB, Succ});
+  }
+}
+
+static void replaceFuncletPadsRetWithUnreachable(
+    Instruction &I, SmallVectorImpl<DominatorTree::UpdateType> *Updates,
+    bool KeepOneInputPHIs) {
   assert(isa<FuncletPadInst>(I) && "Instruction must be a funclet pad!");
   for (User *User : make_early_inc_range(I.users())) {
     Instruction *ReturnInstr = dyn_cast<Instruction>(User);
+    // If we have a cleanupret or catchret block, replace it with just an
+    // unreachable.
     if (isa<CatchReturnInst>(ReturnInstr) ||
         isa<CleanupReturnInst>(ReturnInstr)) {
       BasicBlock *ReturnInstrBB = ReturnInstr->getParent();
-      ReturnInstr->eraseFromParent();
-      new UnreachableInst(ReturnInstrBB->getContext(), ReturnInstrBB);
+      // This catchret or catchpad basic block is detached now. Let the
+      // successors know it.
+      deleteBasicBlockFromSuccessor(ReturnInstrBB, Updates, KeepOneInputPHIs);
+      zapAllInstructionInDeadBasicBlock(ReturnInstrBB);
     }
   }
 }
 
-void llvm::detachDeadBlocks(
-    ArrayRef<BasicBlock *> BBs,
-    SmallVectorImpl<DominatorTree::UpdateType> *Updates,
-    bool KeepOneInputPHIs) {
+void llvm::detachDeadBlocks(ArrayRef<BasicBlock *> BBs,
+                            SmallVectorImpl<DominatorTree::UpdateType> *Updates,
+                            bool KeepOneInputPHIs) {
   for (auto *BB : BBs) {
     // Loop through all of our successors and make sure they know that one
     // of their predecessors is going away.
     SmallPtrSet<BasicBlock *, 4> UniqueSuccessors;
-    for (BasicBlock *Succ : successors(BB)) {
-      Succ->removePredecessor(BB, KeepOneInputPHIs);
-      if (Updates && UniqueSuccessors.insert(Succ).second)
-        Updates->push_back({DominatorTree::Delete, BB, Succ});
-    }
-
-    // Zap all the instructions in the block.
-    while (!BB->empty()) {
-      Instruction &I = BB->back();
+    auto NonFirstPhiIt = BB->getFirstNonPHIIt();
+    if (NonFirstPhiIt != BB->end()) {
+      Instruction &I = *NonFirstPhiIt;
       // Exception handling funclets need to be explicitly addressed.
       // These funclets must begin with cleanuppad or catchpad and end with
       // cleanupred or catchret. The return instructions can be in different
@@ -95,42 +123,12 @@ void llvm::detachDeadBlocks(
       // first block, the we would have possible cleanupret and catchret
       // instructions with poison arguments, which wouldn't be valid.
       if (isa<FuncletPadInst>(I))
-        replaceFuncletPadsRetWithUnreachable(I);
-
-      // Catchswitch instructions have handlers, that must be catchpads and
-      // an unwind label, that is either a catchpad or catchswitch.
-      if (CatchSwitchInst *CSI = dyn_cast<CatchSwitchInst>(&I)) {
-        // Iterating over the handlers and the unwind basic block and processing
-        // catchpads. If the unwind label is a catchswitch, we just replace the
-        // label with poison later on.
-        for (unsigned I = 0; I < CSI->getNumSuccessors(); I++) {
-          BasicBlock *SucBlock = CSI->getSuccessor(I);
-          Instruction &SucFstInst = *(SucBlock->getFirstNonPHIIt());
-          if (isa<FuncletPadInst>(SucFstInst)) {
-            replaceFuncletPadsRetWithUnreachable(SucFstInst);
-            // There may be catchswitch instructions using the catchpad.
-            // Just replace those with poison.
-            if (!SucFstInst.use_empty())
-              SucFstInst.replaceAllUsesWith(
-                  PoisonValue::get(SucFstInst.getType()));
-            SucFstInst.eraseFromParent();
-          }
-        }
-      }
-
-      // Because control flow can't get here, we don't care what we replace the
-      // value with.  Note that since this block is unreachable, and all values
-      // contained within it must dominate their uses, that all uses will
-      // eventually be removed (they are themselves dead).
-      if (!I.use_empty())
-        I.replaceAllUsesWith(PoisonValue::get(I.getType()));
-      BB->back().eraseFromParent();
+        replaceFuncletPadsRetWithUnreachable(I, Updates, KeepOneInputPHIs);
     }
-    new UnreachableInst(BB->getContext(), BB);
-    assert(BB->size() == 1 &&
-           isa<UnreachableInst>(BB->getTerminator()) &&
-           "The successor list of BB isn't empty before "
-           "applying corresponding DTU updates.");
+
+    // Detaching and emptying the current basic block.
+    deleteBasicBlockFromSuccessor(BB, Updates, KeepOneInputPHIs);
+    zapAllInstructionInDeadBasicBlock(BB);
   }
 }
 
@@ -1704,8 +1702,8 @@ BranchInst *llvm::GetIfCondition(BasicBlock *BB, BasicBlock *&IfTrue,
 
   // We can only handle branches.  Other control flow will be lowered to
   // branches if possible anyway.
-  BranchInst *Pred1Br = dyn_cast<BranchInst>(Pred1->getTerminator());
-  BranchInst *Pred2Br = dyn_cast<BranchInst>(Pred2->getTerminator());
+  BranchInst *Pred1Br = dyn_cast_or_null<BranchInst>(Pred1->getTerminator());
+  BranchInst *Pred2Br = dyn_cast_or_null<BranchInst>(Pred2->getTerminator());
   if (!Pred1Br || !Pred2Br)
     return nullptr;
 
diff --git a/llvm/test/Transforms/SimplifyCFG/unreachable-multi-basic-block-funclet.ll b/llvm/test/Transforms/SimplifyCFG/unreachable-multi-basic-block-funclet.ll
index d2fccae6770db..d915b0c5272ee 100644
--- a/llvm/test/Transforms/SimplifyCFG/unreachable-multi-basic-block-funclet.ll
+++ b/llvm/test/Transforms/SimplifyCFG/unreachable-multi-basic-block-funclet.ll
@@ -166,4 +166,41 @@ bb13:
   cleanupret from %cleanuppad unwind label %funclet_bb10
 }
 
+%struct.foo = type { ptr, %struct.eggs, ptr }
+%struct.eggs = type { ptr, ptr, ptr }
+
+declare x86_thiscallcc ptr @quux(ptr, ptr, i32)
+
+define x86_thiscallcc ptr @baz(ptr %arg, ptr %arg1, ptr %arg2, i1 %arg3, ptr %arg4) personality ptr null {
+bb:
+  %alloca = alloca [2 x %struct.foo], align 4
+  %invoke = invoke x86_thiscallcc ptr @quux(ptr null, ptr null, i32 0)
+          to label %bb5 unwind label %bb10
+
+bb5:                                              ; preds = %bb
+  %getelementptr = getelementptr i8, ptr %arg, i32 20
+  %call = call x86_thiscallcc ptr null(ptr null, ptr null, i32 0)
+  br label %bb6
+
+bb6:                                              ; preds = %bb10, %bb5
+  %phi = phi ptr [ null, %bb10 ], [ null, %bb5 ]
+  ret ptr %phi
+
+bb7:                                              ; No predecessors!
+  %cleanuppad = cleanuppad within none []
+  %getelementptr8 = getelementptr i8, ptr %arg2, i32 -20
+  %icmp = icmp eq ptr %arg, null
+  br label %bb9
+
+bb9:                                              ; preds = %bb7
+  cleanupret from %cleanuppad unwind label %bb10
+
+bb10:                                             ; preds = %bb9, %bb
+  %phi11 = phi ptr [ %arg, %bb9 ], [ null, %bb ]
+  %cleanuppad12 = cleanuppad within none []
+  %getelementptr13 = getelementptr i8, ptr %phi11, i32 -20
+  store i32 0, ptr %phi11, align 4
+  br label %bb6
+}
+
 declare void @func(i64, i64, ptr)

Copy link

github-actions bot commented Sep 13, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@spaits spaits force-pushed the fix-funclet-crash-3 branch from be78e07 to 55b7d84 Compare September 13, 2025 20:55
@spaits spaits changed the title [BasicBlockUtils] Fix funclet deletion crash [BasicBlockUtils] Fix funclet deletion crash by correctly detaching return blocks Sep 14, 2025
…llvm#157363)

Fixes llvm#148052 .

When removing EH Pad blocks, the value defined by them becomes poison. These poison values are then used by `catchret` and `cleanupret`, which is invalid. This commit replaces those unreachable `catchret` and `cleanupret` instructions with `unreachable`.
@spaits spaits force-pushed the fix-funclet-crash-3 branch from 914670b to ece2d37 Compare September 14, 2025 20:06
@spaits spaits changed the title [BasicBlockUtils] Fix funclet deletion crash by correctly detaching return blocks Reland [BasicBlockUtils] Handle funclets when detaching EH pad blocks Sep 14, 2025
@spaits spaits changed the title Reland [BasicBlockUtils] Handle funclets when detaching EH pad blocks Reland "[BasicBlockUtils] Handle funclets when detaching EH pad blocks" Sep 14, 2025
Copy link
Collaborator

@rnk rnk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@spaits
Copy link
Contributor Author

spaits commented Sep 16, 2025

Thank you very much @rnk for reviewing my PR. I will probably merge it today.

I have added a few NFC commits. Just cleaning up and more testing.

  • Added another test case.
  • Added some more comments, when deleting the uses of catchpad/cleanuppad.
  • Cleaned up the code a bit:
    • Merged two static functions, that are always called after each other (zapAllInstructionInDeadBasicBlock + deleteBasicBlockFromSuccessor -> emptyAndDetachBlock)
    • Deleted the static function, that was only called from one place after no more explicit catchswitch handling (replaceFuncletPadsRetWithUnreachable).

@spaits spaits force-pushed the fix-funclet-crash-3 branch from c8e01e8 to ae456f2 Compare September 16, 2025 14:08
…or -> emptyAndDetachBlock and delete replaceFuncletPadsRetWithUnreachable
@spaits spaits force-pushed the fix-funclet-crash-3 branch from ae456f2 to 9dcf6e5 Compare September 16, 2025 14:14
@spaits spaits merged commit 41cef78 into llvm:main Sep 17, 2025
9 checks passed
spaits added a commit to spaits/llvm-project that referenced this pull request Sep 17, 2025
spaits added a commit that referenced this pull request Sep 17, 2025
rnk pushed a commit that referenced this pull request Sep 20, 2025
…#159379)

Fixes #148052 .

Last PR did not account for the scenario, when more than one instruction
used the `catchpad` label.
In that case I have deleted uses, which were already "choosen to be
iterated over" by the early increment iterator. This issue was not
visible in normal release build on x86, but luckily later on the address
sanitizer build it has found it on the buildbot.

Here is the diff from the last version of this PR: #158435
```diff
diff --git a/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp b/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
index 91e245e..1dd8cb4 100644
--- a/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
+++ b/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
@@ -106,7 +106,8 @@ void llvm::detachDeadBlocks(ArrayRef<BasicBlock *> BBs,
       // first block, the we would have possible cleanupret and catchret
       // instructions with poison arguments, which wouldn't be valid.
       if (isa<FuncletPadInst>(I)) {
-        for (User *User : make_early_inc_range(I.users())) {
+        SmallPtrSet<BasicBlock *, 4> UniqueEHRetBlocksToDelete;
+        for (User *User : I.users()) {
           Instruction *ReturnInstr = dyn_cast<Instruction>(User);
           // If we have a cleanupret or catchret block, replace it with just an
           // unreachable. The other alternative, that may use a catchpad is a
@@ -114,33 +115,12 @@ void llvm::detachDeadBlocks(ArrayRef<BasicBlock *> BBs,
           if (isa<CatchReturnInst>(ReturnInstr) ||
               isa<CleanupReturnInst>(ReturnInstr)) {
             BasicBlock *ReturnInstrBB = ReturnInstr->getParent();
-            // This catchret or catchpad basic block is detached now. Let the
-            // successors know it.
-            // This basic block also may have some predecessors too. For
-            // example the following LLVM-IR is valid:
-            //
-            //   [cleanuppad_block]
-            //            |
-            //    [regular_block]
-            //            |
-            //   [cleanupret_block]
-            //
-            // The IR after the cleanup will look like this:
-            //
-            //   [cleanuppad_block]
-            //            |
-            //    [regular_block]
-            //            |
-            //     [unreachable]
-            //
-            // So regular_block will lead to an unreachable block, which is also
-            // valid. There is no need to replace regular_block with unreachable
-            // in this context now.
-            // On the other hand, the cleanupret/catchret block's successors
-            // need to know about the deletion of their predecessors.
-            emptyAndDetachBlock(ReturnInstrBB, Updates, KeepOneInputPHIs);
+            UniqueEHRetBlocksToDelete.insert(ReturnInstrBB);
           }
         }
+        for (BasicBlock *EHRetBB :
+             make_early_inc_range(UniqueEHRetBlocksToDelete))
+          emptyAndDetachBlock(EHRetBB, Updates, KeepOneInputPHIs);
       }
     }
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug in SimplifyCFG causes ICE
3 participants