Skip to content

Conversation

@bwendling
Copy link
Collaborator

@bwendling bwendling commented Jul 2, 2025

This pass isolates code paths with undefined behavior. This allows
us to allowing non-UB code to remain while removing UB behavior.

This first patch implements 'nullptr' accesses. It analyses PHI
nodes that return a 'ptr' type and which have constant 'null' as an
incoming value. It then searches that same basic block for any uses
of the PHI node in a 'load' or 'store' instruction.

Example, we convert this code:

foo:
  %phi.val = phi ptr [ %arrayidx.i, %pred1 ], [ null, %pred2 ]
  %load.val = load i32, ptr %phi.val, align 4
   ...

into:

foo:
  %load.val = load i32, ptr %arrayidx.i, align 4
   ...

foo.ub.path:
  unreachable

To replace the undefined behavior with an explicit "trap-unreachable"
sequence, use '-isolate-ub-path-to-trap-unreachable', giving:

foo.ub.path:
  %load.val = load volatile i32, ptr null, align 4
  tail call void @llvm.trap()
  unreachable

Note: we allow the NULL dereference to actually occur so that code that
wishes to catch the signal can do so.

This pass isolates code paths with undefined behavior. This allows us to
replace the undefined behavior with an explicit 'trap' instruction while
allowing non-UB code to remain.

This first patch implements accessing a 'nullptr'. It analyses PHI nodes
that return a 'ptr' type and which have constant 'null' as an incoming
value. It then searches that basic block for any uses of the PHI node in
a 'load' or 'store' instruction. If a 'load' or 'store' is found, a
'trap' instruction is generated instead.

Example, we convert this code:

    foo:
      %phi.val = phi ptr [ %arrayidx.i, %pred1 ], [ null, %pred2 ]
      %load.val = load i32, ptr %phi.val, align 4
       ...

into:

    foo:
      %load.val = load i32, ptr %ptr.val, align 4
       ...

    foo.ub.path:
      tail call void @llvm.trap()
      unreachable
@llvmbot
Copy link
Member

llvmbot commented Jul 2, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Bill Wendling (bwendling)

Changes

This pass isolates code paths with undefined behavior. This allows
us to replace the undefined behavior with an explicit 'trap'
instruction while allowing non-UB code to remain.

This first patch implements accessing a 'nullptr'. It analyses PHI
nodes that return a 'ptr' type and which have constant 'null' as an
incoming value. It then searches that basic block for any uses of
the PHI node in a 'load' or 'store' instruction. If a 'load' or 'store'
is found, a 'trap' instruction is generated instead.

Example, we convert this code:

foo:
  %phi.val = phi ptr [ %arrayidx.i, %pred1 ], [ null, %pred2 ]
  %load.val = load i32, ptr %phi.val, align 4
   ...

into:

foo:
  %load.val = load i32, ptr %ptr.val, align 4
   ...

foo.ub.path:
  %load.val = load i32, ptr null, align 4
  tail call void @<!-- -->llvm.trap()
  unreachable

Note: we allow the NULL dereference to actually occur so that code that
wishes to catch the signal can do so.


Patch is 38.98 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/146791.diff

14 Files Affected:

  • (added) llvm/include/llvm/Transforms/Scalar/IsolatePath.h (+58)
  • (modified) llvm/lib/Passes/PassBuilder.cpp (+1)
  • (modified) llvm/lib/Passes/PassBuilderPipelines.cpp (+5)
  • (modified) llvm/lib/Passes/PassRegistry.def (+1)
  • (modified) llvm/lib/Transforms/Scalar/CMakeLists.txt (+1)
  • (added) llvm/lib/Transforms/Scalar/IsolatePath.cpp (+300)
  • (modified) llvm/test/Other/new-pm-defaults.ll (+2-1)
  • (modified) llvm/test/Other/new-pm-thinlto-postlink-defaults.ll (+2-1)
  • (modified) llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll (+1)
  • (modified) llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll (+1)
  • (modified) llvm/test/Other/new-pm-thinlto-prelink-defaults.ll (+2-1)
  • (modified) llvm/test/Other/new-pm-thinlto-prelink-pgo-defaults.ll (+3-1)
  • (modified) llvm/test/Other/new-pm-thinlto-prelink-samplepgo-defaults.ll (+1)
  • (added) llvm/test/Transforms/IsolatePath/ub-memory-accesses.ll (+370)
diff --git a/llvm/include/llvm/Transforms/Scalar/IsolatePath.h b/llvm/include/llvm/Transforms/Scalar/IsolatePath.h
new file mode 100644
index 0000000000000..fce4c4ceaa27b
--- /dev/null
+++ b/llvm/include/llvm/Transforms/Scalar/IsolatePath.h
@@ -0,0 +1,58 @@
+//===- IsolatePath.cpp - Code to isolate paths with UB ----------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This pass isolates code paths with undefined behavior from paths without
+// undefined behavior, and then add a trap instruction on that path. This
+// prevents code generation where, after the UB instruction's eliminated, the
+// code can wander off the end of a function.
+//
+// For example, a nullptr dereference:
+//
+//   foo:
+//     %phi.val = phi ptr [ %arrayidx.i, %pred1 ], [ null, %pred2 ]
+//     %load.val = load i32, ptr %phi.val, align 4
+//
+// is converted into:
+//
+//   foo.ub.path:
+//     %load.val.ub = load volatile i32, ptr null, align 4
+//     tail call void @llvm.trap()
+//     unreachable
+//
+// Note: we allow the NULL dereference to actually occur so that code that
+// wishes to catch the signal can do so.
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_TRANSFORMS_SCALAR_ISOLATEPATH_H
+#define LLVM_TRANSFORMS_SCALAR_ISOLATEPATH_H
+
+#include "llvm/ADT/SmallPtrSet.h"
+#include "llvm/IR/PassManager.h"
+
+namespace llvm {
+
+class BasicBlock;
+class DomTreeUpdater;
+class Function;
+
+/// This pass performs 'path isolation', which looks for undefined behavior and
+/// isolates the path from non-undefined behavior code and converts the UB into
+/// a trap call.
+class IsolatePathPass : public PassInfoMixin<IsolatePathPass> {
+  SmallPtrSet<BasicBlock *, 4> SplitUBBlocks;
+
+  bool ProcessPointerUndefinedBehavior(BasicBlock *BB, DomTreeUpdater *DTU);
+
+public:
+  PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM);
+};
+
+} // end namespace llvm
+
+#endif // LLVM_TRANSFORMS_SCALAR_ISOLATEPATH_H
diff --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp
index 0697a0a6b4c74..0addad282b89b 100644
--- a/llvm/lib/Passes/PassBuilder.cpp
+++ b/llvm/lib/Passes/PassBuilder.cpp
@@ -277,6 +277,7 @@
 #include "llvm/Transforms/Scalar/InferAddressSpaces.h"
 #include "llvm/Transforms/Scalar/InferAlignment.h"
 #include "llvm/Transforms/Scalar/InstSimplifyPass.h"
+#include "llvm/Transforms/Scalar/IsolatePath.h"
 #include "llvm/Transforms/Scalar/JumpTableToSwitch.h"
 #include "llvm/Transforms/Scalar/JumpThreading.h"
 #include "llvm/Transforms/Scalar/LICM.h"
diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp
index c83d2dc1f1514..ef72e0dc1f182 100644
--- a/llvm/lib/Passes/PassBuilderPipelines.cpp
+++ b/llvm/lib/Passes/PassBuilderPipelines.cpp
@@ -98,6 +98,7 @@
 #include "llvm/Transforms/Scalar/IndVarSimplify.h"
 #include "llvm/Transforms/Scalar/InferAlignment.h"
 #include "llvm/Transforms/Scalar/InstSimplifyPass.h"
+#include "llvm/Transforms/Scalar/IsolatePath.h"
 #include "llvm/Transforms/Scalar/JumpTableToSwitch.h"
 #include "llvm/Transforms/Scalar/JumpThreading.h"
 #include "llvm/Transforms/Scalar/LICM.h"
@@ -598,6 +599,10 @@ PassBuilder::buildFunctionSimplificationPipeline(OptimizationLevel Level,
         SimplifyCFGPass(SimplifyCFGOptions().convertSwitchRangeToICmp(true)));
   }
 
+  // Isolate paths with undefined behavior. At this point, all inlinable
+  // functions should be inlined and constants propagated.
+  FPM.addPass(IsolatePathPass());
+
   // Speculative execution if the target has divergent branches; otherwise nop.
   FPM.addPass(SpeculativeExecutionPass(/* OnlyIfDivergentTarget =*/true));
 
diff --git a/llvm/lib/Passes/PassRegistry.def b/llvm/lib/Passes/PassRegistry.def
index dd3dab3425975..ed473ceff9192 100644
--- a/llvm/lib/Passes/PassRegistry.def
+++ b/llvm/lib/Passes/PassRegistry.def
@@ -448,6 +448,7 @@ FUNCTION_PASS("interleaved-access", InterleavedAccessPass(TM))
 FUNCTION_PASS("interleaved-load-combine", InterleavedLoadCombinePass(TM))
 FUNCTION_PASS("invalidate<all>", InvalidateAllAnalysesPass())
 FUNCTION_PASS("irce", IRCEPass())
+FUNCTION_PASS("isolate-path", IsolatePathPass());
 FUNCTION_PASS("jump-threading", JumpThreadingPass())
 FUNCTION_PASS("jump-table-to-switch", JumpTableToSwitchPass());
 FUNCTION_PASS("kcfi", KCFIPass())
diff --git a/llvm/lib/Transforms/Scalar/CMakeLists.txt b/llvm/lib/Transforms/Scalar/CMakeLists.txt
index 84a5b02043d01..a1738aa7485df 100644
--- a/llvm/lib/Transforms/Scalar/CMakeLists.txt
+++ b/llvm/lib/Transforms/Scalar/CMakeLists.txt
@@ -24,6 +24,7 @@ add_llvm_component_library(LLVMScalarOpts
   InferAddressSpaces.cpp
   InferAlignment.cpp
   InstSimplifyPass.cpp
+  IsolatePath.cpp
   JumpThreading.cpp
   JumpTableToSwitch.cpp
   LICM.cpp
diff --git a/llvm/lib/Transforms/Scalar/IsolatePath.cpp b/llvm/lib/Transforms/Scalar/IsolatePath.cpp
new file mode 100644
index 0000000000000..6f561563c767f
--- /dev/null
+++ b/llvm/lib/Transforms/Scalar/IsolatePath.cpp
@@ -0,0 +1,300 @@
+//===- IsolatePath.cpp - Code to isolate paths with UB --------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This pass isolates code paths with undefined behavior from paths without
+// undefined behavior, and then add a trap instruction on that path. This
+// prevents code generation where, after the UB instruction's eliminated, the
+// code can wander off the end of a function.
+//
+// For example, a nullptr dereference:
+//
+//   foo:
+//     %phi.val = phi ptr [ %arrayidx.i, %pred1 ], [ null, %pred2 ]
+//     %load.val = load i32, ptr %phi.val, align 4
+//
+// is converted into:
+//
+//   foo:
+//     %load.val = load i32, ptr %ptr.val, align 4
+//
+//   foo.ub.path:
+//     %load.val.ub = load volatile i32, ptr null, align 4
+//     tail call void @llvm.trap()
+//     unreachable
+//
+// Note: we allow the NULL dereference to actually occur so that code that
+// wishes to catch the signal can do so.
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Transforms/Scalar/IsolatePath.h"
+#include "llvm/ADT/Statistic.h"
+#include "llvm/Analysis/DomTreeUpdater.h"
+#include "llvm/Analysis/PostDominators.h"
+#include "llvm/IR/Constants.h"
+#include "llvm/IR/Dominators.h"
+#include "llvm/IR/IRBuilder.h"
+#include "llvm/IR/Instructions.h"
+#include "llvm/IR/IntrinsicInst.h"
+#include "llvm/Transforms/Utils/BasicBlockUtils.h"
+#include "llvm/Transforms/Utils/Cloning.h"
+
+using namespace llvm;
+
+#define DEBUG_TYPE "isolate-path"
+
+STATISTIC(NumIsolatedBlocks, "Number of isolated blocks");
+
+/// Look through GEPs to see if the nullptr is accessed.
+static bool HasUBAccess(BasicBlock *Parent, GetElementPtrInst *GEP) {
+  for (Value *V : GEP->materialized_users()) {
+    if (auto *G = dyn_cast<GetElementPtrInst>(V)) {
+      if (G->getParent() != Parent)
+        return false;
+      return HasUBAccess(Parent, G);
+    } else if (auto *LI = dyn_cast<LoadInst>(V)) {
+      if (LI->getParent() != Parent)
+        return false;
+      if (GEP == LI->getPointerOperand())
+        return true;
+    } else if (auto *SI = dyn_cast<StoreInst>(V)) {
+      if (SI->getParent() != Parent)
+        return false;
+      if (GEP == SI->getPointerOperand())
+        return true;
+    }
+  }
+
+  return false;
+}
+
+static std::pair<PHINode *, Instruction *> GetFirstUBInst(BasicBlock *BB) {
+  // Find PHIs that have 'nullptr' inputs.
+  SmallPtrSet<PHINode *, 4> NullptrPhis;
+  for (PHINode &PN : BB->phis()) {
+    if (!PN.getType()->isPointerTy())
+      continue;
+
+    for (Value *V : PN.incoming_values())
+      if (isa<ConstantPointerNull>(V)) {
+        NullptrPhis.insert(&PN);
+        break;
+      }
+  }
+  if (NullptrPhis.empty())
+    return {};
+
+  // Grab instructions that may be UB.
+  SmallDenseMap<PHINode *, SmallPtrSet<Instruction *, 4>> MaybeUBInsts;
+  for (PHINode *PN : NullptrPhis) {
+    for (Value *V : PN->materialized_users()) {
+      if (auto *LI = dyn_cast<LoadInst>(V)) {
+        if (LI->getParent() == BB && PN == LI->getPointerOperand())
+          MaybeUBInsts[PN].insert(LI);
+      } else if (auto *SI = dyn_cast<StoreInst>(V)) {
+        if (SI->getParent() == BB && PN == SI->getPointerOperand())
+          MaybeUBInsts[PN].insert(SI);
+      } else if (auto *GEP = dyn_cast<GetElementPtrInst>(V)) {
+        if (GEP->getParent() == BB && HasUBAccess(BB, GEP))
+          MaybeUBInsts[PN].insert(GEP);
+      }
+    }
+  }
+  if (MaybeUBInsts.empty())
+    return {};
+
+  // Get the first UB instruction.
+  PHINode *FirstUBPhiNode = nullptr;
+  Instruction *FirstUBInst = nullptr;
+  for (auto Element : MaybeUBInsts) {
+    PHINode *PN = Element.getFirst();
+    SmallPtrSetImpl<Instruction *> &Insts = Element.getSecond();
+
+    for (Instruction &I : *BB) {
+      if (&I == FirstUBInst)
+        break;
+
+      if (Insts.contains(&I)) {
+        FirstUBPhiNode = PN;
+        FirstUBInst = &I;
+        break;
+      }
+    }
+  }
+
+  return std::make_pair(FirstUBPhiNode, FirstUBInst);
+}
+
+/// Convert any accesses of a nullptr within the BB into a trap.
+bool IsolatePathPass::ProcessPointerUndefinedBehavior(BasicBlock *BB,
+                                                      DomTreeUpdater *DTU) {
+  if (!BB->canSplitPredecessors())
+    return false;
+
+  // Get the first UB instruction and associated PHI node.
+  auto [FirstUBPhiNode, FirstUBInst] = GetFirstUBInst(BB);
+  if (!FirstUBInst)
+    return false;
+
+  // Now that we have the first UB instruction and the PHI node associated with
+  // it, determine how to split the predecessors.
+  SmallPtrSet<BasicBlock *, 4> UBPhiPreds;
+  SmallPtrSet<BasicBlock *, 4> NonUBPhiPreds;
+  unsigned Index = 0;
+  for (Value *V : FirstUBPhiNode->incoming_values())
+    if (isa<ConstantPointerNull>(V))
+      UBPhiPreds.insert(FirstUBPhiNode->getIncomingBlock(Index++));
+    else
+      NonUBPhiPreds.insert(FirstUBPhiNode->getIncomingBlock(Index++));
+
+  SmallVector<DominatorTree::UpdateType, 8> Updates;
+  BasicBlock *UBBlock = nullptr;
+  if (NonUBPhiPreds.empty()) {
+    // All PHI node values cause UB in the block. Just add the 'trap'
+    // instruction without cloning.
+    UBBlock = BB;
+
+    // Remove the block from any successors.
+    for (BasicBlock *Succ : successors(BB)) {
+      Succ->removePredecessor(BB);
+      Updates.push_back({DominatorTree::Delete, BB, Succ});
+    }
+  } else {
+    // Clone the block, isolating the UB instructions on their own path.
+    ValueToValueMapTy VMap;
+    UBBlock = CloneBasicBlock(BB, VMap, ".ub.path", BB->getParent());
+    VMap[BB] = UBBlock;
+    ++NumIsolatedBlocks;
+
+    // Replace the UB predecessors' terminators' targets with the new block.
+    llvm::for_each(UBPhiPreds, [&](BasicBlock *Pred) {
+      Pred->getTerminator()->replaceSuccessorWith(BB, UBBlock);
+    });
+
+    // Remove predecessors of isolated paths from the original PHI nodes.
+    for (PHINode &PN : BB->phis())
+      PN.removeIncomingValueIf([&](unsigned I) {
+        return UBPhiPreds.contains(PN.getIncomingBlock(I));
+      });
+
+    // Remove predecessors of valid paths from the isolated path PHI nodes.
+    for (PHINode &PN : UBBlock->phis())
+      PN.removeIncomingValueIf([&](unsigned I) {
+        return NonUBPhiPreds.contains(PN.getIncomingBlock(I));
+      });
+
+    // Rewrite the instructions in the cloned block to refer to the instructions
+    // in the cloned block.
+    for (auto &I : *UBBlock) {
+      RemapDbgRecordRange(BB->getModule(), I.getDbgRecordRange(), VMap,
+                          RF_NoModuleLevelChanges | RF_IgnoreMissingLocals);
+      RemapInstruction(&I, VMap,
+                       RF_NoModuleLevelChanges | RF_IgnoreMissingLocals);
+    }
+
+    // Update the dominator tree.
+    for (auto *Pred : UBPhiPreds) {
+      Updates.push_back({DominatorTree::Insert, Pred, UBBlock});
+      Updates.push_back({DominatorTree::Delete, Pred, BB});
+    }
+  }
+
+  // Get the index into the block of the first UB instruction.
+  unsigned UBIndex = 0;
+  for (auto Iter = BB->begin(); Iter != BB->end(); ++Iter, ++UBIndex)
+    if (&*Iter == FirstUBInst) {
+      if (isa<LoadInst>(FirstUBInst))
+        ++UBIndex;
+      break;
+    }
+
+  // Remove the instructions following the nullptr dereference.
+  for (unsigned Index = UBBlock->size(); Index > UBIndex; --Index)
+    UBBlock->rbegin()->eraseFromParent();
+
+  // Allow the NULL dereference to actually occur so that code that wishes to
+  // catch the signal can do so.
+  if (const auto *LI = dyn_cast<LoadInst>(&*UBBlock->rbegin()))
+    const_cast<LoadInst *>(LI)->setVolatile(true);
+
+  // Add a 'trap()' call followed by an 'unreachable' terminator.
+  IRBuilder<> Builder(UBBlock);
+  Function *TrapDecl =
+      Intrinsic::getOrInsertDeclaration(BB->getModule(), Intrinsic::trap);
+  Builder.CreateCall(TrapDecl);
+  Builder.CreateUnreachable();
+
+  if (!Updates.empty())
+    DTU->applyUpdates(Updates);
+
+  SplitUBBlocks.insert(UBBlock);
+  return true;
+}
+
+PreservedAnalyses IsolatePathPass::run(Function &F,
+                                       FunctionAnalysisManager &FAM) {
+  bool Changed = false;
+
+  auto &DT = FAM.getResult<DominatorTreeAnalysis>(F);
+  auto &PDT = FAM.getResult<PostDominatorTreeAnalysis>(F);
+  DomTreeUpdater DTU(&DT, &PDT, DomTreeUpdater::UpdateStrategy::Eager);
+
+  // Use a worklist of blocks because we'll be adding new blocks to the
+  // function and potentially processing the same block multiple times.
+  std::vector<BasicBlock *> Blocks;
+  Blocks.reserve(F.size());
+  llvm::transform(F, std::back_inserter(Blocks),
+                  [](BasicBlock &BB) { return &BB; });
+
+  while (!Blocks.empty()) {
+    BasicBlock *BB = Blocks.back();
+    Blocks.pop_back();
+    if (SplitUBBlocks.contains(BB))
+      continue;
+
+    // No PHI nodes.
+    if (BB->phis().empty())
+      continue;
+
+    // Ignore landing and EH pads for now.
+    // FIXME: Should we support them?
+    if (BB->isLandingPad() || BB->isEHPad())
+      continue;
+
+    // Support some of the more common predecessor terminators.
+    // FIXME: Add support for 'SwitchInst'.
+    if (llvm::any_of(predecessors(BB), [&](BasicBlock *Pred) {
+          Instruction *TI = Pred->getTerminator();
+          return !isa<BranchInst>(TI) && !isa<ReturnInst>(TI) &&
+                 !isa<SwitchInst>(TI);
+        }))
+      continue;
+
+    if (auto *BI = dyn_cast<BranchInst>(BB->getTerminator()))
+      // If a BB has an edge to itself, then duplication of BB could result in
+      // reallocation of the BB's PHI nodes.
+      if (llvm::any_of(BI->successors(),
+                       [&](BasicBlock *B) { return B == BB; }))
+        continue;
+
+    if (ProcessPointerUndefinedBehavior(BB, &DTU)) {
+      // Reprocess the block to handle further UB instructions.
+      Blocks.push_back(BB);
+      Changed = true;
+    }
+  }
+
+  if (!Changed)
+    return PreservedAnalyses::all();
+
+  // FIXME: Should we update LoopInfo and LCCSA like in SplitBlockPredecessors?
+  PreservedAnalyses PA;
+  PA.preserve<DominatorTreeAnalysis>();
+  PA.preserve<PostDominatorTreeAnalysis>();
+  return PA;
+}
diff --git a/llvm/test/Other/new-pm-defaults.ll b/llvm/test/Other/new-pm-defaults.ll
index c554fdbf4c799..491759b3d1e1a 100644
--- a/llvm/test/Other/new-pm-defaults.ll
+++ b/llvm/test/Other/new-pm-defaults.ll
@@ -156,6 +156,8 @@
 ; CHECK-O-NEXT: Running pass: EarlyCSEPass
 ; CHECK-O-NEXT: Running analysis: MemorySSAAnalysis
 ; CHECK-O-NEXT: Running analysis: AAManager
+; CHECK-O23SZ-NEXT: Running pass: IsolatePathPass
+; CHECK-O23SZ-NEXT: Running analysis: PostDominatorTreeAnalysis
 ; CHECK-O23SZ-NEXT: Running pass: SpeculativeExecutionPass
 ; CHECK-O23SZ-NEXT: Running pass: JumpThreadingPass
 ; CHECK-O23SZ-NEXT: Running analysis: LazyValueAnalysis
@@ -215,7 +217,6 @@
 ; CHECK-O23SZ-NEXT: Invalidating analysis: LazyValueAnalysis
 ; CHECK-O1-NEXT: Running pass: CoroElidePass
 ; CHECK-O-NEXT: Running pass: ADCEPass
-; CHECK-O23SZ-NEXT: Running analysis: PostDominatorTreeAnalysis
 ; CHECK-O23SZ-NEXT: Running pass: MemCpyOptPass
 ; CHECK-O23SZ-NEXT: Running pass: DSEPass
 ; CHECK-O23SZ-NEXT: Running pass: MoveAutoInitPass on foo
diff --git a/llvm/test/Other/new-pm-thinlto-postlink-defaults.ll b/llvm/test/Other/new-pm-thinlto-postlink-defaults.ll
index 62bb02d9b3c40..6e038785eb763 100644
--- a/llvm/test/Other/new-pm-thinlto-postlink-defaults.ll
+++ b/llvm/test/Other/new-pm-thinlto-postlink-defaults.ll
@@ -86,6 +86,8 @@
 ; CHECK-O-NEXT: Running pass: EarlyCSEPass
 ; CHECK-O-NEXT: Running analysis: MemorySSAAnalysis
 ; CHECK-O-NEXT: Running analysis: AAManager
+; CHECK-O23SZ-NEXT: Running pass: IsolatePathPass
+; CHECK-O23SZ-NEXT: Running analysis: PostDominatorTreeAnalysis
 ; CHECK-O23SZ-NEXT: Running pass: SpeculativeExecutionPass
 ; CHECK-O23SZ-NEXT: Running pass: JumpThreadingPass
 ; CHECK-O23SZ-NEXT: Running analysis: LazyValueAnalysis
@@ -140,7 +142,6 @@
 ; CHECK-O23SZ-NEXT: Invalidating analysis: LazyValueAnalysis
 ; CHECK-O1-NEXT: Running pass: CoroElidePass
 ; CHECK-O-NEXT: Running pass: ADCEPass
-; CHECK-O23SZ-NEXT: Running analysis: PostDominatorTreeAnalysis
 ; CHECK-O23SZ-NEXT: Running pass: MemCpyOptPass
 ; CHECK-O23SZ-NEXT: Running pass: DSEPass
 ; CHECK-O23SZ-NEXT: Running pass: MoveAutoInitPass on foo
diff --git a/llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll b/llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll
index 0da7a9f73bdce..bec6d67302066 100644
--- a/llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll
+++ b/llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll
@@ -74,6 +74,7 @@
 ; CHECK-O-NEXT: Running pass: EarlyCSEPass
 ; CHECK-O-NEXT: Running analysis: MemorySSAAnalysis
 ; CHECK-O-NEXT: Running analysis: AAManager
+; CHECK-O23SZ-NEXT: Running pass: IsolatePathPass
 ; CHECK-O23SZ-NEXT: Running pass: SpeculativeExecutionPass
 ; CHECK-O23SZ-NEXT: Running pass: JumpThreadingPass
 ; CHECK-O23SZ-NEXT: Running analysis: LazyValueAnalysis
diff --git a/llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll b/llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll
index 38b7890682783..69ccba72c92e0 100644
--- a/llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll
+++ b/llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll
@@ -83,6 +83,7 @@
 ; CHECK-O-NEXT: Running pass: EarlyCSEPass
 ; CHECK-O-NEXT: Running analysis: MemorySSAAnalysis
 ; CHECK-O-NEXT: Running analysis: AAManager
+; CHECK-O23SZ-NEXT: Running pass: IsolatePathPass
 ; CHECK-O23SZ-NEXT: Running pass: SpeculativeExecutionPass
 ; CHECK-O23SZ-NEXT: Running pass: JumpThreadingPass
 ; CHECK-O23SZ-NEXT: Running analysis: LazyValueAnalysis
diff --git a/llvm/test/Other/new-pm-thinlto-prelink-defaults.ll b/llvm/test/Other/new-pm-thinlto-prelink-defaults.ll
index 5aacd26def2be..68f4d421bb6de 100644
--- a/llvm/test/Other/new-pm-thinlto-prelink-defaults.ll
+++ b/llvm/test/Other/new-pm-thinlto-prelink-defaults.ll
@@ -118,6 +118,8 @@
 ; CHECK-O-NEXT: Running pass: EarlyCSEPass
 ; CHECK-O-NEXT: Running analysis: MemorySSAAnalysis
 ; CHECK-O-NEXT: Running analysis: AAManager
+; CHECK-O23SZ-NEXT: Running pass: IsolatePathPass
+; CHECK-O23SZ-NEXT: Running analysis: PostDominatorTreeAnalysis
 ; CHECK-O23SZ-NEXT: Running pass: SpeculativeExecutionPass
 ; CHECK-O23SZ-NEXT: Running pass: JumpThreadingPass
 ; CHECK-O23SZ-NEXT: Running analysis: LazyValueAnalysis
@@ -172,7 +174,6 @@
 ; CHECK-O23SZ-NEXT: Invalidating analysis: LazyValueAnalysis
 ; CHECK-O1-NEXT: Running pass: CoroElidePass
 ; CHECK-O-NEXT: Running pass: ADCEPass
-; CHECK-O23SZ-NEXT: Running analysis: PostDominatorTreeAnalysis
 ; CHECK-O23SZ-NEXT: Running pass: MemCpyOptPass
 ; CHECK-O23SZ-NEXT: Running pass: DSEPass
 ; CHECK-O23SZ-NEXT: Running pass: MoveAutoInitPass
diff --git a/llvm/test/Other/new-pm-thinlto-prelink-pgo-defaults.ll b/llvm/test/Other/new-pm-thinlto-prelink-pgo-defaults.ll
i...
[truncated]

@dtcxzyw
Copy link
Member

dtcxzyw commented Jul 3, 2025

foo:
%phi.val = phi ptr [ %arrayidx.i, %pred1 ], [ null, %pred2 ]
%load.val = load i32, ptr %phi.val, align 4
...

This pass should run before SimplifyCFGPass. Otherwise, %pred2 -> %foo will be removed in removeUndefIntroducingPredecessor:

// Load from null is undefined.
if (LoadInst *LI = dyn_cast<LoadInst>(User))
if (!LI->isVolatile())
return !NullPointerIsDefined(LI->getFunction(),
LI->getPointerAddressSpace());
// Store to null is undefined.
if (StoreInst *SI = dyn_cast<StoreInst>(User))
if (!SI->isVolatile())
return (!NullPointerIsDefined(SI->getFunction(),
SI->getPointerAddressSpace())) &&
SI->getPointerOperand() == I;

@nikic
Copy link
Contributor

nikic commented Jul 3, 2025

I fundamentally disagree with what this is trying to do. I want paths that always lead to UB to be eliminated. Both as a user of C/C++, but especially as a user of memory safe languages.

I might accept this as an optional hardening mode, but I don't think this has any business being part of the default pipeline. (And viewed as an optional hardening mode, I'm not sure a dedicated pass is the best way to do this, for phase ordering reasons. I'd expect disabling specific optimizations that aggressively backpropagate UB based on an attribute would be preferable.)

(My previous guidance to expose trap-unreachable in clang and enable it by default stands.)

@bwendling
Copy link
Collaborator Author

bwendling commented Jul 7, 2025

I fundamentally disagree with what this is trying to do. I want paths that always lead to UB to be eliminated. Both as a user of C/C++, but especially as a user of memory safe languages.

This pass does exactly this. It isolates the paths with explicit undefined behavior (nullptr accesses) so that they can be eliminated. I suppose the only controversial element here would be a retained load and use of __builtin_trap() on the UB path. How this differs from what we currently have is that it applies a finer-toothed comb to the code. It isolates the bad paths so that the good paths aren't affected.

I might accept this as an optional hardening mode, but I don't think this has any business being part of the default pipeline. (And viewed as an optional hardening mode, I'm not sure a dedicated pass is the best way to do this, for phase ordering reasons. I'd expect disabling specific optimizations that aggressively backpropagate UB based on an attribute would be preferable.)

This isn't a hardening pass. This is a simple pass to remove UB code from the control path. I don't understand your objection to doing exactly what you say you want done...

(My previous guidance to expose trap-unreachable in clang and enable it by default stands.)

What guidance?

Here is the RFC for the same feature in GCC: https://gcc.gnu.org/pipermail/gcc-patches/2013-October/372710.html

@nikic
Copy link
Contributor

nikic commented Jul 7, 2025

I fundamentally disagree with what this is trying to do. I want paths that always lead to UB to be eliminated. Both as a user of C/C++, but especially as a user of memory safe languages.

This pass does exactly this. It isolates the paths with explicit undefined behavior (nullptr accesses) so that they can be eliminated. I suppose the only controversial element here would be a retained load and use of __builtin_trap() on the UB path. How this differs from what we currently have is that it applies a finer-toothed comb to the code. It isolates the bad paths so that the good paths aren't affected.

Retaining the load and trap is exactly the problem. This means that the UB-triggering path is explicitly retained, not optimized away. Most importantly, the control flow deciding whether to take that path is not optimized away.

This is also why I called this a hardening pass: It replaces UB with traps.

What guidance?

That function fallthrough due to UB can be prevented using trap-unreachable mode, and I'd be open to having it enabled by default. This is in reference to this code comment: "This prevents code generation where, after the UB instruction's eliminated, the code can wander off the end of a function."

Maybe I misunderstood everything here, and your actual intention is to strengthen the optimization performed by removeUndefIntroducingPredecessor() using a dedicated pass -- but in that case this code should be eliminating UB paths, not converting them to traps, and function fallthrough should be addressed via trap-unreachable.

@bwendling
Copy link
Collaborator Author

I fundamentally disagree with what this is trying to do. I want paths that always lead to UB to be eliminated. Both as a user of C/C++, but especially as a user of memory safe languages.

This pass does exactly this. It isolates the paths with explicit undefined behavior (nullptr accesses) so that they can be eliminated. I suppose the only controversial element here would be a retained load and use of __builtin_trap() on the UB path. How this differs from what we currently have is that it applies a finer-toothed comb to the code. It isolates the bad paths so that the good paths aren't affected.

Retaining the load and trap is exactly the problem. This means that the UB-triggering path is explicitly retained, not optimized away. Most importantly, the control flow deciding whether to take that path is not optimized away.

This is also why I called this a hardening pass: It replaces UB with traps.

Okay. Then I can put those parts behind a flag. They're a "nicety" to preserve some semantics, but probably aren't 100% necessary. :-)

What guidance?

That function fallthrough due to UB can be prevented using trap-unreachable mode, and I'd be open to having it enabled by default. This is in reference to this code comment: "This prevents code generation where, after the UB instruction's eliminated, the code can wander off the end of a function."

I agree that a trap-unreachable is far better than the alternative.

Maybe I misunderstood everything here, and your actual intention is to strengthen the optimization performed by removeUndefIntroducingPredecessor() using a dedicated pass -- but in that case this code should be eliminating UB paths, not converting them to traps, and function fallthrough should be addressed via trap-unreachable.

It's easy to remove them. I'll submit a change.

…hs are UB, allow further passes to handle them.
@bwendling
Copy link
Collaborator Author

foo:
%phi.val = phi ptr [ %arrayidx.i, %pred1 ], [ null, %pred2 ]
%load.val = load i32, ptr %phi.val, align 4
...

This pass should run before SimplifyCFGPass. Otherwise, %pred2 -> %foo will be removed in removeUndefIntroducingPredecessor:

// Load from null is undefined.
if (LoadInst *LI = dyn_cast<LoadInst>(User))
if (!LI->isVolatile())
return !NullPointerIsDefined(LI->getFunction(),
LI->getPointerAddressSpace());
// Store to null is undefined.
if (StoreInst *SI = dyn_cast<StoreInst>(User))
if (!SI->isVolatile())
return (!NullPointerIsDefined(SI->getFunction(),
SI->getPointerAddressSpace())) &&
SI->getPointerOperand() == I;

I don't want this to run before SimplifyCFGPass precisely because of this. :-) I want the undefined behavior to remain so that it can be teased out into its own "path" (i.e. a block containing an unreachable instruction). The SimplifyCFGPass will then do its magic to remove the blocks.

@bwendling
Copy link
Collaborator Author

I updated the code to producing an unreachable instruction instead of the trap-unreachable sequence (which is placed behind a -isolate-ub-path-to-trap-unreachable flag). Also updated the description to reflect reality.

PTAL.

This commit provides several improvements to the IsolatePath pass:

- Refactors the main processing function into smaller, more focused
  helpers to improve readability and maintainability.
- Fixes a bug where the pass would crash with a 'terminator in the
  middle of a basic block' error due to incorrect instruction mapping in
  cloned blocks.
- Ensures LoopInfo is correctly updated when a block within a loop is
  split, and preserves the analysis.
- Removes stale FIXME comments regarding SwitchInst support.
- Expands the test suite to cover more complex scenarios, including UB
  in loops and blocks terminated by SwitchInst.
@bwendling
Copy link
Collaborator Author

Friendly ping.

I did some work to improve testing and make the code not smelly.

@vitalybuka
Copy link
Collaborator

vitalybuka commented Jul 17, 2025

How this is different from -fsanitize=null -fsanitize-trap=null, which is likely more precise?

@bwendling
Copy link
Collaborator Author

How this is different from -fsanitize=null -fsanitize-trap=null, which is likely more precise?

This pass would run by default. The flags introduce calls to `@llvm.ubsantrap, which isn't what some people want. I'm not sure how it would be more precise as I assume they do roughly the same thing.

@dtcxzyw
Copy link
Member

dtcxzyw commented Jul 18, 2025

This pass would run by default.

I agree to enable the check by default in the future. According to my previous experiment https://github.com/dtcxzyw/conservative-clang and https://web.ist.utl.pt/nuno.lopes/pubs/ub-pldi25.pdf, disabling UB exploiting may cause an approximately 10% performance downgrade without LTO. Fortunately, it can often be recovered with LTO. The reason for my objection to #137741 is that the undefined behavior in C/C++ is different from the one in LLVM IR. We can recover the performance since we can recover the UB-related flags/attributes/metadata in the middle end.

The flags introduce calls to `@llvm.ubsantrap, which isn't what some people want.

Currently, clang allows users to specify an error-handling callback with -ftrap-function=custom_handler: https://godbolt.org/z/Mnb9G6Y55. That is what I wanted to say in #137741 (comment). If we really need a trap instead of calling a custom handler with an error code, we can introduce a new flag and emit llvm.trap here:

} else {
TrapBB = createBasicBlock("trap");
Builder.CreateCondBr(Checked, Cont, TrapBB,
MDHelper.createLikelyBranchWeights());
EmitBlock(TrapBB);
llvm::CallInst *TrapCall =
Builder.CreateCall(CGM.getIntrinsic(llvm::Intrinsic::ubsantrap),
llvm::ConstantInt::get(CGM.Int8Ty, CheckHandlerID));
if (!CGM.getCodeGenOpts().TrapFuncName.empty()) {
auto A = llvm::Attribute::get(getLLVMContext(), "trap-func-name",
CGM.getCodeGenOpts().TrapFuncName);
TrapCall->addFnAttr(A);
}
if (NoMerge)
TrapCall->addFnAttr(llvm::Attribute::NoMerge);
TrapCall->setDoesNotReturn();
TrapCall->setDoesNotThrow();
Builder.CreateUnreachable();
}

I'm not sure how it would be more precise as I assume they do roughly the same thing.

Obviously, it is more precise (and slower) than adding a hardening pass in the middle of the pipeline, since some UB may be optimized before this pass. If we move the pass to the start of the optimization pipeline, there is no difference between this solution and the clang-codegen solution.

@vitalybuka
Copy link
Collaborator

How this is different from -fsanitize=null -fsanitize-trap=null, which is likely more precise?

This pass would run by default. The flags introduce calls to `@llvm.ubsantrap, which isn't what some people want. I'm not sure how it would be more precise as I assume they do roughly the same thing.

"Precise" in my comment means, that "miscompile" can already happen before the pass, and I am not sure I understand why miscompile can't happen after such transformation.

Also the patch does not explain why trap() can be less desirable than the load from nullptr?
Either way should endup with termination. So why do we need something other than a trap?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants