Skip to content

Conversation

@mtrofin
Copy link
Member

@mtrofin mtrofin commented Oct 18, 2025

In the case of a partial unswitch, we take the invariant part of an expression consisting of either conjunctions or disjunctions, and hoist it out of the loop, conditioning a branch on it (==the invariant part). We can't correctly calculate the branch probability of this new branch, but can use the probability of the existing branch as a bound. That would preserve block frequencies better than allowing for the default, static (50-50) probability for that branch.

Issue #147390

@mtrofin mtrofin marked this pull request as ready for review October 18, 2025 00:12
@llvmbot
Copy link
Member

llvmbot commented Oct 18, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Mircea Trofin (mtrofin)

Changes

When


Full diff: https://github.com/llvm/llvm-project/pull/164035.diff

3 Files Affected:

  • (modified) llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp (+43-8)
  • (added) llvm/test/Transforms/SimpleLoopUnswitch/nontrivial-unswitch-profile.ll (+89)
  • (added) llvm/test/Transforms/SimpleLoopUnswitch/simple-unswitch-profile.ll (+157)
diff --git a/llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp b/llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp
index e4ba70d1bce16..1ed7fd482b9de 100644
--- a/llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp
+++ b/llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp
@@ -82,6 +82,7 @@ STATISTIC(
 STATISTIC(NumInvariantConditionsInjected,
           "Number of invariant conditions injected and unswitched");
 
+namespace llvm {
 static cl::opt<bool> EnableNonTrivialUnswitch(
     "enable-nontrivial-unswitch", cl::init(false), cl::Hidden,
     cl::desc("Forcibly enables non-trivial loop unswitching rather than "
@@ -132,11 +133,17 @@ static cl::opt<bool> InjectInvariantConditions(
 
 static cl::opt<unsigned> InjectInvariantConditionHotnesThreshold(
     "simple-loop-unswitch-inject-invariant-condition-hotness-threshold",
-    cl::Hidden, cl::desc("Only try to inject loop invariant conditions and "
-                         "unswitch on them to eliminate branches that are "
-                         "not-taken 1/<this option> times or less."),
+    cl::Hidden,
+    cl::desc("Only try to inject loop invariant conditions and "
+             "unswitch on them to eliminate branches that are "
+             "not-taken 1/<this option> times or less."),
     cl::init(16));
 
+static cl::opt<bool> EstimateProfile("simple-loop-unswitch-estimate-profile",
+                                     cl::Hidden, cl::init(true));
+extern cl::opt<bool> ProfcheckDisableMetadataFixes;
+} // namespace llvm
+
 AnalysisKey ShouldRunExtraSimpleLoopUnswitch::Key;
 namespace {
 struct CompareDesc {
@@ -272,10 +279,33 @@ static bool areLoopExitPHIsLoopInvariant(const Loop &L,
 /// Copy a set of loop invariant values \p ToDuplicate and insert them at the
 /// end of \p BB and conditionally branch on the copied condition. We only
 /// branch on a single value.
+/// We attempt to estimate the profile of the resulting conditional branch from
+/// \p ComputeProfFrom, which is the original conditional branch we're
+/// unswitching.
 static void buildPartialUnswitchConditionalBranch(
     BasicBlock &BB, ArrayRef<Value *> Invariants, bool Direction,
     BasicBlock &UnswitchedSucc, BasicBlock &NormalSucc, bool InsertFreeze,
-    const Instruction *I, AssumptionCache *AC, const DominatorTree &DT) {
+    const Instruction *I, AssumptionCache *AC, const DominatorTree &DT,
+    const BranchInst &ComputeProfFrom) {
+
+  SmallVector<uint32_t> BranchWeights;
+  bool HasBranchWeights = EstimateProfile && !ProfcheckDisableMetadataFixes &&
+                          extractBranchWeights(ComputeProfFrom, BranchWeights);
+  // If Direction is true, that means we had a disjunction and that the "true"
+  // case exits. The probability of the disjunction of the subset of terms is at
+  // most as high as the original one. So, if the probability is higher than the
+  // one we'd assign in absence of a profile (i.e. 0.5), we will use 0.5,
+  // but if it's lower, we will use the original probability.
+  // Coversely, if Direction is false, that means we had a conjunction, and the
+  // probability of exiting is captured in the second branch weight. That
+  // probability is a disjunction (of the negation of the original terms). The
+  // same reasoning applies as above.
+  if (HasBranchWeights &&
+      static_cast<double>(BranchWeights[Direction ? 0 : 1]) /
+              static_cast<double>(sum_of(BranchWeights)) >
+          0.5)
+    HasBranchWeights = false;
+
   IRBuilder<> IRB(&BB);
   IRB.SetCurrentDebugLocation(DebugLoc::getCompilerGenerated());
 
@@ -288,8 +318,13 @@ static void buildPartialUnswitchConditionalBranch(
 
   Value *Cond = Direction ? IRB.CreateOr(FrozenInvariants)
                           : IRB.CreateAnd(FrozenInvariants);
-  IRB.CreateCondBr(Cond, Direction ? &UnswitchedSucc : &NormalSucc,
-                   Direction ? &NormalSucc : &UnswitchedSucc);
+  auto *BR = IRB.CreateCondBr(
+      Cond, Direction ? &UnswitchedSucc : &NormalSucc,
+      Direction ? &NormalSucc : &UnswitchedSucc,
+      HasBranchWeights ? ComputeProfFrom.getMetadata(LLVMContext::MD_prof)
+                       : nullptr);
+  if (!HasBranchWeights)
+    setExplicitlyUnknownBranchWeights(*BR, DEBUG_TYPE);
 }
 
 /// Copy a set of loop invariant values, and conditionally branch on them.
@@ -659,7 +694,7 @@ static bool unswitchTrivialBranch(Loop &L, BranchInst &BI, DominatorTree &DT,
              " condition!");
     buildPartialUnswitchConditionalBranch(
         *OldPH, Invariants, ExitDirection, *UnswitchedBB, *NewPH,
-        FreezeLoopUnswitchCond, OldPH->getTerminator(), nullptr, DT);
+        FreezeLoopUnswitchCond, OldPH->getTerminator(), nullptr, DT, BI);
   }
 
   // Update the dominator tree with the added edge.
@@ -2478,7 +2513,7 @@ static void unswitchNontrivialInvariants(
     else {
       buildPartialUnswitchConditionalBranch(
           *SplitBB, Invariants, Direction, *ClonedPH, *LoopPH,
-          FreezeLoopUnswitchCond, BI, &AC, DT);
+          FreezeLoopUnswitchCond, BI, &AC, DT, *BI);
     }
     DTUpdates.push_back({DominatorTree::Insert, SplitBB, ClonedPH});
 
diff --git a/llvm/test/Transforms/SimpleLoopUnswitch/nontrivial-unswitch-profile.ll b/llvm/test/Transforms/SimpleLoopUnswitch/nontrivial-unswitch-profile.ll
new file mode 100644
index 0000000000000..9cc417f6b874e
--- /dev/null
+++ b/llvm/test/Transforms/SimpleLoopUnswitch/nontrivial-unswitch-profile.ll
@@ -0,0 +1,89 @@
+; RUN: split-file %s %t
+; RUN: cat %t/main.ll %t/probable-or.prof > %t/probable-or.ll
+; RUN: cat %t/main.ll %t/probable-and.prof > %t/probable-and.ll
+; RUN: opt -passes='loop(simple-loop-unswitch<nontrivial>)' -S %t/probable-or.ll -o -| FileCheck %t/probable-or.prof
+; RUN: opt -passes='loop(simple-loop-unswitch<nontrivial>)' -S %t/probable-and.ll -o -| FileCheck %t/probable-and.prof
+
+;--- main.ll
+declare i32 @a()
+declare i32 @b()
+
+define i32 @or(ptr %ptr, i1 %cond) !prof !0 {
+entry:
+  br label %loop_begin
+
+loop_begin:
+  %v1 = load i1, ptr %ptr
+  %cond_or = or i1 %v1, %cond
+  br i1 %cond_or, label %loop_a, label %loop_b, !prof !1
+
+loop_a:
+  call i32 @a()
+  br label %latch
+
+loop_b:
+  call i32 @b()
+  br label %latch
+
+latch:
+  %v2 = load i1, ptr %ptr
+  br i1 %v2, label %loop_begin, label %loop_exit, !prof !2
+
+loop_exit:
+  ret i32 0
+}
+
+define i32 @and(ptr %ptr, i1 %cond) !prof !0 {
+entry:
+  br label %loop_begin
+
+loop_begin:
+  %v1 = load i1, ptr %ptr
+  %cond_and = and i1 %v1, %cond
+  br i1 %cond_and, label %loop_a, label %loop_b, !prof !1
+
+loop_a:
+  call i32 @a()
+  br label %latch
+
+loop_b:
+  call i32 @b()
+  br label %latch
+
+latch:
+  %v2 = load i1, ptr %ptr
+  br i1 %v2, label %loop_begin, label %loop_exit, !prof !2
+
+loop_exit:
+  ret i32 0
+}
+
+;--- probable-or.prof
+!0 = !{!"function_entry_count", i32 10}
+!1 = !{!"branch_weights", i32 1, i32 1000}
+!2 = !{!"branch_weights", i32 5, i32 7}
+; CHECK-LABEL: @or
+; CHECK-LABEL: entry:
+; CHECK-NEXT:   %cond.fr = freeze i1 %cond
+; CHECK-NEXT:   br i1 %cond.fr, label %entry.split.us, label %entry.split, !prof !1
+; CHECK-LABEL: @and
+; CHECK-LABEL: entry:
+; CHECK-NEXT:   %cond.fr = freeze i1 %cond
+; CHECK-NEXT:   br i1 %cond.fr, label %entry.split, label %entry.split.us, !prof !3
+; CHECK: !1 = !{!"branch_weights", i32 1, i32 1000}
+; CHECK: !3 = !{!"unknown", !"simple-loop-unswitch"}
+
+;--- probable-and.prof
+!0 = !{!"function_entry_count", i32 10}
+!1 = !{!"branch_weights", i32 1000, i32 1}
+!2 = !{!"branch_weights", i32 5, i32 7}
+; CHECK-LABEL: @or
+; CHECK-LABEL: entry:
+; CHECK-NEXT:   %cond.fr = freeze i1 %cond
+; CHECK-NEXT:   br i1 %cond.fr, label %entry.split.us, label %entry.split, !prof !1
+; CHECK-LABEL: @and
+; CHECK-LABEL: entry:
+; CHECK-NEXT:   %cond.fr = freeze i1 %cond
+; CHECK-NEXT:   br i1 %cond.fr, label %entry.split, label %entry.split.us, !prof !3
+; CHECK: !1 = !{!"unknown", !"simple-loop-unswitch"}
+; CHECK: !3 = !{!"branch_weights", i32 1000, i32 1}
diff --git a/llvm/test/Transforms/SimpleLoopUnswitch/simple-unswitch-profile.ll b/llvm/test/Transforms/SimpleLoopUnswitch/simple-unswitch-profile.ll
new file mode 100644
index 0000000000000..ec6baa5b3772f
--- /dev/null
+++ b/llvm/test/Transforms/SimpleLoopUnswitch/simple-unswitch-profile.ll
@@ -0,0 +1,157 @@
+; RUN: split-file %s %t
+; RUN: cat %t/main.ll %t/probable-or.prof > %t/probable-or.ll
+; RUN: cat %t/main.ll %t/probable-and.prof > %t/probable-and.ll
+; RUN: opt -passes='loop-mssa(simple-loop-unswitch)' -S %t/probable-or.ll -o - | FileCheck %t/probable-or.prof
+; RUN: opt -passes='loop-mssa(simple-loop-unswitch)' -S %t/probable-and.ll -o - | FileCheck %t/probable-and.prof
+;
+; RUN: opt -passes='module(print<block-freq>),function(loop-mssa(simple-loop-unswitch)),module(print<block-freq>)' \
+; RUN:   %t/probable-or.ll -disable-output -simple-loop-unswitch-estimate-profile=0 2>&1 | FileCheck %t/probable-or.prof --check-prefixes=PROFILE-COM,PROFILE-REF
+
+; RUN: opt -passes='module(print<block-freq>),function(loop-mssa(simple-loop-unswitch)),module(print<block-freq>)' \
+; RUN:   %t/probable-or.ll -disable-output -simple-loop-unswitch-estimate-profile=1 2>&1 | FileCheck %t/probable-or.prof --check-prefixes=PROFILE-COM,PROFILE-CHK
+
+; RUN: opt -passes='module(print<block-freq>),function(loop-mssa(simple-loop-unswitch)),module(print<block-freq>)' \
+; RUN:   %t/probable-and.ll -disable-output -simple-loop-unswitch-estimate-profile=0 2>&1 | FileCheck %t/probable-and.prof --check-prefixes=PROFILE-COM,PROFILE-REF
+
+; RUN: opt -passes='module(print<block-freq>),function(loop-mssa(simple-loop-unswitch)),module(print<block-freq>)' \
+; RUN:   %t/probable-and.ll -disable-output -simple-loop-unswitch-estimate-profile=1 2>&1 | FileCheck %t/probable-and.prof --check-prefixes=PROFILE-COM,PROFILE-CHK
+
+;--- main.ll
+declare void @some_func() noreturn
+
+define i32 @or(i1 %cond1, i32 %var1) !prof !0 {
+entry:
+  br label %loop_begin
+
+loop_begin:
+  %var3 = phi i32 [%var1, %entry], [%var2, %do_something]
+  %cond2 = icmp eq i32 %var3, 10
+  %cond.or = or i1 %cond1, %cond2
+  br i1 %cond.or, label %loop_exit, label %do_something, !prof !1
+
+do_something:
+  %var2 = add i32 %var3, 1
+  call void @some_func() noreturn nounwind
+  br label %loop_begin
+
+loop_exit:
+  ret i32 0
+}
+
+define i32 @and(i1 %cond1, i32 %var1) !prof !0 {
+entry:
+  br label %loop_begin
+
+loop_begin:
+  %var3 = phi i32 [%var1, %entry], [%var2, %do_something]
+  %cond2 = icmp eq i32 %var3, 10
+  %cond.and = and i1 %cond1, %cond2
+  br i1 %cond.and, label %do_something, label %loop_exit, !prof !1
+
+do_something:
+  %var2 = add i32 %var3, 1
+  call void @some_func() noreturn nounwind
+  br label %loop_begin
+
+loop_exit:
+  ret i32 0
+}
+
+;--- probable-or.prof
+!0 = !{!"function_entry_count", i32 10}
+!1 = !{!"branch_weights", i32 1, i32 1000}
+; CHECK-LABEL: @or
+; CHECK-LABEL: entry:
+; CHECK-NEXT:   %cond1.fr = freeze i1 %cond1
+; CHECK-NEXT:   br i1 %cond1.fr, label %loop_exit.split, label %entry.split, !prof !1
+; CHECK-LABEL: @and
+; CHECK-LABEL: entry:
+; CHECK-NEXT:   %cond1.fr = freeze i1 %cond1
+; CHECK-NEXT:   br i1 %cond1.fr, label %entry.split, label %loop_exit.split, !prof !2
+; CHECK: !1 = !{!"branch_weights", i32 1, i32 1000}
+; CHECK: !2 = !{!"unknown", !"simple-loop-unswitch"}
+
+; PROFILE-COM: Printing analysis results of BFI for function 'or':
+; PROFILE-COM: block-frequency-info: or
+ ; PROFILE-COM: - entry: {{.*}} count = 10
+ ; PROFILE-COM: - loop_begin: {{.*}} count = 10010
+ ; PROFILE-COM: - do_something: {{.*}} count = 10000
+ ; PROFILE-COM: - loop_exit: {{.*}} count = 10
+
+; PROFILE-COM: Printing analysis results of BFI for function 'and':
+; PROFILE-COM: block-frequency-info: and
+ ; PROFILE-COM: - entry: {{.*}} count = 10
+ ; PROFILE-COM: - loop_begin: {{.*}} count = 10
+ ; PROFILE-COM: - do_something: {{.*}} count = 0
+ ; PROFILE-COM: - loop_exit: {{.*}} count = 10
+
+; PROFILE-COM: Printing analysis results of BFI for function 'or':
+; PROFILE-COM: block-frequency-info: or
+ ; PROFILE-COM: - entry: {{.*}} count = 10
+ ; PROFILE-REF: - entry.split: {{.*}} count = 5
+ ; PROFILE-CHK: - entry.split: {{.*}} count = 10
+ ; PROFILE-REF: - loop_begin: {{.*}} count = 5005
+ ; PROFILE-CHK: - loop_begin: {{.*}} count = 10000
+ ; PROFILE-REF: - do_something: {{.*}} count = 5000
+ ; PROFILE-CHK: - do_something: {{.*}} count = 9990
+ ; PROFILE-REF: - loop_exit: {{.*}} count = 5
+ ; PROFILE-CHK: - loop_exit: {{.*}} count = 10
+ ; PROFILE-COM: - loop_exit.split: {{.*}} count = 10
+
+; PROFILE-COM: Printing analysis results of BFI for function 'and':
+; PROFILE-COM: block-frequency-info: and
+ ; PROFILE-COM: - entry: {{.*}} count = 10
+ ; PROFILE-COM: - entry.split: {{.*}} count = 5
+ ; PROFILE-COM: - loop_begin: {{.*}} count = 5
+ ; PROFILE-COM: - do_something: {{.*}} count = 0
+ ; PROFILE-COM: - loop_exit: {{.*}} count = 5
+ ; PROFILE-COM: - loop_exit.split: {{.*}} count = 10
+
+;--- probable-and.prof
+!0 = !{!"function_entry_count", i32 10}
+!1 = !{!"branch_weights", i32 1000, i32 1}
+; CHECK-LABEL: @or
+; CHECK-LABEL: entry:
+; CHECK-NEXT:   %cond1.fr = freeze i1 %cond1
+; CHECK-NEXT:   br i1 %cond1.fr, label %loop_exit.split, label %entry.split, !prof !1
+; CHECK-LABEL: @and
+; CHECK-LABEL: entry:
+; CHECK-NEXT:   %cond1.fr = freeze i1 %cond1
+; CHECK-NEXT:   br i1 %cond1.fr, label %entry.split, label %loop_exit.split, !prof !2
+; CHECK: !1 = !{!"unknown", !"simple-loop-unswitch"}
+; CHECK: !2 = !{!"branch_weights", i32 1000, i32 1}
+; PROFILE-COM: Printing analysis results of BFI for function 'or':
+; PROFILE-COM: block-frequency-info: or
+ ; PROFILE-COM: - entry: {{.*}}, count = 10
+ ; PROFILE-COM: - loop_begin: {{.*}}, count = 10
+ ; PROFILE-COM: - do_something: {{.*}}, count = 0
+ ; PROFILE-COM: - loop_exit: {{.*}}, count = 10
+
+; PROFILE-COM: Printing analysis results of BFI for function 'and':
+; PROFILE-COM: block-frequency-info: and
+ ; PROFILE-COM: - entry: {{.*}} count = 10
+ ; PROFILE-COM: - loop_begin: {{.*}} count = 10010
+ ; PROFILE-COM: - do_something: {{.*}} count = 10000
+ ; PROFILE-COM: - loop_exit: {{.*}} count = 10
+
+; PROFILE-COM: Printing analysis results of BFI for function 'or':
+; PROFILE-COM: block-frequency-info: or
+ ; PROFILE-COM: - entry: {{.*}} count = 10
+ ; PROFILE-COM: - entry.split: {{.*}} count = 5
+ ; PROFILE-COM: - loop_begin: {{.*}} count = 5
+ ; PROFILE-COM: - do_something: {{.*}} count = 0
+ ; PROFILE-COM: - loop_exit: {{.*}} count = 5
+ ; PROFILE-COM: - loop_exit.split: {{.*}} count = 10
+
+; PROFILE-COM: Printing analysis results of BFI for function 'and':
+; PROFILE-COM: block-frequency-info: and
+ ; PROFILE-COM: - entry: {{.*}} count = 10
+ ; PROFILE-REF: - entry.split: {{.*}} count = 5
+ ; PROFILE-CHK: - entry.split: {{.*}} count = 10
+ ; PROFILE-REF: - loop_begin: {{.*}} count = 5005
+ ; PROFILE-CHK: - loop_begin: {{.*}} count = 10000
+ ; PROFILE-REF: - do_something: {{.*}} count = 5000
+ ; PROFILE-CHK: - do_something: {{.*}} count = 9990
+ ; PROFILE-REF: - loop_exit: {{.*}} count = 5
+ ; PROFILE-CHK: - loop_exit: {{.*}} count = 10
+ ; PROFILE-COM: - loop_exit.split: {{.*}} count = 10

Copy link
Member Author

mtrofin commented Oct 18, 2025

Come to think of it, maybe a possibility is to use the BFI of the loop header before the transformation and calculate the BPI to preserve it.

STATISTIC(NumInvariantConditionsInjected,
"Number of invariant conditions injected and unswitched");

namespace llvm {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this change useful?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fly-by akin to #161240, and in particular because ProfcheckDisableMetadataFixes is llvm-qualified, so if I was going to add llvm { might as well.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the link. If adding namespace llvm {...} around static variables is worthwhile, it would be nice to extend the coding standards discussion of namespaces vs. static. @nhaehnle

; PROFILE-REF: - loop_begin: {{.*}} count = 5005
; PROFILE-CHK: - loop_begin: {{.*}} count = 10000
; PROFILE-REF: - do_something: {{.*}} count = 5000
; PROFILE-CHK: - do_something: {{.*}} count = 9990
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, do_something would have count = 10000 as before the transformation, right? It seems the problem is that the new branch weights assign the original branch probability to each of the hoisted condition set and the unhoisted condition set, but instead the product of their probabilities should still be the original probability.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - which is why I was wondering in the PR-level comment if maybe I should use BFIs as the invariants from which to calculate the new !prof. I didn't think hard about it, and it might be more complex than this change, hence offered it as a point of discussion. If we think there's some merit to it, I would prefer adding an issue for looking more at it in phase 2 (when we look at !prof values); and land this change, as it is still an improvement.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned in reply to that comment, my understanding is that the current PR is strictly an improvement, so it seems fine to me to figure out further improvements later.

@jdenny-ornl
Copy link
Collaborator

As I understand it, the current PR strictly improves the accuracy of probabilities but falls short of achieving the BFI from before the transformation in two ways:

  1. If it cannot prove the probability upper bound is less than .5, it offers no improvement. It's this limitation that ensures this PR cannot worsen accuracy.
  2. It does not adjust the probability of the unhoisted conditions to compensate for the probability added to the hoisted conditions. The combined probabilities cause BFI inaccuracies, as demonstrated in this test.

Am I understanding correctly?

If 2 can be addressed, I wonder if the limitation of 1 is necessary. But I'm not sure how you would decide to split the probability between the hoisted conditions and the unhoisted conditions.

Come to think of it, maybe a possibility is to use the BFI of the loop header before the transformation and calculate the BPI to preserve it.

It does seem like the original BFI should guide the above decision, if feasible.

@mtrofin mtrofin force-pushed the users/mtrofin/10-17-_slu_profcheck_estimate_branch_weights_in_partial_unswitch_cases branch from 9914c83 to d18d628 Compare October 29, 2025 15:02
Copy link
Collaborator

@jdenny-ornl jdenny-ornl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except for the mangled comment.

Comment on lines 297 to 298
// If Direction is true, that means we had a disjunction and that the "true"
// Conversely, if Direction is false, that means we had a conjunction, and the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something went wrong here. The corrected "Conversely" sentence replaced the wrong sentence.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably stumbled on my fingers.

STATISTIC(NumInvariantConditionsInjected,
"Number of invariant conditions injected and unswitched");

namespace llvm {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the link. If adding namespace llvm {...} around static variables is worthwhile, it would be nice to extend the coding standards discussion of namespaces vs. static. @nhaehnle

; PROFILE-REF: - loop_begin: {{.*}} count = 5005
; PROFILE-CHK: - loop_begin: {{.*}} count = 10000
; PROFILE-REF: - do_something: {{.*}} count = 5000
; PROFILE-CHK: - do_something: {{.*}} count = 9990
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned in reply to that comment, my understanding is that the current PR is strictly an improvement, so it seems fine to me to figure out further improvements later.

@mtrofin mtrofin force-pushed the users/mtrofin/10-17-_slu_profcheck_estimate_branch_weights_in_partial_unswitch_cases branch from d18d628 to 90d5f03 Compare October 30, 2025 05:30
@mtrofin mtrofin force-pushed the users/mtrofin/10-17-_slu_profcheck_estimate_branch_weights_in_partial_unswitch_cases branch from 90d5f03 to 92337a6 Compare October 30, 2025 15:09
Copy link
Member Author

mtrofin commented Oct 30, 2025

Merge activity

  • Oct 30, 5:35 PM UTC: A user started a stack merge that includes this pull request via Graphite.
  • Oct 30, 5:37 PM UTC: @mtrofin merged this pull request with Graphite.

@mtrofin mtrofin merged commit e63f0f5 into main Oct 30, 2025
10 checks passed
@mtrofin mtrofin deleted the users/mtrofin/10-17-_slu_profcheck_estimate_branch_weights_in_partial_unswitch_cases branch October 30, 2025 17:37
luciechoi pushed a commit to luciechoi/llvm-project that referenced this pull request Nov 1, 2025
…lvm#164035)

In the case of a partial unswitch, we take the invariant part of an expression consisting of either conjunctions or disjunctions, and hoist it out of the loop, conditioning a branch on it (==the invariant part). We can't correctly calculate the branch probability of this new branch, but can use the probability of the existing branch as a bound. That would preserve block frequencies better than allowing for the default, static (50-50) probability for that branch.

Issue llvm#147390
DEBADRIBASAK pushed a commit to DEBADRIBASAK/llvm-project that referenced this pull request Nov 3, 2025
…lvm#164035)

In the case of a partial unswitch, we take the invariant part of an expression consisting of either conjunctions or disjunctions, and hoist it out of the loop, conditioning a branch on it (==the invariant part). We can't correctly calculate the branch probability of this new branch, but can use the probability of the existing branch as a bound. That would preserve block frequencies better than allowing for the default, static (50-50) probability for that branch.

Issue llvm#147390
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants