Skip to content

Conversation

mtrofin
Copy link
Member

@mtrofin mtrofin commented Sep 9, 2025

In a number of cases, InstCombine combines patterns into a select​. We can’t easily produce a profile for this select​ in the general case. For now, we just generate an unknown​ profile metadata.

Copy link
Member Author

mtrofin commented Sep 9, 2025

This stack of pull requests is managed by Graphite. Learn more about stacking.

@mtrofin mtrofin force-pushed the users/mtrofin/09-08-_instcombine_set_prof_metadata_on_selects branch 2 times, most recently from f139887 to d5c54f4 Compare September 9, 2025 04:23
@nikic
Copy link
Contributor

nikic commented Sep 9, 2025

TBH I don't think selects should be validated by profcheck at all, only branches.

@mtrofin
Copy link
Member Author

mtrofin commented Sep 9, 2025

TBH I don't think selects should be validated by profcheck at all, only branches.

Some get lowered to branches later on - otherwise, yes, I'd also love to not worry about them.

Copy link
Member Author

mtrofin commented Sep 9, 2025

In fact - to add some data to this. There are a few places where select may get lowered to a conditional branch, but I chose CodeGenPrepare in a ThinLTO, instrumented FDO build of a datacenter binary we care about. I then added this patch below and rebuilt.

Summed up the Total and Missed, and out of a total of 1322 lowerings, 1288 were not cold and without !prof.

--- a/work/llvm-project/llvm/lib/CodeGen/CodeGenPrepare.cpp
+++ b/work/llvm-project/llvm/lib/CodeGen/CodeGenPrepare.cpp
@@ -368,7 +368,14 @@ class CodeGenPrepare {
   std::unique_ptr<DominatorTree> DT;
 
 public:
+  size_t Missed = 0;
+  size_t Total = 0;
   CodeGenPrepare(){};
+  ~CodeGenPrepare() {
+    if (Missed > 0)
+      dbgs() << "[MISSED]," << Missed << "," << Total << "\n";
+  }
+
   CodeGenPrepare(const TargetMachine *TM) : TM(TM){};
   /// If encounter huge function, we need to limit the build time.
   bool IsHugeFunc = false;
@@ -7613,6 +7620,12 @@ bool CodeGenPrepare::optimizeSelectInst(
   // We should split before any debug-info.
   SplitPt.setHeadBit(true);
 
+  if (!PSI->isColdBlock(StartBlock, BFI.get())) {
+    if (!SI->getMetadata(LLVMContext::MD_prof))
+      ++Missed;
+  }
+  ++Total;
+
   IRBuilder<> IB(SI);
   auto *CondFr = IB.CreateFreeze(SI->getCondition(), SI->getName() + ".frozen");

@mtrofin mtrofin marked this pull request as ready for review September 9, 2025 22:13
@mtrofin mtrofin requested a review from nikic as a code owner September 9, 2025 22:13
@llvmbot llvmbot added llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms labels Sep 9, 2025
@llvmbot
Copy link
Member

llvmbot commented Sep 9, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Mircea Trofin (mtrofin)

Changes

In a number of cases, InstCombine combines patterns into a select​. We can’t easily produce a profile for this select​ in the general case. For now, we just generate an unknown​ profile metadata.


Full diff: https://github.com/llvm/llvm-project/pull/157599.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstructionCombining.cpp (+10)
  • (modified) llvm/utils/profcheck-xfail.txt (-160)
diff --git a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
index a74f292524b4d..1d0b7edc15530 100644
--- a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
@@ -81,6 +81,7 @@
 #include "llvm/IR/Operator.h"
 #include "llvm/IR/PassManager.h"
 #include "llvm/IR/PatternMatch.h"
+#include "llvm/IR/ProfDataUtils.h"
 #include "llvm/IR/Type.h"
 #include "llvm/IR/Use.h"
 #include "llvm/IR/User.h"
@@ -6008,6 +6009,15 @@ static bool combineInstructionsOverFunction(
     if (!MadeChangeInThisIteration)
       break;
 
+    // Issue #147390: InstCombine emits `select` as rewrites of instructions
+    // without metadata. It may be possible to synthesize some more meaningful
+    // profiles in some cases (based on operands or based on the pattern)
+    if (auto EC = F.getEntryCount(); EC.has_value() && EC->getCount() > 0)
+      for (auto &BB : F)
+        for (auto &I : BB)
+          if (isa<SelectInst>(I) && !I.getMetadata(LLVMContext::MD_prof))
+            setExplicitlyUnknownBranchWeights(I, DEBUG_TYPE);
+
     MadeIRChange = true;
     if (Iteration > Opts.MaxIterations) {
       reportFatalUsageError(
diff --git a/llvm/utils/profcheck-xfail.txt b/llvm/utils/profcheck-xfail.txt
index eac24dc5e2cc9..c0b47a5180be1 100644
--- a/llvm/utils/profcheck-xfail.txt
+++ b/llvm/utils/profcheck-xfail.txt
@@ -831,166 +831,6 @@ Transforms/IndVarSimplify/pr45835.ll
 Transforms/IndVarSimplify/preserving-debugloc-rem-div.ll
 Transforms/Inline/optimization-remarks-hotness-threshold.ll
 Transforms/InstCombine/2004-09-20-BadLoadCombine2.ll
-Transforms/InstCombine/2004-09-20-BadLoadCombine.ll
-Transforms/InstCombine/2005-04-07-UDivSelectCrash.ll
-Transforms/InstCombine/2011-02-14-InfLoop.ll
-Transforms/InstCombine/AArch64/sve-intrinsic-sel.ll
-Transforms/InstCombine/AArch64/sve-intrinsic-simplify-binop.ll
-Transforms/InstCombine/AArch64/sve-intrinsic-simplify-shift.ll
-Transforms/InstCombine/add2.ll
-Transforms/InstCombine/add.ll
-Transforms/InstCombine/add-mask.ll
-Transforms/InstCombine/add-shl-mul-umax.ll
-Transforms/InstCombine/add-shl-sdiv-to-srem.ll
-Transforms/InstCombine/AMDGPU/addrspacecast.ll
-Transforms/InstCombine/and2.ll
-Transforms/InstCombine/and-fcmp.ll
-Transforms/InstCombine/and.ll
-Transforms/InstCombine/and-or-icmp-nullptr.ll
-Transforms/InstCombine/and-or-icmps.ll
-Transforms/InstCombine/and-or-implied-cond-not.ll
-Transforms/InstCombine/apint-div1.ll
-Transforms/InstCombine/apint-div2.ll
-Transforms/InstCombine/apint-rem1.ll
-Transforms/InstCombine/apint-rem2.ll
-Transforms/InstCombine/ashr-demand.ll
-Transforms/InstCombine/atomic.ll
-Transforms/InstCombine/binop-cast.ll
-Transforms/InstCombine/binop-select-cast-of-select-cond.ll
-Transforms/InstCombine/binop-select.ll
-Transforms/InstCombine/bit-checks.ll
-Transforms/InstCombine/bitreverse.ll
-Transforms/InstCombine/branch.ll
-Transforms/InstCombine/builtin-dynamic-object-size.ll
-Transforms/InstCombine/canonicalize-clamp-like-pattern-between-negative-and-positive-thresholds.ll
-Transforms/InstCombine/canonicalize-clamp-like-pattern-between-zero-and-positive-threshold.ll
-Transforms/InstCombine/cast-mul-select.ll
-Transforms/InstCombine/clamp-to-minmax.ll
-Transforms/InstCombine/conditional-negation.ll
-Transforms/InstCombine/cttz.ll
-Transforms/InstCombine/debuginfo-invert.ll
-Transforms/InstCombine/demorgan.ll
-Transforms/InstCombine/div.ll
-Transforms/InstCombine/div-shift.ll
-Transforms/InstCombine/fabs.ll
-Transforms/InstCombine/fcmp-select.ll
-Transforms/InstCombine/ffs-1.ll
-Transforms/InstCombine/ffs-i16.ll
-Transforms/InstCombine/fmul-bool.ll
-Transforms/InstCombine/fmul.ll
-Transforms/InstCombine/fneg.ll
-Transforms/InstCombine/fold-ctpop-of-not.ll
-Transforms/InstCombine/fold-ext-eq-c-with-op.ll
-Transforms/InstCombine/free-inversion.ll
-Transforms/InstCombine/icmp-and-lowbit-mask.ll
-Transforms/InstCombine/icmp-equality-test.ll
-Transforms/InstCombine/icmp.ll
-Transforms/InstCombine/icmp-mul-and.ll
-Transforms/InstCombine/icmp-of-and-x.ll
-Transforms/InstCombine/icmp-of-or-x.ll
-Transforms/InstCombine/icmp-select-implies-common-op.ll
-Transforms/InstCombine/icmp-select.ll
-Transforms/InstCombine/icmp-with-selects.ll
-Transforms/InstCombine/intrinsic-select.ll
-Transforms/InstCombine/known-never-nan.ll
-Transforms/InstCombine/ldexp-ext.ll
-Transforms/InstCombine/ldexp.ll
-Transforms/InstCombine/load-bitcast-select.ll
-Transforms/InstCombine/load.ll
-Transforms/InstCombine/load-select.ll
-Transforms/InstCombine/loadstore-metadata.ll
-Transforms/InstCombine/logical-select-inseltpoison.ll
-Transforms/InstCombine/logical-select.ll
-Transforms/InstCombine/lshr.ll
-Transforms/InstCombine/masked_intrinsics-inseltpoison.ll
-Transforms/InstCombine/masked_intrinsics.ll
-Transforms/InstCombine/memchr-11.ll
-Transforms/InstCombine/memchr-2.ll
-Transforms/InstCombine/memchr-3.ll
-Transforms/InstCombine/memchr-6.ll
-Transforms/InstCombine/memchr-7.ll
-Transforms/InstCombine/memchr-9.ll
-Transforms/InstCombine/memchr.ll
-Transforms/InstCombine/mem-gep-zidx.ll
-Transforms/InstCombine/memrchr-3.ll
-Transforms/InstCombine/memrchr-4.ll
-Transforms/InstCombine/minmax-fold.ll
-Transforms/InstCombine/minmax-fp.ll
-Transforms/InstCombine/minmax-intrinsics.ll
-Transforms/InstCombine/mul-inseltpoison.ll
-Transforms/InstCombine/mul.ll
-Transforms/InstCombine/mul-masked-bits.ll
-Transforms/InstCombine/mul-pow2.ll
-Transforms/InstCombine/multiple-uses-load-bitcast-select.ll
-Transforms/InstCombine/narrow.ll
-Transforms/InstCombine/negated-bitmask.ll
-Transforms/InstCombine/nested-select.ll
-Transforms/InstCombine/not.ll
-Transforms/InstCombine/or-bitmask.ll
-Transforms/InstCombine/or-fcmp.ll
-Transforms/InstCombine/or.ll
-Transforms/InstCombine/phi-select-constant.ll
-Transforms/InstCombine/pow-1.ll
-Transforms/InstCombine/pow-3.ll
-Transforms/InstCombine/pow-sqrt.ll
-Transforms/InstCombine/pr24354.ll
-Transforms/InstCombine/pr35515.ll
-Transforms/InstCombine/ptrtoint-nullgep.ll
-Transforms/InstCombine/pull-conditional-binop-through-shift.ll
-Transforms/InstCombine/rem.ll
-Transforms/InstCombine/sdiv-canonicalize.ll
-Transforms/InstCombine/sdiv-guard.ll
-Transforms/InstCombine/select-and-cmp.ll
-Transforms/InstCombine/select-and-or.ll
-Transforms/InstCombine/select_arithmetic.ll
-Transforms/InstCombine/select-bitext.ll
-Transforms/InstCombine/select-cmp-br.ll
-Transforms/InstCombine/select-cmp.ll
-Transforms/InstCombine/select-factorize.ll
-Transforms/InstCombine/select_frexp.ll
-Transforms/InstCombine/select-icmp-and.ll
-Transforms/InstCombine/select.ll
-Transforms/InstCombine/select-min-max.ll
-Transforms/InstCombine/select-of-symmetric-selects.ll
-Transforms/InstCombine/select-or-cmp.ll
-Transforms/InstCombine/select-safe-bool-transforms.ll
-Transforms/InstCombine/select-safe-impliedcond-transforms.ll
-Transforms/InstCombine/select-safe-transforms.ll
-Transforms/InstCombine/select-select.ll
-Transforms/InstCombine/select-with-extreme-eq-cond.ll
-Transforms/InstCombine/shift.ll
-Transforms/InstCombine/shuffle-select-narrow-inseltpoison.ll
-Transforms/InstCombine/shuffle-select-narrow.ll
-Transforms/InstCombine/simplify-demanded-fpclass.ll
-Transforms/InstCombine/sink-not-into-another-hand-of-logical-and.ll
-Transforms/InstCombine/sink-not-into-another-hand-of-logical-or.ll
-Transforms/InstCombine/sink-not-into-logical-and.ll
-Transforms/InstCombine/sink-not-into-logical-or.ll
-Transforms/InstCombine/strchr-1.ll
-Transforms/InstCombine/strchr-3.ll
-Transforms/InstCombine/strlen-1.ll
-Transforms/InstCombine/strrchr-3.ll
-Transforms/InstCombine/sub-ashr-and-to-icmp-select.ll
-Transforms/InstCombine/sub-ashr-or-to-icmp-select.ll
-Transforms/InstCombine/sub.ll
-Transforms/InstCombine/sub-xor-cmp.ll
-Transforms/InstCombine/truncating-saturate.ll
-Transforms/InstCombine/trunc-inseltpoison.ll
-Transforms/InstCombine/trunc.ll
-Transforms/InstCombine/unordered-fcmp-select.ll
-Transforms/InstCombine/urem-via-cmp-select.ll
-Transforms/InstCombine/vec_sext.ll
-Transforms/InstCombine/vector-urem.ll
-Transforms/InstCombine/wcslen-1.ll
-Transforms/InstCombine/wcslen-3.ll
-Transforms/InstCombine/X86/blend_x86.ll
-Transforms/InstCombine/X86/x86-avx512-inseltpoison.ll
-Transforms/InstCombine/X86/x86-avx512.ll
-Transforms/InstCombine/xor-and-or.ll
-Transforms/InstCombine/xor-ashr.ll
-Transforms/InstCombine/xor.ll
-Transforms/InstCombine/zext-bool-add-sub.ll
-Transforms/InstCombine/zext-or-icmp.ll
 Transforms/IRCE/add-metadata-pre-post-loops.ll
 Transforms/IRCE/bad_expander.ll
 Transforms/IRCE/bug-mismatched-types.ll

Copy link

@snehasish snehasish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, ~160 tests off the list for a 5 line change is great ROI.

Copy link
Contributor

@boomanaiden154 boomanaiden154 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable enough to me. This comes down to what Nikita thinks is reasonable though.

Base automatically changed from users/mtrofin/09-09-_profcheck_require_unknown_metadata_have_a_origin_parameter to main September 10, 2025 22:34
@mtrofin mtrofin force-pushed the users/mtrofin/09-08-_instcombine_set_prof_metadata_on_selects branch from d5c54f4 to 7325466 Compare September 10, 2025 22:36
@mtrofin mtrofin requested a review from alanzhao1 September 12, 2025 22:22
Copy link
Member Author

mtrofin commented Sep 12, 2025

Changing to Draft, @alanzhao1 is doing better things here with #158375 .

We can land this PR as a clean-all for cases that can't yet be handled correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants