Skip to content

Conversation

hassnaaHamdi
Copy link
Member

@hassnaaHamdi hassnaaHamdi commented Sep 16, 2025

This patch implements the speculative devirtualization feature in the LLVM backend.
It handles the case of single implementation devirtualization where there is a single
possible callee of a virtual function.

  • Add cl::opt 'devirtualize-speculatively' to enable it.
  • Flag is disabled by default.
  • It works regardless of the visibility of the object.
  • Not enabled for LTO for now.

@llvmbot
Copy link
Member

llvmbot commented Sep 16, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Hassnaa Hamdi (hassnaaHamdi)

Changes
  • Add cl::opt 'devirtualize-speculatively' to enable it.
  • Flag is disabled by default.
  • It works regardless of the visibility of the object.

Full diff: https://github.com/llvm/llvm-project/pull/159048.diff

3 Files Affected:

  • (modified) llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp (+49-15)
  • (added) llvm/test/Transforms/WholeProgramDevirt/speculative-devirt-single-impl.ll (+131)
  • (modified) llvm/test/Transforms/WholeProgramDevirt/virtual-const-prop-check.ll (+7)
diff --git a/llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp b/llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
index 09bffa7bf5846..64f574be8fd0e 100644
--- a/llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
+++ b/llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
@@ -24,7 +24,8 @@
 //   returns 0, or a single vtable's function returns 1, replace each virtual
 //   call with a comparison of the vptr against that vtable's address.
 //
-// This pass is intended to be used during the regular and thin LTO pipelines:
+// This pass is intended to be used during the regular/thin and non-LTO
+// pipelines:
 //
 // During regular LTO, the pass determines the best optimization for each
 // virtual call and applies the resolutions directly to virtual calls that are
@@ -48,6 +49,14 @@
 //   is supported.
 // - Import phase: (same as with hybrid case above).
 //
+// During Speculative devirtualization mode -not restricted to LTO-:
+// - The pass applies speculative devirtualization without requiring any type of
+//   visibility.
+// - Skips other features like virtual constant propagation, uniform return
+//   value optimization, unique return value optimization and branch funnels as
+//   they need LTO.
+// - This mode is enabled via 'devirtualize-speculatively' flag.
+//
 //===----------------------------------------------------------------------===//
 
 #include "llvm/Transforms/IPO/WholeProgramDevirt.h"
@@ -61,7 +70,9 @@
 #include "llvm/Analysis/AssumptionCache.h"
 #include "llvm/Analysis/BasicAliasAnalysis.h"
 #include "llvm/Analysis/BlockFrequencyInfo.h"
+#include "llvm/Analysis/ModuleSummaryAnalysis.h"
 #include "llvm/Analysis/OptimizationRemarkEmitter.h"
+#include "llvm/Analysis/ProfileSummaryInfo.h"
 #include "llvm/Analysis/TypeMetadataUtils.h"
 #include "llvm/Bitcode/BitcodeReader.h"
 #include "llvm/Bitcode/BitcodeWriter.h"
@@ -120,6 +131,13 @@ STATISTIC(NumVirtConstProp1Bit,
           "Number of 1 bit virtual constant propagations");
 STATISTIC(NumVirtConstProp, "Number of virtual constant propagations");
 
+// TODO: This option eventually should support any public visibility vtables
+// with/out LTO.
+static cl::opt<bool> ClDevirtualizeSpeculatively(
+    "devirtualize-speculatively",
+    cl::desc("Enable speculative devirtualization optimization"),
+    cl::init(false));
+
 static cl::opt<PassSummaryAction> ClSummaryAction(
     "wholeprogramdevirt-summary-action",
     cl::desc("What to do with the summary when running this pass"),
@@ -1083,10 +1101,10 @@ bool DevirtModule::tryFindVirtualCallTargets(
     if (!TM.Bits->GV->isConstant())
       return false;
 
-    // We cannot perform whole program devirtualization analysis on a vtable
-    // with public LTO visibility.
-    if (TM.Bits->GV->getVCallVisibility() ==
-        GlobalObject::VCallVisibilityPublic)
+    // Without ClDevirtualizeSpeculatively, we cannot perform whole program
+    // devirtualization analysis on a vtable with public LTO visibility.
+    if (!ClDevirtualizeSpeculatively && TM.Bits->GV->getVCallVisibility() ==
+                                            GlobalObject::VCallVisibilityPublic)
       return false;
 
     Function *Fn = nullptr;
@@ -1105,6 +1123,12 @@ bool DevirtModule::tryFindVirtualCallTargets(
     if (Fn->getName() == "__cxa_pure_virtual")
       continue;
 
+    // In most cases empty functions will be overridden by the
+    // implementation of the derived class, so we can skip them.
+    if (ClDevirtualizeSpeculatively && Fn->getReturnType()->isVoidTy() &&
+        Fn->getInstructionCount() <= 1)
+      continue;
+
     // We can disregard unreachable functions as possible call targets, as
     // unreachable functions shouldn't be called.
     if (mustBeUnreachableFunction(Fn, ExportSummary))
@@ -1223,10 +1247,12 @@ void DevirtModule::applySingleImplDevirt(VTableSlotInfo &SlotInfo,
         CallTrap->setDebugLoc(CB.getDebugLoc());
       }
 
-      // If fallback checking is enabled, add support to compare the virtual
-      // function pointer to the devirtualized target. In case of a mismatch,
-      // fall back to indirect call.
-      if (DevirtCheckMode == WPDCheckMode::Fallback) {
+      // If fallback checking or speculative devirtualization are enabled,
+      // add support to compare the virtual function pointer to the
+      // devirtualized target. In case of a mismatch, fall back to indirect
+      // call.
+      if (DevirtCheckMode == WPDCheckMode::Fallback ||
+          ClDevirtualizeSpeculatively) {
         MDNode *Weights = MDBuilder(M.getContext()).createLikelyBranchWeights();
         // Version the indirect call site. If the called value is equal to the
         // given callee, 'NewInst' will be executed, otherwise the original call
@@ -1325,10 +1351,10 @@ bool DevirtModule::trySingleImplDevirt(
   if (!IsExported)
     return false;
 
-  // If the only implementation has local linkage, we must promote to external
-  // to make it visible to thin LTO objects. We can only get here during the
-  // ThinLTO export phase.
-  if (TheFn->hasLocalLinkage()) {
+  // Out of speculative devirtualization mode, if the only implementation has
+  // local linkage, we must promote to external to make it visible to thin LTO
+  // objects.
+  if (!ClDevirtualizeSpeculatively && TheFn->hasLocalLinkage()) {
     std::string NewName = (TheFn->getName() + ".llvm.merged").str();
 
     // Since we are renaming the function, any comdats with the same name must
@@ -2350,6 +2376,11 @@ bool DevirtModule::run() {
 
   Function *TypeTestFunc =
       Intrinsic::getDeclarationIfExists(&M, Intrinsic::type_test);
+  // If we are in speculative devirtualization mode, we can work on the public
+  // type test intrinsics.
+  if (!TypeTestFunc && ClDevirtualizeSpeculatively)
+    TypeTestFunc =
+        Intrinsic::getDeclarationIfExists(&M, Intrinsic::public_type_test);
   Function *TypeCheckedLoadFunc =
       Intrinsic::getDeclarationIfExists(&M, Intrinsic::type_checked_load);
   Function *TypeCheckedLoadRelativeFunc = Intrinsic::getDeclarationIfExists(
@@ -2472,8 +2503,11 @@ bool DevirtModule::run() {
                  .WPDRes[S.first.ByteOffset];
     if (tryFindVirtualCallTargets(TargetsForSlot, TypeMemberInfos,
                                   S.first.ByteOffset, ExportSummary)) {
-
-      if (!trySingleImplDevirt(ExportSummary, TargetsForSlot, S.second, Res)) {
+      bool SingleImplDevirt =
+          trySingleImplDevirt(ExportSummary, TargetsForSlot, S.second, Res);
+      // Out of speculative devirtualization mode, Try to apply virtual constant
+      // propagation or branch funneling.
+      if (!SingleImplDevirt && !ClDevirtualizeSpeculatively) {
         DidVirtualConstProp |=
             tryVirtualConstProp(TargetsForSlot, S.second, Res, S.first);
 
diff --git a/llvm/test/Transforms/WholeProgramDevirt/speculative-devirt-single-impl.ll b/llvm/test/Transforms/WholeProgramDevirt/speculative-devirt-single-impl.ll
new file mode 100644
index 0000000000000..b7d1bda8d133b
--- /dev/null
+++ b/llvm/test/Transforms/WholeProgramDevirt/speculative-devirt-single-impl.ll
@@ -0,0 +1,131 @@
+; -stats requires asserts
+; REQUIRES: asserts
+
+; Check that we can still devirtualize outside LTO mode when speculative devirtualization is enabled.
+; Check that we skip devirtualization for empty functions in speculative devirtualization mode
+
+; RUN: opt -S -passes=wholeprogramdevirt -devirtualize-speculatively \
+; RUN: -pass-remarks=wholeprogramdevirt -stats %s 2>&1 | FileCheck %s
+
+target datalayout = "e-p:64:64"
+target triple = "x86_64-unknown-linux-gnu"
+
+; CHECK: remark: devirt-single.cc:30:32: single-impl: devirtualized a call to vf
+; CHECK: remark: devirt-single.cc:41:32: single-impl: devirtualized a call to vf
+; CHECK: remark: devirt-single.cc:51:32: single-impl: devirtualized a call to vf
+; CHECK: remark: devirt-single.cc:13:0: devirtualized vf
+; CHECK-NOT: devirtualized
+
+@vt1 = constant [1 x ptr] [ptr @vf], !type !8
+@vt2 = constant [1 x ptr] [ptr @vf_empty], !type !12
+
+define i1 @vf(ptr %this) #0 !dbg !7 {
+  ret i1 true
+}
+
+; This should NOT be devirtualized because during non-lto empty functions
+; are skipped.
+define void @vf_empty(ptr %this) !dbg !11 {
+  ret void
+}
+
+; CHECK: define void @call
+define void @call(ptr %obj) #1 !dbg !5 {
+  %vtable = load ptr, ptr %obj
+  %p = call i1 @llvm.type.test(ptr %vtable, metadata !"typeid")
+  call void @llvm.assume(i1 %p)
+  %fptr = load ptr, ptr %vtable
+  ; CHECK: if.true.direct_targ:
+  ; CHECK:   call i1 @vf(
+  ; CHECK: if.false.orig_indirect:
+  ; CHECK:   call i1 %fptr(
+  call i1 %fptr(ptr %obj), !dbg !6
+  ret void
+}
+
+
+; CHECK: define void @call1
+define void @call1(ptr %obj) #1 !dbg !9 {
+  %vtable = load ptr, ptr %obj
+  %p = call i1 @llvm.type.test(ptr %vtable, metadata !"typeid1")
+  call void @llvm.assume(i1 %p)
+  %fptr = load ptr, ptr %vtable, align 8
+  ; CHECK: call i1 %fptr
+  %1 = call i1 %fptr(ptr %obj), !dbg !10
+  ret void
+}
+declare ptr @llvm.load.relative.i32(ptr, i32)
+
+@vt3 = private unnamed_addr constant [1 x i32] [
+  i32 trunc (i64 sub (i64 ptrtoint (ptr dso_local_equivalent @vf to i64), i64 ptrtoint (ptr @vt3 to i64)) to i32)
+], align 4, !type !15
+
+; CHECK: define void @call2
+define void @call2(ptr %obj) #1 !dbg !13 {
+  %vtable = load ptr, ptr %obj
+  %p = call i1 @llvm.type.test(ptr %vtable, metadata !"typeid2")
+  call void @llvm.assume(i1 %p)
+  %fptr = call ptr @llvm.load.relative.i32(ptr %vtable, i32 0)
+  ; CHECK: if.true.direct_targ:
+  ; CHECK:   call i1 @vf(
+  ; CHECK: if.false.orig_indirect:
+  ; CHECK:   call i1 %fptr(
+  call i1 %fptr(ptr %obj), !dbg !14
+  ret void
+}
+
+@_ZTV1A.local = private unnamed_addr constant { [3 x i32] } { [3 x i32] [
+  i32 0,  ; offset to top
+  i32 0,  ; rtti
+  i32 trunc (i64 sub (i64 ptrtoint (ptr dso_local_equivalent @vf to i64), i64 ptrtoint (ptr getelementptr inbounds ({ [3 x i32] }, ptr @_ZTV1A.local, i32 0, i32 0, i32 2) to i64)) to i32)  ; vf_emptyunc offset
+] }, align 4, !type !18
+
+; CHECK: define void @call3
+define void @call3(ptr %obj) #1 !dbg !16 {
+  %vtable = load ptr, ptr %obj
+  %p = call i1 @llvm.type.test(ptr %vtable, metadata !"typeid3")
+  call void @llvm.assume(i1 %p)
+  %fptr = call ptr @llvm.load.relative.i32(ptr %vtable, i32 8)
+  ; CHECK: if.true.direct_targ:
+  ; CHECK:   call i1 @vf(
+  ; CHECK: if.false.orig_indirect:
+  ; CHECK:   call i1 %fptr(
+  call i1 %fptr(ptr %obj), !dbg !17
+  ret void
+}
+
+
+declare i1 @llvm.type.test(ptr, metadata)
+declare void @llvm.assume(i1)
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!2, !3}
+!llvm.ident = !{!4}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, file: !1, producer: "clang version 4.0.0 (trunk 278098)", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug)
+!1 = !DIFile(filename: "devirt-single.cc", directory: ".")
+!2 = !{i32 2, !"Dwarf Version", i32 4}
+!3 = !{i32 2, !"Debug Info Version", i32 3}
+!4 = !{!"clang version 4.0.0 (trunk 278098)"}
+!5 = distinct !DISubprogram(name: "call", linkageName: "_Z4callPv", scope: !1, file: !1, line: 29, isLocal: false, isDefinition: true, scopeLine: 9, flags: DIFlagPrototyped, isOptimized: false, unit: !0)
+!6 = !DILocation(line: 30, column: 32, scope: !5)
+!7 = distinct !DISubprogram(name: "vf", linkageName: "_ZN3vt12vfEv", scope: !1, file: !1, line: 13, isLocal: false, isDefinition: true, scopeLine: 13, flags: DIFlagPrototyped, isOptimized: false, unit: !0)
+!8 = !{i32 0, !"typeid"}
+
+!9 = distinct !DISubprogram(name: "call1", linkageName: "_Z5call1Pv", scope: !1, file: !1, line: 31, isLocal: false, isDefinition: true, scopeLine: 9, flags: DIFlagPrototyped, isOptimized: false, unit: !0)
+!10 = !DILocation(line: 35, column: 32, scope: !9)
+!11 = distinct !DISubprogram(name: "vf_empty", linkageName: "_ZN3vt18vf_emptyEv", scope: !1, file: !1, line: 23, isLocal: false, isDefinition: true, scopeLine: 23, flags: DIFlagPrototyped, isOptimized: false, unit: !0)
+!12 = !{i32 0, !"typeid1"}
+
+!13 = distinct !DISubprogram(name: "call2", linkageName: "_Z5call2Pv", scope: !1, file: !1, line: 40, isLocal: false, isDefinition: true, scopeLine: 9, flags: DIFlagPrototyped, isOptimized: false, unit: !0)
+!14 = !DILocation(line: 41, column: 32, scope: !13)
+!15 = !{i32 0, !"typeid2"}
+
+!16 = distinct !DISubprogram(name: "call3", linkageName: "_Z5call3Pv", scope: !1, file: !1, line: 50, isLocal: false, isDefinition: true, scopeLine: 9, flags: DIFlagPrototyped, isOptimized: false, unit: !0)
+!17 = !DILocation(line: 51, column: 32, scope: !16)
+!18 = !{i32 0, !"typeid3"}
+
+
+
+; CHECK: 1 wholeprogramdevirt - Number of whole program devirtualization targets
+; CHECK: 3 wholeprogramdevirt - Number of single implementation devirtualizations
diff --git a/llvm/test/Transforms/WholeProgramDevirt/virtual-const-prop-check.ll b/llvm/test/Transforms/WholeProgramDevirt/virtual-const-prop-check.ll
index d8f5c912e9a50..8327e1cfdf1d2 100644
--- a/llvm/test/Transforms/WholeProgramDevirt/virtual-const-prop-check.ll
+++ b/llvm/test/Transforms/WholeProgramDevirt/virtual-const-prop-check.ll
@@ -11,6 +11,9 @@
 ; Check wildcard
 ; RUN: opt -S -passes=wholeprogramdevirt -whole-program-visibility -pass-remarks=wholeprogramdevirt -wholeprogramdevirt-skip=vf?i1 %s 2>&1 | FileCheck %s --check-prefix=SKIP
 
+; Check that no stats are reported in speculative devirtualization mode as the virtual const prop is disabled.
+; RUN: opt -S -passes=wholeprogramdevirt -devirtualize-speculatively -stats %s 2>&1 | FileCheck %s --check-prefix=CHECK-SPECULATIVE-WPD
+
 target datalayout = "e-p:64:64"
 target triple = "x86_64-unknown-linux-gnu"
 
@@ -225,3 +228,7 @@ declare ptr @llvm.load.relative.i32(ptr, i32)
 ; CHECK: 2 wholeprogramdevirt - Number of unique return value optimizations
 ; CHECK: 2 wholeprogramdevirt - Number of virtual constant propagations
 ; CHECK: 2 wholeprogramdevirt - Number of 1 bit virtual constant propagations
+
+; CHECK-SPECULATIVE-WPD-NOT: 0 wholeprogramdevirt - Number of unique return value optimizations
+; CHECK-SPECULATIVE-WPD-NOT: 0 wholeprogramdevirt - Number of virtual constant propagations
+; CHECK-SPECULATIVE-WPD-NOT: 0 wholeprogramdevirt - Number of 1 bit virtual constant propagations

@hassnaaHamdi
Copy link
Member Author

@teresajohnson

Copy link
Contributor

@teresajohnson teresajohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, a few minor comments but largely looks good.

std::optional<ModuleSummaryIndex> Index;
if (!ExportSummary && !ImportSummary && DevirtSpeculatively) {
// Build the ExportSummary from the module.
assert(!ExportSummary &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assert is redundant with the enclosing if condition

if (!DevirtModule(M, MAM, ExportSummary, ImportSummary).run())

std::optional<ModuleSummaryIndex> Index;
if (!ExportSummary && !ImportSummary && DevirtSpeculatively) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think DevirtSpeculatively is ever set currently (see comment about this earlier). So I don't think this change and the one below to pass DevirtSpeculatively can be tested in this PR, and should probably move to the follow on one.

@@ -226,11 +226,14 @@ struct WholeProgramDevirtPass : public PassInfoMixin<WholeProgramDevirtPass> {
ModuleSummaryIndex *ExportSummary;
const ModuleSummaryIndex *ImportSummary;
bool UseCommandLine = false;
bool DevirtSpeculatively = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is never set by this PR. Probably this should move into PR159685.

// Out of speculative devirtualization mode, if the only implementation has
// local linkage, we must promote to external to make it visible to thin LTO
// objects.
if (!DevirtSpeculatively && TheFn->hasLocalLinkage()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work if it is extended later to support speculative devirt of public vis vtables in LTO mode. Maybe track whether we originally had an ExportSummary, and do this only when we did?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that LTO flag should be passed to the constructor to differentiate between LTO mode and non-LTO mode.
It could happen in some tests that in LTO mode that the ExportSummary is null, ex: llvm/test/Other/new-pm-lto-defaults.ll

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added LTO flag. Please Let me know what do you think about it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's full LTO. In that case no promotion should be needed, as full LTO merges everything into one module. This handling is for ThinLTO when we have hybrid ThinLTO/fullLTO (via LTO unit splitting), and the internal object in the full LTO object must be promoted to be exported to a ThinLTO object (hence the original comment "We can only get here during the ThinLTO export phase"). In that case we always have an ExportSummary (exports the full LTO analysis to the ThinLTO split modules). So I don't think we need to identify that we are in LTO.

@@ -2350,6 +2392,11 @@ bool DevirtModule::run() {

Function *TypeTestFunc =
Intrinsic::getDeclarationIfExists(&M, Intrinsic::type_test);
// If we are in speculative devirtualization mode, we can work on the public
// type test intrinsics.
if (!TypeTestFunc && DevirtSpeculatively)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that we can actually have a mix of public and non-public type tests, depending on the original vtable visibility. Maybe add a TODO here to support that case? As written if there were hidden visibility vtables in the module, we will have a non-public type test, and would not do speculative devirtualization for any public type tests for non-hidden vtables.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, can you add a TODO to updatePublicTypeTestCalls to not drop the public type tests if we have enabled speculative devirtualization? That will be required to support this for LTO.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great notice, thanks.

// type test intrinsics.
if (!TypeTestFunc && DevirtSpeculatively)
TypeTestFunc =
Intrinsic::getDeclarationIfExists(&M, Intrinsic::public_type_test);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test with public type tests?

Index.emplace(buildModuleSummaryIndex(M, nullptr, &PSI));
ExportSummary = Index.has_value() ? &Index.value() : nullptr;
}
if (!DevirtModule(M, MAM, ExportSummary, ImportSummary, DevirtSpeculatively)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think DevirtSpeculatively is set by this PR, as mentioned above.

bool SingleImplDevirt =
trySingleImplDevirt(ExportSummary, TargetsForSlot, S.second, Res);
// Out of speculative devirtualization mode, Try to apply virtual constant
// propagation or branch funneling.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a TODO that these optimizations should eventually be allowed on calls from non-public type tests?

// Out of speculative devirtualization mode, if the only implementation has
// local linkage, we must promote to external to make it visible to thin LTO
// objects.
if (!DevirtSpeculatively && TheFn->hasLocalLinkage()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's full LTO. In that case no promotion should be needed, as full LTO merges everything into one module. This handling is for ThinLTO when we have hybrid ThinLTO/fullLTO (via LTO unit splitting), and the internal object in the full LTO object must be promoted to be exported to a ThinLTO object (hence the original comment "We can only get here during the ThinLTO export phase"). In that case we always have an ExportSummary (exports the full LTO analysis to the ThinLTO split modules). So I don't think we need to identify that we are in LTO.

- Add cl::opt 'devirtualize-speculatively' to enable it.
- Flag is disabled by default.
- It works regardless of the visibility of the object.
- Drop commit where we were building ExportSummary as it can't be
  tested in this patch.
- Consider the case where we have public/non-public type tests
- Use LTO flag to differentiate between LTO mode and non-LTO mode.

Change-Id: I52acd6e7352fe1075c9f990aa78dfdc35b89f124
Add a flag to differentiate between ExportSummary
that is built locally and the external one.
@hassnaaHamdi
Copy link
Member Author

@teresajohnson I think it's better to handle the object visibility in this patch too ?
Direct devirtualization for hidden objects and speculative for non-hidden ones ?

Copy link
Contributor

@teresajohnson teresajohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did the building of the ExportSummary get inadvertently removed from this PR? BTW, if you could please avoid force pushing when you update that would make reviewing much easier - as it is I am unable to compare to old versions. See https://llvm.org/docs/GitHub.html#rebasing-pull-requests-and-force-pushes.

if (TheFn->hasLocalLinkage()) {
// If the only implementation has local linkage, we must promote
// to external to make it visible to thin LTO objects.
// This change should be safe only in LTO mode.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe change this to something like: "We only want to do this in the ThinLTO export phase, in which case we will not have built a local summary for the module."

// True if ExportSummary was built locally from the module rather than
// provided externally to the pass (e.g., during LTO). Default value is false
// unless explicitly set when the Summary is explicitly built.
bool HasLocalSummary = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't find where this is set. In fact, I can't find anymore where ExportSummary is being built locally?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, It's not built locally. But for future patches it will be built locally. So here I want to add the default behaviour (no summary is built)

@teresajohnson
Copy link
Contributor

@teresajohnson I think it's better to handle the object visibility in this patch too ? Direct devirtualization for hidden objects and speculative for non-hidden ones ?

I could go either way. It is fine if you want to add that here, but I'm also fine with doing it as a follow up which might make the patches a bit more compact (unless you think adding the support will cause many changes to what is added by this PR).

@hassnaaHamdi
Copy link
Member Author

hassnaaHamdi commented Oct 8, 2025

Did the building of the ExportSummary get inadvertently removed from this PR? BTW, if you could please avoid force pushing when you update that would make reviewing much easier - as it is I am unable to compare to old versions. See https://llvm.org/docs/GitHub.html#rebasing-pull-requests-and-force-pushes.

I just dropped the commit related to building the ExportSummary. Sorry if that makes confusion. I'll follow your advice, thanks for it.

@hassnaaHamdi
Copy link
Member Author

@teresajohnson I think it's better to handle the object visibility in this patch too ? Direct devirtualization for hidden objects and speculative for non-hidden ones ?

I could go either way. It is fine if you want to add that here, but I'm also fine with doing it as a follow up which might make the patches a bit more compact (unless you think adding the support will cause many changes to what is added by this PR).

I think the change of handling the visibility will just pass the object visibility to 'applySingleImplDevirt' to check if we apply speculative/direct devirtualization. So the changes will include changing the function prototype, and passing the parameter.
something like:

void DevirtModule::applySingleImplDevirt(VTableSlotInfo &SlotInfo,
                                         Constant *TheFn, bool &IsExported,
                                         std::optional<GlobalObject::VCallVisibility> ObjVisibility) {
 ..  
 if (DevirtCheckMode == WPDCheckMode::Fallback || (speculativeDevirt && ObjVisibility != hidden)
      ... apply speculative devirt.
  else // it will do the direct devirt.
}

bool DevirtModule::trySingleImplDevirt(
    ModuleSummaryIndex *ExportSummary,
    MutableArrayRef<VirtualCallTarget> TargetsForSlot, VTableSlotInfo &SlotInfo,
    WholeProgramDevirtResolution *Res) {
 ...
 applySingleImplDevirt(SlotInfo, TheFn, IsExported, TargetsForSlot.TM.Bits->GV->getVCallVisibility());
} 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants