-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[LTO][WPD] Suppress WPD on a class if the LTO unit doesn't have the prevailing definition of this class or any of its superclass #131721
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@llvm/pr-subscribers-lto Author: Mingming Liu (mingmingl-llvm) ChangesBefore this patch, whole program devirtualization is suppressed on a class if any superclass is visible to regular object files, by recording the class GUID in This patch suppresses whole program devirtualization on a class if any superclass is visible (or escape to shared library) for index-based WPD. Implementation summaries:
The test case is reduced from a small C++ program (main.cc, lib.cc/h pasted below in [1]). To reproduce the program failure without this patch, compile lib.cc into a shared library, and provide it to a ThinLTO build of main.cc (commands are pasted in [2]). [1]
[2] Full diff: https://github.com/llvm/llvm-project/pull/131721.diff 3 Files Affected:
diff --git a/llvm/lib/LTO/LTO.cpp b/llvm/lib/LTO/LTO.cpp
index e895a46b8cd77..6ec2dc381a932 100644
--- a/llvm/lib/LTO/LTO.cpp
+++ b/llvm/lib/LTO/LTO.cpp
@@ -1905,7 +1905,7 @@ Error LTO::runThinLTO(AddStreamFn AddStream, FileCache Cache,
auto IsVisibleToRegularObj = [&](StringRef name) {
auto It = GlobalResolutions->find(name);
return (It == GlobalResolutions->end() ||
- It->second.VisibleOutsideSummary);
+ It->second.VisibleOutsideSummary || It->second.ExportDynamic);
};
getVisibleToRegularObjVtableGUIDs(ThinLTO.CombinedIndex,
diff --git a/llvm/test/ThinLTO/X86/nodevirt_exort_dynamic.ll b/llvm/test/ThinLTO/X86/nodevirt_exort_dynamic.ll
new file mode 100644
index 0000000000000..ea33291a75161
--- /dev/null
+++ b/llvm/test/ThinLTO/X86/nodevirt_exort_dynamic.ll
@@ -0,0 +1,111 @@
+; RUN: rm -rf %t && mkdir %t && cd %t
+
+; Generate unsplit module with summary for ThinLTO index-based WPD.
+; RUN: opt -thinlto-bc -o summary.o %s
+
+; RUN: llvm-dis -o - summary.o
+
+;; TODO: Implement the fix for WPD in regular or hybrid LTO, and add test coverage.
+
+; Index based WPD
+; For `_ZTI7Derived`, the 'llvm-lto2' resolution arguments specifies `VisibleOutsideSummary` as false
+; and `ExportDynamic` as false. The callsite inside @_ZN4Base8dispatchEv
+; got devirtualized.
+; RUN: llvm-lto2 run summary.o -save-temps -pass-remarks=. \
+; RUN: -o tmp \
+; RUN: --whole-program-visibility-enabled-in-lto=true \
+; RUN: --validate-all-vtables-have-type-infos=true \
+; RUN: --all-vtables-have-type-infos=true \
+; RUN: -r=summary.o,__cxa_pure_virtual, \
+; RUN: -r=summary.o,_ZN8DerivedNC2Ev,x \
+; RUN: -r=summary.o,_ZN4Base8dispatchEv,px \
+; RUN: -r=summary.o,_ZN7DerivedC2Ev, \
+; RUN: -r=summary.o,_ZN8DerivedN5printEv,px \
+; RUN: -r=summary.o,_ZTS4Base, \
+; RUN: -r=summary.o,_ZTV8DerivedN,p \
+; RUN: -r=summary.o,_ZTI8DerivedN,p \
+; RUN: -r=summary.o,_ZTI4Base, \
+; RUN: -r=summary.o,_ZTS8DerivedN,p \
+; RUN: -r=summary.o,_ZTI7Derived, \
+; RUN: -r=summary.o,_ZTV4Base 2>&1 | FileCheck --allow-empty %s --check-prefix=REMARK
+
+; REMARK: single-impl: devirtualized a call to _ZN8DerivedN5printEv
+
+; Index based WPD
+; For `_ZTI7Derived`, the 'llvm-lto2' resolution arguments specifies `VisibleOutsideSummary` as false
+; and `ExportDynamic` as true. The callsite inside @_ZN4Base8dispatchEv won't
+; get devirtualized.
+; RUN: llvm-lto2 run summary.o -save-temps -pass-remarks=. \
+; RUN: -o tmp \
+; RUN: --whole-program-visibility-enabled-in-lto=true \
+; RUN: --validate-all-vtables-have-type-infos=true \
+; RUN: --all-vtables-have-type-infos=true \
+; RUN: -r=summary.o,__cxa_pure_virtual, \
+; RUN: -r=summary.o,_ZN8DerivedNC2Ev,x \
+; RUN: -r=summary.o,_ZN4Base8dispatchEv,px \
+; RUN: -r=summary.o,_ZN7DerivedC2Ev, \
+; RUN: -r=summary.o,_ZN8DerivedN5printEv,px \
+; RUN: -r=summary.o,_ZTS4Base, \
+; RUN: -r=summary.o,_ZTV8DerivedN,p \
+; RUN: -r=summary.o,_ZTI8DerivedN,p \
+; RUN: -r=summary.o,_ZTI4Base, \
+; RUN: -r=summary.o,_ZTS8DerivedN,p \
+; RUN: -r=summary.o,_ZTI7Derived,d \
+; RUN: -r=summary.o,_ZTV4Base 2>&1 | FileCheck %s --allow-empty --implicit-check-not='single-impl: devirtualized a call to'
+
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+@_ZTV8DerivedN = linkonce_odr hidden constant { [3 x ptr] } { [3 x ptr] [ptr null, ptr @_ZTI8DerivedN, ptr @_ZN8DerivedN5printEv] }, !type !0, !type !1, !type !2, !type !3, !type !4, !type !5, !vcall_visibility !6
+@_ZTI8DerivedN = linkonce_odr hidden constant { ptr, ptr, ptr } { ptr null, ptr @_ZTS8DerivedN, ptr @_ZTI7Derived }
+@_ZTS8DerivedN = linkonce_odr hidden constant [10 x i8] c"8DerivedN\00", align 1
+@_ZTI7Derived = external constant ptr
+@_ZTV4Base = linkonce_odr hidden constant { [3 x ptr] } { [3 x ptr] [ptr null, ptr @_ZTI4Base, ptr @__cxa_pure_virtual] }, !type !0, !type !1, !vcall_visibility !6
+@_ZTI4Base = linkonce_odr hidden constant { ptr, ptr } { ptr null, ptr @_ZTS4Base }
+@_ZTS4Base = linkonce_odr hidden constant [6 x i8] c"4Base\00", align 1
+
+@llvm.used = appending global [1 x ptr] [ptr @_ZN8DerivedNC2Ev], section "llvm.metadata"
+
+define hidden void @_ZN4Base8dispatchEv(ptr %this) {
+entry:
+ %this.addr = alloca ptr
+ store ptr %this, ptr %this.addr
+ %this1 = load ptr, ptr %this.addr
+ %vtable = load ptr, ptr %this1
+ %0 = call i1 @llvm.type.test(ptr %vtable, metadata !"_ZTS7Derived")
+ call void @llvm.assume(i1 %0)
+ %vfn = getelementptr inbounds ptr, ptr %vtable, i64 0
+ %1 = load ptr, ptr %vfn
+ call void %1(ptr %this1)
+ ret void
+}
+
+define linkonce_odr hidden void @_ZN8DerivedNC2Ev(ptr %this) #0 {
+entry:
+ %this.addr = alloca ptr
+ store ptr %this, ptr %this.addr
+ %this1 = load ptr, ptr %this.addr
+ call void @_ZN7DerivedC2Ev(ptr %this1)
+ store ptr getelementptr inbounds inrange(-16, 8) ({ [3 x ptr] }, ptr @_ZTV8DerivedN, i32 0, i32 0, i32 2), ptr %this1
+ ret void
+}
+
+define linkonce_odr hidden void @_ZN8DerivedN5printEv(ptr %this) #0 {
+entry:
+ ret void
+}
+
+attributes #0 = { noinline optnone }
+
+declare void @__cxa_pure_virtual()
+declare i1 @llvm.type.test(ptr, metadata)
+declare void @llvm.assume(i1)
+declare void @_ZN7DerivedC2Ev(ptr)
+
+!0 = !{i64 16, !"_ZTS4Base"}
+!1 = !{i64 16, !"_ZTSM4BaseFvvE.virtual"}
+!2 = !{i64 16, !"_ZTS7Derived"}
+!3 = !{i64 16, !"_ZTSM7DerivedFvvE.virtual"}
+!4 = !{i64 16, !"_ZTS8DerivedN"}
+!5 = !{i64 16, !"_ZTSM8DerivedNFvvE.virtual"}
+!6 = !{i64 0}
diff --git a/llvm/tools/llvm-lto2/llvm-lto2.cpp b/llvm/tools/llvm-lto2/llvm-lto2.cpp
index dbb7f2f3028aa..0317a610a25a3 100644
--- a/llvm/tools/llvm-lto2/llvm-lto2.cpp
+++ b/llvm/tools/llvm-lto2/llvm-lto2.cpp
@@ -187,6 +187,18 @@ static cl::opt<bool> EnableFreestanding(
cl::desc("Enable Freestanding (disable builtins / TLI) during LTO"),
cl::Hidden);
+static cl::opt<bool> WholeProgramVisibilityEnabledInLTO(
+ "whole-program-visibility-enabled-in-lto",
+ cl::desc("Enable whole program visibility during LTO"), cl::Hidden);
+
+static cl::opt<bool> ValidateAllVtablesHaveTypeInfos(
+ "validate-all-vtables-have-type-infos",
+ cl::desc("Validate that all vtables have type infos in LTO"), cl::Hidden);
+
+static cl::opt<bool>
+ AllVtablesHaveTypeInfos("all-vtables-have-type-infos", cl::Hidden,
+ cl::desc("All vtables have type infos"));
+
extern cl::opt<cl::boolOrDefault> LoadBitcodeIntoNewDbgInfoFormat;
extern cl::opt<cl::boolOrDefault> PreserveInputDbgFormat;
@@ -257,6 +269,8 @@ static int run(int argc, char **argv) {
Res.VisibleToRegularObj = true;
else if (C == 'r')
Res.LinkerRedefined = true;
+ else if (C == 'd')
+ Res.ExportDynamic = true;
else {
llvm::errs() << "invalid character " << C << " in resolution: " << R
<< '\n';
@@ -332,6 +346,13 @@ static int run(int argc, char **argv) {
Conf.PTO.LoopVectorization = Conf.OptLevel > 1;
Conf.PTO.SLPVectorization = Conf.OptLevel > 1;
+ if (WholeProgramVisibilityEnabledInLTO.getNumOccurrences() > 0)
+ Conf.HasWholeProgramVisibility = WholeProgramVisibilityEnabledInLTO;
+ if (ValidateAllVtablesHaveTypeInfos.getNumOccurrences() > 0)
+ Conf.ValidateAllVtablesHaveTypeInfos = ValidateAllVtablesHaveTypeInfos;
+ if (AllVtablesHaveTypeInfos.getNumOccurrences() > 0)
+ Conf.AllVtablesHaveTypeInfos = AllVtablesHaveTypeInfos;
+
ThinBackend Backend;
if (ThinLTODistributedIndexes)
Backend = createWriteIndexesThinBackend(llvm::hardware_concurrency(Threads),
|
teresajohnson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test name has a typo (missing 'p' in nodevirt_exort_dynamic.ll)
llvm/lib/LTO/LTO.cpp
Outdated
| auto It = GlobalResolutions->find(name); | ||
| return (It == GlobalResolutions->end() || | ||
| It->second.VisibleOutsideSummary); | ||
| It->second.VisibleOutsideSummary || It->second.ExportDynamic); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same fix should presumably go in for regular LTO (see the other def of IsVisibleToRegularObj in this file).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, and added test coverage for hybrid LTO. Regular LTO test coverage is not very meaningful, since without this change, the VisibleOutsideSummary bit suppresses devirtualization with -Wl,--lto-whole-program-visibility -Wl,--lto-validate-all-vtables-have-type-infos in lld
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per offline discussion, opt -module-summary is the way to create LTO input for regular LTO. Now the regression test has coverage for {thin, hybrid, regular} LTO.
| @_ZTV8DerivedN = linkonce_odr hidden constant { [3 x ptr] } { [3 x ptr] [ptr null, ptr @_ZTI8DerivedN, ptr @_ZN8DerivedN5printEv] }, !type !0, !type !1, !type !2, !type !3, !type !4, !type !5, !vcall_visibility !6 | ||
| @_ZTI8DerivedN = linkonce_odr hidden constant { ptr, ptr, ptr } { ptr null, ptr @_ZTS8DerivedN, ptr @_ZTI7Derived } | ||
| @_ZTS8DerivedN = linkonce_odr hidden constant [10 x i8] c"8DerivedN\00", align 1 | ||
| @_ZTI7Derived = external constant ptr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking more about this...presumably the GlobalResolution Prevailing flag and isPrevailingIRSymbol are false for _ZTI7Derived. Should we more broadly be returning true to IsVisibleToRegularObj for that case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question about the prevailing bit. Let me think about it more and implement the change if needed as a separate patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed offline, I'll look into whether checking Prevailing bit is a better alternative to ExportDynamic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated LTO.cpp to use IsPrevailing bit since it's semantically correct to do that.
Whoops! Fixed the test name. |
teresajohnson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor comments on reducing the test a bit more. Also, I don't think the llvm-lto2 changes are needed, as noted in the comments there.
| ; RUN: rm -rf %t && mkdir %t && cd %t | ||
|
|
||
| ; Tests that devirtualization is suppressed on a class when the LTO unit doesn't | ||
| ; have the prevailing definition of the class. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"of the base class" maybe? Here it is suppressing devirt on DerivedN because base class Derived isn't prevailing in the LTO unit I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
| !1 = !{i64 16, !"_ZTSM7DerivedFvvE.virtual"} | ||
| !2 = !{i64 16, !"_ZTS8DerivedN"} | ||
| !3 = !{i64 16, !"_ZTSM8DerivedNFvvE.virtual"} | ||
| ;!4 = !{i64 16, !"_ZTS8DerivedN"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the commented out metadata?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
|
|
||
| attributes #0 = { noinline optnone } | ||
|
|
||
| declare void @__cxa_pure_virtual() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this decl?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| declare void @__cxa_pure_virtual() | ||
| declare i1 @llvm.type.test(ptr, metadata) | ||
| declare void @llvm.assume(i1) | ||
| declare void @_ZN7DerivedC2Ev(ptr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this decl once the reference above is removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of adding this function to llvm.used is to keep vtable live.
Upon this comment, it's simpler to adding the vtable itself to llvm.used for this purpose.
So I removed this function and the other function decl to simplify the test case.
| Conf.PTO.SLPVectorization = Conf.OptLevel > 1; | ||
|
|
||
| if (WholeProgramVisibilityEnabledInLTO.getNumOccurrences() > 0) | ||
| Conf.HasWholeProgramVisibility = WholeProgramVisibilityEnabledInLTO; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the test can just use the existing -whole-program-visibility flag (defined in WholeProgramDevirt.cpp) which is OR'ed with this config value, instead of adding a new llvm-lto2 flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I spent some time looking into reusing -whole-program-visibility flag before introducing new options in llvm-lto2, but -whole-program-visibility itself doesn't initialize the VisibleToRegularObjSymbols set, which are used to skip vtable visibility upgrade.
Basically, the condition (
llvm-project/llvm/lib/LTO/LTO.cpp
Lines 1901 to 1902 in d0d33d2
| if (WholeProgramVisibilityEnabledInLTO && | |
| Conf.ValidateAllVtablesHaveTypeInfos) { |
WholeProgramVisibilityEnabledInLTO and Conf.ValidateAllVtablesHaveTypeInfos to get DenseSet<GlobalValue::GUID> VisibleToRegularObjSymbols, and WholeProgramVisibilityEnabledInLTO value is based on configuration value (llvm-project/llvm/lib/LTO/LTO.cpp
Lines 1889 to 1893 in d0d33d2
| bool WholeProgramVisibilityEnabledInLTO = | |
| Conf.HasWholeProgramVisibility && | |
| // If validation is enabled, upgrade visibility only when all vtables | |
| // have typeinfos. | |
| (!Conf.ValidateAllVtablesHaveTypeInfos || Conf.AllVtablesHaveTypeInfos); |
I tried to use -whole-program-visibility [1] flag in the second llvm-lto2 RUN command (which is expected to see no devirtualization), and the test failed..
[1]
--- a/llvm/test/ThinLTO/X86/devirt_prevailing.ll
+++ b/llvm/test/ThinLTO/X86/devirt_prevailing.ll
@@ -32,9 +32,7 @@
; RUN: llvm-lto2 run summary.o -save-temps -pass-remarks=. \
; RUN: -thinlto-threads=1 \
; RUN: -o tmp \
-; RUN: --whole-program-visibility-enabled-in-lto=true \
-; RUN: --validate-all-vtables-have-type-infos=true \
-; RUN: --all-vtables-have-type-infos=true \
+; RUN: -whole-program-visibility \
; RUN: -r=summary.o,_ZN4Base8dispatchEv,px \
; RUN: -r=summary.o,_ZN8DerivedN5printEv,px \
; RUN: -r=summary.o,_ZTV8DerivedN,p \
| if (WholeProgramVisibilityEnabledInLTO.getNumOccurrences() > 0) | ||
| Conf.HasWholeProgramVisibility = WholeProgramVisibilityEnabledInLTO; | ||
| if (ValidateAllVtablesHaveTypeInfos.getNumOccurrences() > 0) | ||
| Conf.ValidateAllVtablesHaveTypeInfos = ValidateAllVtablesHaveTypeInfos; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think either of these are needed for the test as ValidateAllVtablesHaveTypeInfos defaults to false, meaning we don't do this validation, and AllVtablesHaveTypeInfos is only checked if ValidateAllVtablesHaveTypeInfos is true.
| ret void | ||
| } | ||
|
|
||
| define linkonce_odr hidden void @_ZN8DerivedNC2Ev(ptr %this) #0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this function be removed from the test? It doesn't seem to be referenced other than in the @llvm.used (which could then presumably also be removed).
teresajohnson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Before this patch, whole program devirtualization is suppressed on a class if any superclass is visible to regular object files, by recording the class GUID in
VisibleToRegularObjSymbols.This patch suppresses whole program devirtualization on a class if the LTO unit doesn't have the prevailing definition of this class or any of its superclass (e.g., the prevailing definition is in a shared library)
Implementation summaries:
IsVisibleToRegularObjis updated to look at the global resolution'sIsPrevailingbit for ThinLTO and regularLTO.llvm-lto2can overrideConf.HasWholeProgramVisibility,Conf.ValidateAllVtablesHaveTypeInfosandConf.AllVtablesHaveTypeInfos.The test case is reduced from a small C++ program (main.cc, lib.cc/h pasted below in [1]). To reproduce the program failure without this patch, compile lib.cc into a shared library, and provide it to a ThinLTO build of main.cc (commands are pasted in [2]).
[1]
[2]