Skip to content

Commit 33509ca

Browse files
committed
Resolve review comments:
- Improve documentation and comments - Update release notes.
1 parent 70c4b6f commit 33509ca

File tree

7 files changed

+64
-22
lines changed

7 files changed

+64
-22
lines changed

clang/docs/ReleaseNotes.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -343,6 +343,7 @@ Modified Compiler Flags
343343
-----------------------
344344
- The `-gkey-instructions` compiler flag is now enabled by default when DWARF is emitted for plain C/C++ and optimizations are enabled. (#GH149509)
345345
- The `-fconstexpr-steps` compiler flag now accepts value `0` to opt out of this limit. (#GH160440)
346+
- The `-fdevirtualize-speculatively` compiler flag is now supported to enable speculative devirtualization of virtual function calls, it's disabled by default. (#GH159685)
346347

347348
Removed Compiler Flags
348349
-------------------------

clang/docs/UsersManual.rst

Lines changed: 46 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2328,13 +2328,6 @@ are listed below.
23282328
This enables better devirtualization. Turned off by default, because it is
23292329
still experimental.
23302330

2331-
.. option:: -fdevirtualize-speculatively
2332-
2333-
Enable speculative devirtualization optimization, such as single-implementation
2334-
devirtualization. This optimization is used out of LTO mode for now.
2335-
Turned off by default.
2336-
TODO: Enable for LTO mode.
2337-
23382331
.. option:: -fwhole-program-vtables
23392332

23402333
Enable whole-program vtable optimizations, such as single-implementation
@@ -2359,6 +2352,52 @@ are listed below.
23592352
pure ThinLTO, as all split regular LTO modules are merged and LTO linked
23602353
with regular LTO.
23612354

2355+
.. option:: -fdevirtualize-speculatively
2356+
2357+
Enable speculative devirtualization optimization where a virtual call
2358+
can be transformed into a direct call under the assumption that its
2359+
object is of a particular type. A runtime check is inserted to validate
2360+
the assumption before making the direct call, and if the check fails,
2361+
the original virtual call is made instead. This optimization can enable
2362+
more inlining opportunities and better optimization of the direct call.
2363+
This is different from other whole program devirtualization optimizations
2364+
that rely on global analysis and hidden visibility of the objects to prove
2365+
that the object is always of a particular type at a virtual call site.
2366+
This optimization doesn't require global analysis or hidden visibility.
2367+
This optimization doesn't devirtualize all virtual calls, but only
2368+
when there's a single implementation of the virtual function.
2369+
There could be a single implementaiton of the virtual function
2370+
either because the function is not overridden in any derived class,
2371+
or because there is a sinlge instantiated object that is using the funciton.
2372+
2373+
Ex of IR before the optimization:
2374+
.. code-block:: llvm
2375+
%vtable = load ptr, ptr %BV, align 8, !tbaa !6
2376+
%0 = tail call i1 @llvm.public.type.test(ptr %vtable, metadata !"_ZTS4Base")
2377+
tail call void @llvm.assume(i1 %0)
2378+
%0 = load ptr, ptr %vtable, align 8
2379+
tail call void %0(ptr noundef nonnull align 8 dereferenceable(8) %BV)
2380+
ret void
2381+
2382+
IR after the optimization:
2383+
.. code-block:: llvm
2384+
%vtable = load ptr, ptr %BV, align 8, !tbaa !12
2385+
%0 = load ptr, ptr %vtable, align 8
2386+
%1 = icmp eq ptr %0, @_ZN4Base17virtual_function1Ev
2387+
br i1 %1, label %if.true.direct_targ, label %if.false.orig_indirect, !prof !15
2388+
if.true.direct_targ: ; preds = %entry
2389+
tail call void @_ZN4Base17virtual_function1Ev(ptr noundef nonnull align 8 dereferenceable(8) %BV)
2390+
br label %if.end.icp
2391+
if.false.orig_indirect: ; preds = %entry
2392+
tail call void %0(ptr noundef nonnull align 8 dereferenceable(8) %BV)
2393+
br label %if.end.icp
2394+
2395+
if.end.icp: ; preds = %if.false.orig_indirect, %if.true.direct_targ
2396+
ret void
2397+
This feature is temporarily ignored at the LLVM side when LTO is enabled.
2398+
TODO: Update the comment when the LLVM side supports it.
2399+
This feature is turned off by default.
2400+
23622401
.. option:: -f[no-]unique-source-file-names
23632402

23642403
When enabled, allows the compiler to assume that each object file

clang/lib/CodeGen/CGClass.cpp

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2827,9 +2827,9 @@ void CodeGenFunction::EmitTypeMetadataCodeForVCall(const CXXRecordDecl *RD,
28272827
SourceLocation Loc) {
28282828
if (SanOpts.has(SanitizerKind::CFIVCall))
28292829
EmitVTablePtrCheckForCall(RD, VTable, CodeGenFunction::CFITCK_VCall, Loc);
2830+
// Emit the type test assumes for the features of WPD (only when LTO
2831+
// visibility is NOT public) and speculative devirtualization.
28302832
else if ((CGM.getCodeGenOpts().WholeProgramVTables &&
2831-
// Don't insert type test assumes if we are forcing public
2832-
// visibility.
28332833
!CGM.AlwaysHasLTOVisibilityPublic(RD)) ||
28342834
CGM.getCodeGenOpts().DevirtualizeSpeculatively) {
28352835
CanQualType Ty = CGM.getContext().getCanonicalTagType(RD);

clang/lib/CodeGen/ItaniumCXXABI.cpp

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -716,6 +716,9 @@ CGCallee ItaniumCXXABI::EmitLoadOfMemberFunctionPointer(
716716

717717
bool ShouldEmitVFEInfo = CGM.getCodeGenOpts().VirtualFunctionElimination &&
718718
CGM.HasHiddenLTOVisibility(RD);
719+
// TODO: Update this name not to be restricted to WPD only
720+
// as we now emit the vtable info info for speculative devirtualization as
721+
// well.
719722
bool ShouldEmitWPDInfo =
720723
(CGM.getCodeGenOpts().WholeProgramVTables &&
721724
// Don't insert type tests if we are forcing public visibility.
@@ -2111,9 +2114,10 @@ void ItaniumCXXABI::emitVTableDefinitions(CodeGenVTables &CGVT,
21112114

21122115
// Always emit type metadata on non-available_externally definitions, and on
21132116
// available_externally definitions if we are performing whole program
2114-
// devirtualization. For WPD we need the type metadata on all vtable
2115-
// definitions to ensure we associate derived classes with base classes
2116-
// defined in headers but with a strong definition only in a shared library.
2117+
// devirtualization or speculative devirtualization. We need the type metadata
2118+
// on all vtable definitions to ensure we associate derived classes with base
2119+
// classes defined in headers but with a strong definition only in a shared
2120+
// library.
21172121
if (!VTable->isDeclarationForLinker() ||
21182122
CGM.getCodeGenOpts().WholeProgramVTables ||
21192123
CGM.getCodeGenOpts().DevirtualizeSpeculatively) {

clang/lib/Driver/ToolChains/Clang.cpp

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7745,8 +7745,6 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
77457745

77467746
addOpenMPHostOffloadingArgs(C, JA, Args, CmdArgs);
77477747

7748-
// Temporarily disable this for LTO if it's not explicitly enabled.
7749-
// TODO: enable it by default for LTO also.
77507748
if (Args.hasFlag(options::OPT_fdevirtualize_speculatively,
77517749
options::OPT_fno_devirtualize_speculatively,
77527750
/*Default value*/ false))

clang/test/CodeGenCXX/type-metadata.cpp

Lines changed: 0 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -14,9 +14,6 @@
1414
// RUN: %clang_cc1 -O2 -flto -flto-unit -triple x86_64-unknown-linux -fwhole-program-vtables -emit-llvm -o - %s | FileCheck --check-prefix=ITANIUM-OPT --check-prefix=ITANIUM-OPT-LAYOUT %s
1515
// RUN: %clang_cc1 -flto -flto-unit -triple x86_64-pc-windows-msvc -fwhole-program-vtables -emit-llvm -o - %s | FileCheck --check-prefix=VTABLE-OPT --check-prefix=MS --check-prefix=MS-TYPEMETADATA --check-prefix=TT-MS %s
1616

17-
// Test for the speculative devirtualization feature in nonlto mode:
18-
// RUN: %clang_cc1 -triple x86_64-unknown-linux -fdevirtualize-speculatively -emit-llvm -o - %s | FileCheck --check-prefix=VTABLE-OPT --check-prefix=TT-ITANIUM-DEFAULT-NOLTO-SPECULATIVE-DEVIRT %s
19-
2017
// Tests for cfi + whole-program-vtables:
2118
// RUN: %clang_cc1 -flto -flto-unit -triple x86_64-unknown-linux -fvisibility=hidden -fsanitize=cfi-vcall -fsanitize-trap=cfi-vcall -fwhole-program-vtables -emit-llvm -o - %s | FileCheck --check-prefix=CFI --check-prefix=CFI-VT --check-prefix=ITANIUM-HIDDEN --check-prefix=ITANIUM-COMMON-MD --check-prefix=TC-ITANIUM --check-prefix=ITANIUM-NO-RV-MD %s
2219
// RUN: %clang_cc1 -flto -flto-unit -triple x86_64-pc-windows-msvc -fsanitize=cfi-vcall -fsanitize-trap=cfi-vcall -fwhole-program-vtables -emit-llvm -o - %s | FileCheck --check-prefix=CFI --check-prefix=CFI-VT --check-prefix=MS --check-prefix=MS-TYPEMETADATA --check-prefix=TC-MS %s
@@ -181,7 +178,6 @@ void af(A *a) {
181178
// TT-ITANIUM-HIDDEN: [[P:%[^ ]*]] = call i1 @llvm.type.test(ptr [[VT:%[^ ]*]], metadata !"_ZTS1A")
182179
// TT-ITANIUM-DEFAULT: [[P:%[^ ]*]] = call i1 @llvm.public.type.test(ptr [[VT:%[^ ]*]], metadata !"_ZTS1A")
183180
// TT-MS: [[P:%[^ ]*]] = call i1 @llvm.type.test(ptr [[VT:%[^ ]*]], metadata !"?AUA@@")
184-
// TT-ITANIUM-DEFAULT-NOLTO-SPECULATIVE-DEVIRT: [[P:%[^ ]*]] = call i1 @llvm.public.type.test(ptr [[VT:%[^ ]*]], metadata !"_ZTS1A")
185181
// TC-ITANIUM: [[PAIR:%[^ ]*]] = call { ptr, i1 } @llvm.type.checked.load(ptr {{%[^ ]*}}, i32 0, metadata !"_ZTS1A")
186182
// TC-ITANIUM-RV: [[PAIR:%[^ ]*]] = call { ptr, i1 } @llvm.type.checked.load.relative(ptr {{%[^ ]*}}, i32 0, metadata !"_ZTS1A")
187183
// TC-MS: [[PAIR:%[^ ]*]] = call { ptr, i1 } @llvm.type.checked.load(ptr {{%[^ ]*}}, i32 0, metadata !"?AUA@@")
@@ -216,7 +212,6 @@ void df1(D *d) {
216212
// TT-ITANIUM-HIDDEN: {{%[^ ]*}} = call i1 @llvm.type.test(ptr {{%[^ ]*}}, metadata ![[DTYPE:[0-9]+]])
217213
// TT-ITANIUM-DEFAULT: {{%[^ ]*}} = call i1 @llvm.type.test(ptr {{%[^ ]*}}, metadata ![[DTYPE:[0-9]+]])
218214
// TT-MS: {{%[^ ]*}} = call i1 @llvm.type.test(ptr {{%[^ ]*}}, metadata !"?AUA@@")
219-
// TT-ITANIUM-DEFAULT-NOLTO-SPECULATIVE-DEVIRT: {{%[^ ]*}} = call i1 @llvm.type.test(ptr {{%[^ ]*}}, metadata ![[DTYPE:[0-9]+]])
220215
// TC-ITANIUM: {{%[^ ]*}} = call { ptr, i1 } @llvm.type.checked.load(ptr {{%[^ ]*}}, i32 0, metadata ![[DTYPE:[0-9]+]])
221216
// TC-ITANIUM-RV: {{%[^ ]*}} = call { ptr, i1 } @llvm.type.checked.load.relative(ptr {{%[^ ]*}}, i32 0, metadata ![[DTYPE:[0-9]+]])
222217
// TC-MS: {{%[^ ]*}} = call { ptr, i1 } @llvm.type.checked.load(ptr {{%[^ ]*}}, i32 0, metadata !"?AUA@@")
@@ -229,7 +224,6 @@ void dg1(D *d) {
229224
// TT-ITANIUM-HIDDEN: {{%[^ ]*}} = call i1 @llvm.type.test(ptr {{%[^ ]*}}, metadata !"_ZTS1B")
230225
// TT-ITANIUM-DEFAULT: {{%[^ ]*}} = call i1 @llvm.public.type.test(ptr {{%[^ ]*}}, metadata !"_ZTS1B")
231226
// TT-MS: {{%[^ ]*}} = call i1 @llvm.type.test(ptr {{%[^ ]*}}, metadata !"?AUB@@")
232-
// TT-ITANIUM-DEFAULT-NOLTO-SPECULATIVE-DEVIRT: {{%[^ ]*}} = call i1 @llvm.public.type.test(ptr {{%[^ ]*}}, metadata !"_ZTS1B")
233227
// TC-ITANIUM: {{%[^ ]*}} = call { ptr, i1 } @llvm.type.checked.load(ptr {{%[^ ]*}}, i32 8, metadata !"_ZTS1B")
234228
// TC-ITANIUM-RV: {{%[^ ]*}} = call { ptr, i1 } @llvm.type.checked.load.relative(ptr {{%[^ ]*}}, i32 4, metadata !"_ZTS1B")
235229
// TC-MS: {{%[^ ]*}} = call { ptr, i1 } @llvm.type.checked.load(ptr {{%[^ ]*}}, i32 0, metadata !"?AUB@@")
@@ -242,7 +236,6 @@ void dh1(D *d) {
242236
// TT-ITANIUM-HIDDEN: {{%[^ ]*}} = call i1 @llvm.type.test(ptr {{%[^ ]*}}, metadata ![[DTYPE]])
243237
// TT-ITANIUM-DEFAULT: {{%[^ ]*}} = call i1 @llvm.type.test(ptr {{%[^ ]*}}, metadata ![[DTYPE]])
244238
// TT-MS: {{%[^ ]*}} = call i1 @llvm.type.test(ptr {{%[^ ]*}}, metadata ![[DTYPE:[0-9]+]])
245-
// TT-ITANIUM-DEFAULT-NOLTO-SPECULATIVE-DEVIRT: {{%[^ ]*}} = call i1 @llvm.type.test(ptr {{%[^ ]*}}, metadata ![[DTYPE]])
246239
// TC-ITANIUM: {{%[^ ]*}} = call { ptr, i1 } @llvm.type.checked.load(ptr {{%[^ ]*}}, i32 16, metadata ![[DTYPE]])
247240
// TC-ITANIUM-RV: {{%[^ ]*}} = call { ptr, i1 } @llvm.type.checked.load.relative(ptr {{%[^ ]*}}, i32 8, metadata ![[DTYPE]])
248241
// TC-MS: {{%[^ ]*}} = call { ptr, i1 } @llvm.type.checked.load(ptr {{%[^ ]*}}, i32 8, metadata ![[DTYPE:[0-9]+]])
@@ -304,7 +297,6 @@ void f(D *d) {
304297
// TT-ITANIUM-HIDDEN: {{%[^ ]*}} = call i1 @llvm.type.test(ptr {{%[^ ]*}}, metadata !"_ZTSN5test21DE")
305298
// TT-ITANIUM-DEFAULT: {{%[^ ]*}} = call i1 @llvm.public.type.test(ptr {{%[^ ]*}}, metadata !"_ZTSN5test21DE")
306299
// TT-MS: {{%[^ ]*}} = call i1 @llvm.type.test(ptr {{%[^ ]*}}, metadata !"?AUA@test2@@")
307-
// TT-ITANIUM-DEFAULT-NOLTO-SPECULATIVE-DEVIRT: {{%[^ ]*}} = call i1 @llvm.public.type.test(ptr {{%[^ ]*}}, metadata !"_ZTSN5test21DE")
308300
// TC-ITANIUM: {{%[^ ]*}} = call { ptr, i1 } @llvm.type.checked.load(ptr {{%[^ ]*}}, i32 8, metadata !"_ZTSN5test21DE")
309301
// TC-ITANIUM-RV: {{%[^ ]*}} = call { ptr, i1 } @llvm.type.checked.load.relative(ptr {{%[^ ]*}}, i32 4, metadata !"_ZTSN5test21DE")
310302
// TC-MS: {{%[^ ]*}} = call { ptr, i1 } @llvm.type.checked.load(ptr {{%[^ ]*}}, i32 0, metadata !"?AUA@test2@@")

llvm/lib/Passes/PassBuilderPipelines.cpp

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1656,13 +1656,21 @@ PassBuilder::buildModuleOptimizationPipeline(OptimizationLevel Level,
16561656
if (!LTOPreLink)
16571657
MPM.addPass(RelLookupTableConverterPass());
16581658

1659+
// Add devirtualization pass only when LTO is not enabled, as otherwise
1660+
// the pass is already enabled in the LTO pipeline.
16591661
if (PTO.DevirtualizeSpeculatively && LTOPhase == ThinOrFullLTOPhase::None) {
16601662
MPM.addPass(WholeProgramDevirtPass(
16611663
/*ExportSummary*/ nullptr,
16621664
/*ImportSummary*/ nullptr,
16631665
/*DevirtSpeculatively*/ PTO.DevirtualizeSpeculatively));
16641666
MPM.addPass(LowerTypeTestsPass(nullptr, nullptr,
16651667
lowertypetests::DropTestKind::Assume));
1668+
// Given that the devirtualization creates more opportunities for inlining,
1669+
// we run the Inliner again here to maximize the optimization gain we
1670+
// get from devirtualization.
1671+
// Also, we can't run devirtualization before inlining because the
1672+
// devirtualization depends on the passes optimizing/eliminating vtable GVs
1673+
// and those passes are only effective after inlining.
16661674
if (EnableModuleInliner) {
16671675
MPM.addPass(ModuleInlinerPass(getInlineParamsFromOptLevel(Level),
16681676
UseInlineAdvisor,

0 commit comments

Comments
 (0)