-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[Clang]: Support opt-in speculative devirtualization #159685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
ab53801
70c4b6f
33509ca
3d6997c
d8863df
f64ee2c
eac8e32
2b1d72e
76ac87d
b1d1aae
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -2827,10 +2827,15 @@ void CodeGenFunction::EmitTypeMetadataCodeForVCall(const CXXRecordDecl *RD, | |
| SourceLocation Loc) { | ||
| if (SanOpts.has(SanitizerKind::CFIVCall)) | ||
| EmitVTablePtrCheckForCall(RD, VTable, CodeGenFunction::CFITCK_VCall, Loc); | ||
| else if (CGM.getCodeGenOpts().WholeProgramVTables && | ||
| // Don't insert type test assumes if we are forcing public | ||
| // visibility. | ||
| !CGM.AlwaysHasLTOVisibilityPublic(RD)) { | ||
| // Emit the intrinsics of (type_test and assume) for the features of WPD and | ||
| // speculative devirtualization. For WPD, emit the intrinsics only for the | ||
| // case of non_public LTO visibility. | ||
| // TODO: refactor this condition and similar ones into a function (e.g., | ||
| // ShouldEmitDevirtualizationMD) to encapsulate the details of the different | ||
| // types of devirtualization. | ||
| else if ((CGM.getCodeGenOpts().WholeProgramVTables && | ||
| !CGM.AlwaysHasLTOVisibilityPublic(RD)) || | ||
| CGM.getCodeGenOpts().DevirtualizeSpeculatively) { | ||
| CanQualType Ty = CGM.getContext().getCanonicalTagType(RD); | ||
| llvm::Metadata *MD = CGM.CreateMetadataIdentifierForType(Ty); | ||
| llvm::Value *TypeId = | ||
|
|
@@ -2988,8 +2993,9 @@ void CodeGenFunction::EmitVTablePtrCheck(const CXXRecordDecl *RD, | |
| } | ||
|
|
||
| bool CodeGenFunction::ShouldEmitVTableTypeCheckedLoad(const CXXRecordDecl *RD) { | ||
| if (!CGM.getCodeGenOpts().WholeProgramVTables || | ||
| !CGM.HasHiddenLTOVisibility(RD)) | ||
| if ((!CGM.getCodeGenOpts().WholeProgramVTables || | ||
| !CGM.HasHiddenLTOVisibility(RD)) && | ||
| !CGM.getCodeGenOpts().DevirtualizeSpeculatively) | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. wait, is this devirtualizing ALL calls? How can that work? That doesn't seem like an appropriate implementation? I find myself wondering if this needs to be a 'smaller hammer' here other than TU level.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi @erichkeane
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. But how can we KNOW there is only a single implementation of the virtual function cross TU? thats the point about linking above. In the GCC case, we know the concrete type of
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Clang does not do the same: https://godbolt.org/z/fE48b3eK3
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See my other answer, but basically it is guarded by a function pointer comparison. @hassnaaHamdi I'll take a look at the code changes later today, but perhaps the PR description needs to have an explanation of what is meant by speculative devirtualization, with a short before and after code snippet.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here are different cases of usages of virtual function, there are cpp code which shows the c++ test case, another tap for IR without enabling devirtualization and another tap for the IR when devietualization is enabled.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Got it, thanks! The branch based on the equality of the vtables wasn't something that was clear to me, and I would love to see a clang-level test that shows us generating that without the pass happening? (or am I missing something else?).
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Clang here only emits the the intrinsics of (public_type_test and assume) and type metadata for each related vtable (as part of the CGVtables.cpp logic). You can see those metadata nodes at the end of the IR. For example here the vtable of Derived class without the devirt flag: While when we use the devirt flag, it does: That logic already exists in Clang for the LTO/wholeprogramdevirt feature, in this patch I just ask Clang to also emit it when the speculative_devirt flag is enabled.
Does this test fulfil your ask ? clang/test/CodeGenCXX/type-metadata.cpp
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That test isn't quite 'clear' enough? Though perhaps it is good enough for lit. Either way, I think I understand enought o be happy, thanks for your patience.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, there are a lot of other test cases in that test file. I will create another test dedicated to show the clang effects after enabling the flag.
Comment on lines
+2996
to
+2998
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We seem to do that check in multiple places - maybe a
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Which check specifically? this one:
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It might be nice to refactor the whole if statement, although there are a few different variations of the combination of conditions being checked. But yes, it should be named something more general, like shouldEmitVTableTypeInfo or something like that. |
||
| return false; | ||
|
|
||
| if (CGM.getCodeGenOpts().VirtualFunctionElimination) | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -716,10 +716,14 @@ CGCallee ItaniumCXXABI::EmitLoadOfMemberFunctionPointer( | |
|
|
||
| bool ShouldEmitVFEInfo = CGM.getCodeGenOpts().VirtualFunctionElimination && | ||
| CGM.HasHiddenLTOVisibility(RD); | ||
| // TODO: Update this name not to be restricted to WPD only | ||
| // as we now emit the vtable info info for speculative devirtualization as | ||
| // well. | ||
| bool ShouldEmitWPDInfo = | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe add a comment that the WPD terminology is legacy and we now also perform speculative devirtualization using the same info. |
||
| CGM.getCodeGenOpts().WholeProgramVTables && | ||
| // Don't insert type tests if we are forcing public visibility. | ||
| !CGM.AlwaysHasLTOVisibilityPublic(RD); | ||
| (CGM.getCodeGenOpts().WholeProgramVTables && | ||
| // Don't insert type tests if we are forcing public visibility. | ||
| !CGM.AlwaysHasLTOVisibilityPublic(RD)) || | ||
| CGM.getCodeGenOpts().DevirtualizeSpeculatively; | ||
| llvm::Value *VirtualFn = nullptr; | ||
|
|
||
| { | ||
|
|
@@ -2110,17 +2114,20 @@ void ItaniumCXXABI::emitVTableDefinitions(CodeGenVTables &CGVT, | |
|
|
||
| // Always emit type metadata on non-available_externally definitions, and on | ||
| // available_externally definitions if we are performing whole program | ||
| // devirtualization. For WPD we need the type metadata on all vtable | ||
| // definitions to ensure we associate derived classes with base classes | ||
| // defined in headers but with a strong definition only in a shared library. | ||
| // devirtualization or speculative devirtualization. We need the type metadata | ||
| // on all vtable definitions to ensure we associate derived classes with base | ||
| // classes defined in headers but with a strong definition only in a shared | ||
| // library. | ||
| if (!VTable->isDeclarationForLinker() || | ||
| CGM.getCodeGenOpts().WholeProgramVTables) { | ||
| CGM.getCodeGenOpts().WholeProgramVTables || | ||
| CGM.getCodeGenOpts().DevirtualizeSpeculatively) { | ||
| CGM.EmitVTableTypeMetadata(RD, VTable, VTLayout); | ||
| // For available_externally definitions, add the vtable to | ||
| // @llvm.compiler.used so that it isn't deleted before whole program | ||
| // analysis. | ||
| if (VTable->isDeclarationForLinker()) { | ||
| assert(CGM.getCodeGenOpts().WholeProgramVTables); | ||
| assert(CGM.getCodeGenOpts().WholeProgramVTables || | ||
| CGM.getCodeGenOpts().DevirtualizeSpeculatively); | ||
| CGM.addCompilerUsedGlobal(VTable); | ||
| } | ||
| } | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,78 @@ | ||
| // Test that Clang emits vtable metadata when speculative devirtualization is enabled. | ||
| // RUN: %clang_cc1 -triple x86_64-unknown-linux -fdevirtualize-speculatively -emit-llvm -o - %s | FileCheck --check-prefix=CHECK %s | ||
|
|
||
| struct A { | ||
| A(); | ||
| virtual void f(); | ||
| }; | ||
|
|
||
| struct B : virtual A { | ||
| B(); | ||
| virtual void g(); | ||
| virtual void h(); | ||
| }; | ||
|
|
||
| namespace { | ||
|
|
||
| struct D : B { | ||
| D(); | ||
| virtual void f(); | ||
| virtual void h(); | ||
| }; | ||
|
|
||
| } | ||
|
|
||
| A::A() {} | ||
| B::B() {} | ||
| D::D() {} | ||
|
|
||
| void A::f() { | ||
| } | ||
|
|
||
| void B::g() { | ||
| } | ||
|
|
||
| void D::f() { | ||
| } | ||
|
|
||
| void D::h() { | ||
| } | ||
|
|
||
| void af(A *a) { | ||
| // CHECK: [[P:%[^ ]*]] = call i1 @llvm.public.type.test(ptr [[VT:%[^ ]*]], metadata !"_ZTS1A") | ||
| // CHECK-NEXT: call void @llvm.assume(i1 [[P]]) | ||
| a->f(); | ||
| } | ||
|
|
||
| void dg1(D *d) { | ||
| // CHECK: [[P:%[^ ]*]] = call i1 @llvm.public.type.test(ptr [[VT:%[^ ]*]], metadata !"_ZTS1B") | ||
| // CHECK-NEXT: call void @llvm.assume(i1 [[P]]) | ||
| d->g(); | ||
| } | ||
|
|
||
| void df1(D *d) { | ||
| // CHECK: [[P:%[^ ]*]] = call i1 @llvm.type.test(ptr [[VT:%[^ ]*]], metadata !11) | ||
| // CHECK-NEXT: call void @llvm.assume(i1 [[P]]) | ||
| d->f(); | ||
| } | ||
|
|
||
| void dh1(D *d) { | ||
| // CHECK: [[P:%[^ ]*]] = call i1 @llvm.type.test(ptr [[VT:%[^ ]*]], metadata !11) | ||
| // CHECK-NEXT: call void @llvm.assume(i1 [[P]]) | ||
| d->h(); | ||
| } | ||
|
|
||
|
|
||
| D d; | ||
|
|
||
| void foo() { | ||
| dg1(&d); | ||
| df1(&d); | ||
| dh1(&d); | ||
|
|
||
|
|
||
| struct FA : A { | ||
| void f() {} | ||
| } fa; | ||
| af(&fa); | ||
| } |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -305,6 +305,13 @@ static cl::opt<std::string> InstrumentColdFuncOnlyPath( | |
| "with --pgo-instrument-cold-function-only)"), | ||
| cl::Hidden); | ||
|
|
||
| // TODO: There is a similar flag in WPD pass, we should consolidate them by | ||
| // parsing the option only once in PassBuilder and share it across both places. | ||
| static cl::opt<bool> EnableDevirtualizeSpeculatively( | ||
| "enable-devirtualize-speculatively", | ||
| cl::desc("Enable speculative devirtualization optimization"), | ||
| cl::init(false)); | ||
|
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To enable PhaseOrdering testing, I created that flag since devirtualization is conditionally added to the pipeline. However, since a similar flag exists in the WholeProgramDevirt pass itself, the optimal solution would be to: remove both flags, add flag parsing logic to PassBuilder (similar to some other passes), and update the WholeProgramDevirt pass accordingly. This would provide single flag handling across both PassBuilderPipeline and the pass implementation. |
||
|
|
||
| extern cl::opt<std::string> UseCtxProfile; | ||
| extern cl::opt<bool> PGOInstrumentColdFunctionOnly; | ||
|
|
||
|
|
@@ -326,6 +333,7 @@ PipelineTuningOptions::PipelineTuningOptions() { | |
| MergeFunctions = EnableMergeFunctions; | ||
| InlinerThreshold = -1; | ||
| EagerlyInvalidateAnalyses = EnableEagerlyInvalidateAnalyses; | ||
| DevirtualizeSpeculatively = EnableDevirtualizeSpeculatively; | ||
| } | ||
|
|
||
| namespace llvm { | ||
|
|
@@ -1655,6 +1663,34 @@ PassBuilder::buildModuleOptimizationPipeline(OptimizationLevel Level, | |
| if (!LTOPreLink) | ||
| MPM.addPass(RelLookupTableConverterPass()); | ||
|
|
||
| // Add devirtualization pass only when LTO is not enabled, as otherwise | ||
| // the pass is already enabled in the LTO pipeline. | ||
| if (PTO.DevirtualizeSpeculatively && LTOPhase == ThinOrFullLTOPhase::None) { | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Add a TODO about LTO
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry, TODO about what ? This check here is because that if LTO is enabled, the WPD pass will be already added to the LTO pipeline, so we should not add it again here.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry about the late reply, I initially thought I understood the comment, but now not 😅
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nevermind, this isn't the right place to add the LTO speculative devirt support so a TODO doesn't make sense. Current comment is fine. |
||
| // TODO: explore a better pipeline configuration that can improve | ||
| // compilation time overhead. | ||
| MPM.addPass(WholeProgramDevirtPass( | ||
| /*ExportSummary*/ nullptr, | ||
| /*ImportSummary*/ nullptr, | ||
| /*DevirtSpeculatively*/ PTO.DevirtualizeSpeculatively)); | ||
| MPM.addPass(LowerTypeTestsPass(nullptr, nullptr, | ||
| lowertypetests::DropTestKind::Assume)); | ||
| // Given that the devirtualization creates more opportunities for inlining, | ||
| // we run the Inliner again here to maximize the optimization gain we | ||
| // get from devirtualization. | ||
| // Also, we can't run devirtualization before inlining because the | ||
| // devirtualization depends on the passes optimizing/eliminating vtable GVs | ||
| // and those passes are only effective after inlining. | ||
| if (EnableModuleInliner) { | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you remind me why you need another round of inlining vs doing this before the earlier inline pass? Is this an optimization for a specific use case that needs 2 rounds of inlining? I'm concerned about the potential side effects.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have added comments about this:
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you clarify what vtable GV optimization is needed to expose these opportunities?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, is there a test being added that would fail without this inliner invocation?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I mean optimisation passes like GlobalOpt.cpp pass which eliminates the unused GVs, and then when devirtualization works, it finds a single GV that is refering to the virtual function, so it can devirtualize.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I will look at how the DevirtSCCRepeatedPass is invoked.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I tested the compile-time on ct-mark, and the compilation overhead seems big for each of the inliner and the WPD.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Okay, I see that integrating the WPD pass into another pipeline is non-trivial and would require changes across multiple places. Additionally, I'm uncertain if this approach will succeed given that WPD is a module pass. Does it make sense to keep the current opt-in speculative devirtualization changes for now, and revisit the pipeline integration as a follow-up?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm open to putting it in as is for now, since it is disabled by default, and adding a TODO to explore a better pipeline configuration.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added the TODO. |
||
| MPM.addPass(ModuleInlinerPass(getInlineParamsFromOptLevel(Level), | ||
| UseInlineAdvisor, | ||
| ThinOrFullLTOPhase::None)); | ||
| } else { | ||
| MPM.addPass(ModuleInlinerWrapperPass( | ||
| getInlineParamsFromOptLevel(Level), | ||
| /* MandatoryFirst */ true, | ||
| InlineContext{ThinOrFullLTOPhase::None, InlinePass::CGSCCInliner})); | ||
| } | ||
| } | ||
| return MPM; | ||
| } | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we document the implications that this has? This seems like it would break linking, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Linking? I think linking is not related here, maybe you got confused by the LTO related code ?
LTO related code is old code, I just modified it to also emit the type test assumes when the flag of devirt is enabled. But the change here is not related to LTO or linking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with devirtualization is always that we don't know what is going on in other TUs, right? There could be other variants of the function in other types in the same virtual tree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah you are right. That's why this is only 'speculative' so it should be safe regardless of how other modules could affect the inheritance tree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain "should be safe"? Do you mean: "The user is promising us that it is safe", or "we are proving it is safe"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Speculative devirtualization transforms an indirect call to a guarded direct call. It is guarded by a comparison of the virtual function pointer to the expected target. So it is always safe. We also do this with profile information (in that case aka IndirectCallPromotion).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, this seems interesting and more sound. Can you share some IR of the 'after' here that shows that? More importantly, can we see some tests to that effect?