-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[WPD]: Apply speculative WPD in non-lto mode. #145031
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1359,7 +1359,8 @@ void CodeGenModule::EmitVTableTypeMetadata(const CXXRecordDecl *RD, | |
| // Emit type metadata on vtables with LTO or IR instrumentation. | ||
| // In IR instrumentation, the type metadata is used to find out vtable | ||
| // definitions (for type profiling) among all global variables. | ||
| if (!getCodeGenOpts().LTOUnit && !getCodeGenOpts().hasProfileIRInstr()) | ||
| if (!getCodeGenOpts().LTOUnit && !getCodeGenOpts().hasProfileIRInstr() && | ||
| !getCodeGenOpts().WholeProgramVTables) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is this change needed?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi Teresa, thanks for the review. Without checking |
||
| return; | ||
|
|
||
| CharUnits ComponentWidth = GetTargetTypeStoreSize(getVTableComponentType()); | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,15 +1,11 @@ | ||
| // RUN: not %clang -target x86_64-unknown-linux -fwhole-program-vtables -### %s 2>&1 | FileCheck --check-prefix=NO-LTO %s | ||
| // RUN: not %clang_cl --target=x86_64-pc-win32 -fwhole-program-vtables -### -- %s 2>&1 | FileCheck --check-prefix=NO-LTO %s | ||
| // NO-LTO: invalid argument '-fwhole-program-vtables' only allowed with '-flto' | ||
| // RUN: %clang -target x86_64-unknown-linux -fwhole-program-vtables -O1 -### %s 2>&1 | FileCheck --check-prefix=WPD-NO-LTO %s | ||
| // RUN: %clang_cl --target=x86_64-pc-win32 -fwhole-program-vtables -O1 -### -- %s 2>&1 | FileCheck --check-prefix=WPD-NO-LTO %s | ||
| // WPD-NO-LTO: "-fwhole-program-vtables" | ||
|
|
||
| // RUN: %clang -target x86_64-unknown-linux -fwhole-program-vtables -flto -### %s 2>&1 | FileCheck --check-prefix=LTO %s | ||
| // RUN: not %clang_cl --target=x86_64-pc-win32 -fwhole-program-vtables -flto -### -- %s 2>&1 | FileCheck --check-prefix=LTO %s | ||
| // LTO: "-fwhole-program-vtables" | ||
|
|
||
| /// -funified-lto does not imply -flto, so we still get an error that fwhole-program-vtables has no effect without -flto | ||
| // RUN: not %clang --target=x86_64-pc-linux-gnu -fwhole-program-vtables -funified-lto -### %s 2>&1 | FileCheck --check-prefix=NO-LTO %s | ||
| // RUN: not %clang --target=x86_64-pc-linux-gnu -fwhole-program-vtables -fno-unified-lto -### %s 2>&1 | FileCheck --check-prefix=NO-LTO %s | ||
|
|
||
| // RUN: %clang -target x86_64-unknown-linux -fwhole-program-vtables -fno-whole-program-vtables -flto -### %s 2>&1 | FileCheck --check-prefix=LTO-DISABLE %s | ||
| // RUN: not %clang_cl --target=x86_64-pc-win32 -fwhole-program-vtables -fno-whole-program-vtables -flto -### -- %s 2>&1 | FileCheck --check-prefix=LTO-DISABLE %s | ||
| // LTO-DISABLE-NOT: "-fwhole-program-vtables" |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -98,6 +98,12 @@ class PipelineTuningOptions { | |
| // analyses after various module->function or cgscc->function adaptors in the | ||
| // default pipelines. | ||
| bool EagerlyInvalidateAnalyses; | ||
|
|
||
| /// Tuning option to enable/disable whole program devirtualization. | ||
| /// Its default value is false. | ||
| /// This is controlled by the `-whole-program-vtables` flag. | ||
| /// Used only in non-LTO mode. | ||
| bool WholeProgramDevirt; | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Change this to match the proposed new option, and make it independent of LTO. We can presumably use this to do the speculative devirtualization when we are in LTO mode but don't have whole program visibility too. |
||
| }; | ||
|
|
||
| /// This class provides access to building LLVM's passes. | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -321,6 +321,7 @@ PipelineTuningOptions::PipelineTuningOptions() { | |
| MergeFunctions = EnableMergeFunctions; | ||
| InlinerThreshold = -1; | ||
| EagerlyInvalidateAnalyses = EnableEagerlyInvalidateAnalyses; | ||
| WholeProgramDevirt = false; | ||
| } | ||
|
|
||
| namespace llvm { | ||
|
|
@@ -1629,6 +1630,23 @@ PassBuilder::buildModuleOptimizationPipeline(OptimizationLevel Level, | |
| if (!LTOPreLink) | ||
| MPM.addPass(RelLookupTableConverterPass()); | ||
|
|
||
| if (PTO.WholeProgramDevirt && LTOPhase == ThinOrFullLTOPhase::None) { | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does this need to be guarded by whether whole program vtables was enabled? We don't for LTO. What happens if it is simply always invoked for non-LTO?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi Teresa, thanks for looking at the patch.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. At least WPD and LTT should not do anything if there is no type metadata (the inliner invocation is a separate story, more comments about that below).
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have that check to avoid running passes that are not needed if the option is not enabled, not only the Inliner pass, but also the WPD and LowerTypeTests pass. |
||
| MPM.addPass(WholeProgramDevirtPass(/*ExportSummary*/ nullptr, | ||
| /*ImportSummary*/ nullptr, | ||
| /*InLTOMode=*/false)); | ||
| MPM.addPass(LowerTypeTestsPass(nullptr, nullptr, | ||
| lowertypetests::DropTestKind::Assume)); | ||
| if (EnableModuleInliner) { | ||
| MPM.addPass(ModuleInlinerPass(getInlineParamsFromOptLevel(Level), | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think you want to reinvoke the inliner. Better to invoke WPD from the module simplifier right before the inliner is invoked there.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It seems that the passes that eliminate the unused GVs are depending on the inliner where constructors get inlined so it becomes clear which GVs are used and which are not.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess in LTO mode we effectively get 2 rounds of inlining too: one during the pre-LTO compile, which must be what is inlining the constructors, and then one in the LTO backends after WPD. This is unfortunately a bigger change, adding another full round of inlining. But I guess ok because this won't be on by default.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you summarize this info in a comment (i.e. specifically why another round of inlining is needed in non-LTO mode but effectively already exists elsewhere in LTO mode)? |
||
| UseInlineAdvisor, | ||
| ThinOrFullLTOPhase::None)); | ||
| } else { | ||
| MPM.addPass(ModuleInlinerWrapperPass( | ||
| getInlineParamsFromOptLevel(Level), | ||
| /* MandatoryFirst */ true, | ||
| InlineContext{ThinOrFullLTOPhase::None, InlinePass::CGSCCInliner})); | ||
| } | ||
| } | ||
| return MPM; | ||
| } | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking more about this, the option name doesn't really make sense in this case. Perhaps there should be a separate option controlling the non-LTO behavior. Let me think more about this part...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gcc has an option called -fdevirtualize-speculatively, that describes what is being enabled here: https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-fdevirtualize-speculatively
How about adding that to clang and using it to enable this instead of -fwhole-program-vtables, which doesn't really make sense as what is being enabled isn't whole program in this case.