-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[flang][Driver] Enable -Os and -Oz in flang #164707
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@llvm/pr-subscribers-flang-fir-hlfir @llvm/pr-subscribers-flang-driver Author: Tarun Prabhu (tarunprabhu) ChangesThe implementation adheres closely to the implementation in clang. The effect on the pass pipelines have been tested. Fixes #62268 Full diff: https://github.com/llvm/llvm-project/pull/164707.diff 4 Files Affected:
diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def
index dc3da7ba5c7f3..5892d3ef24e5a 100644
--- a/flang/include/flang/Frontend/CodeGenOptions.def
+++ b/flang/include/flang/Frontend/CodeGenOptions.def
@@ -20,6 +20,8 @@ CODEGENOPT(Name, Bits, Default)
#endif
CODEGENOPT(OptimizationLevel, 2, 0) ///< The -O[0-3] option specified.
+/// The -Os (==1) or -Oz (==2) option is specified.
+CODEGENOPT(OptimizeSize, 2, 0)
CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new
///< pass manager.
diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp
index 548ca675db5ea..6dfff3b001aea 100644
--- a/flang/lib/Frontend/CompilerInvocation.cpp
+++ b/flang/lib/Frontend/CompilerInvocation.cpp
@@ -114,6 +114,10 @@ static unsigned getOptimizationLevel(llvm::opt::ArgList &args,
assert(a->getOption().matches(clang::driver::options::OPT_O));
+ llvm::StringRef s(a->getValue());
+ if (s == "s" || s == "z")
+ return 2;
+
return getLastArgIntValue(args, clang::driver::options::OPT_O, defaultOpt,
diags);
}
@@ -121,6 +125,24 @@ static unsigned getOptimizationLevel(llvm::opt::ArgList &args,
return defaultOpt;
}
+/// Extracts the size-optimization level from \a args
+static unsigned getOptimizationLevelSize(llvm::opt::ArgList &args) {
+ if (llvm::opt::Arg *a =
+ args.getLastArg(clang::driver::options::OPT_O_Group)) {
+ if (a->getOption().matches(clang::driver::options::OPT_O)) {
+ switch (a->getValue()[0]) {
+ default:
+ return 0;
+ case 's':
+ return 1;
+ case 'z':
+ return 2;
+ }
+ }
+ }
+ return 0;
+}
+
bool Fortran::frontend::parseDiagnosticArgs(clang::DiagnosticOptions &opts,
llvm::opt::ArgList &args) {
opts.ShowColors = parseShowColorsArgs(args);
@@ -273,6 +295,7 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts,
llvm::opt::ArgList &args,
clang::DiagnosticsEngine &diags) {
opts.OptimizationLevel = getOptimizationLevel(args, diags);
+ opts.OptimizeSize = getOptimizationLevelSize(args);
if (args.hasFlag(clang::driver::options::OPT_fdebug_pass_manager,
clang::driver::options::OPT_fno_debug_pass_manager, false))
diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp
index 0c630d2ba876d..d6fb98bc930c6 100644
--- a/flang/lib/Frontend/FrontendActions.cpp
+++ b/flang/lib/Frontend/FrontendActions.cpp
@@ -602,7 +602,16 @@ mapToLevel(const Fortran::frontend::CodeGenOptions &opts) {
case 1:
return llvm::OptimizationLevel::O1;
case 2:
- return llvm::OptimizationLevel::O2;
+ switch (opts.OptimizeSize) {
+ default:
+ llvm_unreachable("Invalid optimization level for size!");
+ case 0:
+ return llvm::OptimizationLevel::O2;
+ case 1:
+ return llvm::OptimizationLevel::Os;
+ case 2:
+ return llvm::OptimizationLevel::Oz;
+ }
case 3:
return llvm::OptimizationLevel::O3;
}
diff --git a/flang/test/Driver/default-optimization-pipelines.f90 b/flang/test/Driver/default-optimization-pipelines.f90
index 08e407f73da5c..18108cd632220 100644
--- a/flang/test/Driver/default-optimization-pipelines.f90
+++ b/flang/test/Driver/default-optimization-pipelines.f90
@@ -14,10 +14,16 @@
! RUN: %flang_fc1 -S -O2 %s -flto=full -fdebug-pass-manager -o /dev/null 2>&1 | FileCheck %s --check-prefix=CHECK-O2-LTO
! RUN: %flang_fc1 -S -O2 %s -flto=thin -fdebug-pass-manager -o /dev/null 2>&1 | FileCheck %s --check-prefix=CHECK-O2-THINLTO
-! Verify that only the left-most `-O{n}` is used
+! Verify that only the right-most `-O{n}` is used
! RUN: %flang -S -O2 -O0 %s -Xflang -fdebug-pass-manager -o /dev/null 2>&1 | FileCheck %s --check-prefix=CHECK-O0
! RUN: %flang_fc1 -S -O2 -O0 %s -fdebug-pass-manager -o /dev/null 2>&1 | FileCheck %s --check-prefix=CHECK-O0
+! Verify that passing -Os/-Oz have the desired effect on the pass pipelines.
+! RUN: %flang -S -Os %s -Xflang -fdebug-pass-manager -o /dev/null 2>&1 \
+! RUN: | FileCheck %s --check-prefix=CHECK-OSIZE
+! RUN: %flang -S -Oz %s -Xflang -fdebug-pass-manager -o /dev/null 2>&1 \
+! RUN: | FileCheck %s --check-prefix=CHECK-OSIZE
+
! CHECK-O0-NOT: Running pass: SimplifyCFGPass on simple_loop_
! CHECK-O0: Running analysis: TargetLibraryAnalysis on simple_loop_
! CHECK-O0-ANYLTO: Running pass: CanonicalizeAliasesPass on [module]
@@ -33,6 +39,12 @@
! CHECK-O2-THINLTO: Running pass: CanonicalizeAliasesPass on [module]
! CHECK-O2-THINLTO: Running pass: NameAnonGlobalPass on [module]
+! -Os/-Oz imply -O2, so check that a pass that runs on O2 is run. Then check
+! that passes like LibShrinkWrap, that should not be run when optimizing for
+! size, are not run (see llvm/lib/Passes/PassBuilderPipelines.cpp).
+! CHECK-OSIZE: Running pass: SimplifyCFGPass on simple_loop_
+! CHECK-OSIZE-NOT: Running pass: LibCallsShrinkWrapPass on simple_loop_
+
subroutine simple_loop
integer :: i
do i=1,5
|
|
Could you post in https://discourse.llvm.org/t/code-size-optimization-flags-in-flang/69482 regarding this patch to inform incase anyone has a different opinion? |
|
Does Os and Oz run with optimization level 2? At these optimization levels flang might be performing intrinsic inlining, copy in inlining, math function inlining, loop versioning etc that might increase the size. |
Yes. This intentionally mirrors clang's implementation. Some LLVM optimizations like IPSCCP, inlining (or here) examine this level internally, but I haven't looked at exactly how these are affected. We could do something different in Do you think that should be done as part of this PR? |
I think what Kiran is referring to is the MLIR pass pipeline inside of flang. I don't think there is any equivalent to this in clang. See |
8619920 to
2d755d7
Compare
Ah, thanks for clarifying. I'll take a look |
|
Note to reviewers: Please disregard any changes that you see in this PR until I change it back out of draft mode. |
For the LLVM passes, the implementation mirrors that in clang. A number of FIR and MLIR passes that may result in an increase in code size are not run when either of these options is provided. Explicit speedup and size levels are used instead of isOptimizingForSize since the latter assumes that speedupLevel is 0 if optimizing for size. However, optimizing for size implies that the speedup level is 2
7154eb8 to
5d29984
Compare
|
Have you gathered any metrics on the executable size/perf impact? For instance, I expect that All in all, I am a bit worried about complexifying flang pipeline (at least the MLIR pipeline) for this feature for which the use case in the community of Fortran user is a bit unclear to me. I do expect a big impact on the code size is currently related to LLVM optimizations (inlining/loop versioning/unrolling). More options are great on the paper, but they also means more complexity in the compiler, more technical debt/source of bugs/difficulty to reproduce issues. If you have metrics and a use-case that justify the modification of the MLIR pipeline (say something like getting x% executable size reduction at the cost of y% slow-down on SPEC where x% and y% are what users using |
I have not. That's a good point. I'll see what I can put together. I don't have a use-case beyond addressing the bug-report that was filed which referenced the SPEC 2017 benchmark suite. Perhaps @ceseo, who filed that report, has more insight.
Do you think it would be worth splitting this into two PR's? The first would accept the The second would tweak the MLIR pass pipeline, after a more careful analysis of the effects of the various passes (it may also be that we choose not to make any changes to the MLIR pass pipeline at all). |
|
Maybe we could just tweak the MLIR pipeline for the most obvious cases (LoopVersioning, Inlining) and be extremely conservative about anything with an uncertain payoff. |
For the LLVM passes, the implementation mirrors that in clang. A number of
FIR and MLIR passes that may result in an increase in code size are not run
when either of these options is provided. Explicit speedup and size levels are
used instead of isOptimizingForSize since the latter assumes that speedupLevel
is 0 if optimizing for size. However, optimizing for size implies that the
speedup level is 2
Fixes #62268