Skip to content

Conversation

@vzakhari
Copy link
Contributor

Enable the option under opt-for-speed. Elementals with shapes
like (0, HUGE) should run faster.

Enable the option under opt-for-speed. Elementals with shapes
like `(0, HUGE)` should run faster.
@vzakhari vzakhari requested review from jeanPerier and tblah January 29, 2025 20:37
@llvmbot llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir labels Jan 29, 2025
@llvmbot
Copy link
Member

llvmbot commented Jan 29, 2025

@llvm/pr-subscribers-flang-fir-hlfir

Author: Slava Zakharin (vzakhari)

Changes

Enable the option under opt-for-speed. Elementals with shapes
like (0, HUGE) should run faster.


Full diff: https://github.com/llvm/llvm-project/pull/124982.diff

1 Files Affected:

  • (modified) flang/lib/Optimizer/Passes/Pipelines.cpp (+9-1)
diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp
index 1cc3f0b81c20ad..d55ad9e603ffaf 100644
--- a/flang/lib/Optimizer/Passes/Pipelines.cpp
+++ b/flang/lib/Optimizer/Passes/Pipelines.cpp
@@ -245,7 +245,15 @@ void createHLFIRToFIRPassPipeline(mlir::PassManager &pm, bool enableOpenMP,
   }
   pm.addPass(hlfir::createLowerHLFIROrderedAssignments());
   pm.addPass(hlfir::createLowerHLFIRIntrinsics());
-  pm.addPass(hlfir::createBufferizeHLFIR());
+
+  hlfir::BufferizeHLFIROptions bufferizeOptions;
+  // For opt-for-speed, avoid running any of the loops resulting
+  // from hlfir.elemental lowering, if the result is an empty array.
+  // This helps to avoid long running loops for elementals with
+  // shapes like (0, HUGE).
+  if (optLevel.isOptimizingForSpeed())
+    bufferizeOptions.optimizeEmptyElementals = true;
+  pm.addPass(hlfir::createBufferizeHLFIR(bufferizeOptions));
   // Run hlfir.assign inlining again after BufferizeHLFIR,
   // because the latter may introduce new hlfir.assign operations,
   // e.g. for copying an array into a temporary due to

@vzakhari
Copy link
Contributor Author

x86 performance run showed some fluctuations on fatigue2 and cactusADM, but they look just like noise to me: the patch is not triggered in cactusADM at all; in fatigue2 the selects are inserted in one spot affecting the instruction addresses, but otherwise the code looks the same.

Copy link
Contributor

@jeanPerier jeanPerier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

Copy link
Contributor

@tblah tblah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No change to spec2017 on aarch64

@vzakhari vzakhari merged commit 81f5098 into llvm:main Jan 30, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

flang:fir-hlfir flang Flang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants