Skip to content

Conversation

@luporl
Copy link
Contributor

@luporl luporl commented Aug 20, 2025

Because character comparison appends spaces to the shorter character,
calls to trim() that are used only in the comparison can be eliminated.

Example:
trim(x) == trim(y)
can be simplified to
x == y

This makes 527.cam4_r about 3% faster, measured on Neoverse V2.

This patch implements the optimization above in a new pass:
ExpressionSimplification. Although no other expression simplifications
are planned at the moment, they could be easily added to the new pass.

The ExpressionSimplification pass runs early in the HLFIR pipeline, to
make it easy to identify expressions before any transformations occur.

Because character comparison appends spaces to the shorter character,
calls to trim() that are used only in the comparison can be eliminated.

Example:
`trim(x) == trim(y)`
can be simplified to
`x == y`

This makes 527.cam4_r about 3% faster, measured on Neoverse V2.

This patch implements the optimization above in a new pass:
ExpressionSimplification. Although no other expression simplifications
are planned at the moment, they could be easily added to the new pass.

The ExpressionSimplification pass runs early in the HLFIR pipeline, to
make it easy to identify expressions before any transformations occur.
@llvmbot llvmbot added flang:driver flang Flang issues not falling into any other category flang:fir-hlfir labels Aug 20, 2025
@llvmbot
Copy link
Member

llvmbot commented Aug 20, 2025

@llvm/pr-subscribers-flang-driver

@llvm/pr-subscribers-flang-fir-hlfir

Author: Leandro Lupori (luporl)

Changes

Because character comparison appends spaces to the shorter character,
calls to trim() that are used only in the comparison can be eliminated.

Example:
trim(x) == trim(y)
can be simplified to
x == y

This makes 527.cam4_r about 3% faster, measured on Neoverse V2.

This patch implements the optimization above in a new pass:
ExpressionSimplification. Although no other expression simplifications
are planned at the moment, they could be easily added to the new pass.

The ExpressionSimplification pass runs early in the HLFIR pipeline, to
make it easy to identify expressions before any transformations occur.


Patch is 25.09 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/154593.diff

6 Files Affected:

  • (modified) flang/include/flang/Optimizer/HLFIR/Passes.td (+5)
  • (modified) flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt (+1)
  • (added) flang/lib/Optimizer/HLFIR/Transforms/ExpressionSimplification.cpp (+294)
  • (modified) flang/lib/Optimizer/Passes/Pipelines.cpp (+3)
  • (modified) flang/test/Driver/mlir-pass-pipeline.f90 (+1)
  • (added) flang/test/HLFIR/expression-simplification.fir (+173)
diff --git a/flang/include/flang/Optimizer/HLFIR/Passes.td b/flang/include/flang/Optimizer/HLFIR/Passes.td
index 04d7aec5fe489..43646c6c05ead 100644
--- a/flang/include/flang/Optimizer/HLFIR/Passes.td
+++ b/flang/include/flang/Optimizer/HLFIR/Passes.td
@@ -61,6 +61,11 @@ def SimplifyHLFIRIntrinsics : Pass<"simplify-hlfir-intrinsics"> {
                         "the hlfir.matmul.">];
 }
 
+def ExpressionSimplification
+    : Pass<"expression-simplification", "::mlir::ModuleOp"> {
+  let summary = "Simplify Fortran expressions";
+}
+
 def InlineElementals : Pass<"inline-elementals"> {
   let summary = "Inline chained hlfir.elemental operations";
 }
diff --git a/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt b/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt
index cc74273d9c5d9..28153076df0fb 100644
--- a/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt
+++ b/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt
@@ -3,6 +3,7 @@ get_property(dialect_libs GLOBAL PROPERTY MLIR_DIALECT_LIBS)
 add_flang_library(HLFIRTransforms
   BufferizeHLFIR.cpp
   ConvertToFIR.cpp
+  ExpressionSimplification.cpp
   InlineElementals.cpp
   InlineHLFIRAssign.cpp
   InlineHLFIRCopyIn.cpp
diff --git a/flang/lib/Optimizer/HLFIR/Transforms/ExpressionSimplification.cpp b/flang/lib/Optimizer/HLFIR/Transforms/ExpressionSimplification.cpp
new file mode 100644
index 0000000000000..d9e73482020f8
--- /dev/null
+++ b/flang/lib/Optimizer/HLFIR/Transforms/ExpressionSimplification.cpp
@@ -0,0 +1,294 @@
+#include "flang/Optimizer/Builder/FIRBuilder.h"
+#include "flang/Optimizer/HLFIR/HLFIROps.h"
+#include "flang/Optimizer/HLFIR/Passes.h"
+#include "flang/Runtime/entry-names.h"
+#include "llvm/Support/DebugLog.h"
+
+namespace hlfir {
+#define GEN_PASS_DEF_EXPRESSIONSIMPLIFICATION
+#include "flang/Optimizer/HLFIR/Passes.h.inc"
+} // namespace hlfir
+
+#define DEBUG_TYPE "expression-simplification"
+
+#define INDENT(n) std::string((n) * 2, ' ')
+
+static void removeOperands(mlir::Operation *op, int nestLevel);
+
+static void removeOp(mlir::Operation *op, int parentUses, int nestLevel) {
+  int opUses = std::distance(op->getUses().begin(), op->getUses().end());
+
+  if (opUses <= parentUses) {
+    LDBG() << INDENT(nestLevel) << "remove: " << *op;
+    removeOperands(op, nestLevel);
+    op->dropAllReferences();
+    op->dropAllUses();
+    op->erase();
+  }
+}
+
+static void removeOp(mlir::Operation *op) {
+  removeOp(op, /*parentUses=*/0, /*nestLevel=*/0);
+  LDBG();
+}
+
+static void removeOperands(mlir::Operation *op, int nestLevel) {
+  for (mlir::Value operand : op->getOperands()) {
+    if (!operand)
+      // Already removed.
+      continue;
+    if (mlir::Operation *operandOp = operand.getDefiningOp()) {
+      int uses = 0;
+      for (mlir::Operation *user : operandOp->getUsers())
+        if (user == op)
+          ++uses;
+      removeOp(operandOp, uses, nestLevel + 1);
+    }
+  }
+}
+
+template <typename UserOp>
+static UserOp getFirstUser(mlir::Operation *op) {
+  auto it = op->user_begin(), end = op->user_end(), prev = it;
+  for (; it != end; prev = it++)
+    ;
+  if (prev != end)
+    if (auto userOp = mlir::dyn_cast<UserOp>(*prev))
+      return userOp;
+  return {};
+}
+
+template <typename UserOp>
+static UserOp getLastUser(mlir::Operation *op) {
+  if (!op->getUsers().empty())
+    if (auto userOp = mlir::dyn_cast<UserOp>(*op->user_begin()))
+      return userOp;
+  return {};
+}
+
+template <typename UserOp>
+static UserOp getPreviousUser(mlir::Operation *op, mlir::Operation *curUser) {
+  for (auto user = op->user_begin(), end = op->user_end(); user != end;
+       ++user) {
+    if (*user == curUser) {
+      if (++user == end)
+        break;
+      if (auto userOp = mlir::dyn_cast<UserOp>(*user))
+        return userOp;
+      break;
+    }
+  }
+  return {};
+}
+
+// Check if operation has the expected number of uses.
+static bool expectUses(mlir::Operation *op, int expUses) {
+  int uses = std::distance(op->use_begin(), op->use_end());
+  if (uses != expUses) {
+    LDBG() << "expectUses: expected " << expUses << ", got " << uses;
+    for (mlir::Operation *user : op->getUsers())
+      LDBG() << "\t" << *user;
+  }
+  return uses == expUses;
+}
+
+template <typename Op>
+static Op expectOp(mlir::Value val) {
+  if (Op op = mlir::dyn_cast<Op>(val.getDefiningOp())) {
+    LDBG() << op;
+    return op;
+  }
+  return {};
+}
+
+static mlir::Value findBoxDef(mlir::Value val) {
+  if (auto op = expectOp<fir::ConvertOp>(val)) {
+    assert(op->getOperands().size() != 0);
+    if (auto boxOp = expectOp<fir::EmboxOp>(op->getOperand(0)))
+      return boxOp.getResult();
+  }
+  return {};
+}
+
+namespace {
+
+// This class analyzes a trimmed character and removes the call to trim() (and
+// its dependencies) if its result is not used elsewhere.
+class TrimRemover {
+public:
+  TrimRemover(fir::FirOpBuilder &builder, mlir::Value charVal,
+              mlir::Value charLenVal)
+      : builder(builder), charVal(charVal), charLenVal(charLenVal) {}
+  TrimRemover(const TrimRemover &) = delete;
+
+  bool charWasTrimmed();
+  void removeTrim();
+
+private:
+  // Class inputs.
+  fir::FirOpBuilder &builder;
+  mlir::Value charVal;
+  mlir::Value charLenVal;
+  // Operations found while analyzing inputs, that are needed when removing
+  // the trim call.
+  hlfir::DeclareOp charDeclOp;      // Trim input character.
+  fir::CallOp trimCallOp;           // Trim call.
+  hlfir::EndAssociateOp endAssocOp; // Trim result association.
+  hlfir::DestroyOp destroyExprOp;   // Trim result expression.
+  fir::AllocaOp allocaOp;           // Trim result alloca.
+};
+
+bool TrimRemover::charWasTrimmed() {
+  LDBG() << "\ncharWasTrimmed: " << charVal;
+
+  // Get the declare and expression operations associated to `charVal`, that
+  // correspond to the result of trim.
+  auto charCvtOp = expectOp<fir::ConvertOp>(charVal);
+  auto charLenCvtOp = expectOp<fir::ConvertOp>(charLenVal);
+  if (!charCvtOp || !charLenCvtOp || !expectUses(charCvtOp, 1) ||
+      !expectUses(charLenCvtOp, 1))
+    return false;
+  auto assocOp = expectOp<hlfir::AssociateOp>(charCvtOp.getOperand());
+  if (!assocOp || !expectUses(assocOp, 3)) // end_associate uses assocOp twice
+    return false;
+  endAssocOp = getLastUser<hlfir::EndAssociateOp>(assocOp);
+  if (!endAssocOp)
+    return false;
+  auto asExprOp = expectOp<hlfir::AsExprOp>(assocOp.getOperand(0));
+  if (!asExprOp || !expectUses(asExprOp, 2))
+    return false;
+  destroyExprOp = getLastUser<hlfir::DestroyOp>(asExprOp);
+  if (!destroyExprOp)
+    return false;
+  auto declOp = expectOp<hlfir::DeclareOp>(asExprOp.getOperand(0));
+  if (!declOp || !expectUses(declOp, 1))
+    return false;
+
+  // Get associated box and alloca.
+  auto boxAddrOp = expectOp<fir::BoxAddrOp>(declOp.getMemref());
+  if (!boxAddrOp || !expectUses(boxAddrOp, 1))
+    return false;
+  auto loadOp = expectOp<fir::LoadOp>(boxAddrOp.getOperand());
+  if (!loadOp || !getFirstUser<fir::BoxEleSizeOp>(loadOp) ||
+      !expectUses(loadOp, 2))
+    return false;
+  // The allocaOp is initialized by a store.
+  // Besides its use by the store and loadOp, it's also converted and used by
+  // the trim call.
+  allocaOp = expectOp<fir::AllocaOp>(loadOp.getMemref());
+  if (!allocaOp || !getFirstUser<fir::StoreOp>(allocaOp) ||
+      !expectUses(allocaOp, 3))
+    return false;
+
+  // Find the trim call that uses the allocaOp.
+  if (auto userOp = getPreviousUser<fir::ConvertOp>(allocaOp, loadOp))
+    if (userOp->hasOneUse())
+      trimCallOp = mlir::dyn_cast<fir::CallOp>(*userOp->user_begin());
+  if (!trimCallOp)
+    return false;
+  LDBG() << "call: " << trimCallOp;
+  llvm::StringRef calleeName =
+      trimCallOp.getCalleeAttr().getLeafReference().getValue();
+  LDBG() << "callee: " << calleeName;
+  if (calleeName != RTNAME_STRING(Trim))
+    return false;
+
+  // Get trim input character.
+  auto chrEmboxOp =
+      expectOp<fir::EmboxOp>(findBoxDef(trimCallOp.getOperand(1)));
+  if (!chrEmboxOp)
+    return false;
+  charDeclOp = expectOp<hlfir::DeclareOp>(chrEmboxOp.getMemref());
+  if (!charDeclOp)
+    return false;
+
+  // Found everything as expected.
+  return true;
+}
+
+void TrimRemover::removeTrim() {
+  LDBG() << "\nremoveTrim:";
+
+  auto charCvtOp = expectOp<fir::ConvertOp>(charVal);
+  auto charLenCvtOp = expectOp<fir::ConvertOp>(charLenVal);
+  assert(charCvtOp && charLenCvtOp);
+
+  // Replace trim output char with its input.
+  mlir::Location loc = charVal.getLoc();
+  auto cvtOp = fir::ConvertOp::create(builder, loc, charCvtOp.getType(),
+                                      charDeclOp.getOriginalBase());
+  charCvtOp.replaceAllUsesWith(cvtOp.getResult());
+
+  // Replace trim output length with its input.
+  mlir::Value chrLen = charDeclOp.getTypeparams().back();
+  auto cvtLenOp =
+      fir::ConvertOp::create(builder, loc, charLenCvtOp.getType(), chrLen);
+  charLenCvtOp.replaceAllUsesWith(cvtLenOp.getResult());
+
+  // Remove trim call and old conversions.
+  removeOp(charCvtOp);
+  removeOp(charLenCvtOp);
+  removeOp(trimCallOp);
+  // Remove association and expression.
+  removeOp(endAssocOp);
+  removeOp(destroyExprOp);
+  // The only remaining use of allocaOp should be its initialization.
+  // Remove the store and alloca operations.
+  if (auto userOp = getLastUser<fir::StoreOp>(allocaOp))
+    removeOp(userOp);
+}
+
+} // namespace
+
+namespace {
+
+class ExpressionSimplification
+    : public hlfir::impl::ExpressionSimplificationBase<
+          ExpressionSimplification> {
+public:
+  using ExpressionSimplificationBase<
+      ExpressionSimplification>::ExpressionSimplificationBase;
+
+  void runOnOperation() override;
+
+private:
+  // Simplify character comparisons.
+  // Because character comparison appends spaces to the shorter character,
+  // calls to trim() that are used only in the comparison can be eliminated.
+  //
+  // Example:
+  // `trim(x) == trim(y)`
+  // can be simplified to
+  // `x == y`
+  void simplifyCharCompare(fir::CallOp call, const fir::KindMapping &kindMap);
+};
+
+void ExpressionSimplification::simplifyCharCompare(
+    fir::CallOp call, const fir::KindMapping &kindMap) {
+  fir::FirOpBuilder builder{call, kindMap};
+  mlir::Operation::operand_range args = call.getArgs();
+  TrimRemover lhsTrimRem(builder, args[0], args[2]);
+  TrimRemover rhsTrimRem(builder, args[1], args[3]);
+
+  if (lhsTrimRem.charWasTrimmed())
+    lhsTrimRem.removeTrim();
+  if (rhsTrimRem.charWasTrimmed())
+    rhsTrimRem.removeTrim();
+}
+
+void ExpressionSimplification::runOnOperation() {
+  mlir::ModuleOp module = getOperation();
+  fir::KindMapping kindMap = fir::getKindMapping(module);
+  module.walk([&](mlir::Operation *op) {
+    if (auto call = mlir::dyn_cast<fir::CallOp>(op)) {
+      if (mlir::SymbolRefAttr callee = call.getCalleeAttr()) {
+        mlir::StringRef funcName = callee.getLeafReference().getValue();
+        if (funcName.starts_with(RTNAME_STRING(CharacterCompareScalar))) {
+          simplifyCharCompare(call, kindMap);
+        }
+      }
+    }
+  });
+}
+
+} // namespace
diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp
index ca8e820608688..3b8f3e3add6e9 100644
--- a/flang/lib/Optimizer/Passes/Pipelines.cpp
+++ b/flang/lib/Optimizer/Passes/Pipelines.cpp
@@ -244,6 +244,9 @@ void createDefaultFIROptimizerPassPipeline(mlir::PassManager &pm,
 ///   passes pipeline
 void createHLFIRToFIRPassPipeline(mlir::PassManager &pm, bool enableOpenMP,
                                   llvm::OptimizationLevel optLevel) {
+  if (optLevel.getSizeLevel() > 0 || optLevel.getSpeedupLevel() > 0) {
+    pm.addPass(hlfir::createExpressionSimplification());
+  }
   if (optLevel.isOptimizingForSpeed()) {
     addCanonicalizerPassWithoutRegionSimplification(pm);
     addNestedPassToAllTopLevelOperations<PassConstructor>(
diff --git a/flang/test/Driver/mlir-pass-pipeline.f90 b/flang/test/Driver/mlir-pass-pipeline.f90
index 0bcd055a84b87..3d19a0a5700f5 100644
--- a/flang/test/Driver/mlir-pass-pipeline.f90
+++ b/flang/test/Driver/mlir-pass-pipeline.f90
@@ -15,6 +15,7 @@
 ! ALL: Pass statistics report
 
 ! ALL: Fortran::lower::VerifierPass
+! O2-NEXT: ExpressionSimplification
 ! O2-NEXT: Canonicalizer
 ! ALL:     Pipeline Collection : ['fir.global', 'func.func', 'omp.declare_reduction', 'omp.private']
 ! ALL-NEXT:'fir.global' Pipeline
diff --git a/flang/test/HLFIR/expression-simplification.fir b/flang/test/HLFIR/expression-simplification.fir
new file mode 100644
index 0000000000000..28e17de381461
--- /dev/null
+++ b/flang/test/HLFIR/expression-simplification.fir
@@ -0,0 +1,173 @@
+// RUN: fir-opt %s --expression-simplification | FileCheck %s
+
+// Test removal of trim() calls.
+
+//  logical function cmp(x, y)
+//    character(*) :: x, y
+//    cmp = trim(x) == trim(y)
+//  end function
+
+func.func @test_char_cmp(%arg0: !fir.boxchar<1> {fir.bindc_name = "x"},
+                         %arg1: !fir.boxchar<1> {fir.bindc_name = "y"}) -> !fir.logical<4> {
+  %0 = fir.alloca !fir.box<!fir.heap<!fir.char<1,?>>>
+  %1 = fir.alloca !fir.box<!fir.heap<!fir.char<1,?>>>
+  %2 = fir.dummy_scope : !fir.dscope
+  %3 = fir.alloca !fir.logical<4> {bindc_name = "cmp", uniq_name = "_QFcmpEcmp"}
+  %4:2 = hlfir.declare %3 {uniq_name = "_QFcmpEcmp"} : (!fir.ref<!fir.logical<4>>) -> (!fir.ref<!fir.logical<4>>, !fir.ref<!fir.logical<4>>)
+  %5:2 = fir.unboxchar %arg0 : (!fir.boxchar<1>) -> (!fir.ref<!fir.char<1,?>>, index)
+  %6:2 = hlfir.declare %5#0 typeparams %5#1 dummy_scope %2 {uniq_name = "_QFcmpEx"} : (!fir.ref<!fir.char<1,?>>, index, !fir.dscope) -> (!fir.boxchar<1>, !fir.ref<!fir.char<1,?>>)
+  %7:2 = fir.unboxchar %arg1 : (!fir.boxchar<1>) -> (!fir.ref<!fir.char<1,?>>, index)
+  %8:2 = hlfir.declare %7#0 typeparams %7#1 dummy_scope %2 {uniq_name = "_QFcmpEy"} : (!fir.ref<!fir.char<1,?>>, index, !fir.dscope) -> (!fir.boxchar<1>, !fir.ref<!fir.char<1,?>>)
+  %9 = fir.embox %6#1 typeparams %5#1 : (!fir.ref<!fir.char<1,?>>, index) -> !fir.box<!fir.char<1,?>>
+  %10 = fir.zero_bits !fir.heap<!fir.char<1,?>>
+  %c0 = arith.constant 0 : index
+  %11 = fir.embox %10 typeparams %c0 : (!fir.heap<!fir.char<1,?>>, index) -> !fir.box<!fir.heap<!fir.char<1,?>>>
+  fir.store %11 to %1 : !fir.ref<!fir.box<!fir.heap<!fir.char<1,?>>>>
+  %12 = fir.address_of(@_QQclX710b72634dc58109795123e3cbc3b5ed) : !fir.ref<!fir.char<1,65>>
+  %c5_i32 = arith.constant 5 : i32
+  %13 = fir.convert %1 : (!fir.ref<!fir.box<!fir.heap<!fir.char<1,?>>>>) -> !fir.ref<!fir.box<none>>
+  %14 = fir.convert %9 : (!fir.box<!fir.char<1,?>>) -> !fir.box<none>
+  %15 = fir.convert %12 : (!fir.ref<!fir.char<1,65>>) -> !fir.ref<i8>
+  fir.call @_FortranATrim(%13, %14, %15, %c5_i32) fastmath<contract> : (!fir.ref<!fir.box<none>>, !fir.box<none>, !fir.ref<i8>, i32) -> ()
+  %16 = fir.load %1 : !fir.ref<!fir.box<!fir.heap<!fir.char<1,?>>>>
+  %17 = fir.box_elesize %16 : (!fir.box<!fir.heap<!fir.char<1,?>>>) -> index
+  %18 = fir.box_addr %16 : (!fir.box<!fir.heap<!fir.char<1,?>>>) -> !fir.heap<!fir.char<1,?>>
+  %19:2 = hlfir.declare %18 typeparams %17 {uniq_name = ".tmp.intrinsic_result"} : (!fir.heap<!fir.char<1,?>>, index) -> (!fir.boxchar<1>, !fir.heap<!fir.char<1,?>>)
+  %true = arith.constant true
+  %20 = hlfir.as_expr %19#0 move %true : (!fir.boxchar<1>, i1) -> !hlfir.expr<!fir.char<1,?>>
+  %21 = fir.embox %8#1 typeparams %7#1 : (!fir.ref<!fir.char<1,?>>, index) -> !fir.box<!fir.char<1,?>>
+  %22 = fir.zero_bits !fir.heap<!fir.char<1,?>>
+  %c0_0 = arith.constant 0 : index
+  %23 = fir.embox %22 typeparams %c0_0 : (!fir.heap<!fir.char<1,?>>, index) -> !fir.box<!fir.heap<!fir.char<1,?>>>
+  fir.store %23 to %0 : !fir.ref<!fir.box<!fir.heap<!fir.char<1,?>>>>
+  %24 = fir.address_of(@_QQclX710b72634dc58109795123e3cbc3b5ed) : !fir.ref<!fir.char<1,65>>
+  %c5_i32_1 = arith.constant 5 : i32
+  %25 = fir.convert %0 : (!fir.ref<!fir.box<!fir.heap<!fir.char<1,?>>>>) -> !fir.ref<!fir.box<none>>
+  %26 = fir.convert %21 : (!fir.box<!fir.char<1,?>>) -> !fir.box<none>
+  %27 = fir.convert %24 : (!fir.ref<!fir.char<1,65>>) -> !fir.ref<i8>
+  fir.call @_FortranATrim(%25, %26, %27, %c5_i32_1) fastmath<contract> : (!fir.ref<!fir.box<none>>, !fir.box<none>, !fir.ref<i8>, i32) -> ()
+  %28 = fir.load %0 : !fir.ref<!fir.box<!fir.heap<!fir.char<1,?>>>>
+  %29 = fir.box_elesize %28 : (!fir.box<!fir.heap<!fir.char<1,?>>>) -> index
+  %30 = fir.box_addr %28 : (!fir.box<!fir.heap<!fir.char<1,?>>>) -> !fir.heap<!fir.char<1,?>>
+  %31:2 = hlfir.declare %30 typeparams %29 {uniq_name = ".tmp.intrinsic_result"} : (!fir.heap<!fir.char<1,?>>, index) -> (!fir.boxchar<1>, !fir.heap<!fir.char<1,?>>)
+  %true_2 = arith.constant true
+  %32 = hlfir.as_expr %31#0 move %true_2 : (!fir.boxchar<1>, i1) -> !hlfir.expr<!fir.char<1,?>>
+  %33:3 = hlfir.associate %20 typeparams %17 {adapt.valuebyref} : (!hlfir.expr<!fir.char<1,?>>, index) -> (!fir.boxchar<1>, !fir.ref<!fir.char<1,?>>, i1)
+  %34:3 = hlfir.associate %32 typeparams %29 {adapt.valuebyref} : (!hlfir.expr<!fir.char<1,?>>, index) -> (!fir.boxchar<1>, !fir.ref<!fir.char<1,?>>, i1)
+  %35 = fir.convert %33#1 : (!fir.ref<!fir.char<1,?>>) -> !fir.ref<i8>
+  %36 = fir.convert %34#1 : (!fir.ref<!fir.char<1,?>>) -> !fir.ref<i8>
+  %37 = fir.convert %17 : (index) -> i64
+  %38 = fir.convert %29 : (index) -> i64
+  %39 = fir.call @_FortranACharacterCompareScalar1(%35, %36, %37, %38) fastmath<contract> : (!fir.ref<i8>, !fir.ref<i8>, i64, i64) -> i32
+  %c0_i32 = arith.constant 0 : i32
+  %40 = arith.cmpi eq, %39, %c0_i32 : i32
+  hlfir.end_associate %33#1, %33#2 : !fir.ref<!fir.char<1,?>>, i1
+  hlfir.end_associate %34#1, %34#2 : !fir.ref<!fir.char<1,?>>, i1
+  %41 = fir.convert %40 : (i1) -> !fir.logical<4>
+  hlfir.assign %41 to %4#0 : !fir.logical<4>, !fir.ref<!fir.logical<4>>
+  hlfir.destroy %32 : !hlfir.expr<!fir.char<1,?>>
+  hlfir.destroy %20 : !hlfir.expr<!fir.char<1,?>>
+  %42 = fir.load %4#0 : !fir.ref<!fir.logical<4>>
+  return %42 : !fir.logical<4>
+}
+
+// CHECK-LABEL: func.func @test_char_cmp(
+// CHECK-SAME:        %[[VAL_4:.*]]: !fir.boxchar<1> {fir.bindc_name = "x"},
+// CHECK-SAME:        %[[VAL_7:.*]]: !fir.boxchar<1> {fir.bindc_name = "y"}) -> !fir.logical<4> {
+// CHECK:         %[[VAL_0:.*]] = fir.dummy_scope : !fir.dscope
+// CHECK:         %[[VAL_1:.*]] = fir.alloca !fir.logical<4> {bindc_name = "cmp", uniq_name = "_QFcmpEcmp"}
+// CHECK:         %[[VAL_2:.*]]:2 = hlfir.declare %[[VAL_1]] {uniq_name = "_QFcmpEcmp"} : (!fir.ref<!fir.logical<4>>) -> (!fir.ref<!fir.logical<4>>, !fir.ref<!fir.logical<4>>)
+// CHECK:         %[[VAL_3:.*]]:2 = fir.unboxchar %[[VAL_4]] : (!fir.boxchar<1>) -> (!fir.ref<!fir.char<1,?>>, index)
+// CHECK:         %[[VAL_5:.*]]:2 = hlfir.declare %[[VAL_3]]#0 typeparams %[[VAL_3]]#1 dummy_scope %[[VAL_0]] {uniq_name = "_QFcmpEx"} : (!fir.ref<!fir.char<1,?>>, index, !fir.dscope) -> (!fir.boxchar<1>, !fir.ref<!fir.char<1,?>>)
+// CHECK:         %[[VAL_6:.*]]:2 = fir.unboxchar %[[VAL_7]] : (!fir.boxchar<1>) -> (!fir.ref<!fir.char<1,?>>, index)
+// CHECK:         %[[VAL_8:.*]]:2 = hlfir.declare %[[VAL_6]]#0 typeparams %[[VAL_6]]#1 dummy_scope %[[VAL_0]] {uniq_name = "_QFcmpEy"} : (!fir.ref<!fir.char<1,?>>, index, !fir.dscope) -> (!fir.boxchar<1>, !fir.ref<!fir.char<1,?>>)
+// CHECK:         %[[VAL_9:.*]] = fir.convert %[[VAL_5]]#1 : (!fir.ref<!fir.char<1,?>>) -> !fir.ref<i8>
+// CHECK:         %[[VAL_10:.*]] = fir.convert %[[VAL_3]]#1 : (index) -> i64
+// CHECK:         %[[VAL_11:.*]] = fir.convert %[[VAL_8]]#1 : (!fir.ref<!fir.char<1,?>>) -> !fir.ref<i8>
+// CHECK:         %[[VAL_12:.*]] = fir.convert %[[VAL_6]]#1 : (index) -> i64
+// CHECK:         %[[VAL_13:.*]] = fir.call @_FortranACharacterCompareScalar1(%[[VAL_9]], %[[VAL_11]], %[[VAL_10]], %[[VAL_12]]) fastmath<contract> : (!fir.ref<i8>, !fir.ref<i8>, i64, i64) -> i32
+// CHECK:         %[[VAL_14:.*]] = arith.constant 0 : i32
+// CHECK:         %[[VAL_15:.*]] = arith.cmpi eq, %[[VAL_13]], %[[VAL_14]] : i32
+// CHECK:         %[[VAL_16:.*]] = fir.convert %[[VAL_15]] : (i1) -> !fir.logical<4>
+// CHECK:         hlfir.assign %[[VAL_16]] to %[[VAL_2]]#0 : !fir.logical<4>, !fir.ref<!fir.logical<4>>
+// CHECK:         %[[VAL_17:.*]] = fir.load %[[VAL_2]]#0 : !fir.ref<!fir.logical<4>>
+// CHECK:         return %[[VAL_17]] : !fir.logical<4>
+// CHECK:       }
+
+// Check that trim() is not removed when its result is stored.
+
+//  logical function eq_use3(x, y) result(res)
+//    char...
[truncated]

Copy link
Contributor

@vzakhari vzakhari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the changes!

I think it is better to introduce an hlfir.char_trim operation and then recognize the same pattern using the new operation rather than recognizing the runtime calls.

FYI, @valerydmit is wroking on adding an HLFIR operation for scalar character comparison, so we won't have _FortranACharacterCompareScalar1 soon.

@luporl
Copy link
Contributor Author

luporl commented Aug 20, 2025

Thanks for the quick review and suggestions!

I'll try to add an hlfir.char_trim operation and use it instead.
If it gets ready quickly, then I guess _FortranACharacterCompareScalar1 can remain, otherwise I'll switch to the new character comparison operation. Sounds good?

@vzakhari
Copy link
Contributor

Sounds good. Thank you!

luporl added a commit to luporl/llvm-project that referenced this pull request Aug 29, 2025
Fortran character trim is currently lowered directly into a runtime
call, which makes it more complex to simplify expressions using it.

With this patch trim is first lowered into an hlfir.char_trim
operation, that is only later transformed into a runtime call.
This makes it easier to remove unnecessary calls to trim, as
proposed in llvm#154593.
luporl added a commit that referenced this pull request Sep 4, 2025
Fortran character trim is currently lowered directly into a runtime
call, which makes it more complex to simplify expressions using it.

With this patch trim is first lowered into an hlfir.char_trim
operation, that is only later transformed into a runtime call.
This makes it easier to remove unnecessary calls to trim, as
proposed in #154593.
Copy link
Contributor

@vzakhari vzakhari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the update! It looks much clearer now!

I have a couple of suggestions.


void ExpressionSimplification::runOnOperation() {
mlir::ModuleOp module = getOperation();
module.walk([&](hlfir::CmpCharOp cmpChar) { simplifyCmpChar(cmpChar); });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend walking all trim operations instead of the comparisons, e.g. consider the case where the same trim result is used in multiple character comparisons. We may not have such cases now, but the CSE may be able to optimize it in codes like (trim(x) == trim(y)).and.(trim(x) == trim(z)).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks for the suggestion, it simplified the code even further.

At least for now the pass is able to erase all trims in (trim(x) == trim(y)).and.(trim(x) == trim(z)).
But for each trim an hlfir.char_trim is generated, no result is currently reused.

If this becomes the case in the future, we would need to check all users to see if they are composed only of hlfir.cmpchars and one hlfir.destroy. To keep the code simple, I have not made this change.
Let me know if you think it would be better to implement it already.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I missed the notification. I will review now.

Copy link
Contributor

@tblah tblah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very clean now. Good stuff.

Copy link
Contributor

@vzakhari vzakhari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, with one comment for the LIT test.

Copy link
Contributor

@jeanPerier jeanPerier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks great.

@luporl luporl merged commit 8d8bd0a into llvm:main Sep 24, 2025
9 checks passed
@luporl luporl deleted the luporl-rem-trim branch September 24, 2025 14:52
@luporl
Copy link
Contributor Author

luporl commented Sep 24, 2025

Thank you all for the reviews.

mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Oct 3, 2025
Because character comparison appends spaces to the shorter character,
calls to trim() that are used only in the comparison can be eliminated.

Example:
`trim(x) == trim(y)`
can be simplified to
`x == y`

This makes 527.cam4_r about 3% faster, measured on Neoverse V2.

This patch implements the optimization above in a new pass:
ExpressionSimplification. Although no other expression simplifications
are planned at the moment, they could be easily added to the new pass.

The ExpressionSimplification pass runs early in the HLFIR pipeline, to
make it easy to identify expressions before any transformations occur.
luporl added a commit to luporl/llvm-project that referenced this pull request Oct 15, 2025
Fortran character trim is currently lowered directly into a runtime
call, which makes it more complex to simplify expressions using it.

With this patch trim is first lowered into an hlfir.char_trim
operation, that is only later transformed into a runtime call.
This makes it easier to remove unnecessary calls to trim, as
proposed in llvm#154593.
luporl added a commit to luporl/llvm-project that referenced this pull request Oct 15, 2025
Because character comparison appends spaces to the shorter character,
calls to trim() that are used only in the comparison can be eliminated.

Example:
`trim(x) == trim(y)`
can be simplified to
`x == y`

This makes 527.cam4_r about 3% faster, measured on Neoverse V2.

This patch implements the optimization above in a new pass:
ExpressionSimplification. Although no other expression simplifications
are planned at the moment, they could be easily added to the new pass.

The ExpressionSimplification pass runs early in the HLFIR pipeline, to
make it easy to identify expressions before any transformations occur.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

flang:driver flang:fir-hlfir flang Flang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants