Skip to content

Conversation

HanchengWu
Copy link
Contributor

@HanchengWu HanchengWu commented Sep 23, 2025

The pass generate-runtime-verification generates additional runtime op verification checks.

Currently, the pass is extremely expensive. For example, with a mobilenet v2 ssd network(converted to mlir), running this pass alone in debug mode will take 30 minutes. The same observation has been made to other networks as small as 5 Mb.

The culprit is this line "op->print(stream, flags);" in function "RuntimeVerifiableOpInterface::generateErrorMessage" in File mlir/lib/Interfaces/RuntimeVerifiableOpInterface.cpp.

As we are printing the op with all the names of the operands in the middle end, we are constructing a new SSANameState for each op->print(...) call. Thus, we are doing a new SSA analysis for each error message printed.

Perf profiling shows that 98% percent of the time is spent in the constructor of SSANameState.

This change refactored the message generator. We use a toplevel AsmState, and reuse it with all the op-print(stream, asmState). With a release build, this change reduces the pass exeuction time from ~160 seconds to 0.3 seconds on my machine.

This change also adds verbose options to generate-runtime-verification pass.
verbose 0: print only source location with error message.
verbose 1: print the full op, including the name of the operands.

Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot
Copy link
Member

llvmbot commented Sep 23, 2025

@llvm/pr-subscribers-mlir-tensor
@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-mlir-linalg

Author: Hanchenng Wu (HanchengWu)

Changes

The pass generate-runtime-verification generates additional runtime op verification checks.

Currently, the pass is extremely expensive. For example, with a mobilenet v2 ssd network(converted to mlir), running this pass alone will take 30 minutes. The same observation has been made to other networks as small as 5 Mb.

The culprit is this line "op->print(stream, flags);" in function "RuntimeVerifiableOpInterface::generateErrorMessage" in File mlir/lib/Interfaces/RuntimeVerifiableOpInterface.cpp.

As we are printing the op with all the names of the operands in the middle end, we are constructing a new SSANameState for each op->print(...) call. Thus, we are doing a new SSA analysis for each error message printed.

Perf profiling shows that 98% percent of the time is spent in the constructor of SSANameState.

This change add verbose options to generate-runtime-verification pass.
verbose 0: print only source location with error message.
verbose 1: print source location and operation name and operand types with error message.
verbose 2: print the full op, including the name of the operands.

verbose 2 is the current behavior and is very expensive. I still keep the default as verbose 2.

When we switch from verbose 2 to verbose 0/1, we see below improvements.

For mlir imported from mobileNet v2 ssd, the running time of the pass is reduced from 32 mintues to 21 seconds.
For another small network (only 5MB size), the running time of the pass is reduced from 15 minutes to 4 seconds.


Full diff: https://github.com/llvm/llvm-project/pull/160331.diff

8 Files Affected:

  • (modified) mlir/include/mlir/Interfaces/RuntimeVerifiableOpInterface.td (+4-2)
  • (modified) mlir/include/mlir/Transforms/Passes.h (+1)
  • (modified) mlir/include/mlir/Transforms/Passes.td (+8)
  • (modified) mlir/lib/Dialect/Linalg/Transforms/RuntimeOpVerification.cpp (+5-3)
  • (modified) mlir/lib/Dialect/MemRef/Transforms/RuntimeOpVerification.cpp (+25-21)
  • (modified) mlir/lib/Interfaces/RuntimeVerifiableOpInterface.cpp (+20-4)
  • (modified) mlir/lib/Transforms/GenerateRuntimeVerification.cpp (+10-1)
  • (modified) mlir/test/Dialect/Linalg/runtime-verification.mlir (+14)
diff --git a/mlir/include/mlir/Interfaces/RuntimeVerifiableOpInterface.td b/mlir/include/mlir/Interfaces/RuntimeVerifiableOpInterface.td
index 6fd0df59d9d2e..e5c9336c8d8dc 100644
--- a/mlir/include/mlir/Interfaces/RuntimeVerifiableOpInterface.td
+++ b/mlir/include/mlir/Interfaces/RuntimeVerifiableOpInterface.td
@@ -32,14 +32,16 @@ def RuntimeVerifiableOpInterface : OpInterface<"RuntimeVerifiableOpInterface"> {
       /*retTy=*/"void",
       /*methodName=*/"generateRuntimeVerification",
       /*args=*/(ins "::mlir::OpBuilder &":$builder,
-                    "::mlir::Location":$loc)
+                    "::mlir::Location":$loc,
+                    "unsigned":$verboseLevel)
     >,
   ];
 
   let extraClassDeclaration = [{
     /// Generate the error message that will be printed to the user when 
     /// verification fails.
-    static std::string generateErrorMessage(Operation *op, const std::string &msg);
+    static std::string generateErrorMessage(Operation *op, const std::string &msg,
+                                            unsigned verboseLevel = 0);
   }];
 }
 
diff --git a/mlir/include/mlir/Transforms/Passes.h b/mlir/include/mlir/Transforms/Passes.h
index 41f208216374f..58ba0892df113 100644
--- a/mlir/include/mlir/Transforms/Passes.h
+++ b/mlir/include/mlir/Transforms/Passes.h
@@ -46,6 +46,7 @@ class GreedyRewriteConfig;
 #define GEN_PASS_DECL_SYMBOLPRIVATIZE
 #define GEN_PASS_DECL_TOPOLOGICALSORT
 #define GEN_PASS_DECL_COMPOSITEFIXEDPOINTPASS
+#define GEN_PASS_DECL_GENERATERUNTIMEVERIFICATION
 #include "mlir/Transforms/Passes.h.inc"
 
 /// Creates an instance of the Canonicalizer pass, configured with default
diff --git a/mlir/include/mlir/Transforms/Passes.td b/mlir/include/mlir/Transforms/Passes.td
index a39ab77fc8fb3..3d643d8a168db 100644
--- a/mlir/include/mlir/Transforms/Passes.td
+++ b/mlir/include/mlir/Transforms/Passes.td
@@ -271,8 +271,16 @@ def GenerateRuntimeVerification : Pass<"generate-runtime-verification"> {
     passes that are suspected to introduce faulty IR.
   }];
   let constructor = "mlir::createGenerateRuntimeVerificationPass()";
+  let options = [
+    Option<"verboseLevel", "verbose-level", "unsigned", /*default=*/"2",
+           "Verbosity level for runtime verification messages: "
+           "0 = Minimum (only source location), "
+           "1 = Basic (include operation type and operand type), "
+           "2 = Detailed (include full operation details, names, types, shapes, etc.)">
+  ];
 }
 
+
 def Inliner : Pass<"inline"> {
   let summary = "Inline function calls";
   let constructor = "mlir::createInlinerPass()";
diff --git a/mlir/lib/Dialect/Linalg/Transforms/RuntimeOpVerification.cpp b/mlir/lib/Dialect/Linalg/Transforms/RuntimeOpVerification.cpp
index b30182dc84079..608a6801af267 100644
--- a/mlir/lib/Dialect/Linalg/Transforms/RuntimeOpVerification.cpp
+++ b/mlir/lib/Dialect/Linalg/Transforms/RuntimeOpVerification.cpp
@@ -32,7 +32,7 @@ struct StructuredOpInterface
     : public RuntimeVerifiableOpInterface::ExternalModel<
           StructuredOpInterface<T>, T> {
   void generateRuntimeVerification(Operation *op, OpBuilder &builder,
-                                   Location loc) const {
+                                   Location loc, unsigned verboseLevel) const {
     auto linalgOp = llvm::cast<LinalgOp>(op);
 
     SmallVector<Range> loopRanges = linalgOp.createLoopRanges(builder, loc);
@@ -73,7 +73,8 @@ struct StructuredOpInterface
         auto msg = RuntimeVerifiableOpInterface::generateErrorMessage(
             linalgOp, "unexpected negative result on dimension #" +
                           std::to_string(dim) + " of input/output operand #" +
-                          std::to_string(opOperand.getOperandNumber()));
+                          std::to_string(opOperand.getOperandNumber()),
+            verboseLevel);
         builder.createOrFold<cf::AssertOp>(loc, cmpOp, msg);
 
         // Generate:
@@ -104,7 +105,8 @@ struct StructuredOpInterface
             linalgOp, "dimension #" + std::to_string(dim) +
                           " of input/output operand #" +
                           std::to_string(opOperand.getOperandNumber()) +
-                          " is incompatible with inferred dimension size");
+                          " is incompatible with inferred dimension size",
+            verboseLevel);
         builder.createOrFold<cf::AssertOp>(loc, cmpOp, msg);
       }
     }
diff --git a/mlir/lib/Dialect/MemRef/Transforms/RuntimeOpVerification.cpp b/mlir/lib/Dialect/MemRef/Transforms/RuntimeOpVerification.cpp
index cd92026562da9..d8a7a89a3fbe7 100644
--- a/mlir/lib/Dialect/MemRef/Transforms/RuntimeOpVerification.cpp
+++ b/mlir/lib/Dialect/MemRef/Transforms/RuntimeOpVerification.cpp
@@ -39,7 +39,7 @@ struct AssumeAlignmentOpInterface
     : public RuntimeVerifiableOpInterface::ExternalModel<
           AssumeAlignmentOpInterface, AssumeAlignmentOp> {
   void generateRuntimeVerification(Operation *op, OpBuilder &builder,
-                                   Location loc) const {
+                                   Location loc, unsigned verboseLevel) const {
     auto assumeOp = cast<AssumeAlignmentOp>(op);
     Value ptr = builder.create<ExtractAlignedPointerAsIndexOp>(
         loc, assumeOp.getMemref());
@@ -53,7 +53,8 @@ struct AssumeAlignmentOpInterface
         loc, isAligned,
         RuntimeVerifiableOpInterface::generateErrorMessage(
             op, "memref is not aligned to " +
-                    std::to_string(assumeOp.getAlignment())));
+                    std::to_string(assumeOp.getAlignment()),
+            verboseLevel));
   }
 };
 
@@ -61,7 +62,7 @@ struct CastOpInterface
     : public RuntimeVerifiableOpInterface::ExternalModel<CastOpInterface,
                                                          CastOp> {
   void generateRuntimeVerification(Operation *op, OpBuilder &builder,
-                                   Location loc) const {
+                                   Location loc, unsigned verboseLevel) const {
     auto castOp = cast<CastOp>(op);
     auto srcType = cast<BaseMemRefType>(castOp.getSource().getType());
 
@@ -79,8 +80,8 @@ struct CastOpInterface
           loc, arith::CmpIPredicate::eq, srcRank, resultRank);
       builder.create<cf::AssertOp>(
           loc, isSameRank,
-          RuntimeVerifiableOpInterface::generateErrorMessage(op,
-                                                             "rank mismatch"));
+          RuntimeVerifiableOpInterface::generateErrorMessage(
+              op, "rank mismatch", verboseLevel));
     }
 
     // Get source offset and strides. We do not have an op to get offsets and
@@ -119,7 +120,8 @@ struct CastOpInterface
       builder.create<cf::AssertOp>(
           loc, isSameSz,
           RuntimeVerifiableOpInterface::generateErrorMessage(
-              op, "size mismatch of dim " + std::to_string(it.index())));
+              op, "size mismatch of dim " + std::to_string(it.index()),
+              verboseLevel));
     }
 
     // Get result offset and strides.
@@ -139,7 +141,7 @@ struct CastOpInterface
       builder.create<cf::AssertOp>(
           loc, isSameOffset,
           RuntimeVerifiableOpInterface::generateErrorMessage(
-              op, "offset mismatch"));
+              op, "offset mismatch", verboseLevel));
     }
 
     // Check strides.
@@ -157,7 +159,8 @@ struct CastOpInterface
       builder.create<cf::AssertOp>(
           loc, isSameStride,
           RuntimeVerifiableOpInterface::generateErrorMessage(
-              op, "stride mismatch of dim " + std::to_string(it.index())));
+              op, "stride mismatch of dim " + std::to_string(it.index()),
+              verboseLevel));
     }
   }
 };
@@ -166,7 +169,7 @@ struct CopyOpInterface
     : public RuntimeVerifiableOpInterface::ExternalModel<CopyOpInterface,
                                                          CopyOp> {
   void generateRuntimeVerification(Operation *op, OpBuilder &builder,
-                                   Location loc) const {
+                                   Location loc, unsigned verboseLevel) const {
     auto copyOp = cast<CopyOp>(op);
     BaseMemRefType sourceType = copyOp.getSource().getType();
     BaseMemRefType targetType = copyOp.getTarget().getType();
@@ -201,7 +204,7 @@ struct CopyOpInterface
           loc, sameDimSize,
           RuntimeVerifiableOpInterface::generateErrorMessage(
               op, "size of " + std::to_string(i) +
-                      "-th source/target dim does not match"));
+                      "-th source/target dim does not match", verboseLevel));
     }
   }
 };
@@ -210,14 +213,14 @@ struct DimOpInterface
     : public RuntimeVerifiableOpInterface::ExternalModel<DimOpInterface,
                                                          DimOp> {
   void generateRuntimeVerification(Operation *op, OpBuilder &builder,
-                                   Location loc) const {
+                                   Location loc, unsigned verboseLevel) const {
     auto dimOp = cast<DimOp>(op);
     Value rank = builder.create<RankOp>(loc, dimOp.getSource());
     Value zero = builder.create<arith::ConstantIndexOp>(loc, 0);
     builder.create<cf::AssertOp>(
         loc, generateInBoundsCheck(builder, loc, dimOp.getIndex(), zero, rank),
         RuntimeVerifiableOpInterface::generateErrorMessage(
-            op, "index is out of bounds"));
+            op, "index is out of bounds", verboseLevel));
   }
 };
 
@@ -228,7 +231,7 @@ struct LoadStoreOpInterface
     : public RuntimeVerifiableOpInterface::ExternalModel<
           LoadStoreOpInterface<LoadStoreOp>, LoadStoreOp> {
   void generateRuntimeVerification(Operation *op, OpBuilder &builder,
-                                   Location loc) const {
+                                   Location loc, unsigned verboseLevel) const {
     auto loadStoreOp = cast<LoadStoreOp>(op);
 
     auto memref = loadStoreOp.getMemref();
@@ -251,7 +254,7 @@ struct LoadStoreOpInterface
     builder.create<cf::AssertOp>(
         loc, assertCond,
         RuntimeVerifiableOpInterface::generateErrorMessage(
-            op, "out-of-bounds access"));
+            op, "out-of-bounds access", verboseLevel));
   }
 };
 
@@ -295,7 +298,7 @@ struct ReinterpretCastOpInterface
     : public RuntimeVerifiableOpInterface::ExternalModel<
           ReinterpretCastOpInterface, ReinterpretCastOp> {
   void generateRuntimeVerification(Operation *op, OpBuilder &builder,
-                                   Location loc) const {
+                                   Location loc, unsigned verboseLevel) const {
     auto reinterpretCast = cast<ReinterpretCastOp>(op);
     auto baseMemref = reinterpretCast.getSource();
     auto resultMemref =
@@ -323,7 +326,8 @@ struct ReinterpretCastOpInterface
         loc, assertCond,
         RuntimeVerifiableOpInterface::generateErrorMessage(
             op,
-            "result of reinterpret_cast is out-of-bounds of the base memref"));
+            "result of reinterpret_cast is out-of-bounds of the base memref",
+            verboseLevel));
   }
 };
 
@@ -331,7 +335,7 @@ struct SubViewOpInterface
     : public RuntimeVerifiableOpInterface::ExternalModel<SubViewOpInterface,
                                                          SubViewOp> {
   void generateRuntimeVerification(Operation *op, OpBuilder &builder,
-                                   Location loc) const {
+                                   Location loc, unsigned verboseLevel) const {
     auto subView = cast<SubViewOp>(op);
     MemRefType sourceType = subView.getSource().getType();
 
@@ -357,7 +361,7 @@ struct SubViewOpInterface
       builder.create<cf::AssertOp>(
           loc, offsetInBounds,
           RuntimeVerifiableOpInterface::generateErrorMessage(
-              op, "offset " + std::to_string(i) + " is out-of-bounds"));
+              op, "offset " + std::to_string(i) + " is out-of-bounds", verboseLevel));
 
       // Verify that slice does not run out-of-bounds.
       Value sizeMinusOne = builder.create<arith::SubIOp>(loc, size, one);
@@ -371,7 +375,7 @@ struct SubViewOpInterface
           loc, lastPosInBounds,
           RuntimeVerifiableOpInterface::generateErrorMessage(
               op, "subview runs out-of-bounds along dimension " +
-                      std::to_string(i)));
+                      std::to_string(i), verboseLevel));
     }
   }
 };
@@ -380,7 +384,7 @@ struct ExpandShapeOpInterface
     : public RuntimeVerifiableOpInterface::ExternalModel<ExpandShapeOpInterface,
                                                          ExpandShapeOp> {
   void generateRuntimeVerification(Operation *op, OpBuilder &builder,
-                                   Location loc) const {
+                                   Location loc, unsigned verboseLevel) const {
     auto expandShapeOp = cast<ExpandShapeOp>(op);
 
     // Verify that the expanded dim sizes are a product of the collapsed dim
@@ -414,7 +418,7 @@ struct ExpandShapeOpInterface
           loc, isModZero,
           RuntimeVerifiableOpInterface::generateErrorMessage(
               op, "static result dims in reassoc group do not "
-                  "divide src dim evenly"));
+                  "divide src dim evenly", verboseLevel));
     }
   }
 };
diff --git a/mlir/lib/Interfaces/RuntimeVerifiableOpInterface.cpp b/mlir/lib/Interfaces/RuntimeVerifiableOpInterface.cpp
index 8aa194befb420..8b54ed1dc3780 100644
--- a/mlir/lib/Interfaces/RuntimeVerifiableOpInterface.cpp
+++ b/mlir/lib/Interfaces/RuntimeVerifiableOpInterface.cpp
@@ -15,7 +15,7 @@ class OpBuilder;
 /// Generate an error message string for the given op and the specified error.
 std::string
 RuntimeVerifiableOpInterface::generateErrorMessage(Operation *op,
-                                                   const std::string &msg) {
+                                                   const std::string &msg, unsigned verboseLevel) {
   std::string buffer;
   llvm::raw_string_ostream stream(buffer);
   OpPrintingFlags flags;
@@ -26,9 +26,25 @@ RuntimeVerifiableOpInterface::generateErrorMessage(Operation *op,
   flags.skipRegions();
   flags.useLocalScope();
   stream << "ERROR: Runtime op verification failed\n";
-  op->print(stream, flags);
-  stream << "\n^ " << msg;
-  stream << "\nLocation: ";
+  if (verboseLevel == 2){
+    // print full op including operand names, very expensive
+    op->print(stream, flags);
+  stream << "\n " << msg;
+  }else if (verboseLevel == 1){
+    // print op name and operand types
+    stream << "Op: " << op->getName().getStringRef() << "\n";
+    stream << "Operand Types:";
+    for (const auto &operand : op->getOpOperands()) {
+      stream << " " << operand.get().getType();
+    }
+    stream << "\n" << msg;
+    stream << "Result Types:";
+    for (const auto &result : op->getResults()) {
+      stream << " " << result.getType();
+    }
+    stream << "\n" << msg;
+  }
+  stream << "^\nLocation: ";
   op->getLoc().print(stream);
   return buffer;
 }
diff --git a/mlir/lib/Transforms/GenerateRuntimeVerification.cpp b/mlir/lib/Transforms/GenerateRuntimeVerification.cpp
index a40bc2b3272fc..7a54ce667c6ad 100644
--- a/mlir/lib/Transforms/GenerateRuntimeVerification.cpp
+++ b/mlir/lib/Transforms/GenerateRuntimeVerification.cpp
@@ -28,6 +28,14 @@ struct GenerateRuntimeVerificationPass
 } // namespace
 
 void GenerateRuntimeVerificationPass::runOnOperation() {
+  // Check verboseLevel is in range [0, 2].
+  if (verboseLevel > 2) {
+    getOperation()->emitError(
+      "generate-runtime-verification pass: set verboseLevel to 0, 1 or 2");
+    signalPassFailure();
+    return;
+  }
+
   // The implementation of the RuntimeVerifiableOpInterface may create ops that
   // can be verified. We don't want to generate verification for IR that
   // performs verification, so gather all runtime-verifiable ops first.
@@ -39,7 +47,8 @@ void GenerateRuntimeVerificationPass::runOnOperation() {
   OpBuilder builder(getOperation()->getContext());
   for (RuntimeVerifiableOpInterface verifiableOp : ops) {
     builder.setInsertionPoint(verifiableOp);
-    verifiableOp.generateRuntimeVerification(builder, verifiableOp.getLoc());
+    verifiableOp.generateRuntimeVerification(builder, verifiableOp.getLoc(),
+                                           verboseLevel);
   };
 }
 
diff --git a/mlir/test/Dialect/Linalg/runtime-verification.mlir b/mlir/test/Dialect/Linalg/runtime-verification.mlir
index a4f29d8457e58..238169adf496e 100644
--- a/mlir/test/Dialect/Linalg/runtime-verification.mlir
+++ b/mlir/test/Dialect/Linalg/runtime-verification.mlir
@@ -1,13 +1,25 @@
 // RUN: mlir-opt %s -generate-runtime-verification | FileCheck %s
+// RUN: mlir-opt %s --generate-runtime-verification="verbose-level=1" | FileCheck %s --check-prefix=VERBOSE1
+// RUN: mlir-opt %s --generate-runtime-verification="verbose-level=0" | FileCheck %s --check-prefix=VERBOSE0
 
 // Most of the tests for linalg runtime-verification are implemented as integration tests.
 
 #identity = affine_map<(d0) -> (d0)>
 
 // CHECK-LABEL: @static_dims
+// VERBOSE1-LABEL: @static_dims
+// VERBOSE0-LABEL: @static_dims
 func.func @static_dims(%arg0: tensor<5xf32>, %arg1: tensor<5xf32>) -> (tensor<5xf32>) {
     // CHECK: %[[TRUE:.*]] = index.bool.constant true
     // CHECK: cf.assert %[[TRUE]]
+    // VERBOSE1: %[[TRUE:.*]] = index.bool.constant true
+    // VERBOSE1: cf.assert %[[TRUE]]
+    // VERBOSE1: Operand Types: tensor<5xf32> tensor<5xf32> tensor<5xf32>
+    // VERBOSE1: Result Types
+    // VERBOSE1: Location: loc
+    // VERBOSE0-NOT: Operand Types: tensor<5xf32> tensor<5xf32> tensor<5xf32>
+    // VERBOSE0-NOT: Result Types
+    // VERBOSE0: Location: loc
     %result = tensor.empty() : tensor<5xf32> 
     %0 = linalg.generic {
       indexing_maps = [#identity, #identity, #identity],
@@ -26,9 +38,11 @@ func.func @static_dims(%arg0: tensor<5xf32>, %arg1: tensor<5xf32>) -> (tensor<5x
 #map = affine_map<() -> ()>
 
 // CHECK-LABEL: @scalars
+// VERBOSE1-LABEL: @scalars
 func.func @scalars(%arg0: tensor<f32>, %arg1: tensor<f32>) -> (tensor<f32>) {
     // No runtime checks are required if the operands are all scalars
     // CHECK-NOT: cf.assert
+    // VERBOSE1-NOT: cf.assert
     %result = tensor.empty() : tensor<f32> 
     %0 = linalg.generic {
       indexing_maps = [#map, #map, #map],

@llvmbot
Copy link
Member

llvmbot commented Sep 23, 2025

@llvm/pr-subscribers-mlir-core

Author: Hanchenng Wu (HanchengWu)

Changes

The pass generate-runtime-verification generates additional runtime op verification checks.

Currently, the pass is extremely expensive. For example, with a mobilenet v2 ssd network(converted to mlir), running this pass alone will take 30 minutes. The same observation has been made to other networks as small as 5 Mb.

The culprit is this line "op->print(stream, flags);" in function "RuntimeVerifiableOpInterface::generateErrorMessage" in File mlir/lib/Interfaces/RuntimeVerifiableOpInterface.cpp.

As we are printing the op with all the names of the operands in the middle end, we are constructing a new SSANameState for each op->print(...) call. Thus, we are doing a new SSA analysis for each error message printed.

Perf profiling shows that 98% percent of the time is spent in the constructor of SSANameState.

This change add verbose options to generate-runtime-verification pass.
verbose 0: print only source location with error message.
verbose 1: print source location and operation name and operand types with error message.
verbose 2: print the full op, including the name of the operands.

verbose 2 is the current behavior and is very expensive. I still keep the default as verbose 2.

When we switch from verbose 2 to verbose 0/1, we see below improvements.

For mlir imported from mobileNet v2 ssd, the running time of the pass is reduced from 32 mintues to 21 seconds.
For another small network (only 5MB size), the running time of the pass is reduced from 15 minutes to 4 seconds.


Full diff: https://github.com/llvm/llvm-project/pull/160331.diff

8 Files Affected:

  • (modified) mlir/include/mlir/Interfaces/RuntimeVerifiableOpInterface.td (+4-2)
  • (modified) mlir/include/mlir/Transforms/Passes.h (+1)
  • (modified) mlir/include/mlir/Transforms/Passes.td (+8)
  • (modified) mlir/lib/Dialect/Linalg/Transforms/RuntimeOpVerification.cpp (+5-3)
  • (modified) mlir/lib/Dialect/MemRef/Transforms/RuntimeOpVerification.cpp (+25-21)
  • (modified) mlir/lib/Interfaces/RuntimeVerifiableOpInterface.cpp (+20-4)
  • (modified) mlir/lib/Transforms/GenerateRuntimeVerification.cpp (+10-1)
  • (modified) mlir/test/Dialect/Linalg/runtime-verification.mlir (+14)
diff --git a/mlir/include/mlir/Interfaces/RuntimeVerifiableOpInterface.td b/mlir/include/mlir/Interfaces/RuntimeVerifiableOpInterface.td
index 6fd0df59d9d2e..e5c9336c8d8dc 100644
--- a/mlir/include/mlir/Interfaces/RuntimeVerifiableOpInterface.td
+++ b/mlir/include/mlir/Interfaces/RuntimeVerifiableOpInterface.td
@@ -32,14 +32,16 @@ def RuntimeVerifiableOpInterface : OpInterface<"RuntimeVerifiableOpInterface"> {
       /*retTy=*/"void",
       /*methodName=*/"generateRuntimeVerification",
       /*args=*/(ins "::mlir::OpBuilder &":$builder,
-                    "::mlir::Location":$loc)
+                    "::mlir::Location":$loc,
+                    "unsigned":$verboseLevel)
     >,
   ];
 
   let extraClassDeclaration = [{
     /// Generate the error message that will be printed to the user when 
     /// verification fails.
-    static std::string generateErrorMessage(Operation *op, const std::string &msg);
+    static std::string generateErrorMessage(Operation *op, const std::string &msg,
+                                            unsigned verboseLevel = 0);
   }];
 }
 
diff --git a/mlir/include/mlir/Transforms/Passes.h b/mlir/include/mlir/Transforms/Passes.h
index 41f208216374f..58ba0892df113 100644
--- a/mlir/include/mlir/Transforms/Passes.h
+++ b/mlir/include/mlir/Transforms/Passes.h
@@ -46,6 +46,7 @@ class GreedyRewriteConfig;
 #define GEN_PASS_DECL_SYMBOLPRIVATIZE
 #define GEN_PASS_DECL_TOPOLOGICALSORT
 #define GEN_PASS_DECL_COMPOSITEFIXEDPOINTPASS
+#define GEN_PASS_DECL_GENERATERUNTIMEVERIFICATION
 #include "mlir/Transforms/Passes.h.inc"
 
 /// Creates an instance of the Canonicalizer pass, configured with default
diff --git a/mlir/include/mlir/Transforms/Passes.td b/mlir/include/mlir/Transforms/Passes.td
index a39ab77fc8fb3..3d643d8a168db 100644
--- a/mlir/include/mlir/Transforms/Passes.td
+++ b/mlir/include/mlir/Transforms/Passes.td
@@ -271,8 +271,16 @@ def GenerateRuntimeVerification : Pass<"generate-runtime-verification"> {
     passes that are suspected to introduce faulty IR.
   }];
   let constructor = "mlir::createGenerateRuntimeVerificationPass()";
+  let options = [
+    Option<"verboseLevel", "verbose-level", "unsigned", /*default=*/"2",
+           "Verbosity level for runtime verification messages: "
+           "0 = Minimum (only source location), "
+           "1 = Basic (include operation type and operand type), "
+           "2 = Detailed (include full operation details, names, types, shapes, etc.)">
+  ];
 }
 
+
 def Inliner : Pass<"inline"> {
   let summary = "Inline function calls";
   let constructor = "mlir::createInlinerPass()";
diff --git a/mlir/lib/Dialect/Linalg/Transforms/RuntimeOpVerification.cpp b/mlir/lib/Dialect/Linalg/Transforms/RuntimeOpVerification.cpp
index b30182dc84079..608a6801af267 100644
--- a/mlir/lib/Dialect/Linalg/Transforms/RuntimeOpVerification.cpp
+++ b/mlir/lib/Dialect/Linalg/Transforms/RuntimeOpVerification.cpp
@@ -32,7 +32,7 @@ struct StructuredOpInterface
     : public RuntimeVerifiableOpInterface::ExternalModel<
           StructuredOpInterface<T>, T> {
   void generateRuntimeVerification(Operation *op, OpBuilder &builder,
-                                   Location loc) const {
+                                   Location loc, unsigned verboseLevel) const {
     auto linalgOp = llvm::cast<LinalgOp>(op);
 
     SmallVector<Range> loopRanges = linalgOp.createLoopRanges(builder, loc);
@@ -73,7 +73,8 @@ struct StructuredOpInterface
         auto msg = RuntimeVerifiableOpInterface::generateErrorMessage(
             linalgOp, "unexpected negative result on dimension #" +
                           std::to_string(dim) + " of input/output operand #" +
-                          std::to_string(opOperand.getOperandNumber()));
+                          std::to_string(opOperand.getOperandNumber()),
+            verboseLevel);
         builder.createOrFold<cf::AssertOp>(loc, cmpOp, msg);
 
         // Generate:
@@ -104,7 +105,8 @@ struct StructuredOpInterface
             linalgOp, "dimension #" + std::to_string(dim) +
                           " of input/output operand #" +
                           std::to_string(opOperand.getOperandNumber()) +
-                          " is incompatible with inferred dimension size");
+                          " is incompatible with inferred dimension size",
+            verboseLevel);
         builder.createOrFold<cf::AssertOp>(loc, cmpOp, msg);
       }
     }
diff --git a/mlir/lib/Dialect/MemRef/Transforms/RuntimeOpVerification.cpp b/mlir/lib/Dialect/MemRef/Transforms/RuntimeOpVerification.cpp
index cd92026562da9..d8a7a89a3fbe7 100644
--- a/mlir/lib/Dialect/MemRef/Transforms/RuntimeOpVerification.cpp
+++ b/mlir/lib/Dialect/MemRef/Transforms/RuntimeOpVerification.cpp
@@ -39,7 +39,7 @@ struct AssumeAlignmentOpInterface
     : public RuntimeVerifiableOpInterface::ExternalModel<
           AssumeAlignmentOpInterface, AssumeAlignmentOp> {
   void generateRuntimeVerification(Operation *op, OpBuilder &builder,
-                                   Location loc) const {
+                                   Location loc, unsigned verboseLevel) const {
     auto assumeOp = cast<AssumeAlignmentOp>(op);
     Value ptr = builder.create<ExtractAlignedPointerAsIndexOp>(
         loc, assumeOp.getMemref());
@@ -53,7 +53,8 @@ struct AssumeAlignmentOpInterface
         loc, isAligned,
         RuntimeVerifiableOpInterface::generateErrorMessage(
             op, "memref is not aligned to " +
-                    std::to_string(assumeOp.getAlignment())));
+                    std::to_string(assumeOp.getAlignment()),
+            verboseLevel));
   }
 };
 
@@ -61,7 +62,7 @@ struct CastOpInterface
     : public RuntimeVerifiableOpInterface::ExternalModel<CastOpInterface,
                                                          CastOp> {
   void generateRuntimeVerification(Operation *op, OpBuilder &builder,
-                                   Location loc) const {
+                                   Location loc, unsigned verboseLevel) const {
     auto castOp = cast<CastOp>(op);
     auto srcType = cast<BaseMemRefType>(castOp.getSource().getType());
 
@@ -79,8 +80,8 @@ struct CastOpInterface
           loc, arith::CmpIPredicate::eq, srcRank, resultRank);
       builder.create<cf::AssertOp>(
           loc, isSameRank,
-          RuntimeVerifiableOpInterface::generateErrorMessage(op,
-                                                             "rank mismatch"));
+          RuntimeVerifiableOpInterface::generateErrorMessage(
+              op, "rank mismatch", verboseLevel));
     }
 
     // Get source offset and strides. We do not have an op to get offsets and
@@ -119,7 +120,8 @@ struct CastOpInterface
       builder.create<cf::AssertOp>(
           loc, isSameSz,
           RuntimeVerifiableOpInterface::generateErrorMessage(
-              op, "size mismatch of dim " + std::to_string(it.index())));
+              op, "size mismatch of dim " + std::to_string(it.index()),
+              verboseLevel));
     }
 
     // Get result offset and strides.
@@ -139,7 +141,7 @@ struct CastOpInterface
       builder.create<cf::AssertOp>(
           loc, isSameOffset,
           RuntimeVerifiableOpInterface::generateErrorMessage(
-              op, "offset mismatch"));
+              op, "offset mismatch", verboseLevel));
     }
 
     // Check strides.
@@ -157,7 +159,8 @@ struct CastOpInterface
       builder.create<cf::AssertOp>(
           loc, isSameStride,
           RuntimeVerifiableOpInterface::generateErrorMessage(
-              op, "stride mismatch of dim " + std::to_string(it.index())));
+              op, "stride mismatch of dim " + std::to_string(it.index()),
+              verboseLevel));
     }
   }
 };
@@ -166,7 +169,7 @@ struct CopyOpInterface
     : public RuntimeVerifiableOpInterface::ExternalModel<CopyOpInterface,
                                                          CopyOp> {
   void generateRuntimeVerification(Operation *op, OpBuilder &builder,
-                                   Location loc) const {
+                                   Location loc, unsigned verboseLevel) const {
     auto copyOp = cast<CopyOp>(op);
     BaseMemRefType sourceType = copyOp.getSource().getType();
     BaseMemRefType targetType = copyOp.getTarget().getType();
@@ -201,7 +204,7 @@ struct CopyOpInterface
           loc, sameDimSize,
           RuntimeVerifiableOpInterface::generateErrorMessage(
               op, "size of " + std::to_string(i) +
-                      "-th source/target dim does not match"));
+                      "-th source/target dim does not match", verboseLevel));
     }
   }
 };
@@ -210,14 +213,14 @@ struct DimOpInterface
     : public RuntimeVerifiableOpInterface::ExternalModel<DimOpInterface,
                                                          DimOp> {
   void generateRuntimeVerification(Operation *op, OpBuilder &builder,
-                                   Location loc) const {
+                                   Location loc, unsigned verboseLevel) const {
     auto dimOp = cast<DimOp>(op);
     Value rank = builder.create<RankOp>(loc, dimOp.getSource());
     Value zero = builder.create<arith::ConstantIndexOp>(loc, 0);
     builder.create<cf::AssertOp>(
         loc, generateInBoundsCheck(builder, loc, dimOp.getIndex(), zero, rank),
         RuntimeVerifiableOpInterface::generateErrorMessage(
-            op, "index is out of bounds"));
+            op, "index is out of bounds", verboseLevel));
   }
 };
 
@@ -228,7 +231,7 @@ struct LoadStoreOpInterface
     : public RuntimeVerifiableOpInterface::ExternalModel<
           LoadStoreOpInterface<LoadStoreOp>, LoadStoreOp> {
   void generateRuntimeVerification(Operation *op, OpBuilder &builder,
-                                   Location loc) const {
+                                   Location loc, unsigned verboseLevel) const {
     auto loadStoreOp = cast<LoadStoreOp>(op);
 
     auto memref = loadStoreOp.getMemref();
@@ -251,7 +254,7 @@ struct LoadStoreOpInterface
     builder.create<cf::AssertOp>(
         loc, assertCond,
         RuntimeVerifiableOpInterface::generateErrorMessage(
-            op, "out-of-bounds access"));
+            op, "out-of-bounds access", verboseLevel));
   }
 };
 
@@ -295,7 +298,7 @@ struct ReinterpretCastOpInterface
     : public RuntimeVerifiableOpInterface::ExternalModel<
           ReinterpretCastOpInterface, ReinterpretCastOp> {
   void generateRuntimeVerification(Operation *op, OpBuilder &builder,
-                                   Location loc) const {
+                                   Location loc, unsigned verboseLevel) const {
     auto reinterpretCast = cast<ReinterpretCastOp>(op);
     auto baseMemref = reinterpretCast.getSource();
     auto resultMemref =
@@ -323,7 +326,8 @@ struct ReinterpretCastOpInterface
         loc, assertCond,
         RuntimeVerifiableOpInterface::generateErrorMessage(
             op,
-            "result of reinterpret_cast is out-of-bounds of the base memref"));
+            "result of reinterpret_cast is out-of-bounds of the base memref",
+            verboseLevel));
   }
 };
 
@@ -331,7 +335,7 @@ struct SubViewOpInterface
     : public RuntimeVerifiableOpInterface::ExternalModel<SubViewOpInterface,
                                                          SubViewOp> {
   void generateRuntimeVerification(Operation *op, OpBuilder &builder,
-                                   Location loc) const {
+                                   Location loc, unsigned verboseLevel) const {
     auto subView = cast<SubViewOp>(op);
     MemRefType sourceType = subView.getSource().getType();
 
@@ -357,7 +361,7 @@ struct SubViewOpInterface
       builder.create<cf::AssertOp>(
           loc, offsetInBounds,
           RuntimeVerifiableOpInterface::generateErrorMessage(
-              op, "offset " + std::to_string(i) + " is out-of-bounds"));
+              op, "offset " + std::to_string(i) + " is out-of-bounds", verboseLevel));
 
       // Verify that slice does not run out-of-bounds.
       Value sizeMinusOne = builder.create<arith::SubIOp>(loc, size, one);
@@ -371,7 +375,7 @@ struct SubViewOpInterface
           loc, lastPosInBounds,
           RuntimeVerifiableOpInterface::generateErrorMessage(
               op, "subview runs out-of-bounds along dimension " +
-                      std::to_string(i)));
+                      std::to_string(i), verboseLevel));
     }
   }
 };
@@ -380,7 +384,7 @@ struct ExpandShapeOpInterface
     : public RuntimeVerifiableOpInterface::ExternalModel<ExpandShapeOpInterface,
                                                          ExpandShapeOp> {
   void generateRuntimeVerification(Operation *op, OpBuilder &builder,
-                                   Location loc) const {
+                                   Location loc, unsigned verboseLevel) const {
     auto expandShapeOp = cast<ExpandShapeOp>(op);
 
     // Verify that the expanded dim sizes are a product of the collapsed dim
@@ -414,7 +418,7 @@ struct ExpandShapeOpInterface
           loc, isModZero,
           RuntimeVerifiableOpInterface::generateErrorMessage(
               op, "static result dims in reassoc group do not "
-                  "divide src dim evenly"));
+                  "divide src dim evenly", verboseLevel));
     }
   }
 };
diff --git a/mlir/lib/Interfaces/RuntimeVerifiableOpInterface.cpp b/mlir/lib/Interfaces/RuntimeVerifiableOpInterface.cpp
index 8aa194befb420..8b54ed1dc3780 100644
--- a/mlir/lib/Interfaces/RuntimeVerifiableOpInterface.cpp
+++ b/mlir/lib/Interfaces/RuntimeVerifiableOpInterface.cpp
@@ -15,7 +15,7 @@ class OpBuilder;
 /// Generate an error message string for the given op and the specified error.
 std::string
 RuntimeVerifiableOpInterface::generateErrorMessage(Operation *op,
-                                                   const std::string &msg) {
+                                                   const std::string &msg, unsigned verboseLevel) {
   std::string buffer;
   llvm::raw_string_ostream stream(buffer);
   OpPrintingFlags flags;
@@ -26,9 +26,25 @@ RuntimeVerifiableOpInterface::generateErrorMessage(Operation *op,
   flags.skipRegions();
   flags.useLocalScope();
   stream << "ERROR: Runtime op verification failed\n";
-  op->print(stream, flags);
-  stream << "\n^ " << msg;
-  stream << "\nLocation: ";
+  if (verboseLevel == 2){
+    // print full op including operand names, very expensive
+    op->print(stream, flags);
+  stream << "\n " << msg;
+  }else if (verboseLevel == 1){
+    // print op name and operand types
+    stream << "Op: " << op->getName().getStringRef() << "\n";
+    stream << "Operand Types:";
+    for (const auto &operand : op->getOpOperands()) {
+      stream << " " << operand.get().getType();
+    }
+    stream << "\n" << msg;
+    stream << "Result Types:";
+    for (const auto &result : op->getResults()) {
+      stream << " " << result.getType();
+    }
+    stream << "\n" << msg;
+  }
+  stream << "^\nLocation: ";
   op->getLoc().print(stream);
   return buffer;
 }
diff --git a/mlir/lib/Transforms/GenerateRuntimeVerification.cpp b/mlir/lib/Transforms/GenerateRuntimeVerification.cpp
index a40bc2b3272fc..7a54ce667c6ad 100644
--- a/mlir/lib/Transforms/GenerateRuntimeVerification.cpp
+++ b/mlir/lib/Transforms/GenerateRuntimeVerification.cpp
@@ -28,6 +28,14 @@ struct GenerateRuntimeVerificationPass
 } // namespace
 
 void GenerateRuntimeVerificationPass::runOnOperation() {
+  // Check verboseLevel is in range [0, 2].
+  if (verboseLevel > 2) {
+    getOperation()->emitError(
+      "generate-runtime-verification pass: set verboseLevel to 0, 1 or 2");
+    signalPassFailure();
+    return;
+  }
+
   // The implementation of the RuntimeVerifiableOpInterface may create ops that
   // can be verified. We don't want to generate verification for IR that
   // performs verification, so gather all runtime-verifiable ops first.
@@ -39,7 +47,8 @@ void GenerateRuntimeVerificationPass::runOnOperation() {
   OpBuilder builder(getOperation()->getContext());
   for (RuntimeVerifiableOpInterface verifiableOp : ops) {
     builder.setInsertionPoint(verifiableOp);
-    verifiableOp.generateRuntimeVerification(builder, verifiableOp.getLoc());
+    verifiableOp.generateRuntimeVerification(builder, verifiableOp.getLoc(),
+                                           verboseLevel);
   };
 }
 
diff --git a/mlir/test/Dialect/Linalg/runtime-verification.mlir b/mlir/test/Dialect/Linalg/runtime-verification.mlir
index a4f29d8457e58..238169adf496e 100644
--- a/mlir/test/Dialect/Linalg/runtime-verification.mlir
+++ b/mlir/test/Dialect/Linalg/runtime-verification.mlir
@@ -1,13 +1,25 @@
 // RUN: mlir-opt %s -generate-runtime-verification | FileCheck %s
+// RUN: mlir-opt %s --generate-runtime-verification="verbose-level=1" | FileCheck %s --check-prefix=VERBOSE1
+// RUN: mlir-opt %s --generate-runtime-verification="verbose-level=0" | FileCheck %s --check-prefix=VERBOSE0
 
 // Most of the tests for linalg runtime-verification are implemented as integration tests.
 
 #identity = affine_map<(d0) -> (d0)>
 
 // CHECK-LABEL: @static_dims
+// VERBOSE1-LABEL: @static_dims
+// VERBOSE0-LABEL: @static_dims
 func.func @static_dims(%arg0: tensor<5xf32>, %arg1: tensor<5xf32>) -> (tensor<5xf32>) {
     // CHECK: %[[TRUE:.*]] = index.bool.constant true
     // CHECK: cf.assert %[[TRUE]]
+    // VERBOSE1: %[[TRUE:.*]] = index.bool.constant true
+    // VERBOSE1: cf.assert %[[TRUE]]
+    // VERBOSE1: Operand Types: tensor<5xf32> tensor<5xf32> tensor<5xf32>
+    // VERBOSE1: Result Types
+    // VERBOSE1: Location: loc
+    // VERBOSE0-NOT: Operand Types: tensor<5xf32> tensor<5xf32> tensor<5xf32>
+    // VERBOSE0-NOT: Result Types
+    // VERBOSE0: Location: loc
     %result = tensor.empty() : tensor<5xf32> 
     %0 = linalg.generic {
       indexing_maps = [#identity, #identity, #identity],
@@ -26,9 +38,11 @@ func.func @static_dims(%arg0: tensor<5xf32>, %arg1: tensor<5xf32>) -> (tensor<5x
 #map = affine_map<() -> ()>
 
 // CHECK-LABEL: @scalars
+// VERBOSE1-LABEL: @scalars
 func.func @scalars(%arg0: tensor<f32>, %arg1: tensor<f32>) -> (tensor<f32>) {
     // No runtime checks are required if the operands are all scalars
     // CHECK-NOT: cf.assert
+    // VERBOSE1-NOT: cf.assert
     %result = tensor.empty() : tensor<f32> 
     %0 = linalg.generic {
       indexing_maps = [#map, #map, #map],

@llvmbot
Copy link
Member

llvmbot commented Sep 23, 2025

@llvm/pr-subscribers-mlir-memref

Author: Hanchenng Wu (HanchengWu)

Changes

The pass generate-runtime-verification generates additional runtime op verification checks.

Currently, the pass is extremely expensive. For example, with a mobilenet v2 ssd network(converted to mlir), running this pass alone will take 30 minutes. The same observation has been made to other networks as small as 5 Mb.

The culprit is this line "op->print(stream, flags);" in function "RuntimeVerifiableOpInterface::generateErrorMessage" in File mlir/lib/Interfaces/RuntimeVerifiableOpInterface.cpp.

As we are printing the op with all the names of the operands in the middle end, we are constructing a new SSANameState for each op->print(...) call. Thus, we are doing a new SSA analysis for each error message printed.

Perf profiling shows that 98% percent of the time is spent in the constructor of SSANameState.

This change add verbose options to generate-runtime-verification pass.
verbose 0: print only source location with error message.
verbose 1: print source location and operation name and operand types with error message.
verbose 2: print the full op, including the name of the operands.

verbose 2 is the current behavior and is very expensive. I still keep the default as verbose 2.

When we switch from verbose 2 to verbose 0/1, we see below improvements.

For mlir imported from mobileNet v2 ssd, the running time of the pass is reduced from 32 mintues to 21 seconds.
For another small network (only 5MB size), the running time of the pass is reduced from 15 minutes to 4 seconds.


Full diff: https://github.com/llvm/llvm-project/pull/160331.diff

8 Files Affected:

  • (modified) mlir/include/mlir/Interfaces/RuntimeVerifiableOpInterface.td (+4-2)
  • (modified) mlir/include/mlir/Transforms/Passes.h (+1)
  • (modified) mlir/include/mlir/Transforms/Passes.td (+8)
  • (modified) mlir/lib/Dialect/Linalg/Transforms/RuntimeOpVerification.cpp (+5-3)
  • (modified) mlir/lib/Dialect/MemRef/Transforms/RuntimeOpVerification.cpp (+25-21)
  • (modified) mlir/lib/Interfaces/RuntimeVerifiableOpInterface.cpp (+20-4)
  • (modified) mlir/lib/Transforms/GenerateRuntimeVerification.cpp (+10-1)
  • (modified) mlir/test/Dialect/Linalg/runtime-verification.mlir (+14)
diff --git a/mlir/include/mlir/Interfaces/RuntimeVerifiableOpInterface.td b/mlir/include/mlir/Interfaces/RuntimeVerifiableOpInterface.td
index 6fd0df59d9d2e..e5c9336c8d8dc 100644
--- a/mlir/include/mlir/Interfaces/RuntimeVerifiableOpInterface.td
+++ b/mlir/include/mlir/Interfaces/RuntimeVerifiableOpInterface.td
@@ -32,14 +32,16 @@ def RuntimeVerifiableOpInterface : OpInterface<"RuntimeVerifiableOpInterface"> {
       /*retTy=*/"void",
       /*methodName=*/"generateRuntimeVerification",
       /*args=*/(ins "::mlir::OpBuilder &":$builder,
-                    "::mlir::Location":$loc)
+                    "::mlir::Location":$loc,
+                    "unsigned":$verboseLevel)
     >,
   ];
 
   let extraClassDeclaration = [{
     /// Generate the error message that will be printed to the user when 
     /// verification fails.
-    static std::string generateErrorMessage(Operation *op, const std::string &msg);
+    static std::string generateErrorMessage(Operation *op, const std::string &msg,
+                                            unsigned verboseLevel = 0);
   }];
 }
 
diff --git a/mlir/include/mlir/Transforms/Passes.h b/mlir/include/mlir/Transforms/Passes.h
index 41f208216374f..58ba0892df113 100644
--- a/mlir/include/mlir/Transforms/Passes.h
+++ b/mlir/include/mlir/Transforms/Passes.h
@@ -46,6 +46,7 @@ class GreedyRewriteConfig;
 #define GEN_PASS_DECL_SYMBOLPRIVATIZE
 #define GEN_PASS_DECL_TOPOLOGICALSORT
 #define GEN_PASS_DECL_COMPOSITEFIXEDPOINTPASS
+#define GEN_PASS_DECL_GENERATERUNTIMEVERIFICATION
 #include "mlir/Transforms/Passes.h.inc"
 
 /// Creates an instance of the Canonicalizer pass, configured with default
diff --git a/mlir/include/mlir/Transforms/Passes.td b/mlir/include/mlir/Transforms/Passes.td
index a39ab77fc8fb3..3d643d8a168db 100644
--- a/mlir/include/mlir/Transforms/Passes.td
+++ b/mlir/include/mlir/Transforms/Passes.td
@@ -271,8 +271,16 @@ def GenerateRuntimeVerification : Pass<"generate-runtime-verification"> {
     passes that are suspected to introduce faulty IR.
   }];
   let constructor = "mlir::createGenerateRuntimeVerificationPass()";
+  let options = [
+    Option<"verboseLevel", "verbose-level", "unsigned", /*default=*/"2",
+           "Verbosity level for runtime verification messages: "
+           "0 = Minimum (only source location), "
+           "1 = Basic (include operation type and operand type), "
+           "2 = Detailed (include full operation details, names, types, shapes, etc.)">
+  ];
 }
 
+
 def Inliner : Pass<"inline"> {
   let summary = "Inline function calls";
   let constructor = "mlir::createInlinerPass()";
diff --git a/mlir/lib/Dialect/Linalg/Transforms/RuntimeOpVerification.cpp b/mlir/lib/Dialect/Linalg/Transforms/RuntimeOpVerification.cpp
index b30182dc84079..608a6801af267 100644
--- a/mlir/lib/Dialect/Linalg/Transforms/RuntimeOpVerification.cpp
+++ b/mlir/lib/Dialect/Linalg/Transforms/RuntimeOpVerification.cpp
@@ -32,7 +32,7 @@ struct StructuredOpInterface
     : public RuntimeVerifiableOpInterface::ExternalModel<
           StructuredOpInterface<T>, T> {
   void generateRuntimeVerification(Operation *op, OpBuilder &builder,
-                                   Location loc) const {
+                                   Location loc, unsigned verboseLevel) const {
     auto linalgOp = llvm::cast<LinalgOp>(op);
 
     SmallVector<Range> loopRanges = linalgOp.createLoopRanges(builder, loc);
@@ -73,7 +73,8 @@ struct StructuredOpInterface
         auto msg = RuntimeVerifiableOpInterface::generateErrorMessage(
             linalgOp, "unexpected negative result on dimension #" +
                           std::to_string(dim) + " of input/output operand #" +
-                          std::to_string(opOperand.getOperandNumber()));
+                          std::to_string(opOperand.getOperandNumber()),
+            verboseLevel);
         builder.createOrFold<cf::AssertOp>(loc, cmpOp, msg);
 
         // Generate:
@@ -104,7 +105,8 @@ struct StructuredOpInterface
             linalgOp, "dimension #" + std::to_string(dim) +
                           " of input/output operand #" +
                           std::to_string(opOperand.getOperandNumber()) +
-                          " is incompatible with inferred dimension size");
+                          " is incompatible with inferred dimension size",
+            verboseLevel);
         builder.createOrFold<cf::AssertOp>(loc, cmpOp, msg);
       }
     }
diff --git a/mlir/lib/Dialect/MemRef/Transforms/RuntimeOpVerification.cpp b/mlir/lib/Dialect/MemRef/Transforms/RuntimeOpVerification.cpp
index cd92026562da9..d8a7a89a3fbe7 100644
--- a/mlir/lib/Dialect/MemRef/Transforms/RuntimeOpVerification.cpp
+++ b/mlir/lib/Dialect/MemRef/Transforms/RuntimeOpVerification.cpp
@@ -39,7 +39,7 @@ struct AssumeAlignmentOpInterface
     : public RuntimeVerifiableOpInterface::ExternalModel<
           AssumeAlignmentOpInterface, AssumeAlignmentOp> {
   void generateRuntimeVerification(Operation *op, OpBuilder &builder,
-                                   Location loc) const {
+                                   Location loc, unsigned verboseLevel) const {
     auto assumeOp = cast<AssumeAlignmentOp>(op);
     Value ptr = builder.create<ExtractAlignedPointerAsIndexOp>(
         loc, assumeOp.getMemref());
@@ -53,7 +53,8 @@ struct AssumeAlignmentOpInterface
         loc, isAligned,
         RuntimeVerifiableOpInterface::generateErrorMessage(
             op, "memref is not aligned to " +
-                    std::to_string(assumeOp.getAlignment())));
+                    std::to_string(assumeOp.getAlignment()),
+            verboseLevel));
   }
 };
 
@@ -61,7 +62,7 @@ struct CastOpInterface
     : public RuntimeVerifiableOpInterface::ExternalModel<CastOpInterface,
                                                          CastOp> {
   void generateRuntimeVerification(Operation *op, OpBuilder &builder,
-                                   Location loc) const {
+                                   Location loc, unsigned verboseLevel) const {
     auto castOp = cast<CastOp>(op);
     auto srcType = cast<BaseMemRefType>(castOp.getSource().getType());
 
@@ -79,8 +80,8 @@ struct CastOpInterface
           loc, arith::CmpIPredicate::eq, srcRank, resultRank);
       builder.create<cf::AssertOp>(
           loc, isSameRank,
-          RuntimeVerifiableOpInterface::generateErrorMessage(op,
-                                                             "rank mismatch"));
+          RuntimeVerifiableOpInterface::generateErrorMessage(
+              op, "rank mismatch", verboseLevel));
     }
 
     // Get source offset and strides. We do not have an op to get offsets and
@@ -119,7 +120,8 @@ struct CastOpInterface
       builder.create<cf::AssertOp>(
           loc, isSameSz,
           RuntimeVerifiableOpInterface::generateErrorMessage(
-              op, "size mismatch of dim " + std::to_string(it.index())));
+              op, "size mismatch of dim " + std::to_string(it.index()),
+              verboseLevel));
     }
 
     // Get result offset and strides.
@@ -139,7 +141,7 @@ struct CastOpInterface
       builder.create<cf::AssertOp>(
           loc, isSameOffset,
           RuntimeVerifiableOpInterface::generateErrorMessage(
-              op, "offset mismatch"));
+              op, "offset mismatch", verboseLevel));
     }
 
     // Check strides.
@@ -157,7 +159,8 @@ struct CastOpInterface
       builder.create<cf::AssertOp>(
           loc, isSameStride,
           RuntimeVerifiableOpInterface::generateErrorMessage(
-              op, "stride mismatch of dim " + std::to_string(it.index())));
+              op, "stride mismatch of dim " + std::to_string(it.index()),
+              verboseLevel));
     }
   }
 };
@@ -166,7 +169,7 @@ struct CopyOpInterface
     : public RuntimeVerifiableOpInterface::ExternalModel<CopyOpInterface,
                                                          CopyOp> {
   void generateRuntimeVerification(Operation *op, OpBuilder &builder,
-                                   Location loc) const {
+                                   Location loc, unsigned verboseLevel) const {
     auto copyOp = cast<CopyOp>(op);
     BaseMemRefType sourceType = copyOp.getSource().getType();
     BaseMemRefType targetType = copyOp.getTarget().getType();
@@ -201,7 +204,7 @@ struct CopyOpInterface
           loc, sameDimSize,
           RuntimeVerifiableOpInterface::generateErrorMessage(
               op, "size of " + std::to_string(i) +
-                      "-th source/target dim does not match"));
+                      "-th source/target dim does not match", verboseLevel));
     }
   }
 };
@@ -210,14 +213,14 @@ struct DimOpInterface
     : public RuntimeVerifiableOpInterface::ExternalModel<DimOpInterface,
                                                          DimOp> {
   void generateRuntimeVerification(Operation *op, OpBuilder &builder,
-                                   Location loc) const {
+                                   Location loc, unsigned verboseLevel) const {
     auto dimOp = cast<DimOp>(op);
     Value rank = builder.create<RankOp>(loc, dimOp.getSource());
     Value zero = builder.create<arith::ConstantIndexOp>(loc, 0);
     builder.create<cf::AssertOp>(
         loc, generateInBoundsCheck(builder, loc, dimOp.getIndex(), zero, rank),
         RuntimeVerifiableOpInterface::generateErrorMessage(
-            op, "index is out of bounds"));
+            op, "index is out of bounds", verboseLevel));
   }
 };
 
@@ -228,7 +231,7 @@ struct LoadStoreOpInterface
     : public RuntimeVerifiableOpInterface::ExternalModel<
           LoadStoreOpInterface<LoadStoreOp>, LoadStoreOp> {
   void generateRuntimeVerification(Operation *op, OpBuilder &builder,
-                                   Location loc) const {
+                                   Location loc, unsigned verboseLevel) const {
     auto loadStoreOp = cast<LoadStoreOp>(op);
 
     auto memref = loadStoreOp.getMemref();
@@ -251,7 +254,7 @@ struct LoadStoreOpInterface
     builder.create<cf::AssertOp>(
         loc, assertCond,
         RuntimeVerifiableOpInterface::generateErrorMessage(
-            op, "out-of-bounds access"));
+            op, "out-of-bounds access", verboseLevel));
   }
 };
 
@@ -295,7 +298,7 @@ struct ReinterpretCastOpInterface
     : public RuntimeVerifiableOpInterface::ExternalModel<
           ReinterpretCastOpInterface, ReinterpretCastOp> {
   void generateRuntimeVerification(Operation *op, OpBuilder &builder,
-                                   Location loc) const {
+                                   Location loc, unsigned verboseLevel) const {
     auto reinterpretCast = cast<ReinterpretCastOp>(op);
     auto baseMemref = reinterpretCast.getSource();
     auto resultMemref =
@@ -323,7 +326,8 @@ struct ReinterpretCastOpInterface
         loc, assertCond,
         RuntimeVerifiableOpInterface::generateErrorMessage(
             op,
-            "result of reinterpret_cast is out-of-bounds of the base memref"));
+            "result of reinterpret_cast is out-of-bounds of the base memref",
+            verboseLevel));
   }
 };
 
@@ -331,7 +335,7 @@ struct SubViewOpInterface
     : public RuntimeVerifiableOpInterface::ExternalModel<SubViewOpInterface,
                                                          SubViewOp> {
   void generateRuntimeVerification(Operation *op, OpBuilder &builder,
-                                   Location loc) const {
+                                   Location loc, unsigned verboseLevel) const {
     auto subView = cast<SubViewOp>(op);
     MemRefType sourceType = subView.getSource().getType();
 
@@ -357,7 +361,7 @@ struct SubViewOpInterface
       builder.create<cf::AssertOp>(
           loc, offsetInBounds,
           RuntimeVerifiableOpInterface::generateErrorMessage(
-              op, "offset " + std::to_string(i) + " is out-of-bounds"));
+              op, "offset " + std::to_string(i) + " is out-of-bounds", verboseLevel));
 
       // Verify that slice does not run out-of-bounds.
       Value sizeMinusOne = builder.create<arith::SubIOp>(loc, size, one);
@@ -371,7 +375,7 @@ struct SubViewOpInterface
           loc, lastPosInBounds,
           RuntimeVerifiableOpInterface::generateErrorMessage(
               op, "subview runs out-of-bounds along dimension " +
-                      std::to_string(i)));
+                      std::to_string(i), verboseLevel));
     }
   }
 };
@@ -380,7 +384,7 @@ struct ExpandShapeOpInterface
     : public RuntimeVerifiableOpInterface::ExternalModel<ExpandShapeOpInterface,
                                                          ExpandShapeOp> {
   void generateRuntimeVerification(Operation *op, OpBuilder &builder,
-                                   Location loc) const {
+                                   Location loc, unsigned verboseLevel) const {
     auto expandShapeOp = cast<ExpandShapeOp>(op);
 
     // Verify that the expanded dim sizes are a product of the collapsed dim
@@ -414,7 +418,7 @@ struct ExpandShapeOpInterface
           loc, isModZero,
           RuntimeVerifiableOpInterface::generateErrorMessage(
               op, "static result dims in reassoc group do not "
-                  "divide src dim evenly"));
+                  "divide src dim evenly", verboseLevel));
     }
   }
 };
diff --git a/mlir/lib/Interfaces/RuntimeVerifiableOpInterface.cpp b/mlir/lib/Interfaces/RuntimeVerifiableOpInterface.cpp
index 8aa194befb420..8b54ed1dc3780 100644
--- a/mlir/lib/Interfaces/RuntimeVerifiableOpInterface.cpp
+++ b/mlir/lib/Interfaces/RuntimeVerifiableOpInterface.cpp
@@ -15,7 +15,7 @@ class OpBuilder;
 /// Generate an error message string for the given op and the specified error.
 std::string
 RuntimeVerifiableOpInterface::generateErrorMessage(Operation *op,
-                                                   const std::string &msg) {
+                                                   const std::string &msg, unsigned verboseLevel) {
   std::string buffer;
   llvm::raw_string_ostream stream(buffer);
   OpPrintingFlags flags;
@@ -26,9 +26,25 @@ RuntimeVerifiableOpInterface::generateErrorMessage(Operation *op,
   flags.skipRegions();
   flags.useLocalScope();
   stream << "ERROR: Runtime op verification failed\n";
-  op->print(stream, flags);
-  stream << "\n^ " << msg;
-  stream << "\nLocation: ";
+  if (verboseLevel == 2){
+    // print full op including operand names, very expensive
+    op->print(stream, flags);
+  stream << "\n " << msg;
+  }else if (verboseLevel == 1){
+    // print op name and operand types
+    stream << "Op: " << op->getName().getStringRef() << "\n";
+    stream << "Operand Types:";
+    for (const auto &operand : op->getOpOperands()) {
+      stream << " " << operand.get().getType();
+    }
+    stream << "\n" << msg;
+    stream << "Result Types:";
+    for (const auto &result : op->getResults()) {
+      stream << " " << result.getType();
+    }
+    stream << "\n" << msg;
+  }
+  stream << "^\nLocation: ";
   op->getLoc().print(stream);
   return buffer;
 }
diff --git a/mlir/lib/Transforms/GenerateRuntimeVerification.cpp b/mlir/lib/Transforms/GenerateRuntimeVerification.cpp
index a40bc2b3272fc..7a54ce667c6ad 100644
--- a/mlir/lib/Transforms/GenerateRuntimeVerification.cpp
+++ b/mlir/lib/Transforms/GenerateRuntimeVerification.cpp
@@ -28,6 +28,14 @@ struct GenerateRuntimeVerificationPass
 } // namespace
 
 void GenerateRuntimeVerificationPass::runOnOperation() {
+  // Check verboseLevel is in range [0, 2].
+  if (verboseLevel > 2) {
+    getOperation()->emitError(
+      "generate-runtime-verification pass: set verboseLevel to 0, 1 or 2");
+    signalPassFailure();
+    return;
+  }
+
   // The implementation of the RuntimeVerifiableOpInterface may create ops that
   // can be verified. We don't want to generate verification for IR that
   // performs verification, so gather all runtime-verifiable ops first.
@@ -39,7 +47,8 @@ void GenerateRuntimeVerificationPass::runOnOperation() {
   OpBuilder builder(getOperation()->getContext());
   for (RuntimeVerifiableOpInterface verifiableOp : ops) {
     builder.setInsertionPoint(verifiableOp);
-    verifiableOp.generateRuntimeVerification(builder, verifiableOp.getLoc());
+    verifiableOp.generateRuntimeVerification(builder, verifiableOp.getLoc(),
+                                           verboseLevel);
   };
 }
 
diff --git a/mlir/test/Dialect/Linalg/runtime-verification.mlir b/mlir/test/Dialect/Linalg/runtime-verification.mlir
index a4f29d8457e58..238169adf496e 100644
--- a/mlir/test/Dialect/Linalg/runtime-verification.mlir
+++ b/mlir/test/Dialect/Linalg/runtime-verification.mlir
@@ -1,13 +1,25 @@
 // RUN: mlir-opt %s -generate-runtime-verification | FileCheck %s
+// RUN: mlir-opt %s --generate-runtime-verification="verbose-level=1" | FileCheck %s --check-prefix=VERBOSE1
+// RUN: mlir-opt %s --generate-runtime-verification="verbose-level=0" | FileCheck %s --check-prefix=VERBOSE0
 
 // Most of the tests for linalg runtime-verification are implemented as integration tests.
 
 #identity = affine_map<(d0) -> (d0)>
 
 // CHECK-LABEL: @static_dims
+// VERBOSE1-LABEL: @static_dims
+// VERBOSE0-LABEL: @static_dims
 func.func @static_dims(%arg0: tensor<5xf32>, %arg1: tensor<5xf32>) -> (tensor<5xf32>) {
     // CHECK: %[[TRUE:.*]] = index.bool.constant true
     // CHECK: cf.assert %[[TRUE]]
+    // VERBOSE1: %[[TRUE:.*]] = index.bool.constant true
+    // VERBOSE1: cf.assert %[[TRUE]]
+    // VERBOSE1: Operand Types: tensor<5xf32> tensor<5xf32> tensor<5xf32>
+    // VERBOSE1: Result Types
+    // VERBOSE1: Location: loc
+    // VERBOSE0-NOT: Operand Types: tensor<5xf32> tensor<5xf32> tensor<5xf32>
+    // VERBOSE0-NOT: Result Types
+    // VERBOSE0: Location: loc
     %result = tensor.empty() : tensor<5xf32> 
     %0 = linalg.generic {
       indexing_maps = [#identity, #identity, #identity],
@@ -26,9 +38,11 @@ func.func @static_dims(%arg0: tensor<5xf32>, %arg1: tensor<5xf32>) -> (tensor<5x
 #map = affine_map<() -> ()>
 
 // CHECK-LABEL: @scalars
+// VERBOSE1-LABEL: @scalars
 func.func @scalars(%arg0: tensor<f32>, %arg1: tensor<f32>) -> (tensor<f32>) {
     // No runtime checks are required if the operands are all scalars
     // CHECK-NOT: cf.assert
+    // VERBOSE1-NOT: cf.assert
     %result = tensor.empty() : tensor<f32> 
     %0 = linalg.generic {
       indexing_maps = [#map, #map, #map],

@joker-eph
Copy link
Collaborator

Perf profiling shows that 98% percent of the time is spent in the constructor of SSANameState.

That seems to point to a caching issue to me, we probably should start there first?

@HanchengWu HanchengWu force-pushed the add-options-for-generate-runtime-verification branch from 5eac44b to 3b8f7dd Compare September 30, 2025 13:41
@HanchengWu HanchengWu requested a review from hanhanW as a code owner September 30, 2025 13:41
@HanchengWu
Copy link
Contributor Author

Perf profiling shows that 98% percent of the time is spent in the constructor of SSANameState.

That seems to point to a caching issue to me, we probably should start there first?

@HanchengWu HanchengWu closed this Sep 30, 2025
@HanchengWu HanchengWu reopened this Sep 30, 2025
@HanchengWu
Copy link
Contributor Author

Apologize that I accidentally clicked the close button (screen's bit lagging in vncviewer), reopened it. Please see my previous answer above.

Copy link

github-actions bot commented Oct 1, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@HanchengWu HanchengWu force-pushed the add-options-for-generate-runtime-verification branch from 3b8f7dd to c111294 Compare October 2, 2025 17:20
@joker-eph joker-eph changed the title Add options to generate-runtime-verification to enable faster pass running [MLIR] Add options to generate-runtime-verification to enable faster pass running Oct 2, 2025
@joker-eph
Copy link
Collaborator

This LGTM, but can we get an upstream repro for:

Currently, the pass is extremely expensive. For example, with a mobilenet v2 ssd network(converted to mlir), running this pass alone will take 30 minutes. The same observation has been made to other networks as small as 5 Mb.
Perf profiling shows that 98% percent of the time is spent in the constructor of SSANameState.

I'd like to be able to benchmark this upstream to check how to improve this pass.

@HanchengWu
Copy link
Contributor Author

This LGTM, but can we get an upstream repro for:

Currently, the pass is extremely expensive. For example, with a mobilenet v2 ssd network(converted to mlir), running this pass alone will take 30 minutes. The same observation has been made to other networks as small as 5 Mb.
Perf profiling shows that 98% percent of the time is spent in the constructor of SSANameState.

I'd like to be able to benchmark this upstream to check how to improve this pass.

Hi, Mehdi
Below is the reproduce steps if you wanna try llvm LKG.

https://drive.google.com/file/d/1KavPyfMPobtpnnQxqmK6Iq5IBbdMe3PY/view?usp=sharing

Download the ssd_mobilenet_v2.standard.mlir, then run below:

mlir-opt --generate-runtime-verification -mlir-disable-threading=true -mlir-timing -mlir-timing-display=tree ssd_mobilenet_v2.standard.mlir -o ssd_mobilenet_v2.lkg.mlir

We can see the reported running time is somewhere 150 seconds to 200 seconds, depending on the machine.

Then comment out the line op->print(...) in "RuntimeVerifiableOpInterface.cpp", recompile llvm, then run above agin.
We should see the pass finished within 0.3 second.

As I replied earlier, the operand names returned by op->print(...) for the error message is not that useful. They are neither in input mlir, nor are they in output mlir.

Or, you can check out my changes and try verbose 0, 1, 2, you would see verbose2 takes 150-200 seconds, while verbose 0,1 take only 0.2-0.3 seconds.

@joker-eph
Copy link
Collaborator

Can you apply this patch:

diff --git a/mlir/lib/Transforms/GenerateRuntimeVerification.cpp b/mlir/lib/Transforms/GenerateRuntimeVerification.cpp
index cfe531385fef..9a25b7087d4d 100644
--- a/mlir/lib/Transforms/GenerateRuntimeVerification.cpp
+++ b/mlir/lib/Transforms/GenerateRuntimeVerification.cpp
@@ -6,6 +6,7 @@
 //
 //===----------------------------------------------------------------------===//
 
+#include "mlir/IR/AsmState.h"
 #include "mlir/Transforms/Passes.h"
 
 #include "mlir/IR/Builders.h"
@@ -43,23 +44,22 @@ void GenerateRuntimeVerificationPass::runOnOperation() {
   getOperation()->walk([&](RuntimeVerifiableOpInterface verifiableOp) {
     ops.push_back(verifiableOp);
   });
-
+  OpPrintingFlags flags;
+  // We may generate a lot of error messages and so we need to ensure the
+  // printing is fast.
+  flags.elideLargeElementsAttrs();
+  flags.skipRegions();
+  flags.useLocalScope();
+  AsmState state(getOperation(), flags);
   // Create error message generator based on verboseLevel
-  auto errorMsgGenerator = [vLevel = verboseLevel.getValue()](
+  auto errorMsgGenerator = [vLevel = verboseLevel.getValue(), &state](
                                Operation *op, StringRef msg) -> std::string {
     std::string buffer;
     llvm::raw_string_ostream stream(buffer);
-    OpPrintingFlags flags;
-    // We may generate a lot of error messages and so we need to ensure the
-    // printing is fast.
-    flags.elideLargeElementsAttrs();
-    flags.printGenericOpForm();
-    flags.skipRegions();
-    flags.useLocalScope();
     stream << "ERROR: Runtime op verification failed\n";
     if (vLevel == 2) {
       // print full op including operand names, very expensive
-      op->print(stream, flags);
+      op->print(stream, state);
       stream << "\n " << msg;
     } else if (vLevel == 1) {
       // print op name and operand types

And then redo the benchmarks and the SSA value comparisons and let me know how does this look now?

@HanchengWu
Copy link
Contributor Author

Can you apply this patch:

diff --git a/mlir/lib/Transforms/GenerateRuntimeVerification.cpp b/mlir/lib/Transforms/GenerateRuntimeVerification.cpp
index cfe531385fef..9a25b7087d4d 100644
--- a/mlir/lib/Transforms/GenerateRuntimeVerification.cpp
+++ b/mlir/lib/Transforms/GenerateRuntimeVerification.cpp
@@ -6,6 +6,7 @@
 //
 //===----------------------------------------------------------------------===//
 
+#include "mlir/IR/AsmState.h"
 #include "mlir/Transforms/Passes.h"
 
 #include "mlir/IR/Builders.h"
@@ -43,23 +44,22 @@ void GenerateRuntimeVerificationPass::runOnOperation() {
   getOperation()->walk([&](RuntimeVerifiableOpInterface verifiableOp) {
     ops.push_back(verifiableOp);
   });
-
+  OpPrintingFlags flags;
+  // We may generate a lot of error messages and so we need to ensure the
+  // printing is fast.
+  flags.elideLargeElementsAttrs();
+  flags.skipRegions();
+  flags.useLocalScope();
+  AsmState state(getOperation(), flags);
   // Create error message generator based on verboseLevel
-  auto errorMsgGenerator = [vLevel = verboseLevel.getValue()](
+  auto errorMsgGenerator = [vLevel = verboseLevel.getValue(), &state](
                                Operation *op, StringRef msg) -> std::string {
     std::string buffer;
     llvm::raw_string_ostream stream(buffer);
-    OpPrintingFlags flags;
-    // We may generate a lot of error messages and so we need to ensure the
-    // printing is fast.
-    flags.elideLargeElementsAttrs();
-    flags.printGenericOpForm();
-    flags.skipRegions();
-    flags.useLocalScope();
     stream << "ERROR: Runtime op verification failed\n";
     if (vLevel == 2) {
       // print full op including operand names, very expensive
-      op->print(stream, flags);
+      op->print(stream, state);
       stream << "\n " << msg;
     } else if (vLevel == 1) {
       // print op name and operand types

And then redo the benchmarks and the SSA value comparisons and let me know how does this look now?

sure, will do today

@HanchengWu
Copy link
Contributor Author

Can you apply this patch:

diff --git a/mlir/lib/Transforms/GenerateRuntimeVerification.cpp b/mlir/lib/Transforms/GenerateRuntimeVerification.cpp
index cfe531385fef..9a25b7087d4d 100644
--- a/mlir/lib/Transforms/GenerateRuntimeVerification.cpp
+++ b/mlir/lib/Transforms/GenerateRuntimeVerification.cpp
@@ -6,6 +6,7 @@
 //
 //===----------------------------------------------------------------------===//
 
+#include "mlir/IR/AsmState.h"
 #include "mlir/Transforms/Passes.h"
 
 #include "mlir/IR/Builders.h"
@@ -43,23 +44,22 @@ void GenerateRuntimeVerificationPass::runOnOperation() {
   getOperation()->walk([&](RuntimeVerifiableOpInterface verifiableOp) {
     ops.push_back(verifiableOp);
   });
-
+  OpPrintingFlags flags;
+  // We may generate a lot of error messages and so we need to ensure the
+  // printing is fast.
+  flags.elideLargeElementsAttrs();
+  flags.skipRegions();
+  flags.useLocalScope();
+  AsmState state(getOperation(), flags);
   // Create error message generator based on verboseLevel
-  auto errorMsgGenerator = [vLevel = verboseLevel.getValue()](
+  auto errorMsgGenerator = [vLevel = verboseLevel.getValue(), &state](
                                Operation *op, StringRef msg) -> std::string {
     std::string buffer;
     llvm::raw_string_ostream stream(buffer);
-    OpPrintingFlags flags;
-    // We may generate a lot of error messages and so we need to ensure the
-    // printing is fast.
-    flags.elideLargeElementsAttrs();
-    flags.printGenericOpForm();
-    flags.skipRegions();
-    flags.useLocalScope();
     stream << "ERROR: Runtime op verification failed\n";
     if (vLevel == 2) {
       // print full op including operand names, very expensive
-      op->print(stream, flags);
+      op->print(stream, state);
       stream << "\n " << msg;
     } else if (vLevel == 1) {
       // print op name and operand types

And then redo the benchmarks and the SSA value comparisons and let me know how does this look now?

Hi,
The patch actually works - constructing an AsmState at the very beginning and use it throughout the pass.

With this patch, I observed the following.

  1. Verbose 2 now all finished very quickly, within 0.3 second, similar to verbose 0 and 1.
  2. For verbose 2, the string of the op that we inserted in the error message is the same string that is used for the op in the input MLIR file.

This seems to be the most ideal case where in error message we refer to op strings from the input mlir.

I actually saw this version of "op->print(stream, asmState)" before, but since the definition of AsmState mentions that "The IR should not be mutated in-between invocations using this state", I didn't try this method. But it looks working as I tried now. Probably it's because we are only inserting new nodes into the IR, and never deletes anything with the pass, so whatever the old ArmState stored is still valid to old IR nodes. I guess something would break, if we try to print the new node that we inserted during the pass run using this old AsmState.

If you think using this is the safe, then I propose that we move forward and remove verbose 1 option. So the new options would be:

verbose 0: print location in input mlir only.
verbose 1: print op string and location in input mlir.

@joker-eph
Copy link
Collaborator

LG

@HanchengWu HanchengWu force-pushed the add-options-for-generate-runtime-verification branch from c111294 to 3151d3c Compare October 6, 2025 20:47
@joker-eph
Copy link
Collaborator

Can you please update the description to make it a good commit message? (not too long, but descriptive enough, and up-to-date with the current state)

@HanchengWu HanchengWu force-pushed the add-options-for-generate-runtime-verification branch from 3151d3c to 23abdae Compare October 6, 2025 21:03
…ion pass, and add location only pass option.
@HanchengWu HanchengWu force-pushed the add-options-for-generate-runtime-verification branch from 23abdae to 1bcb096 Compare October 6, 2025 21:10
@HanchengWu
Copy link
Contributor Author

Can you please update the description to make it a good commit message? (not too long, but descriptive enough, and up-to-date with the current state)

Done. Updated commit message and PR description, also rebased to LLVM main tip

@HanchengWu HanchengWu changed the title [MLIR] Add options to generate-runtime-verification to enable faster pass running [MLIR] Reuse AsmState to enable fast generate-runtime-verification pass; add location-only pass option Oct 7, 2025
@HanchengWu
Copy link
Contributor Author

@joker-eph
Hi, Mehdi
The checks have passed. I don't have permission to merge the PR. Can you help merge it?

@joker-eph joker-eph merged commit a6d1a52 into llvm:main Oct 8, 2025
9 checks passed
Copy link

github-actions bot commented Oct 8, 2025

@HanchengWu Congratulations on having your first Pull Request (PR) merged into the LLVM Project!

Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR.

Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues.

How to do this, and the rest of the post-merge process, is covered in detail here.

If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again.

If you don't get any reports, no action is required from you. Your changes are working as expected, well done!

@llvm-ci
Copy link
Collaborator

llvm-ci commented Oct 8, 2025

LLVM Buildbot has detected a new failure on builder mlir-rocm-mi200 running on mi200-buildbot while building mlir at step 7 "test-build-check-mlir-build-only-check-mlir".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/177/builds/22238

Here is the relevant piece of the build log for the reference
Step 7 (test-build-check-mlir-build-only-check-mlir) failure: test (failure)
******************** TEST 'MLIR :: Integration/Dialect/MemRef/dim-runtime-verification.mlir' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 1
/vol/worker/mi200-buildbot/mlir-rocm-mi200/build/bin/mlir-opt /vol/worker/mi200-buildbot/mlir-rocm-mi200/llvm-project/mlir/test/Integration/Dialect/MemRef/dim-runtime-verification.mlir -generate-runtime-verification      -expand-strided-metadata      -test-cf-assert      -convert-to-llvm |  /vol/worker/mi200-buildbot/mlir-rocm-mi200/build/bin/mlir-runner -e main -entry-point-result=void      -shared-libs=/vol/worker/mi200-buildbot/mlir-rocm-mi200/build/lib/libmlir_runner_utils.so 2>&1 |  /vol/worker/mi200-buildbot/mlir-rocm-mi200/build/bin/FileCheck /vol/worker/mi200-buildbot/mlir-rocm-mi200/llvm-project/mlir/test/Integration/Dialect/MemRef/dim-runtime-verification.mlir
# executed command: /vol/worker/mi200-buildbot/mlir-rocm-mi200/build/bin/mlir-opt /vol/worker/mi200-buildbot/mlir-rocm-mi200/llvm-project/mlir/test/Integration/Dialect/MemRef/dim-runtime-verification.mlir -generate-runtime-verification -expand-strided-metadata -test-cf-assert -convert-to-llvm
# executed command: /vol/worker/mi200-buildbot/mlir-rocm-mi200/build/bin/mlir-runner -e main -entry-point-result=void -shared-libs=/vol/worker/mi200-buildbot/mlir-rocm-mi200/build/lib/libmlir_runner_utils.so
# executed command: /vol/worker/mi200-buildbot/mlir-rocm-mi200/build/bin/FileCheck /vol/worker/mi200-buildbot/mlir-rocm-mi200/llvm-project/mlir/test/Integration/Dialect/MemRef/dim-runtime-verification.mlir
# .---command stderr------------
# | /vol/worker/mi200-buildbot/mlir-rocm-mi200/llvm-project/mlir/test/Integration/Dialect/MemRef/dim-runtime-verification.mlir:23:17: error: CHECK-NEXT: expected string not found in input
# |  // CHECK-NEXT: "memref.dim"(%{{.*}}, %{{.*}}) : (memref<1xf32>, index) -> index
# |                 ^
# | <stdin>:1:38: note: scanning from here
# | ERROR: Runtime op verification failed
# |                                      ^
# | <stdin>:2:8: note: possible intended match here
# | %dim = memref.dim %alloca, %c4 : memref<1xf32>
# |        ^
# | 
# | Input file: <stdin>
# | Check file: /vol/worker/mi200-buildbot/mlir-rocm-mi200/llvm-project/mlir/test/Integration/Dialect/MemRef/dim-runtime-verification.mlir
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# |            1: ERROR: Runtime op verification failed 
# | next:23'0                                          X error: no match found
# |            2: %dim = memref.dim %alloca, %c4 : memref<1xf32> 
# | next:23'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# | next:23'1            ?                                        possible intended match
# |            3: ^ 
# | next:23'0     ~~
# |            4: Location: loc("/vol/worker/mi200-buildbot/mlir-rocm-mi200/llvm-project/mlir/test/Integration/Dialect/MemRef/dim-runtime-verification.mlir":26:10) 
# | next:23'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# | >>>>>>
# `-----------------------------
# error: command failed with exit status: 1

--

********************


@llvm-ci
Copy link
Collaborator

llvm-ci commented Oct 8, 2025

LLVM Buildbot has detected a new failure on builder mlir-nvidia-gcc7 running on mlir-nvidia while building mlir at step 7 "test-build-check-mlir-build-only-check-mlir".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/116/builds/19426

Here is the relevant piece of the build log for the reference
Step 7 (test-build-check-mlir-build-only-check-mlir) failure: test (failure)
******************** TEST 'MLIR :: Integration/Dialect/Linalg/CPU/runtime-verification.mlir' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 1
/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/test/Integration/Dialect/Linalg/CPU/runtime-verification.mlir -generate-runtime-verification  -one-shot-bufferize="bufferize-function-boundaries"  -buffer-deallocation-pipeline  -convert-bufferization-to-memref  -convert-linalg-to-loops  -expand-strided-metadata  -lower-affine  -convert-scf-to-cf  -test-cf-assert  -convert-index-to-llvm  -finalize-memref-to-llvm  -convert-func-to-llvm  -convert-arith-to-llvm  -convert-cf-to-llvm  -reconcile-unrealized-casts |  /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-runner -e main -entry-point-result=void      -shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/lib/libmlir_runner_utils.so      -shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/lib/libmlir_c_runner_utils.so 2>&1 |  /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/FileCheck /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/test/Integration/Dialect/Linalg/CPU/runtime-verification.mlir
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/test/Integration/Dialect/Linalg/CPU/runtime-verification.mlir -generate-runtime-verification -one-shot-bufferize=bufferize-function-boundaries -buffer-deallocation-pipeline -convert-bufferization-to-memref -convert-linalg-to-loops -expand-strided-metadata -lower-affine -convert-scf-to-cf -test-cf-assert -convert-index-to-llvm -finalize-memref-to-llvm -convert-func-to-llvm -convert-arith-to-llvm -convert-cf-to-llvm -reconcile-unrealized-casts
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-runner -e main -entry-point-result=void -shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/lib/libmlir_runner_utils.so -shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/lib/libmlir_c_runner_utils.so
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/FileCheck /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/test/Integration/Dialect/Linalg/CPU/runtime-verification.mlir
# .---command stderr------------
# | /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/test/Integration/Dialect/Linalg/CPU/runtime-verification.mlir:32:12: error: CHECK: expected string not found in input
# |  // CHECK: ^ dimension #0 of input/output operand #1 is incompatible with inferred dimension size
# |            ^
# | <stdin>:2:20: note: scanning from here
# | %0 = linalg.generic {indexing_maps = [affine_map<(d0) -> (d0)>, affine_map<(d0) -> (d0)>], iterator_types = ["parallel"]} ins(%arg0 : tensor<?xf32>) outs(%arg1 : tensor<?xf32>) {...} -> tensor<?xf32>
# |                    ^
# | 
# | Input file: <stdin>
# | Check file: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/test/Integration/Dialect/Linalg/CPU/runtime-verification.mlir
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# |           1: ERROR: Runtime op verification failed 
# |           2: %0 = linalg.generic {indexing_maps = [affine_map<(d0) -> (d0)>, affine_map<(d0) -> (d0)>], iterator_types = ["parallel"]} ins(%arg0 : tensor<?xf32>) outs(%arg1 : tensor<?xf32>) {...} -> tensor<?xf32> 
# | check:32                        X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
# |           3: ^ 
# | check:32     ~~
# |           4: Location: loc("/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/test/Integration/Dialect/Linalg/CPU/runtime-verification.mlir":113:10) 
# | check:32     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |           5: ERROR: Runtime op verification failed 
# | check:32     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |           6: %0 = linalg.generic {indexing_maps = [affine_map<(d0) -> (d0)>, affine_map<(d0) -> (d0)>], iterator_types = ["parallel"]} ins(%arg0 : tensor<?xf32>) outs(%arg1 : tensor<?xf32>) {...} -> tensor<?xf32> 
# | check:32     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |           7: ^ 
# | check:32     ~~
# |           .
# |           .
# |           .
# | >>>>>>
# `-----------------------------
# error: command failed with exit status: 1

--

********************


@HanchengWu
Copy link
Contributor Author

HanchengWu commented Oct 8, 2025

@joker-eph
Hi Mehdi,

My apologies. I think I did a mistake in formatting the error message generator and this resulted in some tests failures.

Should I revert the PR or submit a fix? The fix is straightforward.

HanchengWu added a commit to HanchengWu/llvm-project that referenced this pull request Oct 8, 2025
…m#160331

PR llvm#160331 introduced a mistake that removed the error message for generate-runtime-verification
pass, leading to two test failures during `check-mlir` build.

This patch restores the missing error message, resolving the failures. Verified locally.

Fixes post-merge regression from: llvm#160331
@DavidSpickett
Copy link
Collaborator

Just for completeness, we also had failures on AArch64 Linux: https://lab.llvm.org/buildbot/#/builders/143/builds/11522

They look identical to the ones already posted here though.

Also for future contributions please adjust your email settings - https://llvm.org/docs/DeveloperPolicy.html#email-addresses. Then you will get notified of failed buildbots that include more than just your commit (only builds where your PR is the only commit will report back to GitHub).

HanchengWu added a commit to HanchengWu/llvm-project that referenced this pull request Oct 8, 2025
…m#160331

PR llvm#160331 introduced a mistake that removed the error message for generate-runtime-verification
pass, leading to two test failures during `check-mlir` build.

This patch restores the missing error message, resolving the failures. Verified locally.

Fixes post-merge regression from: llvm#160331
@HanchengWu
Copy link
Contributor Author

Just for completeness, we also had failures on AArch64 Linux: https://lab.llvm.org/buildbot/#/builders/143/builds/11522

They look identical to the ones already posted here though.

Also for future contributions please adjust your email settings - https://llvm.org/docs/DeveloperPolicy.html#email-addresses. Then you will get notified of failed buildbots that include more than just your commit (only builds where your PR is the only commit will report back to GitHub).

Thanks! Set my email preference according to the setting.
Testing the fix now.

HanchengWu added a commit to HanchengWu/llvm-project that referenced this pull request Oct 8, 2025
…m#160331

PR llvm#160331 introduced a mistake that removed the error message for generate-runtime-verification
pass, leading to two test failures during `check-mlir` build.

This patch restores the missing error message, resolving the failures. Verified locally.

Fixes post-merge regression from: llvm#160331
@HanchengWu
Copy link
Contributor Author

I have the fix now. But it seems that I didn't have "-DMLIR_INCLUDE_INTEGRATION_TESTS=ON" when I config my llvm build. So these tests are skipped as "UNSUPPORTED". I will verify against them and submit a new PR for fix asap.

@DavidSpickett
Copy link
Collaborator

Cool.

If they require hardware/a simulator, I have access to that so I can test your fix. There are some AArch64 SVE/SME tests that'll use qemu if you're not on that specific hardware.

HanchengWu added a commit to HanchengWu/llvm-project that referenced this pull request Oct 8, 2025
llvm#160331

PR llvm#160331 introduced a mistake that removed the error message for generate-runtime-verification
pass, leading to test failures during `test-build-check-mlir-build-only-check-mlir`.

This patch restores the missing error message.

In addition, for related tests, the op strings used in FileChecks are updated with the same op
formats as used in input mlirs.

Verified locally.

Fixes post-merge regression from: llvm#160331
@HanchengWu
Copy link
Contributor Author

@DavidSpickett @joker-eph
This is the fix.
#162533

@HanchengWu
Copy link
Contributor Author

Cool.

If they require hardware/a simulator, I have access to that so I can test your fix. There are some AArch64 SVE/SME tests that'll use qemu if you're not on that specific hardware.

Thanks for the help! Luckliy, these generic tests do not require a specific hardware.

I have submitted the fix PR as above.

joker-eph pushed a commit that referenced this pull request Oct 8, 2025
#160331 (#162533)

[MLIR] Fix test failures for generate-runtime-verification pass from PR #160331
    
PR #160331 introduced a mistake that removed the error message for
generate-runtime-verification
pass, leading to test failures during
`test-build-check-mlir-build-only-check-mlir`.
    
This patch restores the missing error message.
    
In addition, for related tests, the op strings used in FileChecks are
updated with the same op
formats as used in input mlirs.
    
Verified locally.
    
Fixes post-merge regression from:
#160331
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Oct 8, 2025
… pass from PR #160331 (#162533)

[MLIR] Fix test failures for generate-runtime-verification pass from PR #160331

PR #160331 introduced a mistake that removed the error message for
generate-runtime-verification
pass, leading to test failures during
`test-build-check-mlir-build-only-check-mlir`.

This patch restores the missing error message.

In addition, for related tests, the op strings used in FileChecks are
updated with the same op
formats as used in input mlirs.

Verified locally.

Fixes post-merge regression from:
llvm/llvm-project#160331
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants