Skip to content

Conversation

@linuxlonelyeagle
Copy link
Member

I encountered this in the pass I wrote.

matmul.mlir:5:3: error: 'affine.for' op operand cannot be used as a symbol
  linalg.matmul
  ^
matmul.mlir:5:3: note: see current operation: 
"affine.for"(%9, %8) <{lowerBoundMap = affine_map<()[s0] -> (s0)>, operandSegmentSizes = array<i32: 1, 1, 0>, step = 32 : index, upperBoundMap = affine_map<()[s0] -> (s0)>}> ({
^bb0(%arg3: index):
  "affine.yield"() : () -> ()
}) : (index, index) -> ()
make: *** [makefile:56: gemm-opt-matmul-lower] Error 1

This is because affinemap is used in the lower-bound or upper-bound of create affine.for, and the symbol for affinemap comes from a memref.dim whose memref is a function argument, affine.for check will be failed.
Something like the following, but the code below doesn't make sense. What I'm trying to say is that I created such affine.for in pass encountered the above bug. but it's worth mentioning that if you write the following IR by hand, there is no problem. So I didn't add a test.

#map = affine_map<()[s0] -> (s0)>

func.func @func(%A : memref<32x128xf32>) {
  %0 = arith.constant 0 : index
  %1 = arith.constant 1 : index
  %dim_0 = memref.dim %A, %0 : memref<32x128xf32>
  %dim_1 = memref.dim %A, %1 : memref<32x128xf32>
  affine.for %it = #map()[%dim_0] to #map()[%dim_1] {

  }
  return
}

@llvmbot
Copy link
Member

llvmbot commented Nov 26, 2024

@llvm/pr-subscribers-mlir-affine

@llvm/pr-subscribers-mlir

Author: lonely eagle (linuxlonelyeagle)

Changes

I encountered this in the pass I wrote.

matmul.mlir:5:3: error: 'affine.for' op operand cannot be used as a symbol
  linalg.matmul
  ^
matmul.mlir:5:3: note: see current operation: 
"affine.for"(%9, %8) &lt;{lowerBoundMap = affine_map&lt;()[s0] -&gt; (s0)&gt;, operandSegmentSizes = array&lt;i32: 1, 1, 0&gt;, step = 32 : index, upperBoundMap = affine_map&lt;()[s0] -&gt; (s0)&gt;}&gt; ({
^bb0(%arg3: index):
  "affine.yield"() : () -&gt; ()
}) : (index, index) -&gt; ()
make: *** [makefile:56: gemm-opt-matmul-lower] Error 1

This is because affinemap is used in the lower-bound or upper-bound of create affine.for, and the symbol for affinemap comes from a memref.dim whose memref is a function argument, affine.for check will be failed.
Something like the following, but the code below doesn't make sense. What I'm trying to say is that I created such affine.for in pass encountered the above bug. but it's worth mentioning that if you write the following IR by hand, there is no problem. So I didn't add a test.

#map = affine_map&lt;()[s0] -&gt; (s0)&gt;

func.func @<!-- -->func(%A : memref&lt;32x128xf32&gt;) {
  %0 = arith.constant 0 : index
  %1 = arith.constant 1 : index
  %dim_0 = memref.dim %A, %0 : memref&lt;32x128xf32&gt;
  %dim_1 = memref.dim %A, %1 : memref&lt;32x128xf32&gt;
  affine.for %it = #map()[%dim_0] to #map()[%dim_1] {

  }
  return
}

Full diff: https://github.com/llvm/llvm-project/pull/117721.diff

1 Files Affected:

  • (modified) mlir/lib/Dialect/Affine/IR/AffineOps.cpp (+13-3)
diff --git a/mlir/lib/Dialect/Affine/IR/AffineOps.cpp b/mlir/lib/Dialect/Affine/IR/AffineOps.cpp
index 1c5466730a5589..0d24e434328419 100644
--- a/mlir/lib/Dialect/Affine/IR/AffineOps.cpp
+++ b/mlir/lib/Dialect/Affine/IR/AffineOps.cpp
@@ -17,6 +17,7 @@
 #include "mlir/IR/Matchers.h"
 #include "mlir/IR/OpDefinition.h"
 #include "mlir/IR/PatternMatch.h"
+#include "mlir/Interfaces/FunctionInterfaces.h"
 #include "mlir/Interfaces/ShapedOpInterfaces.h"
 #include "mlir/Interfaces/ValueBoundsOpInterface.h"
 #include "mlir/Transforms/InliningUtils.h"
@@ -352,9 +353,13 @@ static bool isDimOpValidSymbol(ShapedDimOpInterface dimOp, Region *region) {
 
   // Conservatively handle remaining BlockArguments as non-valid symbols.
   // E.g. scf.for iterArgs.
-  if (llvm::isa<BlockArgument>(dimOp.getShapedValue()))
-    return false;
-
+  if (auto blockArgument =
+          llvm::dyn_cast<BlockArgument>(dimOp.getShapedValue())) {
+    if (!llvm::isa<FunctionOpInterface>(
+            blockArgument.getParentRegion()->getParentOp())) {
+      return false;
+    }
+  }
   // The dim op is also okay if its operand memref is a view/subview whose
   // corresponding size is a valid symbol.
   std::optional<int64_t> index = getConstantIntValue(dimOp.getDimension());
@@ -365,6 +370,11 @@ static bool isDimOpValidSymbol(ShapedDimOpInterface dimOp, Region *region) {
 
   // Skip over all memref.cast ops (if any).
   Operation *op = dimOp.getShapedValue().getDefiningOp();
+
+  // the ShapedValue of the dim is the function block argument.
+  if (!op)
+    return true;
+
   while (auto castOp = dyn_cast<memref::CastOp>(op)) {
     // Bail on unranked memrefs.
     if (isa<UnrankedMemRefType>(castOp.getSource().getType()))

@linuxlonelyeagle
Copy link
Member Author

I believe this issue could be made even clearer.Below are the results after I fixed this bug.If you have any questions, welcome to tell me.

  gpu.module @gpu {
    gpu.func @gemm(%arg0: memref<128x32xf32>, %arg1: memref<32x64xf32>, %arg2: memref<128x64xf32>) kernel {
      %0 = gpu.dynamic_shared_memory : memref<?xi8, #gpu.address_space<workgroup>>
       .....
      %c1 = arith.constant 1 : index
      %dim = memref.dim %arg0, %c1 : memref<128x32xf32>
      %c0_3 = arith.constant 0 : index
      affine.for %arg3 = %c0_3 to %dim step 32 {
      }
      gpu.return
    }
  }

Copy link
Member

@ftynse ftynse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. Let's first understand whether this is a bug or the intended behavior. This starts with adding a test. The test should show that something that wasn't accepted as a symbol becomes accepted as a symbol, i.e. does not emit an error, after the patch.

Also consider the fact not all block arguments are function arguments. One can perfectly well have a

func.func @foo(...) {
  cf.br ^bb1(...)

^bb1(%bbarg: memref<?xf32>):
  %dim = memref.dim %bbarg, %c0
  %new = call @memref_realloc(%bbarg, 2 * %dim)
  cf.cond_br ^bb1(%new), ^bb2

^bb2:
  return
}

where %bbarg is a block argument, but it's dim cannot be used as a symbol because it changes.

Comment on lines +358 to +359
if (!llvm::isa<FunctionOpInterface>(
blockArgument.getParentRegion()->getParentOp())) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functions may have blocks other than the entry block. Not all block arguments are function arguments, so this change looks suspicious to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right.Thanks for the advice, I probably already know how to do it.Thanks!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can parse the following IR with mlir-opt, which will depart the bug, I found that I can depart the bug via generic IR.
In that case, I can write tests too.

#map = affine_map<()[s0] -> (s0)>
"builtin.module"() ({
  "gpu.module"() <{sym_name = "gpu"}> ({
    "gpu.func"() <{function_type = (memref<?x?xf32>, memref<?x?xf32>, memref<?x?xf32>) -> ()}> ({
    ^bb0(%arg3: memref<?x?xf32>, %arg4: memref<?x?xf32>, %arg5: memref<?x?xf32>):
      %16 = "arith.constant"() <{value = 1 : index}> : () -> index
      %17 = "memref.dim"(%arg3, %16) : (memref<?x?xf32>, index) -> index
      %18 = "arith.constant"() <{value = 0 : index}> : () -> index
      "affine.for"(%18, %17) <{lowerBoundMap = #map, operandSegmentSizes = array<i32: 1, 1, 0>, step = 32 : index, upperBoundMap = #map}> ({
      ^bb0(%arg6: index):
        "affine.yield"() : () -> ()
      }) : (index, index) -> ()
      "gpu.return"() : () -> ()
    }) {gpu.kernel, sym_name = "gemm", workgroup_attributions = 0 : i64} : () -> ()
  }) : () -> ()
  "func.func"() <{function_type = (memref<?x?xf32>, memref<?x?xf32>, memref<?x?xf32>) -> f32, sym_name = "main"}> ({
  ^bb0(%arg0: memref<?x?xf32>, %arg1: memref<?x?xf32>, %arg2: memref<?x?xf32>):
    %0 = "arith.constant"() <{value = 0.000000e+00 : f32}> : () -> f32
    %1 = "arith.constant"() <{value = 1.000000e+00 : f32}> : () -> f32
    %2 = "arith.constant"() <{value = 2.000000e+00 : f32}> : () -> f32
    %3 = "arith.constant"() <{value = 0 : index}> : () -> index
    %4 = "memref.dim"(%arg0, %3) : (memref<?x?xf32>, index) -> index
    %5 = "arith.constant"() <{value = 1 : index}> : () -> index
    %6 = "memref.dim"(%arg0, %5) : (memref<?x?xf32>, index) -> index
    %7 = "arith.constant"() <{value = 1 : index}> : () -> index
    %8 = "memref.dim"(%arg1, %7) : (memref<?x?xf32>, index) -> index
    %9 = "arith.constant"() <{value = 128 : index}> : () -> index
    %10 = "arith.ceildivui"(%4, %9) : (index, index) -> index
    %11 = "arith.constant"() <{value = 64 : index}> : () -> index
    %12 = "arith.ceildivsi"(%6, %11) : (index, index) -> index
    %13 = "arith.constant"() <{value = 256 : index}> : () -> index
    %14 = "arith.constant"() <{value = 262144 : i32}> : () -> i32
    %15 = "arith.constant"() <{value = 1 : index}> : () -> index
    "gpu.launch_func"(%12, %10, %15, %13, %15, %15, %14, %arg0, %arg1, %arg2) <{kernel = @gpu::@gemm, operandSegmentSizes = array<i32: 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 3, 0>}> : (index, index, index, index, index, index, i32, memref<?x?xf32>, memref<?x?xf32>, memref<?x?xf32>) -> ()
    "func.return"(%0) : (f32) -> ()
  }) : () -> ()
}) {gpu.container_module} : () -> ()

But in that case, there is another question I'd like to ask, which I'm not thinking about very clearly.

Copy link
Member Author

@linuxlonelyeagle linuxlonelyeagle Nov 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused, for the block bb0, if it's parameter is a memref, then it's dimensions can change as well, but it shouldn't cause an effect like the one inside the example you gave, I'm not very sure. I think this needs to be confirmed.I'm not quite sure how to fix this.I'd appreciate some guidance on this.Thanks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've figured it out, and I'll modify the patch later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a new development on this issue, I found the real problem because gpu.func doesn't have AffineScope Traits.I'm going to have to look further on this issue. @ftynse Thank you for the guidance you've given me. I think I'm still making progress.

ftynse pushed a commit that referenced this pull request Nov 29, 2024
This PR in order to solve the following problem.
#117721.
To efficiently implement the thread-to-data mapping relationship, I
introduced AffineScope in gpu.func(Data or thread layout).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants