-
Notifications
You must be signed in to change notification settings - Fork 15.1k
[mlir][linalg] Extend FuseElementwiseOps pattern to work with named ops
#144922
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 29 commits
c76a8cc
fa60fa5
20b25f3
8e471a7
d723913
5280b87
c2f52bc
8d2e8e0
58582bf
cf67ab6
7d402c1
b1d15b2
459bcb4
f7e164b
9acb5d0
8a375bf
5c3bd5f
a12b417
ce33596
1eb3461
389a9c4
55cca22
b914037
8c906a0
047266b
880d9c0
5588c86
95635df
b7247b9
b639a2f
ac0eb5b
a3bba1a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1017,9 +1017,75 @@ module { | |
|
|
||
| // ----- | ||
|
|
||
| func.func @map_ops(%in1: tensor<8xf32>, %in2: tensor<8xf32>) -> tensor<8xf32> { | ||
srcarroll marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| %fill = tensor.empty() : tensor<8xf32> | ||
| %add = linalg.map {arith.addf} ins(%in1, %in2: tensor<8xf32>, tensor<8xf32>) outs(%fill: tensor<8xf32>) | ||
| %mapped_65 = linalg.map { math.sqrt } ins(%add : tensor<8xf32>) outs(%fill : tensor<8xf32>) | ||
| return %mapped_65 : tensor<8xf32> | ||
| } | ||
|
|
||
| // CHECK-LABEL: func @map_ops | ||
| // CHECK-SAME: %[[ARG0:[a-zA-Z0-9]+]]: tensor<8xf32> | ||
| // CHECK-SAME: %[[ARG1:[a-zA-Z0-9]+]]: tensor<8xf32> | ||
| // CHECK: %[[EMPTY:.+]] = tensor.empty() : tensor<8xf32> | ||
| // CHECK: %[[FUSED_OP:.+]] = linalg.generic | ||
| // CHECK-SAME: ins(%[[ARG0]], %[[ARG1]] : {{.*}}) outs(%[[EMPTY]] : | ||
| // CHECK-NEXT: ^bb0(%[[IN0:.*]]: f32, %[[IN1:.*]]: f32, %[[OUT:.*]]: f32): | ||
| // CHECK-NEXT: %[[ADD:.*]] = arith.addf %[[IN0]], %[[IN1]] | ||
| // CHECK-NEXT: %[[SQRT:.*]] = math.sqrt %[[ADD]] | ||
| // CHECK-NEXT: linalg.yield %[[SQRT]] | ||
| // CHECK-NOT: linalg.generic | ||
|
|
||
| // ----- | ||
|
|
||
| func.func @map_matmul(%in1: tensor<8x10xf32>, %in2: tensor<10x12xf32>) -> tensor<8x12xf32> { | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @rengolin Per your request I haver added this and the test below. Please let me know if there's more you would like to see
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes! That's the kind of thing we need to be testing. If the This also applies to
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Ah yes I forgot
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
@rengolin Actually, why not? I think this is only conditionally true. One case I'm thinking of that involves transpose is valid for fusion, for example would fuse to broadcast cases can also be valid, for example fuses to I'll still need to think about cases, with valid input IR, that should NOT result in fusion for the elementwise + matmul case, but I think I will need a more complex case than this to show that. But just wanted to make sure it is agreed that the above cases are valid fusions cases. And again, if you have a specific test case in mind that I'm not thinking of, I will certainly investigate/add it.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the above were produced with the builtin
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You're right, I should have said "may not be fused" instead.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. cool thanks.
btw, it is not my intention to burden you with coming up with test cases for these changes. I completely accept that responsibility myself. so i just mean if you already have something in mind, please share. so this is more about hoping to get help than expecting it. i will continue adding more cases that i think are illustrative enough of the scope of changes here |
||
| %fill0 = tensor.empty() : tensor<8x10xf32> | ||
| %exp = linalg.map {math.exp} ins(%in1 : tensor<8x10xf32>) outs(%fill0: tensor<8x10xf32>) | ||
| %fill1 = tensor.empty() : tensor<8x12xf32> | ||
| %matmul = linalg.matmul ins(%exp, %in2 : tensor<8x10xf32>, tensor<10x12xf32>) outs(%fill1 : tensor<8x12xf32>) -> tensor<8x12xf32> | ||
| return %matmul : tensor<8x12xf32> | ||
| } | ||
|
|
||
| // CHECK-DAG: #[[$MAP0:.+]] = affine_map<(d0, d1, d2) -> (d0, d2)> | ||
| // CHECK-DAG: #[[$MAP1:.+]] = affine_map<(d0, d1, d2) -> (d2, d1)> | ||
| // CHECK-DAG: #[[$MAP2:.+]] = affine_map<(d0, d1, d2) -> (d0, d1)> | ||
| // CHECK-LABEL: func @map_matmul | ||
| // CHECK-SAME: %[[ARG0:[a-zA-Z0-9]+]]: tensor<8x10xf32> | ||
| // CHECK-SAME: %[[ARG1:[a-zA-Z0-9]+]]: tensor<10x12xf32> | ||
| // CHECK: %[[EMPTY:.+]] = tensor.empty() : tensor<8x12xf32> | ||
| // CHECK: %[[FUSED_OP:.+]] = linalg.generic | ||
| // CHECK-SAME: indexing_maps = [#[[$MAP0]], #[[$MAP1]], #[[$MAP2]]] | ||
| // CHECK-SAME: iterator_types = ["parallel", "parallel", "reduction"] | ||
| // CHECK-SAME: ins(%[[ARG0]], %[[ARG1]] : {{.*}}) outs(%[[EMPTY]] : | ||
| // CHECK-NEXT: ^bb0(%[[IN0:.*]]: f32, %[[IN1:.*]]: f32, %[[OUT:.*]]: f32): | ||
| // CHECK-NEXT: %[[EXP:.*]] = math.exp %[[IN0]] | ||
| // CHECK-NEXT: %[[MUL:.*]] = arith.mulf %[[EXP]], %[[IN1]] | ||
| // CHECK-NEXT: %[[ADD:.*]] = arith.addf %[[OUT]], %[[MUL]] | ||
| // CHECK-NEXT: linalg.yield %[[ADD]] | ||
| // CHECK-NOT: linalg.generic | ||
|
|
||
| // ----- | ||
|
|
||
| func.func @matmul_map(%in1: tensor<8x10xf32>, %in2: tensor<10x12xf32>) -> tensor<8x12xf32> { | ||
| %fill1 = tensor.empty() : tensor<8x12xf32> | ||
| %matmul = linalg.matmul ins(%in1, %in2 : tensor<8x10xf32>, tensor<10x12xf32>) outs(%fill1 : tensor<8x12xf32>) -> tensor<8x12xf32> | ||
| %exp = linalg.map {math.exp} ins(%matmul : tensor<8x12xf32>) outs(%fill1: tensor<8x12xf32>) | ||
|
|
||
| return %exp : tensor<8x12xf32> | ||
| } | ||
|
|
||
| // Should not fuse | ||
| // CHECK-LABEL: func @matmul_map | ||
| // CHECK-NEXT: tensor.empty | ||
| // CHECK-NEXT: linalg.matmul | ||
| // CHECK-NEXT: linalg.map | ||
| // CHECK-NEXT: return | ||
|
|
||
| // ----- | ||
|
|
||
| // In this test we expect the first two linalg.generic operations to be fused into one, but the third one (the matmul) to remain separate. | ||
| // The reason is that when the pattern is applied the 1st time, the fusion of the first two operations produces a fused operation with | ||
| // an additional result and ana dditional output indexing map that is not a permutation / not invertible. | ||
| // an additional result and an additional output indexing map that is not a permutation / not invertible. | ||
| // The fused op will still produce also the original result (and its output indexing map), which is preserved because the new indexing map | ||
| // is not invertible. Thus the fused op will have 2 results, but only the 2nd one will be used by the following matmul op as an input argument. | ||
| // When trying to apply the fusion pattern again, the matmul op won't be fused because the operand to fuse was not produced with an invertible indexing map. | ||
|
|
@@ -1079,4 +1145,4 @@ module { | |
| // CHECK-NOT: linalg.generic | ||
| // CHECK: tensor.expand_shape | ||
| // CHECK: linalg.generic {{.*}}, iterator_types = ["parallel", "parallel", "parallel", "parallel", "parallel", "parallel", "reduction"]} | ||
| // CHECK-SAME: ins(%[[ARG0]], %[[FUSED]]#1 : tensor<1x1x2x1xf32>, tensor<4x1x1x1xf32>) | ||
| // CHECK-SAME: ins(%[[ARG0]], %[[FUSED]]#1 : tensor<1x1x2x1xf32>, tensor<4x1x1x1xf32>) | ||
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is wrong. Please use interface methods to do it correctly:
llvm-project/mlir/include/mlir/Dialect/Linalg/IR/LinalgInterfaces.td
Line 724 in cdf52a1
or
llvm-project/mlir/include/mlir/Dialect/Linalg/IR/LinalgInterfaces.td
Line 491 in cdf52a1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getRegionBuilderis wrong since that just gets a function ref, but will usegetBlock. thanks for pointing it outThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm actually unsure how to apply this suggestion. the other one above makes sense since i'm just getting the blocks that have already been created. however this one is creating a block in the op's region and I dont see an interface method for getting a region. If region 0 isn't guaranteed, shouldn't the interface have a method for that?
Anyway,
fusedOpdoesn't have to be aLinalgOpfor these initial changes since the function that calls this one explicitly creates aGenericOp(see https://github.com/llvm/llvm-project/pull/144922/files#diff-a7543973103a3f3abb605911ca6d141dc4ffd4782b2bc0ad57890d11ab72e2c1R422). So it's probably better to just declarefusedOpasGenericOpfor this function and revert this line. Any thoughts?@rengolin had a discussion back when this PR first went up about generating named ops post fusion when possible. I think it was agreed that it makes sense to leave this as a TODO (see #144922 (comment)). So when we get there we can revisit how to do this, unless there's an obvious solution now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably better to declare it as a GenericOp yes, since the transformation always (today) returns a GenericOp anyway.
I think that is a different problem. My main concern is that expecting something from an interface when the interface doesn't gurantee it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is a separate problem. i just meant that when we make the changes to fuse to named ops (when possible) then
LinalgOpmight be re-introduced here and thus run into this issue again. But it is possible that more refactoring would need to happen anyway so wouldn't necessarily run back into this issue again here specifically.