Skip to content
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,7 @@
// DEFINE: -transform-interpreter -test-transform-dialect-erase-schedule |\
// DEFINE: mlir-opt --test-linalg-transform-patterns="test-decompose-tensor-pack"\
// DEFINE: --test-transform-dialect-erase-schedule \
// DEFINE: -one-shot-bufferize="bufferize-function-boundaries" \
// DEFINE: -buffer-deallocation-pipeline="private-function-dynamic-ownership" \
// DEFINE: -cse -canonicalize -test-lower-to-llvm -o %t
// DEFINE: -test-lower-to-llvm -o %t
// DEFINE: %{entry_point} = main
// DEFINE: %{run} = mlir-cpu-runner %t -e %{entry_point} -entry-point-result=void \
// DEFINE: -shared-libs=%mlir_runner_utils,%mlir_c_runner_utils
Expand Down Expand Up @@ -84,12 +82,30 @@ func.func private @pack(%A: tensor<7x16xi32>) {
}

module @transforms attributes { transform.with_named_sequence } {
transform.named_sequence @__transform_main(%module: !transform.any_op {transform.readonly}) {
transform.named_sequence @__transform_main(%module: !transform.any_op {transform.consume}) {
%pack = transform.structured.match ops{["tensor.pack"]} in %module : (!transform.any_op) -> !transform.any_op

%tiled_linalg_op_p, %loops:2 = transform.structured.tile_using_for %pack tile_sizes [1, 1]
// 1. Tile so that we can decompose tensor.pack into tensor.pad,
// linalg.transpose, etc (see step 2).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got this from LinalgTransformOps.td which answered my doubt.
Rewrite a tensor.pack into tensor.pad + tensor.expand_shape + linalg.transpose.

This removes need for ambigious inalg.transpose, etc (see step 2).

I dont know if its too much to ask and they may be test example elsewhere , but what does

%A_pack = tensor.pack %A
    padding_value(%pad_val : i32)
    inner_dims_pos = [0, 1]
    inner_tiles = [%tile_size, 1]
    into %A_pack_empty : tensor<7x16xi32> -> tensor<?x16x?x1xi32>
  %A_cast = tensor.cast %A_pack : tensor<?x16x?x1xi32> to tensor<*xi32>

is expected to become ... even in scant mode as comment would be super.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a valid request - thank you for bringing it up, and please don’t hesitate to ask in the future. I've been working on this for so long that I may have lost perspective on what's obvious versus what could use more explanation.

I've added some additional comments to clarify what’s happening here. I’ve intentionally skipped some finer details to keep the explanation focused and easier to follow.

%tiled_pack_op_p, %loops:2 = transform.structured.tile_using_for %pack tile_sizes [1, 1]
: (!transform.any_op) -> (!transform.any_op, !transform.any_op, !transform.any_op)

// 2. Decompose the tiled Op into tensor.pad etc
%func_1 = transform.get_parent_op %tiled_pack_op_p {isolated_from_above} : (!transform.any_op) -> !transform.any_op
transform.apply_patterns to %func_1 {
transform.apply_patterns.linalg.decompose_pack_unpack
} : !transform.any_op

// 3. Bufferize before lowering to LLVM
%bufferize = transform.bufferization.one_shot_bufferize %module
{bufferize_function_boundaries=true} : (!transform.any_op) -> !transform.any_op

// 4. Canonicalize
%func_2 = transform.structured.match ops{["func.func"]} in %bufferize : (!transform.any_op) -> !transform.op<"func.func">
transform.apply_patterns to %func_2 {
transform.apply_patterns.canonicalization
} : !transform.op<"func.func">

transform.yield
}
}
Expand Down
Loading