Skip to content

Commit 29b1054

Browse files
authored
[mlir][linalg] Update pack and unpack documentation (#143903)
* Clarified the `inner_dim_pos` attribute in the case of high dimensionality tensors. * Added a 5D examples to show-case the use-cases that triggered this updated. * Added a reminder for linalg.unpack that number of elements are not required to be the same between input/output due to padding being dropped. I encountered some odd variations of `linalg.pack` and `linalg.unpack` while working on some TFLite models and the definition in the documentation did not match what I saw pass in IR verification. The following changes reconcile those differences. --------- Signed-off-by: Christopher McGirr <[email protected]>
1 parent 33d2082 commit 29b1054

File tree

3 files changed

+162
-19
lines changed

3 files changed

+162
-19
lines changed

mlir/include/mlir/Dialect/Linalg/IR/LinalgRelayoutOps.td

Lines changed: 57 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -93,17 +93,21 @@ def Linalg_PackOp : Linalg_RelayoutOp<"pack", [
9393
tensor of rank `n + k` with a tiled and packed layout (maybe with padding)
9494
and optionally transposes the tiled source tensor dimensions.
9595

96-
`inner_dims_pos` (mandatory) specifies `k` source tensor dimensions that are
97-
being tiled, where `0 < k <= n`. The order of the dimensions matters:
98-
- The tiled dimensions (of size `inner_tiles`) are added to the end of the result
99-
tensor in the order in which they appear in `inner_dims_pos`.
100-
- `inner_dims_pos[i]` specifies the source tensor dimension tiled by
101-
`inner_tiles[i]`.
102-
10396
`inner_tiles` (mandatory) specifies `k` tile sizes. These tile sizes
10497
correspond to the least significant ("inner") result tensor dimension sizes,
10598
in the same order. Tile sizes can be static or dynamic.
10699

100+
`inner_dims_pos` (mandatory) specifies `k` source tensor dimensions that are
101+
being tiled, where `0 <= k <= n`.
102+
- `inner_dims_pos[i]` specifies the source tensor dimension tiled by
103+
`inner_tiles[i]` where `0 <= i < k`. All the values in `inner_dims_pos` are
104+
within [0, n).
105+
- The tiled dimensions (of size `inner_tiles`) are added to the end of the
106+
result tensor in the order in which they appear, i.e.
107+
`shape(result)[rank(result) + i] = inner_tiles[i]` for `0 <= i < k`.
108+
- The following relationship for the tiled dimensions holds:
109+
`shape(result)[inner_dims_pos[i]] = shape(source)[inner_dims_pos[i]] / inner_tiles[i]`.
110+
107111
Example: If `inner_tiles = [16, 32]`, the result tensor has a shape of
108112
`...x16x32`. If `inner_dims_pos = [0, 1]`, the 0th source dimension is tiled
109113
by 16 and the 1st source dimension is tiled by 32. Other source dimensions
@@ -116,7 +120,19 @@ def Linalg_PackOp : Linalg_RelayoutOp<"pack", [
116120
%0 = linalg.pack %source inner_dims_pos = [0, 1] inner_tiles = [8, 32]
117121
into %dest : tensor<128x256xf32> -> tensor<16x8 x 8x32 xf32>
118122
// \ / \ /
119-
// outer dims inner dims
123+
// Outer Dims: 16x8 Inner Dims: 8x32
124+
125+
// CHW to CHWhw
126+
%0 = linalg.pack %source inner_dims_pos = [2, 1] inner_tiles = [4, 2]
127+
into %dest : tensor<3x20x24xf32> -> tensor<3x10x6 x 4x2 xf32>
128+
// \ / \ /
129+
// Outer Dims: 3x10x6 Inner Dims: 4x2
130+
131+
// HCW to HCWhw
132+
%0 = linalg.pack %source inner_dims_pos = [2, 0] inner_tiles = [4, 2]
133+
into %dest : tensor<18x3x32xf32> -> tensor<9x3x8 x 4x2 xf32>
134+
// \ / \ /
135+
// Outer Dims: 9x3x8 Inner Dims: 4x2
120136
```
121137

122138
`outer_dims_perm` (optional) specifies a permutation for the outer
@@ -246,13 +262,6 @@ def Linalg_UnPackOp : Linalg_RelayoutOp<"unpack"> {
246262
The "unpack" operation converts a source tensor of rank `n` with a tiled and
247263
packed layout to a result tensor of rank `n - k`.
248264

249-
`inner_dims_pos` (mandatory) specifies `k` source tensor dimensions with
250-
which the last `k` source tensor dimensions are combined, where
251-
`0 < k <= n/2`. Each `inner_dims_pos` element must be `>= 0` and `< n - k`.
252-
The order of the dimensions in `inner_dims_pos` matters: dimension
253-
`inner_dims_pos[i]` is combined with dimension `n - k + i` (assuming that
254-
`outer_dims_perm` is not specified).
255-
256265
`inner_tiles` (mandatory) specifies `k` tile sizes. These tile sizes
257266
correspond to the least significant ("inner") source tensor dimension sizes.
258267
The behavior of this op is undefined if:
@@ -262,21 +271,50 @@ def Linalg_UnPackOp : Linalg_RelayoutOp<"unpack"> {
262271
`inner_dims_pos[i]` (assuming that `outer_dims_perm` is not specified)
263272
evenly.
264273

274+
`inner_dims_pos` (mandatory) specifies `k` result tensor (i.e. unpacked
275+
tensor) dimensions that were tiled with the `inner_tiles` to create the
276+
packed source tensor. The source tensor (i.e. packed tensor) dimensions can
277+
be unpacked given `inner_dims_pos` as follows.
278+
- For `0 <= i < k` the following relationship holds:
279+
`shape(result)[inner_dims_pos[i]] <= shape(source)[n-k+i] * shape(source)[inner_dims_pos[i]]`.
280+
- For `0 <= j < n-k` and `j` not in `inner_dims_pos` the following relationship holds:
281+
`shape(result)[j] = shape(source)[j]`.
282+
265283
`outer_dims_perm` (optional) specifies a permutation for the outer
266284
dimensions. If specified, it must have `n - k` elements. If specified, this
267285
permutation is applied before combining any dimensions.
268286

269-
Example:
287+
Note, the unpack operation may drop any padding introduced by the pack
288+
operation and hence the following holds
289+
`NumElementsOf(source) >= NumElementsOf(result)`.
290+
291+
Examples:
270292

271293
```mlir
272294
// NCnc to NC:
273295
%0 = linalg.unpack %source inner_dims_pos = [0, 1] inner_tiles = [8, 32]
274-
into %dest : tensor<16x8x8x32xf32> -> tensor<128x256xf32>
296+
into %dest : tensor<16x8 x 8x32 xf32> -> tensor<128x256xf32>
297+
// \ / \ /
298+
// Outer Dims: 16x8 Inner Dims: 8x32
275299

276300
// CK to KCck:
277301
%0 = linalg.unpack %source outer_dims_perm = [1, 0] inner_dims_pos = [0, 1]
278-
inner_tiles = [8, 32] into %dest
279-
: tensor<8x16x8x32xf32> -> tensor<128x256xf32>
302+
inner_tiles = [8, 32]
303+
into %dest : tensor<8x16 x 8x32 xf32> -> tensor<128x256xf32>
304+
// \ / \ /
305+
// Outer Dims: 8x16 Inner Dims: 8x32
306+
307+
// CHW to CHWhw:
308+
%0 = linalg.unpack %source inner_dims_pos = [2, 1] inner_tiles = [4, 2]
309+
into %dest : tensor<3x10x6 x 4x2 xf32> -> tensor<3x20x24xf32>
310+
// \ / \ /
311+
// Outer Dims: 3x10x6 Inner Dims: 4x2
312+
313+
// HCW to HCWhw
314+
%0 = linalg.unpack %source inner_dims_pos = [2, 0] inner_tiles = [4, 2]
315+
into %dest : tensor<9x3x8 x 4x2 xf32> -> tensor<18x3x32xf32>
316+
// \ / \ /
317+
// Outer Dims: 9x3x8 Inner Dims: 4x2
280318
```
281319
}];
282320
let arguments = (ins AnyRankedTensor:$source,

mlir/test/Dialect/Linalg/invalid.mlir

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1824,6 +1824,16 @@ func.func @unpack_invalid_outer_dims_perm(%source: tensor<128x256xf32>, %dest: t
18241824

18251825
// -----
18261826

1827+
// The outer dims in the output tensor are incorrectly/unexpectedly transposed.
1828+
// This could be fixed by adding `outer_dims_perm = [1, 0]` (the default value assumes no transpose).
1829+
func.func @pack_invalid_result_shape(%input: tensor<256x128xf32>, %output: tensor<4x16x32x16xf32>) -> tensor<4x16x32x16xf32> {
1830+
// expected-error@+1 {{the shape of output is not large enough to hold the packed data. Expected at least 'tensor<16x4x32x16xf32>', got 'tensor<4x16x32x16xf32>'}}
1831+
%0 = linalg.pack %input inner_dims_pos = [1, 0] inner_tiles = [32, 16] into %output : tensor<256x128xf32> -> tensor<4x16x32x16xf32>
1832+
return %0 : tensor<4x16x32x16xf32>
1833+
}
1834+
1835+
// -----
1836+
18271837
func.func @pack_invalid(%input: tensor<256x128xf32>, %output: tensor<8x8x32x16xf32>) -> tensor<8x8x32x16xf32> {
18281838
// expected-error@+1 {{the shape of output is not large enough to hold the packed data. Expected at least 'tensor<8x8x16x32xf32>', got 'tensor<8x8x32x16xf32>'}}
18291839
%0 = linalg.pack %input inner_dims_pos = [1, 0] inner_tiles = [16, 32] into %output : tensor<256x128xf32> -> tensor<8x8x32x16xf32>

mlir/test/Dialect/Linalg/named-ops.mlir

Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2771,6 +2771,101 @@ func.func @pad_and_pack_partially_dynamic(%source: tensor<?x?xf32>, %dest: tenso
27712771

27722772
// -----
27732773

2774+
func.func @pack_transposed_inner_dims_with_padding(%source: tensor<1x5x7xf32>, %dest: tensor<1x3x2x4x2xf32>, %pad: f32) -> tensor<1x3x2x4x2xf32> {
2775+
%0 = linalg.pack %source padding_value(%pad : f32) inner_dims_pos = [2, 1] inner_tiles = [4, 2] into %dest : tensor<1x5x7xf32> -> tensor<1x3x2x4x2xf32>
2776+
return %0 : tensor<1x3x2x4x2xf32>
2777+
}
2778+
2779+
// CHECK-LABEL: func.func @pack_transposed_inner_dims_with_padding(
2780+
// CHECK-SAME: %[[SOURCE:.*]]: tensor<1x5x7xf32>,
2781+
// CHECK-SAME: %[[DEST:.*]]: tensor<1x3x2x4x2xf32>,
2782+
// CHECK-SAME: %[[PAD:.*]]: f32)
2783+
// CHECK: %{{.*}} = linalg.pack
2784+
// CHECK-SAME: inner_dims_pos = [2, 1]
2785+
// CHECK-SAME: inner_tiles = [4, 2]
2786+
// CHECK-SAME: into %[[DEST]] : tensor<1x5x7xf32> -> tensor<1x3x2x4x2xf32>
2787+
2788+
// -----
2789+
2790+
// The function suffix "with_padding" refers to the padding that was introduced by the pack operation. But here
2791+
// we are dropping the padding. Creating a tensor with less elements than what we started with.
2792+
func.func @unpack_descending_inner_dims_with_padding(%source: tensor<1x3x2x4x2xf32>, %dest: tensor<1x5x7xf32>) -> tensor<1x5x7xf32> {
2793+
%0 = linalg.unpack %source inner_dims_pos = [2, 1] inner_tiles = [4, 2] into %dest : tensor<1x3x2x4x2xf32> -> tensor<1x5x7xf32>
2794+
return %0 : tensor<1x5x7xf32>
2795+
}
2796+
2797+
// CHECK-LABEL: func.func @unpack_descending_inner_dims_with_padding(
2798+
// CHECK-SAME: %[[SOURCE:.*]]: tensor<1x3x2x4x2xf32>,
2799+
// CHECK-SAME: %[[DEST:.*]]: tensor<1x5x7xf32>)
2800+
// CHECK: %{{.*}} = linalg.unpack
2801+
// CHECK-SAME: inner_dims_pos = [2, 1]
2802+
// CHECK-SAME: inner_tiles = [4, 2]
2803+
// CHECK-SAME: into %[[DEST]] : tensor<1x3x2x4x2xf32> -> tensor<1x5x7xf32>
2804+
2805+
// -----
2806+
2807+
func.func @pack_non_adjacent_inner_dims(%source: tensor<20x1x12xf32>, %dest: tensor<10x1x3x4x2xf32>) -> tensor<10x1x3x4x2xf32> {
2808+
%0 = linalg.pack %source inner_dims_pos = [2, 0] inner_tiles = [4, 2] into %dest : tensor<20x1x12xf32> -> tensor<10x1x3x4x2xf32>
2809+
return %0 : tensor<10x1x3x4x2xf32>
2810+
}
2811+
2812+
// CHECK-LABEL: func.func @pack_non_adjacent_inner_dims(
2813+
// CHECK-SAME: %[[SOURCE:.*]]: tensor<20x1x12xf32>,
2814+
// CHECK-SAME: %[[DEST:.*]]: tensor<10x1x3x4x2xf32>)
2815+
// CHECK: %{{.*}} = linalg.pack
2816+
// CHECK-SAME: inner_dims_pos = [2, 0]
2817+
// CHECK-SAME: inner_tiles = [4, 2]
2818+
// CHECK-SAME: into %[[DEST]] : tensor<20x1x12xf32> -> tensor<10x1x3x4x2xf32>
2819+
2820+
// -----
2821+
2822+
func.func @unpack_non_adjacent_inner_dims(%source: tensor<10x1x3x4x2xf32>, %dest: tensor<20x1x12xf32>) -> tensor<20x1x12xf32> {
2823+
%0 = linalg.unpack %source inner_dims_pos = [2, 0] inner_tiles = [4, 2] into %dest : tensor<10x1x3x4x2xf32> -> tensor<20x1x12xf32>
2824+
return %0 : tensor<20x1x12xf32>
2825+
}
2826+
2827+
// CHECK-LABEL: func.func @unpack_non_adjacent_inner_dims(
2828+
// CHECK-SAME: %[[SOURCE:.*]]: tensor<10x1x3x4x2xf32>,
2829+
// CHECK-SAME: %[[DEST:.*]]: tensor<20x1x12xf32>)
2830+
// CHECK: %{{.*}} = linalg.unpack
2831+
// CHECK-SAME: inner_dims_pos = [2, 0]
2832+
// CHECK-SAME: inner_tiles = [4, 2]
2833+
// CHECK-SAME: into %[[DEST]] : tensor<10x1x3x4x2xf32> -> tensor<20x1x12xf32>
2834+
2835+
// -----
2836+
2837+
func.func @pack_implementing_transpose(%source: tensor<3x5x7xf32>, %dest: tensor<3x7x5xf32>) -> tensor<3x7x5xf32> {
2838+
%0 = linalg.pack %source outer_dims_perm = [0, 2, 1] inner_dims_pos = [] inner_tiles = [] into %dest : tensor<3x5x7xf32> -> tensor<3x7x5xf32>
2839+
return %0 : tensor<3x7x5xf32>
2840+
}
2841+
2842+
// CHECK-LABEL: func.func @pack_implementing_transpose(
2843+
// CHECK-SAME: %[[SOURCE:.*]]: tensor<3x5x7xf32>,
2844+
// CHECK-SAME: %[[DEST:.*]]: tensor<3x7x5xf32>)
2845+
// CHECK: %{{.*}} = linalg.pack
2846+
// CHECK-SAME: outer_dims_perm = [0, 2, 1]
2847+
// CHECK-SAME: inner_dims_pos = []
2848+
// CHECK-SAME: inner_tiles = []
2849+
// CHECK-SAME: into %[[DEST]] : tensor<3x5x7xf32> -> tensor<3x7x5xf32>
2850+
2851+
// -----
2852+
2853+
func.func @unpack_implementing_transpose(%source: tensor<3x7x5xf32>, %dest: tensor<3x5x7xf32>) -> tensor<3x5x7xf32> {
2854+
%0 = linalg.unpack %source outer_dims_perm = [0, 2, 1] inner_dims_pos = [] inner_tiles = [] into %dest : tensor<3x7x5xf32> -> tensor<3x5x7xf32>
2855+
return %0 : tensor<3x5x7xf32>
2856+
}
2857+
2858+
// CHECK-LABEL: func.func @unpack_implementing_transpose(
2859+
// CHECK-SAME: %[[SOURCE:.*]]: tensor<3x7x5xf32>,
2860+
// CHECK-SAME: %[[DEST:.*]]: tensor<3x5x7xf32>)
2861+
// CHECK: %{{.*}} = linalg.unpack
2862+
// CHECK-SAME: outer_dims_perm = [0, 2, 1]
2863+
// CHECK-SAME: inner_dims_pos = []
2864+
// CHECK-SAME: inner_tiles = []
2865+
// CHECK-SAME: into %[[DEST]] : tensor<3x7x5xf32> -> tensor<3x5x7xf32>
2866+
2867+
// -----
2868+
27742869
func.func @unpack_fully_dynamic(%source: tensor<?x?x?x?xf32>, %dest: tensor<?x?xf32>, %tile_n : index, %tile_m : index) -> tensor<?x?xf32> {
27752870
%0 = linalg.unpack %source inner_dims_pos = [0, 1] inner_tiles = [%tile_n, %tile_m] into %dest : tensor<?x?x?x?xf32> -> tensor<?x?xf32>
27762871
return %0 : tensor<?x?xf32>

0 commit comments

Comments
 (0)