Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
#include "mlir/Dialect/UB/IR/UBOps.h"
#include "mlir/Dialect/Utils/IndexingUtils.h"
#include "mlir/IR/Dominance.h"
#include "mlir/IR/TypeUtilities.h"
#include "llvm/ADT/SetOperations.h"
#include "llvm/ADT/SetVector.h"
#include "llvm/ADT/TypeSwitch.h"
Expand Down Expand Up @@ -289,9 +290,11 @@ getOrCreatePackedViewOfOperand(OpBuilder &b, Location loc, PackInfo packInfo,

auto empty = linalg::PackOp::createDestinationTensor(
b, loc, opOperand->get(), innerTileSizes, innerDimsPos, outerDimsPerm);
auto packedOperand = linalg::PackOp::create(
b, loc, opOperand->get(), empty, innerDimsPos, innerTileSizes,
/*padding=*/std::nullopt, outerDimsPerm);
auto poison = ub::PoisonOp::create(
b, loc, getElementTypeOrSelf(opOperand->get().getType()));
Comment on lines +342 to +343
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than explicitly creating the pad value by the user, why not take approach similar to #146088? (i.e. make the padding value "optional" and make the builder worry about it)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is okay to have a separate builder to do what you shared, and it makes sense. However, we don't do it in this builder, because users can decide to not use padding values in dynamic shapes at risk. I think it is okay to create a new builder that has such behavior, but we should leave the current builder as what it is. See below explanation for some details. (One of the difference is that padding is optional to pack ops, but padding is required by vector.trasnfer_read op.)

It is hard to make the builder to check if padding value is required or not in dynamic shapes. To me, there are soft check and hard check for padding value requirement. E.g., the existing method is a soft check that returns false in dynamic cases. I'm proposing a hard check (in IREE's issue), and use it with the control that @nirvedhmeshram is working on.

(The hard check version returns true in dynamic cases.)

bool PackOp::requirePaddingValue(ArrayRef<int64_t> inputShape,
                                 ArrayRef<int64_t> innerDimsPos,
                                 ArrayRef<int64_t> outputShape,
                                 ArrayRef<int64_t> outerDimsPerm,
                                 ArrayRef<OpFoldResult> innerTiles) {
  SmallVector<int64_t> outputTileSizes(
      outputShape.take_front(inputShape.size()));
  if (!outerDimsPerm.empty()) {
    assert(outerDimsPerm.size() == outputTileSizes.size() &&
           "expected output and outer_dims_perm to have same size");
    applyPermutationToVector(outputTileSizes,
                             invertPermutationVector(outerDimsPerm));
  }
  for (auto [pos, tileSize] : llvm::zip_equal(innerDimsPos, innerTiles)) {
    if (ShapedType::isDynamic(inputShape[pos]))
      continue;
    std::optional<int64_t> constantTile = getConstantIntValue(tileSize);

    if (!constantTile) {
      if (ShapedType::isStatic(outputTileSizes[pos]) &&
          (inputShape[pos] % outputTileSizes[pos] != 0))
        return true;
    } else if (inputShape[pos] % (*constantTile) != 0) {
      return true;
    }
  }
  return false;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For another example -- that I don't see in practice so far, which is just my hypothesis -- you may drop the padding value in this case:

  • The inner tile size is 4.
  • You have int range analysis that tells you the packing dimension size is a multiple of 4 (which is a dynamic shape).

auto packedOperand =
linalg::PackOp::create(b, loc, opOperand->get(), empty, innerDimsPos,
innerTileSizes, poison, outerDimsPerm);
return std::make_tuple(packedOperand, indexingMap);
}

Expand Down
35 changes: 31 additions & 4 deletions mlir/test/Dialect/Linalg/data-layout-propagation.mlir
Original file line number Diff line number Diff line change
Expand Up @@ -1450,6 +1450,33 @@ func.func @push_unpack_in_padded_domain_out_used(%arg0: tensor<8x8x4x8xf32>, %ar

// -----

#map = affine_map<(d0, d1) -> (d0, d1)>
func.func @push_unpack_in_padded_domain_multiple_inputs(%arg0: tensor<1x4x16x16xf32>, %arg1: tensor<8x64xf32>, %arg2: tensor<8x64xf32>) -> tensor<8x64xf32> {
%0 = tensor.empty() : tensor<8x64xf32>
%unpack = linalg.unpack %arg0 inner_dims_pos = [0, 1] inner_tiles = [16, 16] into %0 : tensor<1x4x16x16xf32> -> tensor<8x64xf32>
%1 = linalg.generic {indexing_maps = [#map, #map, #map], iterator_types = ["parallel", "parallel"]} ins(%arg1, %unpack : tensor<8x64xf32>, tensor<8x64xf32>) outs(%arg2 : tensor<8x64xf32>) {
^bb0(%in: f32, %in_0: f32, %out: f32):
%2 = arith.addf %in, %in_0 : f32
linalg.yield %2 : f32
} -> tensor<8x64xf32>
return %1 : tensor<8x64xf32>
}
// CHECK-LABEL: func.func @push_unpack_in_padded_domain_multiple_inputs
// CHECK-SAME: %[[ARG0:[a-zA-Z0-9]+]]
// CHECK-SAME: %[[ARG1:[a-zA-Z0-9]+]]
// CHECK-SAME: %[[ARG2:[a-zA-Z0-9]+]]
// CHECK-DAG: %[[POISON:.+]] = ub.poison : f32
// CHECK: %[[PACK:.+]] = linalg.pack %[[ARG1]] padding_value(%[[POISON]] : f32)
// CHECK-SAME: inner_dims_pos = [0, 1] inner_tiles = [16, 16]
// CHECK: %[[ELEM:.+]] = linalg.generic
// CHECK: ins(%[[PACK]], %[[ARG0]]
// CHECK: %[[UNPACK:.+]] = linalg.unpack %[[ELEM]]
// CHECK-SAME: inner_dims_pos = [0, 1] inner_tiles = [16, 16]
// CHECK-SAME: into %[[ARG2]]
// CHECK: return %[[UNPACK]]

// -----

module {
func.func @push_extract_through_generic(%arg0: tensor<128x7x128xf32>, %arg1: tensor<?x5x3x128xf32>, %arg2: tensor<?x5x128xbf16>, %arg3: index) -> tensor<?x5x128xbf16> {
%extracted_slice = tensor.extract_slice %arg0[0, 0, %arg3] [128, 7, %arg3] [1, 1, 1] : tensor<128x7x128xf32> to tensor<128x7x?xf32>
Expand All @@ -1473,7 +1500,7 @@ module {
// CHECK: } : tensor<?x5x3x128xf32> to tensor<?x5x3x128xf32>
// CHECK: %[[EMPTY:.+]] = tensor.empty() : tensor<128x5x128xbf16>
// CHECK: %[[GENERIC:.+]] = linalg.generic
// CHECK-SAME: ins(%[[ARG0]], %[[PADDED]]
// CHECK-SAME: ins(%[[ARG0]], %[[PADDED]]
// CHECK-SAME: outs(%[[EMPTY]]
// CHECK: %[[EXTRACT:.+]] = tensor.extract_slice %3[%[[ARG3]], 0, 0] [%[[ARG3]], 5, 128] [1, 1, 1] : tensor<128x5x128xbf16> to tensor<?x5x128xbf16>
// CHECK: return %[[EXTRACT]]
Expand All @@ -1492,7 +1519,7 @@ func.func @nopush_extract_through_generic_nodimexpr1(%arg0: tensor<128x7x128xf32

// CHECK-LABEL: func.func @nopush_extract_through_generic_nodimexpr1
// CHECK: %[[GENERIC:.+]] = linalg.generic
// CHECK: return %[[GENERIC]]
// CHECK: return %[[GENERIC]]

// -----

Expand All @@ -1508,7 +1535,7 @@ func.func @nopush_extract_through_generic_nodimexpr2(%arg0: tensor<128x?x128xf32

// CHECK-LABEL: func.func @nopush_extract_through_generic_nodimexpr2
// CHECK: %[[GENERIC:.+]] = linalg.generic
// CHECK: return %[[GENERIC]]
// CHECK: return %[[GENERIC]]

// -----

Expand Down Expand Up @@ -1575,7 +1602,7 @@ func.func @push_extract_through_generic_rank0_operand(%arg0: tensor<128x128xf32>

// CHECK-LABEL: func.func @push_extract_through_generic_rank0_operand
// CHECK: %[[GENERIC:.+]] = linalg.generic
// CHECK: %[[EXTRACT:.+]] = tensor.extract_slice %[[GENERIC]]
// CHECK: %[[EXTRACT:.+]] = tensor.extract_slice %[[GENERIC]]
// CHECK: return %[[EXTRACT]]

// -----
Expand Down