Skip to content

Conversation

@banach-space
Copy link
Contributor

The semantics of createReadOrMaskedRead and createWriteOrMaskedWrite
are currently a bit inconsistent and not fully documented:

  • The input vector sizes are passed as readShape and
    inputVectorSizes, respectively — inconsistent naming.

  • Currently, the input vector sizes in createWriteOrMaskedWrite are
    not required to be complete: any missing trailing sizes are inferred
    from the destination tensor. This only works when the destination
    tensor is statically shaped.

  • Unlike createReadOrMaskedRead, the documentation for
    createWriteOrMaskedWrite does not specify that write offsets are
    hard-coded to 0.

This PR only updates the documentation and unifies the naming. As such,
it is NFC.

A follow-up PR will generalize and unify the implementation to support,
for example, dynamically shaped destination tensors — a requirement for
enabling scalable vectorization of linalg.pack and linalg.unpack.

…(nfc)

The semantics of `createReadOrMaskedRead` and `createWriteOrMaskedWrite`
are currently a bit inconsistent and not fully documented:

* The input vector sizes are passed as `readShape` and
  `inputVectorSizes`, respectively — inconsistent naming.

* Currently, the input vector sizes in `createWriteOrMaskedWrite` are
  not required to be complete: any missing trailing sizes are inferred
  from the destination tensor. This only works when the destination
  tensor is statically shaped.

* Unlike `createReadOrMaskedRead`, the documentation for
  `createWriteOrMaskedWrite` does not specify that write offsets are
  hard-coded to 0.

This PR only updates the documentation and unifies the naming. As such,
it is NFC.

A follow-up PR will generalize and unify the implementation to support,
for example, dynamically shaped destination tensors — a requirement for
enabling scalable vectorization of `linalg.pack` and `linalg.unpack`.
@llvmbot
Copy link
Member

llvmbot commented Apr 11, 2025

@llvm/pr-subscribers-mlir-vector
@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-mlir-linalg

Author: Andrzej Warzyński (banach-space)

Changes

The semantics of createReadOrMaskedRead and createWriteOrMaskedWrite
are currently a bit inconsistent and not fully documented:

  • The input vector sizes are passed as readShape and
    inputVectorSizes, respectively — inconsistent naming.

  • Currently, the input vector sizes in createWriteOrMaskedWrite are
    not required to be complete: any missing trailing sizes are inferred
    from the destination tensor. This only works when the destination
    tensor is statically shaped.

  • Unlike createReadOrMaskedRead, the documentation for
    createWriteOrMaskedWrite does not specify that write offsets are
    hard-coded to 0.

This PR only updates the documentation and unifies the naming. As such,
it is NFC.

A follow-up PR will generalize and unify the implementation to support,
for example, dynamically shaped destination tensors — a requirement for
enabling scalable vectorization of linalg.pack and linalg.unpack.


Full diff: https://github.com/llvm/llvm-project/pull/135350.diff

3 Files Affected:

  • (modified) mlir/include/mlir/Dialect/Vector/Utils/VectorUtils.h (+11-14)
  • (modified) mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp (+73-28)
  • (modified) mlir/lib/Dialect/Vector/Utils/VectorUtils.cpp (+11-9)
diff --git a/mlir/include/mlir/Dialect/Vector/Utils/VectorUtils.h b/mlir/include/mlir/Dialect/Vector/Utils/VectorUtils.h
index 5f32aca88a273..6609b28d77b6c 100644
--- a/mlir/include/mlir/Dialect/Vector/Utils/VectorUtils.h
+++ b/mlir/include/mlir/Dialect/Vector/Utils/VectorUtils.h
@@ -211,21 +211,18 @@ struct MaskableOpRewritePattern : OpRewritePattern<SourceOp> {
 /// are not linearizable.
 bool isLinearizableVector(VectorType type);
 
-/// Create a TransferReadOp from `source` with static shape `readShape`. If the
-/// vector type for the read is not the same as the type of `source`, then a
-/// mask is created on the read, if use of mask is specified or the bounds on a
-/// dimension are different.
-///
-/// `useInBoundsInsteadOfMasking` if false, the inBoundsVal values are set
-/// properly, based on
-///   the rank dimensions of the source and destination tensors. And that is
-///   what determines if masking is done.
-///
-/// Note that the internal `vector::TransferReadOp` always read at indices zero
-/// for each dimension of the passed in tensor.
+/// Creates a TransferReadOp from `source`.
+///
+/// The shape of the vector to read is specified via `inputVectorSizes`. If the
+/// shape of the output vector differs from the shape of the value being read,
+/// masking is used to avoid out-of-bounds accesses. Set
+/// `useInBoundsInsteadOfMasking` to `true` to use the "in_bounds" attribute
+/// instead of explicit masks.
+///
+/// Note: all read offsets are set to 0.
 Value createReadOrMaskedRead(OpBuilder &builder, Location loc, Value source,
-                             ArrayRef<int64_t> readShape, Value padValue,
-                             bool useInBoundsInsteadOfMasking);
+                             ArrayRef<int64_t> inputVectorSizes, Value padValue,
+                             bool useInBoundsInsteadOfMasking = false);
 
 /// Returns success if `inputVectorSizes` is a valid masking configuraion for
 /// given `shape`, i.e., it meets:
diff --git a/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp b/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
index 8c8b1b85ef5a3..534a53ffe7b83 100644
--- a/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
+++ b/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
@@ -1506,19 +1506,48 @@ static SmallVector<int64_t> getTiledPackShape(linalg::PackOp packOp,
   return applyPermutation(destShape, linalg::getPackInverseDestPerm(packOp));
 }
 
-/// Given an input, the mixed destSizes, and the vector sizes for vectorization,
-/// create an empty destination tensor and create a TransferWriteOp from the
-/// input to the empty tensor. If the destination shape is not the same as the
-/// inputVectorSizes for the first rank(inputVectorSizes) dims, then create a
-/// mask for the write. If `useInBoundsInsteadOfMasking` is set, then update the
-/// inBounds attribute of the transfer write op instead of masking.
-static Operation *createWriteOrMaskedWrite(OpBuilder &builder, Location loc,
-                                           Value input,
-                                           SmallVector<OpFoldResult> destSizes,
-                                           ArrayRef<int64_t> inputVectorSizes,
-                                           bool useInBoundsInsteadOfMasking) {
+/// Creates a TransferWriteOp to write `input` into a newly initialized
+/// output tensor.
+///
+/// Given:
+/// - an input vector to write,
+/// - the mixed destination sizes for the output tensor,
+/// - and the vector sizes used for vectorization (i.e., the leading N dims),
+///
+/// this function generates the following sequence of ops:
+///
+///   %dest = tensor.empty(%destSizes)
+///   %res = vector.transfer_write %input into %dest
+///
+/// If the leading N dimensions of the destination tensor do not match
+/// `inputVecSizesForLeadingDims` (where N =
+/// rank(`inputVecSizesForLeadingDims`)), masking is applied to ensure
+/// correctness:
+///
+///   %dest = tensor.empty(%destSizes)
+///   %write = vector.transfer_write %input into %dest
+///   %mask = vector.create_mask(%destSizes)
+///   %res = vector.mask %mask { %write }
+///
+/// If `useInBoundsInsteadOfMasking` is set to `true`, the `in_bounds` attribute
+/// is used instead of masking:
+///
+///   %dest = tensor.empty(%destSizes)
+///   in_bounds_flags = (...)
+///   %res = vector.transfer_write %input into %dest
+///       {in_bounds = in_bounds_flags}
+///
+/// Note: all write offsets are set to 0.
+static Operation *
+createWriteOrMaskedWrite(OpBuilder &builder, Location loc, Value input,
+                         SmallVector<OpFoldResult> destSizes,
+                         ArrayRef<int64_t> inputVecSizesForLeadingDims,
+                         bool useInBoundsInsteadOfMasking = false) {
 
   auto inputType = cast<VectorType>(input.getType());
+  assert(inputType.getRank() == static_cast<int64_t>(destSizes.size()) &&
+         "Rank mismatch!");
+
   Value dest = builder.create<tensor::EmptyOp>(loc, destSizes,
                                                inputType.getElementType());
   int64_t rank = cast<ShapedType>(dest.getType()).getRank();
@@ -1526,9 +1555,13 @@ static Operation *createWriteOrMaskedWrite(OpBuilder &builder, Location loc,
   auto destShape = cast<ShapedType>(dest.getType()).getShape();
   SmallVector<bool> inBoundsVal(rank, true);
   if (useInBoundsInsteadOfMasking) {
+    // In this case, assume that all the required vector sizes have been
+    // provided.
+    assert(inputVecSizesForLeadingDims.size() == destSizes.size() &&
+           "Insufficient number of input vector sizes!");
     // Update the inBounds attribute.
     for (unsigned i = 0; i < rank; i++)
-      inBoundsVal[i] = (destShape[i] == inputVectorSizes[i]) &&
+      inBoundsVal[i] = (destShape[i] == inputVecSizesForLeadingDims[i]) &&
                        !ShapedType::isDynamic(destShape[i]);
   }
   Operation *write = builder.create<vector::TransferWriteOp>(
@@ -1538,17 +1571,20 @@ static Operation *createWriteOrMaskedWrite(OpBuilder &builder, Location loc,
       /*indices=*/SmallVector<Value>(rank, zero),
       /*inBounds=*/inBoundsVal);
   assert(llvm::none_of(
-             destShape.drop_front(inputVectorSizes.size()),
+             destShape.drop_front(inputVecSizesForLeadingDims.size()),
              [](int64_t size) { return size == ShapedType::kDynamic; }) &&
-         "Only dims aligned with inputVectorSizes may be dynamic");
+         "Only dims aligned with inputVecSizesForLeadingDims may be dynamic");
   if (useInBoundsInsteadOfMasking)
     return write;
-  bool needMaskForWrite = !llvm::equal(
-      inputVectorSizes, destShape.take_front(inputVectorSizes.size()));
+  bool needMaskForWrite =
+      !llvm::equal(inputVecSizesForLeadingDims,
+                   destShape.take_front(inputVecSizesForLeadingDims.size()));
   if (needMaskForWrite) {
     SmallVector<int64_t> writeMaskShape;
-    writeMaskShape.append(inputVectorSizes.begin(), inputVectorSizes.end());
-    writeMaskShape.append(destShape.begin() + inputVectorSizes.size(),
+    writeMaskShape.append(inputVecSizesForLeadingDims.begin(),
+                          inputVecSizesForLeadingDims.end());
+    writeMaskShape.append(destShape.begin() +
+                              inputVecSizesForLeadingDims.size(),
                           destShape.end());
     auto writeMaskType = VectorType::get(writeMaskShape, builder.getI1Type());
     Value maskForWrite =
@@ -1558,9 +1594,11 @@ static Operation *createWriteOrMaskedWrite(OpBuilder &builder, Location loc,
   return write;
 }
 
-/// Vectorize linalg::PackOp with (1) static innerTiles (2) constant
+/// Vectorize linalg::PackOp with (1) static inner_tiles (2) constant
 /// padding value and (3) input vector sizes into:
-/// masked_transfer_read->shape_cast->transpose->transfer_write_in_bounds
+///
+///   masked_transfer_read->shape_cast->transpose->transfer_write_in_bounds
+///
 /// As in the following example:
 /// %pack = tensor.pack %src inner_dims_pos = [2, 1] inner_tiles = [16, 2]
 ///     into %dst : tensor<32x8x16xf32> -> tensor<32x4x1x16x2xf32>
@@ -1582,8 +1620,11 @@ static Operation *createWriteOrMaskedWrite(OpBuilder &builder, Location loc,
 ///     : vector<32x4x1x16x2xf32>, tensor<32x4x1x16x2xf32>
 ///
 /// If the (3) input vector sizes are not provided, the vector sizes are
-/// determined by the result tensor shape. Also, we update the inBounds
-/// attribute instead of masking.
+/// determined by the result tensor shape and the `in_bounds`
+/// attribute is used instead of masking to mark
+///
+/// NOTE: The input vector sizes specify the dimensions corresponding to the
+/// outer dimensions of the output tensor. The remaining dimensions are
 static LogicalResult
 vectorizeAsTensorPackOp(RewriterBase &rewriter, linalg::PackOp packOp,
                         ArrayRef<int64_t> inputVectorSizes,
@@ -1644,9 +1685,11 @@ vectorizeAsTensorPackOp(RewriterBase &rewriter, linalg::PackOp packOp,
       loc, shapeCastOp.getResult(), destPermutation);
 
   // Create TransferWriteOp.
-  Operation *write = createWriteOrMaskedWrite(
-      rewriter, loc, transposeOp.getResult(), reifiedReturnShapes[0],
-      inputVectorSizes, /*useInBoundsInsteadOfMasking=*/false);
+  Operation *write =
+      createWriteOrMaskedWrite(rewriter, loc, transposeOp.getResult(),
+                               /*destSizes=*/reifiedReturnShapes[0],
+                               /*inputVecSizesForLeadingDims=*/inputVectorSizes,
+                               /*useInBoundsInsteadOfMasking=*/false);
   newResults.push_back(write->getResult(0));
   return success();
 }
@@ -1780,8 +1823,9 @@ vectorizeAsTensorUnpackOp(RewriterBase &rewriter, linalg::UnPackOp unpackOp,
           ? vectorSizes
           : shapeCastOp.getResultVectorType().getShape());
   Operation *write = createWriteOrMaskedWrite(
-      rewriter, loc, shapeCastOp.getResult(), reifiedRetShapes[0],
-      writeVectorSizes, useInBoundsInsteadOfMasking);
+      rewriter, loc, shapeCastOp.getResult(), /*destSizes=*/reifiedRetShapes[0],
+      /*inputVecSizesForLeadingDims=*/writeVectorSizes,
+      useInBoundsInsteadOfMasking);
   newResults.push_back(write->getResult(0));
   return success();
 }
@@ -1810,7 +1854,8 @@ vectorizeAsTensorPadOp(RewriterBase &rewriter, tensor::PadOp padOp,
       rewriter, loc, padOp.getSource(), inputVectorSizes, padValue,
       /*useInBoundsInsteadOfMasking=*/false);
   Operation *write = createWriteOrMaskedWrite(
-      rewriter, loc, maskedRead, reifiedReturnShapes[0], inputVectorSizes,
+      rewriter, loc, maskedRead, reifiedReturnShapes[0],
+      /*inputVecSizesForLeadingDims=*/inputVectorSizes,
       /*useInBoundsInsteadOfMasking=*/false);
   newResults.push_back(write->getResult(0));
   return success();
diff --git a/mlir/lib/Dialect/Vector/Utils/VectorUtils.cpp b/mlir/lib/Dialect/Vector/Utils/VectorUtils.cpp
index 7b56cd0cf0e91..024e8ccb901de 100644
--- a/mlir/lib/Dialect/Vector/Utils/VectorUtils.cpp
+++ b/mlir/lib/Dialect/Vector/Utils/VectorUtils.cpp
@@ -327,26 +327,28 @@ bool vector::isLinearizableVector(VectorType type) {
 }
 
 Value vector::createReadOrMaskedRead(OpBuilder &builder, Location loc,
-                                     Value source, ArrayRef<int64_t> readShape,
+                                     Value source,
+                                     ArrayRef<int64_t> inputVectorSizes,
                                      Value padValue,
                                      bool useInBoundsInsteadOfMasking) {
-  assert(llvm::none_of(readShape,
+  assert(llvm::none_of(inputVectorSizes,
                        [](int64_t s) { return s == ShapedType::kDynamic; }) &&
-         "expected static shape");
+         "invalid input vector sizes");
   auto sourceShapedType = cast<ShapedType>(source.getType());
   auto sourceShape = sourceShapedType.getShape();
-  assert(sourceShape.size() == readShape.size() && "expected same ranks.");
-  auto maskType = VectorType::get(readShape, builder.getI1Type());
-  auto vectorType = VectorType::get(readShape, padValue.getType());
+  assert(sourceShape.size() == inputVectorSizes.size() &&
+         "expected same ranks.");
+  auto maskType = VectorType::get(inputVectorSizes, builder.getI1Type());
+  auto vectorType = VectorType::get(inputVectorSizes, padValue.getType());
   assert(padValue.getType() == sourceShapedType.getElementType() &&
          "expected same pad element type to match source element type");
-  int64_t readRank = readShape.size();
+  int64_t readRank = inputVectorSizes.size();
   auto zero = builder.create<arith::ConstantIndexOp>(loc, 0);
   SmallVector<bool> inBoundsVal(readRank, true);
   if (useInBoundsInsteadOfMasking) {
     // Update the inBounds attribute.
     for (unsigned i = 0; i < readRank; i++)
-      inBoundsVal[i] = (sourceShape[i] == readShape[i]) &&
+      inBoundsVal[i] = (sourceShape[i] == inputVectorSizes[i]) &&
                        !ShapedType::isDynamic(sourceShape[i]);
   }
   auto transferReadOp = builder.create<vector::TransferReadOp>(
@@ -357,7 +359,7 @@ Value vector::createReadOrMaskedRead(OpBuilder &builder, Location loc,
       /*padding=*/padValue,
       /*inBounds=*/inBoundsVal);
 
-  if (llvm::equal(readShape, sourceShape) || useInBoundsInsteadOfMasking)
+  if (llvm::equal(inputVectorSizes, sourceShape) || useInBoundsInsteadOfMasking)
     return transferReadOp;
   SmallVector<OpFoldResult> mixedSourceDims =
       tensor::getMixedSizes(builder, loc, source);

Copy link
Contributor

@Max191 Max191 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! One more place where I think the comments may have been cut off. Otherwise LGTM.

Copy link
Contributor

@Max191 Max191 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@banach-space banach-space merged commit f215a61 into llvm:main Apr 15, 2025
11 checks passed
@banach-space banach-space deleted the andrzej/vector/update_create_masked_read_write branch April 15, 2025 14:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants