-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[mlir][linalg][vector] Refine create{Read|Write}OrMasked{Read|Write} (nfc) #135350
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[mlir][linalg][vector] Refine create{Read|Write}OrMasked{Read|Write} (nfc) #135350
Conversation
…(nfc) The semantics of `createReadOrMaskedRead` and `createWriteOrMaskedWrite` are currently a bit inconsistent and not fully documented: * The input vector sizes are passed as `readShape` and `inputVectorSizes`, respectively — inconsistent naming. * Currently, the input vector sizes in `createWriteOrMaskedWrite` are not required to be complete: any missing trailing sizes are inferred from the destination tensor. This only works when the destination tensor is statically shaped. * Unlike `createReadOrMaskedRead`, the documentation for `createWriteOrMaskedWrite` does not specify that write offsets are hard-coded to 0. This PR only updates the documentation and unifies the naming. As such, it is NFC. A follow-up PR will generalize and unify the implementation to support, for example, dynamically shaped destination tensors — a requirement for enabling scalable vectorization of `linalg.pack` and `linalg.unpack`.
|
@llvm/pr-subscribers-mlir-vector @llvm/pr-subscribers-mlir-linalg Author: Andrzej Warzyński (banach-space) ChangesThe semantics of
This PR only updates the documentation and unifies the naming. As such, A follow-up PR will generalize and unify the implementation to support, Full diff: https://github.com/llvm/llvm-project/pull/135350.diff 3 Files Affected:
diff --git a/mlir/include/mlir/Dialect/Vector/Utils/VectorUtils.h b/mlir/include/mlir/Dialect/Vector/Utils/VectorUtils.h
index 5f32aca88a273..6609b28d77b6c 100644
--- a/mlir/include/mlir/Dialect/Vector/Utils/VectorUtils.h
+++ b/mlir/include/mlir/Dialect/Vector/Utils/VectorUtils.h
@@ -211,21 +211,18 @@ struct MaskableOpRewritePattern : OpRewritePattern<SourceOp> {
/// are not linearizable.
bool isLinearizableVector(VectorType type);
-/// Create a TransferReadOp from `source` with static shape `readShape`. If the
-/// vector type for the read is not the same as the type of `source`, then a
-/// mask is created on the read, if use of mask is specified or the bounds on a
-/// dimension are different.
-///
-/// `useInBoundsInsteadOfMasking` if false, the inBoundsVal values are set
-/// properly, based on
-/// the rank dimensions of the source and destination tensors. And that is
-/// what determines if masking is done.
-///
-/// Note that the internal `vector::TransferReadOp` always read at indices zero
-/// for each dimension of the passed in tensor.
+/// Creates a TransferReadOp from `source`.
+///
+/// The shape of the vector to read is specified via `inputVectorSizes`. If the
+/// shape of the output vector differs from the shape of the value being read,
+/// masking is used to avoid out-of-bounds accesses. Set
+/// `useInBoundsInsteadOfMasking` to `true` to use the "in_bounds" attribute
+/// instead of explicit masks.
+///
+/// Note: all read offsets are set to 0.
Value createReadOrMaskedRead(OpBuilder &builder, Location loc, Value source,
- ArrayRef<int64_t> readShape, Value padValue,
- bool useInBoundsInsteadOfMasking);
+ ArrayRef<int64_t> inputVectorSizes, Value padValue,
+ bool useInBoundsInsteadOfMasking = false);
/// Returns success if `inputVectorSizes` is a valid masking configuraion for
/// given `shape`, i.e., it meets:
diff --git a/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp b/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
index 8c8b1b85ef5a3..534a53ffe7b83 100644
--- a/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
+++ b/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
@@ -1506,19 +1506,48 @@ static SmallVector<int64_t> getTiledPackShape(linalg::PackOp packOp,
return applyPermutation(destShape, linalg::getPackInverseDestPerm(packOp));
}
-/// Given an input, the mixed destSizes, and the vector sizes for vectorization,
-/// create an empty destination tensor and create a TransferWriteOp from the
-/// input to the empty tensor. If the destination shape is not the same as the
-/// inputVectorSizes for the first rank(inputVectorSizes) dims, then create a
-/// mask for the write. If `useInBoundsInsteadOfMasking` is set, then update the
-/// inBounds attribute of the transfer write op instead of masking.
-static Operation *createWriteOrMaskedWrite(OpBuilder &builder, Location loc,
- Value input,
- SmallVector<OpFoldResult> destSizes,
- ArrayRef<int64_t> inputVectorSizes,
- bool useInBoundsInsteadOfMasking) {
+/// Creates a TransferWriteOp to write `input` into a newly initialized
+/// output tensor.
+///
+/// Given:
+/// - an input vector to write,
+/// - the mixed destination sizes for the output tensor,
+/// - and the vector sizes used for vectorization (i.e., the leading N dims),
+///
+/// this function generates the following sequence of ops:
+///
+/// %dest = tensor.empty(%destSizes)
+/// %res = vector.transfer_write %input into %dest
+///
+/// If the leading N dimensions of the destination tensor do not match
+/// `inputVecSizesForLeadingDims` (where N =
+/// rank(`inputVecSizesForLeadingDims`)), masking is applied to ensure
+/// correctness:
+///
+/// %dest = tensor.empty(%destSizes)
+/// %write = vector.transfer_write %input into %dest
+/// %mask = vector.create_mask(%destSizes)
+/// %res = vector.mask %mask { %write }
+///
+/// If `useInBoundsInsteadOfMasking` is set to `true`, the `in_bounds` attribute
+/// is used instead of masking:
+///
+/// %dest = tensor.empty(%destSizes)
+/// in_bounds_flags = (...)
+/// %res = vector.transfer_write %input into %dest
+/// {in_bounds = in_bounds_flags}
+///
+/// Note: all write offsets are set to 0.
+static Operation *
+createWriteOrMaskedWrite(OpBuilder &builder, Location loc, Value input,
+ SmallVector<OpFoldResult> destSizes,
+ ArrayRef<int64_t> inputVecSizesForLeadingDims,
+ bool useInBoundsInsteadOfMasking = false) {
auto inputType = cast<VectorType>(input.getType());
+ assert(inputType.getRank() == static_cast<int64_t>(destSizes.size()) &&
+ "Rank mismatch!");
+
Value dest = builder.create<tensor::EmptyOp>(loc, destSizes,
inputType.getElementType());
int64_t rank = cast<ShapedType>(dest.getType()).getRank();
@@ -1526,9 +1555,13 @@ static Operation *createWriteOrMaskedWrite(OpBuilder &builder, Location loc,
auto destShape = cast<ShapedType>(dest.getType()).getShape();
SmallVector<bool> inBoundsVal(rank, true);
if (useInBoundsInsteadOfMasking) {
+ // In this case, assume that all the required vector sizes have been
+ // provided.
+ assert(inputVecSizesForLeadingDims.size() == destSizes.size() &&
+ "Insufficient number of input vector sizes!");
// Update the inBounds attribute.
for (unsigned i = 0; i < rank; i++)
- inBoundsVal[i] = (destShape[i] == inputVectorSizes[i]) &&
+ inBoundsVal[i] = (destShape[i] == inputVecSizesForLeadingDims[i]) &&
!ShapedType::isDynamic(destShape[i]);
}
Operation *write = builder.create<vector::TransferWriteOp>(
@@ -1538,17 +1571,20 @@ static Operation *createWriteOrMaskedWrite(OpBuilder &builder, Location loc,
/*indices=*/SmallVector<Value>(rank, zero),
/*inBounds=*/inBoundsVal);
assert(llvm::none_of(
- destShape.drop_front(inputVectorSizes.size()),
+ destShape.drop_front(inputVecSizesForLeadingDims.size()),
[](int64_t size) { return size == ShapedType::kDynamic; }) &&
- "Only dims aligned with inputVectorSizes may be dynamic");
+ "Only dims aligned with inputVecSizesForLeadingDims may be dynamic");
if (useInBoundsInsteadOfMasking)
return write;
- bool needMaskForWrite = !llvm::equal(
- inputVectorSizes, destShape.take_front(inputVectorSizes.size()));
+ bool needMaskForWrite =
+ !llvm::equal(inputVecSizesForLeadingDims,
+ destShape.take_front(inputVecSizesForLeadingDims.size()));
if (needMaskForWrite) {
SmallVector<int64_t> writeMaskShape;
- writeMaskShape.append(inputVectorSizes.begin(), inputVectorSizes.end());
- writeMaskShape.append(destShape.begin() + inputVectorSizes.size(),
+ writeMaskShape.append(inputVecSizesForLeadingDims.begin(),
+ inputVecSizesForLeadingDims.end());
+ writeMaskShape.append(destShape.begin() +
+ inputVecSizesForLeadingDims.size(),
destShape.end());
auto writeMaskType = VectorType::get(writeMaskShape, builder.getI1Type());
Value maskForWrite =
@@ -1558,9 +1594,11 @@ static Operation *createWriteOrMaskedWrite(OpBuilder &builder, Location loc,
return write;
}
-/// Vectorize linalg::PackOp with (1) static innerTiles (2) constant
+/// Vectorize linalg::PackOp with (1) static inner_tiles (2) constant
/// padding value and (3) input vector sizes into:
-/// masked_transfer_read->shape_cast->transpose->transfer_write_in_bounds
+///
+/// masked_transfer_read->shape_cast->transpose->transfer_write_in_bounds
+///
/// As in the following example:
/// %pack = tensor.pack %src inner_dims_pos = [2, 1] inner_tiles = [16, 2]
/// into %dst : tensor<32x8x16xf32> -> tensor<32x4x1x16x2xf32>
@@ -1582,8 +1620,11 @@ static Operation *createWriteOrMaskedWrite(OpBuilder &builder, Location loc,
/// : vector<32x4x1x16x2xf32>, tensor<32x4x1x16x2xf32>
///
/// If the (3) input vector sizes are not provided, the vector sizes are
-/// determined by the result tensor shape. Also, we update the inBounds
-/// attribute instead of masking.
+/// determined by the result tensor shape and the `in_bounds`
+/// attribute is used instead of masking to mark
+///
+/// NOTE: The input vector sizes specify the dimensions corresponding to the
+/// outer dimensions of the output tensor. The remaining dimensions are
static LogicalResult
vectorizeAsTensorPackOp(RewriterBase &rewriter, linalg::PackOp packOp,
ArrayRef<int64_t> inputVectorSizes,
@@ -1644,9 +1685,11 @@ vectorizeAsTensorPackOp(RewriterBase &rewriter, linalg::PackOp packOp,
loc, shapeCastOp.getResult(), destPermutation);
// Create TransferWriteOp.
- Operation *write = createWriteOrMaskedWrite(
- rewriter, loc, transposeOp.getResult(), reifiedReturnShapes[0],
- inputVectorSizes, /*useInBoundsInsteadOfMasking=*/false);
+ Operation *write =
+ createWriteOrMaskedWrite(rewriter, loc, transposeOp.getResult(),
+ /*destSizes=*/reifiedReturnShapes[0],
+ /*inputVecSizesForLeadingDims=*/inputVectorSizes,
+ /*useInBoundsInsteadOfMasking=*/false);
newResults.push_back(write->getResult(0));
return success();
}
@@ -1780,8 +1823,9 @@ vectorizeAsTensorUnpackOp(RewriterBase &rewriter, linalg::UnPackOp unpackOp,
? vectorSizes
: shapeCastOp.getResultVectorType().getShape());
Operation *write = createWriteOrMaskedWrite(
- rewriter, loc, shapeCastOp.getResult(), reifiedRetShapes[0],
- writeVectorSizes, useInBoundsInsteadOfMasking);
+ rewriter, loc, shapeCastOp.getResult(), /*destSizes=*/reifiedRetShapes[0],
+ /*inputVecSizesForLeadingDims=*/writeVectorSizes,
+ useInBoundsInsteadOfMasking);
newResults.push_back(write->getResult(0));
return success();
}
@@ -1810,7 +1854,8 @@ vectorizeAsTensorPadOp(RewriterBase &rewriter, tensor::PadOp padOp,
rewriter, loc, padOp.getSource(), inputVectorSizes, padValue,
/*useInBoundsInsteadOfMasking=*/false);
Operation *write = createWriteOrMaskedWrite(
- rewriter, loc, maskedRead, reifiedReturnShapes[0], inputVectorSizes,
+ rewriter, loc, maskedRead, reifiedReturnShapes[0],
+ /*inputVecSizesForLeadingDims=*/inputVectorSizes,
/*useInBoundsInsteadOfMasking=*/false);
newResults.push_back(write->getResult(0));
return success();
diff --git a/mlir/lib/Dialect/Vector/Utils/VectorUtils.cpp b/mlir/lib/Dialect/Vector/Utils/VectorUtils.cpp
index 7b56cd0cf0e91..024e8ccb901de 100644
--- a/mlir/lib/Dialect/Vector/Utils/VectorUtils.cpp
+++ b/mlir/lib/Dialect/Vector/Utils/VectorUtils.cpp
@@ -327,26 +327,28 @@ bool vector::isLinearizableVector(VectorType type) {
}
Value vector::createReadOrMaskedRead(OpBuilder &builder, Location loc,
- Value source, ArrayRef<int64_t> readShape,
+ Value source,
+ ArrayRef<int64_t> inputVectorSizes,
Value padValue,
bool useInBoundsInsteadOfMasking) {
- assert(llvm::none_of(readShape,
+ assert(llvm::none_of(inputVectorSizes,
[](int64_t s) { return s == ShapedType::kDynamic; }) &&
- "expected static shape");
+ "invalid input vector sizes");
auto sourceShapedType = cast<ShapedType>(source.getType());
auto sourceShape = sourceShapedType.getShape();
- assert(sourceShape.size() == readShape.size() && "expected same ranks.");
- auto maskType = VectorType::get(readShape, builder.getI1Type());
- auto vectorType = VectorType::get(readShape, padValue.getType());
+ assert(sourceShape.size() == inputVectorSizes.size() &&
+ "expected same ranks.");
+ auto maskType = VectorType::get(inputVectorSizes, builder.getI1Type());
+ auto vectorType = VectorType::get(inputVectorSizes, padValue.getType());
assert(padValue.getType() == sourceShapedType.getElementType() &&
"expected same pad element type to match source element type");
- int64_t readRank = readShape.size();
+ int64_t readRank = inputVectorSizes.size();
auto zero = builder.create<arith::ConstantIndexOp>(loc, 0);
SmallVector<bool> inBoundsVal(readRank, true);
if (useInBoundsInsteadOfMasking) {
// Update the inBounds attribute.
for (unsigned i = 0; i < readRank; i++)
- inBoundsVal[i] = (sourceShape[i] == readShape[i]) &&
+ inBoundsVal[i] = (sourceShape[i] == inputVectorSizes[i]) &&
!ShapedType::isDynamic(sourceShape[i]);
}
auto transferReadOp = builder.create<vector::TransferReadOp>(
@@ -357,7 +359,7 @@ Value vector::createReadOrMaskedRead(OpBuilder &builder, Location loc,
/*padding=*/padValue,
/*inBounds=*/inBoundsVal);
- if (llvm::equal(readShape, sourceShape) || useInBoundsInsteadOfMasking)
+ if (llvm::equal(inputVectorSizes, sourceShape) || useInBoundsInsteadOfMasking)
return transferReadOp;
SmallVector<OpFoldResult> mixedSourceDims =
tensor::getMixedSizes(builder, loc, source);
|
…Write} (nfc) Add missing comments
Max191
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! One more place where I think the comments may have been cut off. Otherwise LGTM.
…d{Read|Write} (nfc)
Add missing comment
Max191
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
The semantics of
createReadOrMaskedReadandcreateWriteOrMaskedWriteare currently a bit inconsistent and not fully documented:
The input vector sizes are passed as
readShapeandinputVectorSizes, respectively — inconsistent naming.Currently, the input vector sizes in
createWriteOrMaskedWritearenot required to be complete: any missing trailing sizes are inferred
from the destination tensor. This only works when the destination
tensor is statically shaped.
Unlike
createReadOrMaskedRead, the documentation forcreateWriteOrMaskedWritedoes not specify that write offsets arehard-coded to 0.
This PR only updates the documentation and unifies the naming. As such,
it is NFC.
A follow-up PR will generalize and unify the implementation to support,
for example, dynamically shaped destination tensors — a requirement for
enabling scalable vectorization of
linalg.packandlinalg.unpack.