-
Notifications
You must be signed in to change notification settings - Fork 15k
[MLIR][Vector] Support vector reduction for xor/maxnumf/minnumf #163310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@llvm/pr-subscribers-mlir-spirv @llvm/pr-subscribers-mlir-affine Author: Lee Wei (leewei05) ChangesSupport vector reduction for xor/maxnumf/minnumf. Full diff: https://github.com/llvm/llvm-project/pull/163310.diff 3 Files Affected:
diff --git a/mlir/lib/Dialect/Affine/Analysis/AffineAnalysis.cpp b/mlir/lib/Dialect/Affine/Analysis/AffineAnalysis.cpp
index 4d2d8738aa4ad..3d1a73417d1ea 100644
--- a/mlir/lib/Dialect/Affine/Analysis/AffineAnalysis.cpp
+++ b/mlir/lib/Dialect/Affine/Analysis/AffineAnalysis.cpp
@@ -66,9 +66,10 @@ static Value getSupportedReduction(AffineForOp forOp, unsigned pos,
.Case([](arith::MaxSIOp) { return arith::AtomicRMWKind::maxs; })
.Case([](arith::MinUIOp) { return arith::AtomicRMWKind::minu; })
.Case([](arith::MaxUIOp) { return arith::AtomicRMWKind::maxu; })
+ .Case([](arith::XOrIOp) { return arith::AtomicRMWKind::xori; })
+ .Case([](arith::MaxNumFOp) { return arith::AtomicRMWKind::maxnumf; })
+ .Case([](arith::MinNumFOp) { return arith::AtomicRMWKind::minnumf; })
.Default([](Operation *) -> std::optional<arith::AtomicRMWKind> {
- // TODO: AtomicRMW supports other kinds of reductions this is
- // currently not detecting, add those when the need arises.
return std::nullopt;
});
if (!maybeKind)
diff --git a/mlir/lib/Dialect/Vector/IR/VectorOps.cpp b/mlir/lib/Dialect/Vector/IR/VectorOps.cpp
index 58256b0ade9f6..e74c72dbcd33e 100644
--- a/mlir/lib/Dialect/Vector/IR/VectorOps.cpp
+++ b/mlir/lib/Dialect/Vector/IR/VectorOps.cpp
@@ -717,7 +717,15 @@ Value mlir::vector::getVectorReductionOp(arith::AtomicRMWKind op,
case arith::AtomicRMWKind::ori:
return vector::ReductionOp::create(builder, vector.getLoc(),
CombiningKind::OR, vector);
- // TODO: Add remaining reduction operations.
+ case arith::AtomicRMWKind::minnumf:
+ return vector::ReductionOp::create(builder, vector.getLoc(),
+ CombiningKind::MINNUMF, vector);
+ case arith::AtomicRMWKind::maxnumf:
+ return vector::ReductionOp::create(builder, vector.getLoc(),
+ CombiningKind::MAXNUMF, vector);
+ case arith::AtomicRMWKind::xori:
+ return vector::ReductionOp::create(builder, vector.getLoc(),
+ CombiningKind::XOR, vector);
default:
(void)emitOptionalError(loc, "Reduction operation type not supported");
break;
diff --git a/mlir/test/Dialect/Affine/SuperVectorize/vectorize_reduction.mlir b/mlir/test/Dialect/Affine/SuperVectorize/vectorize_reduction.mlir
index b616632a6fe24..a6db6ec719472 100644
--- a/mlir/test/Dialect/Affine/SuperVectorize/vectorize_reduction.mlir
+++ b/mlir/test/Dialect/Affine/SuperVectorize/vectorize_reduction.mlir
@@ -243,6 +243,106 @@ func.func @vecdim_reduction_ori(%in: memref<256x512xi32>, %out: memref<256xi32>)
// CHECK: affine.store %[[final_red]], %{{.*}} : memref<256xi32>
// CHECK: }
+// -----
+
+func.func @vecdim_reduction_xori(%in: memref<256x512xi32>, %out: memref<256xi32>) {
+ %cst = arith.constant 0 : i32
+ affine.for %i = 0 to 256 {
+ %final_red = affine.for %j = 0 to 512 iter_args(%red_iter = %cst) -> (i32) {
+ %ld = affine.load %in[%i, %j] : memref<256x512xi32>
+ %xor = arith.xori %red_iter, %ld : i32
+ affine.yield %xor : i32
+ }
+ affine.store %final_red, %out[%i] : memref<256xi32>
+ }
+ return
+}
+
+// CHECK-LABEL: func.func @vecdim_reduction_xori(
+// CHECK-SAME: %[[ARG0:.*]]: memref<256x512xi32>,
+// CHECK-SAME: %[[ARG1:.*]]: memref<256xi32>) {
+// CHECK: %[[VAL_0:.*]] = arith.constant 0 : i32
+// CHECK: affine.for %[[VAL_1:.*]] = 0 to 256 {
+// CHECK: %[[VAL_2:.*]] = arith.constant dense<0> : vector<128xi32>
+// CHECK: %[[VAL_3:.*]] = affine.for %[[VAL_4:.*]] = 0 to 512 step 128 iter_args(%[[VAL_5:.*]] = %[[VAL_2]]) -> (vector<128xi32>) {
+// CHECK: %[[VAL_6:.*]] = ub.poison : i32
+// CHECK: %[[VAL_7:.*]] = vector.transfer_read %[[ARG0]]{{\[}}%[[VAL_1]], %[[VAL_4]]], %[[VAL_6]] : memref<256x512xi32>, vector<128xi32>
+// CHECK: %[[VAL_8:.*]] = arith.xori %[[VAL_5]], %[[VAL_7]] : vector<128xi32>
+// CHECK: affine.yield %[[VAL_8]] : vector<128xi32>
+// CHECK: }
+// CHECK: %[[VAL_9:.*]] = vector.reduction <xor>, %[[VAL_3]] : vector<128xi32> into i32
+// CHECK: affine.store %[[VAL_9]], %[[ARG1]]{{\[}}%[[VAL_1]]] : memref<256xi32>
+// CHECK: }
+// CHECK: return
+// CHECK: }
+
+// -----
+
+func.func @vecdim_reduction_minnumf(%in: memref<256x512xf32>, %out: memref<256xf32>) {
+ %cst = arith.constant 0xFF800000 : f32
+ affine.for %i = 0 to 256 {
+ %final_red = affine.for %j = 0 to 512 iter_args(%red_iter = %cst) -> (f32) {
+ %ld = affine.load %in[%i, %j] : memref<256x512xf32>
+ %min = arith.minnumf %red_iter, %ld : f32
+ affine.yield %min : f32
+ }
+ affine.store %final_red, %out[%i] : memref<256xf32>
+ }
+ return
+}
+
+// CHECK-LABEL: func.func @vecdim_reduction_minnumf(
+// CHECK-SAME: %[[ARG0:.*]]: memref<256x512xf32>,
+// CHECK-SAME: %[[ARG1:.*]]: memref<256xf32>) {
+// CHECK: %[[VAL_0:.*]] = arith.constant 0xFF800000 : f32
+// CHECK: affine.for %[[VAL_1:.*]] = 0 to 256 {
+// CHECK: %[[VAL_2:.*]] = arith.constant dense<0x7FC00000> : vector<128xf32>
+// CHECK: %[[VAL_3:.*]] = affine.for %[[VAL_4:.*]] = 0 to 512 step 128 iter_args(%[[VAL_5:.*]] = %[[VAL_2]]) -> (vector<128xf32>) {
+// CHECK: %[[VAL_6:.*]] = ub.poison : f32
+// CHECK: %[[VAL_7:.*]] = vector.transfer_read %[[ARG0]]{{\[}}%[[VAL_1]], %[[VAL_4]]], %[[VAL_6]] : memref<256x512xf32>, vector<128xf32>
+// CHECK: %[[VAL_8:.*]] = arith.minnumf %[[VAL_5]], %[[VAL_7]] : vector<128xf32>
+// CHECK: affine.yield %[[VAL_8]] : vector<128xf32>
+// CHECK: }
+// CHECK: %[[VAL_9:.*]] = vector.reduction <minnumf>, %[[VAL_3]] : vector<128xf32> into f32
+// CHECK: %[[VAL_10:.*]] = arith.minnumf %[[VAL_9]], %[[VAL_0]] : f32
+// CHECK: affine.store %[[VAL_10]], %[[ARG1]]{{\[}}%[[VAL_1]]] : memref<256xf32>
+// CHECK: }
+// CHECK: return
+// CHECK: }
+
+// -----
+
+func.func @vecdim_reduction_maxnumf(%in: memref<256x512xf32>, %out: memref<256xf32>) {
+ %cst = arith.constant 0xFF800000 : f32
+ affine.for %i = 0 to 256 {
+ %final_red = affine.for %j = 0 to 512 iter_args(%red_iter = %cst) -> (f32) {
+ %ld = affine.load %in[%i, %j] : memref<256x512xf32>
+ %max = arith.maxnumf %red_iter, %ld : f32
+ affine.yield %max : f32
+ }
+ affine.store %final_red, %out[%i] : memref<256xf32>
+ }
+ return
+}
+
+// CHECK-LABEL: func.func @vecdim_reduction_maxnumf(
+// CHECK-SAME: %[[ARG0:.*]]: memref<256x512xf32>,
+// CHECK-SAME: %[[ARG1:.*]]: memref<256xf32>) {
+// CHECK: %[[VAL_0:.*]] = arith.constant 0xFF800000 : f32
+// CHECK: affine.for %[[VAL_1:.*]] = 0 to 256 {
+// CHECK: %[[VAL_2:.*]] = arith.constant dense<0xFFC00000> : vector<128xf32>
+// CHECK: %[[VAL_3:.*]] = affine.for %[[VAL_4:.*]] = 0 to 512 step 128 iter_args(%[[VAL_5:.*]] = %[[VAL_2]]) -> (vector<128xf32>) {
+// CHECK: %[[VAL_6:.*]] = ub.poison : f32
+// CHECK: %[[VAL_7:.*]] = vector.transfer_read %[[ARG0]]{{\[}}%[[VAL_1]], %[[VAL_4]]], %[[VAL_6]] : memref<256x512xf32>, vector<128xf32>
+// CHECK: %[[VAL_8:.*]] = arith.maxnumf %[[VAL_5]], %[[VAL_7]] : vector<128xf32>
+// CHECK: affine.yield %[[VAL_8]] : vector<128xf32>
+// CHECK: }
+// CHECK: %[[VAL_9:.*]] = vector.reduction <maxnumf>, %[[VAL_3]] : vector<128xf32> into f32
+// CHECK: %[[VAL_10:.*]] = arith.maxnumf %[[VAL_9]], %[[VAL_0]] : f32
+// CHECK: affine.store %[[VAL_10]], %[[ARG1]]{{\[}}%[[VAL_1]]] : memref<256xf32>
+// CHECK: }
+// CHECK: return
+// CHECK: }
// -----
|
|
@llvm/pr-subscribers-mlir Author: Lee Wei (leewei05) ChangesSupport vector reduction for xor/maxnumf/minnumf. Full diff: https://github.com/llvm/llvm-project/pull/163310.diff 3 Files Affected:
diff --git a/mlir/lib/Dialect/Affine/Analysis/AffineAnalysis.cpp b/mlir/lib/Dialect/Affine/Analysis/AffineAnalysis.cpp
index 4d2d8738aa4ad..3d1a73417d1ea 100644
--- a/mlir/lib/Dialect/Affine/Analysis/AffineAnalysis.cpp
+++ b/mlir/lib/Dialect/Affine/Analysis/AffineAnalysis.cpp
@@ -66,9 +66,10 @@ static Value getSupportedReduction(AffineForOp forOp, unsigned pos,
.Case([](arith::MaxSIOp) { return arith::AtomicRMWKind::maxs; })
.Case([](arith::MinUIOp) { return arith::AtomicRMWKind::minu; })
.Case([](arith::MaxUIOp) { return arith::AtomicRMWKind::maxu; })
+ .Case([](arith::XOrIOp) { return arith::AtomicRMWKind::xori; })
+ .Case([](arith::MaxNumFOp) { return arith::AtomicRMWKind::maxnumf; })
+ .Case([](arith::MinNumFOp) { return arith::AtomicRMWKind::minnumf; })
.Default([](Operation *) -> std::optional<arith::AtomicRMWKind> {
- // TODO: AtomicRMW supports other kinds of reductions this is
- // currently not detecting, add those when the need arises.
return std::nullopt;
});
if (!maybeKind)
diff --git a/mlir/lib/Dialect/Vector/IR/VectorOps.cpp b/mlir/lib/Dialect/Vector/IR/VectorOps.cpp
index 58256b0ade9f6..e74c72dbcd33e 100644
--- a/mlir/lib/Dialect/Vector/IR/VectorOps.cpp
+++ b/mlir/lib/Dialect/Vector/IR/VectorOps.cpp
@@ -717,7 +717,15 @@ Value mlir::vector::getVectorReductionOp(arith::AtomicRMWKind op,
case arith::AtomicRMWKind::ori:
return vector::ReductionOp::create(builder, vector.getLoc(),
CombiningKind::OR, vector);
- // TODO: Add remaining reduction operations.
+ case arith::AtomicRMWKind::minnumf:
+ return vector::ReductionOp::create(builder, vector.getLoc(),
+ CombiningKind::MINNUMF, vector);
+ case arith::AtomicRMWKind::maxnumf:
+ return vector::ReductionOp::create(builder, vector.getLoc(),
+ CombiningKind::MAXNUMF, vector);
+ case arith::AtomicRMWKind::xori:
+ return vector::ReductionOp::create(builder, vector.getLoc(),
+ CombiningKind::XOR, vector);
default:
(void)emitOptionalError(loc, "Reduction operation type not supported");
break;
diff --git a/mlir/test/Dialect/Affine/SuperVectorize/vectorize_reduction.mlir b/mlir/test/Dialect/Affine/SuperVectorize/vectorize_reduction.mlir
index b616632a6fe24..a6db6ec719472 100644
--- a/mlir/test/Dialect/Affine/SuperVectorize/vectorize_reduction.mlir
+++ b/mlir/test/Dialect/Affine/SuperVectorize/vectorize_reduction.mlir
@@ -243,6 +243,106 @@ func.func @vecdim_reduction_ori(%in: memref<256x512xi32>, %out: memref<256xi32>)
// CHECK: affine.store %[[final_red]], %{{.*}} : memref<256xi32>
// CHECK: }
+// -----
+
+func.func @vecdim_reduction_xori(%in: memref<256x512xi32>, %out: memref<256xi32>) {
+ %cst = arith.constant 0 : i32
+ affine.for %i = 0 to 256 {
+ %final_red = affine.for %j = 0 to 512 iter_args(%red_iter = %cst) -> (i32) {
+ %ld = affine.load %in[%i, %j] : memref<256x512xi32>
+ %xor = arith.xori %red_iter, %ld : i32
+ affine.yield %xor : i32
+ }
+ affine.store %final_red, %out[%i] : memref<256xi32>
+ }
+ return
+}
+
+// CHECK-LABEL: func.func @vecdim_reduction_xori(
+// CHECK-SAME: %[[ARG0:.*]]: memref<256x512xi32>,
+// CHECK-SAME: %[[ARG1:.*]]: memref<256xi32>) {
+// CHECK: %[[VAL_0:.*]] = arith.constant 0 : i32
+// CHECK: affine.for %[[VAL_1:.*]] = 0 to 256 {
+// CHECK: %[[VAL_2:.*]] = arith.constant dense<0> : vector<128xi32>
+// CHECK: %[[VAL_3:.*]] = affine.for %[[VAL_4:.*]] = 0 to 512 step 128 iter_args(%[[VAL_5:.*]] = %[[VAL_2]]) -> (vector<128xi32>) {
+// CHECK: %[[VAL_6:.*]] = ub.poison : i32
+// CHECK: %[[VAL_7:.*]] = vector.transfer_read %[[ARG0]]{{\[}}%[[VAL_1]], %[[VAL_4]]], %[[VAL_6]] : memref<256x512xi32>, vector<128xi32>
+// CHECK: %[[VAL_8:.*]] = arith.xori %[[VAL_5]], %[[VAL_7]] : vector<128xi32>
+// CHECK: affine.yield %[[VAL_8]] : vector<128xi32>
+// CHECK: }
+// CHECK: %[[VAL_9:.*]] = vector.reduction <xor>, %[[VAL_3]] : vector<128xi32> into i32
+// CHECK: affine.store %[[VAL_9]], %[[ARG1]]{{\[}}%[[VAL_1]]] : memref<256xi32>
+// CHECK: }
+// CHECK: return
+// CHECK: }
+
+// -----
+
+func.func @vecdim_reduction_minnumf(%in: memref<256x512xf32>, %out: memref<256xf32>) {
+ %cst = arith.constant 0xFF800000 : f32
+ affine.for %i = 0 to 256 {
+ %final_red = affine.for %j = 0 to 512 iter_args(%red_iter = %cst) -> (f32) {
+ %ld = affine.load %in[%i, %j] : memref<256x512xf32>
+ %min = arith.minnumf %red_iter, %ld : f32
+ affine.yield %min : f32
+ }
+ affine.store %final_red, %out[%i] : memref<256xf32>
+ }
+ return
+}
+
+// CHECK-LABEL: func.func @vecdim_reduction_minnumf(
+// CHECK-SAME: %[[ARG0:.*]]: memref<256x512xf32>,
+// CHECK-SAME: %[[ARG1:.*]]: memref<256xf32>) {
+// CHECK: %[[VAL_0:.*]] = arith.constant 0xFF800000 : f32
+// CHECK: affine.for %[[VAL_1:.*]] = 0 to 256 {
+// CHECK: %[[VAL_2:.*]] = arith.constant dense<0x7FC00000> : vector<128xf32>
+// CHECK: %[[VAL_3:.*]] = affine.for %[[VAL_4:.*]] = 0 to 512 step 128 iter_args(%[[VAL_5:.*]] = %[[VAL_2]]) -> (vector<128xf32>) {
+// CHECK: %[[VAL_6:.*]] = ub.poison : f32
+// CHECK: %[[VAL_7:.*]] = vector.transfer_read %[[ARG0]]{{\[}}%[[VAL_1]], %[[VAL_4]]], %[[VAL_6]] : memref<256x512xf32>, vector<128xf32>
+// CHECK: %[[VAL_8:.*]] = arith.minnumf %[[VAL_5]], %[[VAL_7]] : vector<128xf32>
+// CHECK: affine.yield %[[VAL_8]] : vector<128xf32>
+// CHECK: }
+// CHECK: %[[VAL_9:.*]] = vector.reduction <minnumf>, %[[VAL_3]] : vector<128xf32> into f32
+// CHECK: %[[VAL_10:.*]] = arith.minnumf %[[VAL_9]], %[[VAL_0]] : f32
+// CHECK: affine.store %[[VAL_10]], %[[ARG1]]{{\[}}%[[VAL_1]]] : memref<256xf32>
+// CHECK: }
+// CHECK: return
+// CHECK: }
+
+// -----
+
+func.func @vecdim_reduction_maxnumf(%in: memref<256x512xf32>, %out: memref<256xf32>) {
+ %cst = arith.constant 0xFF800000 : f32
+ affine.for %i = 0 to 256 {
+ %final_red = affine.for %j = 0 to 512 iter_args(%red_iter = %cst) -> (f32) {
+ %ld = affine.load %in[%i, %j] : memref<256x512xf32>
+ %max = arith.maxnumf %red_iter, %ld : f32
+ affine.yield %max : f32
+ }
+ affine.store %final_red, %out[%i] : memref<256xf32>
+ }
+ return
+}
+
+// CHECK-LABEL: func.func @vecdim_reduction_maxnumf(
+// CHECK-SAME: %[[ARG0:.*]]: memref<256x512xf32>,
+// CHECK-SAME: %[[ARG1:.*]]: memref<256xf32>) {
+// CHECK: %[[VAL_0:.*]] = arith.constant 0xFF800000 : f32
+// CHECK: affine.for %[[VAL_1:.*]] = 0 to 256 {
+// CHECK: %[[VAL_2:.*]] = arith.constant dense<0xFFC00000> : vector<128xf32>
+// CHECK: %[[VAL_3:.*]] = affine.for %[[VAL_4:.*]] = 0 to 512 step 128 iter_args(%[[VAL_5:.*]] = %[[VAL_2]]) -> (vector<128xf32>) {
+// CHECK: %[[VAL_6:.*]] = ub.poison : f32
+// CHECK: %[[VAL_7:.*]] = vector.transfer_read %[[ARG0]]{{\[}}%[[VAL_1]], %[[VAL_4]]], %[[VAL_6]] : memref<256x512xf32>, vector<128xf32>
+// CHECK: %[[VAL_8:.*]] = arith.maxnumf %[[VAL_5]], %[[VAL_7]] : vector<128xf32>
+// CHECK: affine.yield %[[VAL_8]] : vector<128xf32>
+// CHECK: }
+// CHECK: %[[VAL_9:.*]] = vector.reduction <maxnumf>, %[[VAL_3]] : vector<128xf32> into f32
+// CHECK: %[[VAL_10:.*]] = arith.maxnumf %[[VAL_9]], %[[VAL_0]] : f32
+// CHECK: affine.store %[[VAL_10]], %[[ARG1]]{{\[}}%[[VAL_1]]] : memref<256xf32>
+// CHECK: }
+// CHECK: return
+// CHECK: }
// -----
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this contribution!
I am not familiar with the Affine part, but the tests that you added look like a pretty straightforward extension of the existing tests.
That said, for a change like this I would expect more comprehensive testing. Here are two places where new tests should be added:
llvm-project/mlir/test/Dialect/Vector/ops.mlir
Lines 609 to 629 in b228a18
// CHECK-LABEL: @reduce_fp func.func @reduce_fp(%arg0: vector<16xf32>, %arg1: f32) -> f32 { // CHECK: vector.reduction <add>, %{{.*}} : vector<16xf32> into f32 vector.reduction <add>, %arg0 : vector<16xf32> into f32 // CHECK: vector.reduction <add>, %{{.*}}, %{{.*}} : vector<16xf32> into f32 vector.reduction <add>, %arg0, %arg1 : vector<16xf32> into f32 // CHECK: vector.reduction <mul>, %{{.*}} : vector<16xf32> into f32 vector.reduction <mul>, %arg0 : vector<16xf32> into f32 // CHECK: vector.reduction <mul>, %{{.*}}, %{{.*}} : vector<16xf32> into f32 vector.reduction <mul>, %arg0, %arg1 : vector<16xf32> into f32 // CHECK: vector.reduction <minnumf>, %{{.*}} : vector<16xf32> into f32 vector.reduction <minnumf>, %arg0 : vector<16xf32> into f32 // CHECK: %[[X0:.*]] = vector.reduction <maxnumf>, %{{.*}} : vector<16xf32> into f32 %0 = vector.reduction <maxnumf>, %arg0 : vector<16xf32> into f32 // CHECK: vector.reduction <minimumf>, %{{.*}} : vector<16xf32> into f32 vector.reduction <minimumf>, %arg0 : vector<16xf32> into f32 // CHECK: %[[X1:.*]] = vector.reduction <maximumf>, %{{.*}} : vector<16xf32> into f32 %1 = vector.reduction <maximumf>, %arg0 : vector<16xf32> into f32 // CHECK: return %[[X0]] : f32 return %0 : f32 } func.func @reduce_2x_f32(%arg0: vector<2xf32>) -> (f32, f32, f32, f32, f32, f32) { %0 = vector.reduction <add>, %arg0 : vector<2xf32> into f32 %1 = vector.reduction <mul>, %arg0 : vector<2xf32> into f32 %2 = vector.reduction <minnumf>, %arg0 : vector<2xf32> into f32 %3 = vector.reduction <maxnumf>, %arg0 : vector<2xf32> into f32 %4 = vector.reduction <minimumf>, %arg0 : vector<2xf32> into f32 %5 = vector.reduction <maximumf>, %arg0 : vector<2xf32> into f32 return %0, %1, %2, %3, %4, %5 : f32, f32, f32, f32, f32, f32 }
There's definitely more :) (*)
Thanks!
(*) To clarify, I am not suggesting that you add these new variants everywhere vector.reduction is used. However, please update tests that enumerate over all the existing options.
| // CHECK-LABEL: func.func @vecdim_reduction_xori( | ||
| // CHECK-SAME: %[[ARG0:.*]]: memref<256x512xi32>, | ||
| // CHECK-SAME: %[[ARG1:.*]]: memref<256xi32>) { | ||
| // CHECK: %[[VAL_0:.*]] = arith.constant 0 : i32 | ||
| // CHECK: affine.for %[[VAL_1:.*]] = 0 to 256 { | ||
| // CHECK: %[[VAL_2:.*]] = arith.constant dense<0> : vector<128xi32> | ||
| // CHECK: %[[VAL_3:.*]] = affine.for %[[VAL_4:.*]] = 0 to 512 step 128 iter_args(%[[VAL_5:.*]] = %[[VAL_2]]) -> (vector<128xi32>) { | ||
| // CHECK: %[[VAL_6:.*]] = ub.poison : i32 | ||
| // CHECK: %[[VAL_7:.*]] = vector.transfer_read %[[ARG0]]{{\[}}%[[VAL_1]], %[[VAL_4]]], %[[VAL_6]] : memref<256x512xi32>, vector<128xi32> | ||
| // CHECK: %[[VAL_8:.*]] = arith.xori %[[VAL_5]], %[[VAL_7]] : vector<128xi32> | ||
| // CHECK: affine.yield %[[VAL_8]] : vector<128xi32> | ||
| // CHECK: } | ||
| // CHECK: %[[VAL_9:.*]] = vector.reduction <xor>, %[[VAL_3]] : vector<128xi32> into i32 | ||
| // CHECK: affine.store %[[VAL_9]], %[[ARG1]]{{\[}}%[[VAL_1]]] : memref<256xi32> | ||
| // CHECK: } | ||
| // CHECK: return | ||
| // CHECK: } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use meaningful FileCheck variable names. See comprehensive guidelines here: https://mlir.llvm.org/getting_started/TestingGuide/
Thanks! Just to clarify, I'm new to MLIR, and I want to know what is the suitable test coverage for a single PR. Let's take this PR as an example, I should focus on covering the tests in Vector Dialect and places where they use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. The SuperVectorizer is not under active development but pieces added to common infrastructure make sense to me.
That's a good question and the answer will vary from one PR to another.
Indeed. Specifically, when touching anything to do with Op definition, I would expect to see tests in:
In addition, I would cover the conversion tests (unless that requires extra code that's not included in a PR). So yes, ideally you would add tests for LLVM and SPIR-V. Next, there are transformation tests. For that, I'd focus on tests that target As for integration tests, these are (from https://mlir.llvm.org/getting_started/TestingGuide/#integration-tests):
TBH, you are hitting a pretty rare case. In principle, most tests relevant to Hopefully this makes sense and you don't mind the extra work :) |
@banach-space Thank you for the reply! I was following this commit 059ee5d, so I thought the tests were sufficient. Anyways, I'll spend some time to cover enough tests for this PR. Thanks again! |
|
Tests for As for conversion tests to LLVM, they are already in it. For conversion to SPIR-V, |
Support vector reduction for xor/maxnumf/minnumf.