[RISCV][SLP][NFC]Add a test for satd-8x4 from x264 benchmark. #162542

mgudim · 2025-10-08T20:08:36Z

No description provided.

llvmbot · 2025-10-08T20:09:11Z

@llvm/pr-subscribers-backend-risc-v

Author: Mikhail Gudim (mgudim)

Changes

Patch is 32.73 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/162542.diff

1 Files Affected:

(added) llvm/test/Transforms/SLPVectorizer/RISCV/x264-satd-8x4.ll (+526)

diff --git a/llvm/test/Transforms/SLPVectorizer/RISCV/x264-satd-8x4.ll b/llvm/test/Transforms/SLPVectorizer/RISCV/x264-satd-8x4.ll
new file mode 100644
index 0000000000000..c1042f1842107
--- /dev/null
+++ b/llvm/test/Transforms/SLPVectorizer/RISCV/x264-satd-8x4.ll
@@ -0,0 +1,526 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -mtriple=riscv64 -mattr=+m,+v,+unaligned-vector-mem \
+; RUN: -passes=slp-vectorizer -S < %s | FileCheck %s
+; Function Attrs: nounwind uwtable vscale_range(8,1024)
+define i32 @x264_pixel_satd_8x4(ptr %pix1, i32  %i_pix1, ptr  %pix2, i32  %i_pix2) {
+; CHECK-LABEL: define i32 @x264_pixel_satd_8x4(
+; CHECK-SAME: ptr [[PIX1:%.*]], i32 [[I_PIX1:%.*]], ptr [[PIX2:%.*]], i32 [[I_PIX2:%.*]]) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[IDX_EXT:%.*]] = sext i32 [[I_PIX1]] to i64
+; CHECK-NEXT:    [[IDX_EXT63:%.*]] = sext i32 [[I_PIX2]] to i64
+; CHECK-NEXT:    [[ARRAYIDX3:%.*]] = getelementptr inbounds nuw i8, ptr [[PIX1]], i64 4
+; CHECK-NEXT:    [[ARRAYIDX5:%.*]] = getelementptr inbounds nuw i8, ptr [[PIX2]], i64 4
+; CHECK-NEXT:    [[ADD_PTR:%.*]] = getelementptr inbounds i8, ptr [[PIX1]], i64 [[IDX_EXT]]
+; CHECK-NEXT:    [[ADD_PTR64:%.*]] = getelementptr inbounds i8, ptr [[PIX2]], i64 [[IDX_EXT63]]
+; CHECK-NEXT:    [[ARRAYIDX3_1:%.*]] = getelementptr inbounds nuw i8, ptr [[ADD_PTR]], i64 4
+; CHECK-NEXT:    [[ARRAYIDX5_1:%.*]] = getelementptr inbounds nuw i8, ptr [[ADD_PTR64]], i64 4
+; CHECK-NEXT:    [[ADD_PTR_1:%.*]] = getelementptr inbounds i8, ptr [[ADD_PTR]], i64 [[IDX_EXT]]
+; CHECK-NEXT:    [[ADD_PTR64_1:%.*]] = getelementptr inbounds i8, ptr [[ADD_PTR64]], i64 [[IDX_EXT63]]
+; CHECK-NEXT:    [[ARRAYIDX3_2:%.*]] = getelementptr inbounds nuw i8, ptr [[ADD_PTR_1]], i64 4
+; CHECK-NEXT:    [[ARRAYIDX5_2:%.*]] = getelementptr inbounds nuw i8, ptr [[ADD_PTR64_1]], i64 4
+; CHECK-NEXT:    [[ADD_PTR_2:%.*]] = getelementptr inbounds i8, ptr [[ADD_PTR_1]], i64 [[IDX_EXT]]
+; CHECK-NEXT:    [[ADD_PTR64_2:%.*]] = getelementptr inbounds i8, ptr [[ADD_PTR64_1]], i64 [[IDX_EXT63]]
+; CHECK-NEXT:    [[ARRAYIDX3_3:%.*]] = getelementptr inbounds nuw i8, ptr [[ADD_PTR_2]], i64 4
+; CHECK-NEXT:    [[ARRAYIDX5_3:%.*]] = getelementptr inbounds nuw i8, ptr [[ADD_PTR64_2]], i64 4
+; CHECK-NEXT:    [[TMP0:%.*]] = load <4 x i8>, ptr [[PIX1]], align 1
+; CHECK-NEXT:    [[TMP1:%.*]] = load <4 x i8>, ptr [[PIX2]], align 1
+; CHECK-NEXT:    [[TMP2:%.*]] = load <4 x i8>, ptr [[ARRAYIDX3]], align 1
+; CHECK-NEXT:    [[TMP3:%.*]] = load <4 x i8>, ptr [[ARRAYIDX5]], align 1
+; CHECK-NEXT:    [[TMP4:%.*]] = load <4 x i8>, ptr [[ADD_PTR]], align 1
+; CHECK-NEXT:    [[TMP5:%.*]] = load <4 x i8>, ptr [[ADD_PTR64]], align 1
+; CHECK-NEXT:    [[TMP6:%.*]] = load <4 x i8>, ptr [[ARRAYIDX3_1]], align 1
+; CHECK-NEXT:    [[TMP7:%.*]] = load <4 x i8>, ptr [[ARRAYIDX5_1]], align 1
+; CHECK-NEXT:    [[TMP8:%.*]] = load <4 x i8>, ptr [[ADD_PTR_1]], align 1
+; CHECK-NEXT:    [[TMP9:%.*]] = load <4 x i8>, ptr [[ADD_PTR64_1]], align 1
+; CHECK-NEXT:    [[TMP10:%.*]] = load <4 x i8>, ptr [[ARRAYIDX3_2]], align 1
+; CHECK-NEXT:    [[TMP11:%.*]] = load <4 x i8>, ptr [[ARRAYIDX5_2]], align 1
+; CHECK-NEXT:    [[TMP12:%.*]] = load <4 x i8>, ptr [[ADD_PTR_2]], align 1
+; CHECK-NEXT:    [[TMP13:%.*]] = shufflevector <4 x i8> [[TMP0]], <4 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP14:%.*]] = shufflevector <4 x i8> [[TMP4]], <4 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP15:%.*]] = shufflevector <4 x i8> [[TMP0]], <4 x i8> [[TMP4]], <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP16:%.*]] = shufflevector <4 x i8> [[TMP8]], <4 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP17:%.*]] = shufflevector <16 x i8> [[TMP15]], <16 x i8> [[TMP16]], <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 16, i32 17, i32 18, i32 19, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP18:%.*]] = shufflevector <4 x i8> [[TMP12]], <4 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP19:%.*]] = shufflevector <16 x i8> [[TMP17]], <16 x i8> [[TMP18]], <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 16, i32 17, i32 18, i32 19>
+; CHECK-NEXT:    [[TMP20:%.*]] = zext <16 x i8> [[TMP19]] to <16 x i32>
+; CHECK-NEXT:    [[TMP21:%.*]] = load <4 x i8>, ptr [[ADD_PTR64_2]], align 1
+; CHECK-NEXT:    [[TMP22:%.*]] = shufflevector <4 x i8> [[TMP1]], <4 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP23:%.*]] = shufflevector <4 x i8> [[TMP5]], <4 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP24:%.*]] = shufflevector <4 x i8> [[TMP1]], <4 x i8> [[TMP5]], <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP25:%.*]] = shufflevector <4 x i8> [[TMP9]], <4 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP26:%.*]] = shufflevector <16 x i8> [[TMP24]], <16 x i8> [[TMP25]], <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 16, i32 17, i32 18, i32 19, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP27:%.*]] = shufflevector <4 x i8> [[TMP21]], <4 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP28:%.*]] = shufflevector <16 x i8> [[TMP26]], <16 x i8> [[TMP27]], <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 16, i32 17, i32 18, i32 19>
+; CHECK-NEXT:    [[TMP29:%.*]] = zext <16 x i8> [[TMP28]] to <16 x i32>
+; CHECK-NEXT:    [[TMP30:%.*]] = sub nsw <16 x i32> [[TMP20]], [[TMP29]]
+; CHECK-NEXT:    [[TMP31:%.*]] = load <4 x i8>, ptr [[ARRAYIDX3_3]], align 1
+; CHECK-NEXT:    [[TMP32:%.*]] = shufflevector <4 x i8> [[TMP2]], <4 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP33:%.*]] = shufflevector <4 x i8> [[TMP6]], <4 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP34:%.*]] = shufflevector <4 x i8> [[TMP2]], <4 x i8> [[TMP6]], <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP35:%.*]] = shufflevector <4 x i8> [[TMP10]], <4 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP36:%.*]] = shufflevector <16 x i8> [[TMP34]], <16 x i8> [[TMP35]], <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 16, i32 17, i32 18, i32 19, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP37:%.*]] = shufflevector <4 x i8> [[TMP31]], <4 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP38:%.*]] = shufflevector <16 x i8> [[TMP36]], <16 x i8> [[TMP37]], <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 16, i32 17, i32 18, i32 19>
+; CHECK-NEXT:    [[TMP39:%.*]] = zext <16 x i8> [[TMP38]] to <16 x i32>
+; CHECK-NEXT:    [[TMP40:%.*]] = load <4 x i8>, ptr [[ARRAYIDX5_3]], align 1
+; CHECK-NEXT:    [[TMP41:%.*]] = shufflevector <4 x i8> [[TMP3]], <4 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP42:%.*]] = shufflevector <4 x i8> [[TMP7]], <4 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP43:%.*]] = shufflevector <4 x i8> [[TMP3]], <4 x i8> [[TMP7]], <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP44:%.*]] = shufflevector <4 x i8> [[TMP11]], <4 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP45:%.*]] = shufflevector <16 x i8> [[TMP43]], <16 x i8> [[TMP44]], <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 16, i32 17, i32 18, i32 19, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP46:%.*]] = shufflevector <4 x i8> [[TMP40]], <4 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP47:%.*]] = shufflevector <16 x i8> [[TMP45]], <16 x i8> [[TMP46]], <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 16, i32 17, i32 18, i32 19>
+; CHECK-NEXT:    [[TMP48:%.*]] = zext <16 x i8> [[TMP47]] to <16 x i32>
+; CHECK-NEXT:    [[TMP49:%.*]] = sub nsw <16 x i32> [[TMP39]], [[TMP48]]
+; CHECK-NEXT:    [[TMP50:%.*]] = shl nsw <16 x i32> [[TMP49]], splat (i32 16)
+; CHECK-NEXT:    [[TMP51:%.*]] = add nsw <16 x i32> [[TMP50]], [[TMP30]]
+; CHECK-NEXT:    [[TMP52:%.*]] = shufflevector <16 x i32> [[TMP51]], <16 x i32> poison, <16 x i32> <i32 1, i32 0, i32 3, i32 2, i32 5, i32 4, i32 7, i32 6, i32 9, i32 8, i32 11, i32 10, i32 13, i32 12, i32 15, i32 14>
+; CHECK-NEXT:    [[TMP53:%.*]] = add nsw <16 x i32> [[TMP52]], [[TMP51]]
+; CHECK-NEXT:    [[TMP54:%.*]] = sub nsw <16 x i32> [[TMP52]], [[TMP51]]
+; CHECK-NEXT:    [[TMP55:%.*]] = shufflevector <16 x i32> [[TMP53]], <16 x i32> [[TMP54]], <16 x i32> <i32 0, i32 17, i32 2, i32 19, i32 4, i32 21, i32 6, i32 23, i32 8, i32 25, i32 10, i32 27, i32 12, i32 29, i32 14, i32 31>
+; CHECK-NEXT:    [[TMP56:%.*]] = shufflevector <16 x i32> [[TMP55]], <16 x i32> poison, <16 x i32> <i32 2, i32 3, i32 0, i32 1, i32 6, i32 7, i32 4, i32 5, i32 10, i32 11, i32 8, i32 9, i32 14, i32 15, i32 12, i32 13>
+; CHECK-NEXT:    [[TMP57:%.*]] = add nsw <16 x i32> [[TMP55]], [[TMP56]]
+; CHECK-NEXT:    [[TMP58:%.*]] = sub nsw <16 x i32> [[TMP55]], [[TMP56]]
+; CHECK-NEXT:    [[TMP59:%.*]] = shufflevector <16 x i32> [[TMP57]], <16 x i32> [[TMP58]], <16 x i32> <i32 16, i32 17, i32 2, i32 3, i32 20, i32 21, i32 6, i32 7, i32 24, i32 25, i32 10, i32 11, i32 28, i32 29, i32 14, i32 15>
+; CHECK-NEXT:    [[TMP60:%.*]] = shufflevector <16 x i32> [[TMP59]], <16 x i32> poison, <16 x i32> <i32 4, i32 5, i32 6, i32 7, i32 0, i32 1, i32 2, i32 3, i32 12, i32 13, i32 14, i32 15, i32 8, i32 9, i32 10, i32 11>
+; CHECK-NEXT:    [[TMP61:%.*]] = sub nsw <16 x i32> [[TMP59]], [[TMP60]]
+; CHECK-NEXT:    [[TMP62:%.*]] = add nsw <16 x i32> [[TMP59]], [[TMP60]]
+; CHECK-NEXT:    [[TMP63:%.*]] = shufflevector <16 x i32> [[TMP61]], <16 x i32> [[TMP62]], <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 20, i32 21, i32 22, i32 23, i32 8, i32 9, i32 10, i32 11, i32 28, i32 29, i32 30, i32 31>
+; CHECK-NEXT:    [[TMP64:%.*]] = shufflevector <16 x i32> [[TMP63]], <16 x i32> poison, <16 x i32> <i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>
+; CHECK-NEXT:    [[TMP65:%.*]] = add nsw <16 x i32> [[TMP63]], [[TMP64]]
+; CHECK-NEXT:    [[TMP66:%.*]] = sub nsw <16 x i32> [[TMP63]], [[TMP64]]
+; CHECK-NEXT:    [[TMP67:%.*]] = shufflevector <16 x i32> [[TMP65]], <16 x i32> [[TMP66]], <16 x i32> <i32 16, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22, i32 23, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>
+; CHECK-NEXT:    [[TMP68:%.*]] = lshr <16 x i32> [[TMP67]], splat (i32 15)
+; CHECK-NEXT:    [[TMP69:%.*]] = and <16 x i32> [[TMP68]], splat (i32 65537)
+; CHECK-NEXT:    [[TMP70:%.*]] = mul nuw <16 x i32> [[TMP69]], splat (i32 65535)
+; CHECK-NEXT:    [[TMP71:%.*]] = add <16 x i32> [[TMP70]], [[TMP67]]
+; CHECK-NEXT:    [[TMP72:%.*]] = xor <16 x i32> [[TMP71]], [[TMP70]]
+; CHECK-NEXT:    [[TMP73:%.*]] = call i32 @llvm.vector.reduce.add.v16i32(<16 x i32> [[TMP72]])
+; CHECK-NEXT:    [[CONV118:%.*]] = and i32 [[TMP73]], 65535
+; CHECK-NEXT:    [[SHR:%.*]] = lshr i32 [[TMP73]], 16
+; CHECK-NEXT:    [[ADD119:%.*]] = add nuw nsw i32 [[CONV118]], [[SHR]]
+; CHECK-NEXT:    [[SHR120:%.*]] = lshr i32 [[ADD119]], 1
+; CHECK-NEXT:    ret i32 [[SHR120]]
+;
+entry:
+  %idx.ext = sext i32 %i_pix1 to i64
+  %idx.ext63 = sext i32 %i_pix2 to i64
+  %0 = load i8, ptr %pix1, align 1
+  %conv = zext i8 %0 to i32
+  %1 = load i8, ptr %pix2, align 1
+  %conv2 = zext i8 %1 to i32
+  %sub = sub nsw i32 %conv, %conv2
+  %arrayidx3 = getelementptr inbounds nuw i8, ptr %pix1, i64 4
+  %2 = load i8, ptr %arrayidx3, align 1
+  %conv4 = zext i8 %2 to i32
+  %arrayidx5 = getelementptr inbounds nuw i8, ptr %pix2, i64 4
+  %3 = load i8, ptr %arrayidx5, align 1
+  %conv6 = zext i8 %3 to i32
+  %sub7 = sub nsw i32 %conv4, %conv6
+  %shl = shl nsw i32 %sub7, 16
+  %add = add nsw i32 %shl, %sub
+  %arrayidx8 = getelementptr inbounds nuw i8, ptr %pix1, i64 1
+  %4 = load i8, ptr %arrayidx8, align 1
+  %conv9 = zext i8 %4 to i32
+  %arrayidx10 = getelementptr inbounds nuw i8, ptr %pix2, i64 1
+  %5 = load i8, ptr %arrayidx10, align 1
+  %conv11 = zext i8 %5 to i32
+  %sub12 = sub nsw i32 %conv9, %conv11
+  %arrayidx13 = getelementptr inbounds nuw i8, ptr %pix1, i64 5
+  %6 = load i8, ptr %arrayidx13, align 1
+  %conv14 = zext i8 %6 to i32
+  %arrayidx15 = getelementptr inbounds nuw i8, ptr %pix2, i64 5
+  %7 = load i8, ptr %arrayidx15, align 1
+  %conv16 = zext i8 %7 to i32
+  %sub17 = sub nsw i32 %conv14, %conv16
+  %shl18 = shl nsw i32 %sub17, 16
+  %add19 = add nsw i32 %shl18, %sub12
+  %arrayidx20 = getelementptr inbounds nuw i8, ptr %pix1, i64 2
+  %8 = load i8, ptr %arrayidx20, align 1
+  %conv21 = zext i8 %8 to i32
+  %arrayidx22 = getelementptr inbounds nuw i8, ptr %pix2, i64 2
+  %9 = load i8, ptr %arrayidx22, align 1
+  %conv23 = zext i8 %9 to i32
+  %sub24 = sub nsw i32 %conv21, %conv23
+  %arrayidx25 = getelementptr inbounds nuw i8, ptr %pix1, i64 6
+  %10 = load i8, ptr %arrayidx25, align 1
+  %conv26 = zext i8 %10 to i32
+  %arrayidx27 = getelementptr inbounds nuw i8, ptr %pix2, i64 6
+  %11 = load i8, ptr %arrayidx27, align 1
+  %conv28 = zext i8 %11 to i32
+  %sub29 = sub nsw i32 %conv26, %conv28
+  %shl30 = shl nsw i32 %sub29, 16
+  %add31 = add nsw i32 %shl30, %sub24
+  %arrayidx32 = getelementptr inbounds nuw i8, ptr %pix1, i64 3
+  %12 = load i8, ptr %arrayidx32, align 1
+  %conv33 = zext i8 %12 to i32
+  %arrayidx34 = getelementptr inbounds nuw i8, ptr %pix2, i64 3
+  %13 = load i8, ptr %arrayidx34, align 1
+  %conv35 = zext i8 %13 to i32
+  %sub36 = sub nsw i32 %conv33, %conv35
+  %arrayidx37 = getelementptr inbounds nuw i8, ptr %pix1, i64 7
+  %14 = load i8, ptr %arrayidx37, align 1
+  %conv38 = zext i8 %14 to i32
+  %arrayidx39 = getelementptr inbounds nuw i8, ptr %pix2, i64 7
+  %15 = load i8, ptr %arrayidx39, align 1
+  %conv40 = zext i8 %15 to i32
+  %sub41 = sub nsw i32 %conv38, %conv40
+  %shl42 = shl nsw i32 %sub41, 16
+  %add43 = add nsw i32 %shl42, %sub36
+  %add44 = add nsw i32 %add19, %add
+  %sub45 = sub nsw i32 %add, %add19
+  %add46 = add nsw i32 %add43, %add31
+  %sub47 = sub nsw i32 %add31, %add43
+  %add48 = add nsw i32 %add46, %add44
+  %sub51 = sub nsw i32 %add44, %add46
+  %add55 = add nsw i32 %sub47, %sub45
+  %sub59 = sub nsw i32 %sub45, %sub47
+  %add.ptr = getelementptr inbounds i8, ptr %pix1, i64 %idx.ext
+  %add.ptr64 = getelementptr inbounds i8, ptr %pix2, i64 %idx.ext63
+  %16 = load i8, ptr %add.ptr, align 1
+  %conv.1 = zext i8 %16 to i32
+  %17 = load i8, ptr %add.ptr64, align 1
+  %conv2.1 = zext i8 %17 to i32
+  %sub.1 = sub nsw i32 %conv.1, %conv2.1
+  %arrayidx3.1 = getelementptr inbounds nuw i8, ptr %add.ptr, i64 4
+  %18 = load i8, ptr %arrayidx3.1, align 1
+  %conv4.1 = zext i8 %18 to i32
+  %arrayidx5.1 = getelementptr inbounds nuw i8, ptr %add.ptr64, i64 4
+  %19 = load i8, ptr %arrayidx5.1, align 1
+  %conv6.1 = zext i8 %19 to i32
+  %sub7.1 = sub nsw i32 %conv4.1, %conv6.1
+  %shl.1 = shl nsw i32 %sub7.1, 16
+  %add.1 = add nsw i32 %shl.1, %sub.1
+  %arrayidx8.1 = getelementptr inbounds nuw i8, ptr %add.ptr, i64 1
+  %20 = load i8, ptr %arrayidx8.1, align 1
+  %conv9.1 = zext i8 %20 to i32
+  %arrayidx10.1 = getelementptr inbounds nuw i8, ptr %add.ptr64, i64 1
+  %21 = load i8, ptr %arrayidx10.1, align 1
+  %conv11.1 = zext i8 %21 to i32
+  %sub12.1 = sub nsw i32 %conv9.1, %conv11.1
+  %arrayidx13.1 = getelementptr inbounds nuw i8, ptr %add.ptr, i64 5
+  %22 = load i8, ptr %arrayidx13.1, align 1
+  %conv14.1 = zext i8 %22 to i32
+  %arrayidx15.1 = getelementptr inbounds nuw i8, ptr %add.ptr64, i64 5
+  %23 = load i8, ptr %arrayidx15.1, align 1
+  %conv16.1 = zext i8 %23 to i32
+  %sub17.1 = sub nsw i32 %conv14.1, %conv16.1
+  %shl18.1 = shl nsw i32 %sub17.1, 16
+  %add19.1 = add nsw i32 %shl18.1, %sub12.1
+  %arrayidx20.1 = getelementptr inbounds nuw i8, ptr %add.ptr, i64 2
+  %24 = load i8, ptr %arrayidx20.1, align 1
+  %conv21.1 = zext i8 %24 to i32
+  %arrayidx22.1 = getelementptr inbounds nuw i8, ptr %add.ptr64, i64 2
+  %25 = load i8, ptr %arrayidx22.1, align 1
+  %conv23.1 = zext i8 %25 to i32
+  %sub24.1 = sub nsw i32 %conv21.1, %conv23.1
+  %arrayidx25.1 = getelementptr inbounds nuw i8, ptr %add.ptr, i64 6
+  %26 = load i8, ptr %arrayidx25.1, align 1
+  %conv26.1 = zext i8 %26 to i32
+  %arrayidx27.1 = getelementptr inbounds nuw i8, ptr %add.ptr64, i64 6
+  %27 = load i8, ptr %arrayidx27.1, align 1
+  %conv28.1 = zext i8 %27 to i32
+  %sub29.1 = sub nsw i32 %conv26.1, %conv28.1
+  %shl30.1 = shl nsw i32 %sub29.1, 16
+  %add31.1 = add nsw i32 %shl30.1, %sub24.1
+  %arrayidx32.1 = getelementptr inbounds nuw i8, ptr %add.ptr, i64 3
+  %28 = load i8, ptr %arrayidx32.1, align 1
+  %conv33.1 = zext i8 %28 to i32
+  %arrayidx34.1 = getelementptr inbounds nuw i8, ptr %add.ptr64, i64 3
+  %29 = load i8, ptr %arrayidx34.1, align 1
+  %conv35.1 = zext i8 %29 to i32
+  %sub36.1 = sub nsw i32 %conv33.1, %conv35.1
+  %arrayidx37.1 = getelementptr inbounds nuw i8, ptr %add.ptr, i64 7
+  %30 = load i8, ptr %arrayidx37.1, align 1
+  %conv38.1 = zext i8 %30 to i32
+  %arrayidx39.1 = getelementptr inbounds nuw i8, ptr %add.ptr64, i64 7
+  %31 = load i8, ptr %arrayidx39.1, align 1
+  %conv40.1 = zext i8 %31 to i32
+  %sub41.1 = sub nsw i32 %conv38.1, %conv40.1
+  %shl42.1 = shl nsw i32 %sub41.1, 16
+  %add43.1 ...
[truncated]

llvmbot · 2025-10-08T20:09:11Z

@llvm/pr-subscribers-llvm-transforms

Author: Mikhail Gudim (mgudim)

Changes

Patch is 32.73 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/162542.diff

1 Files Affected:

(added) llvm/test/Transforms/SLPVectorizer/RISCV/x264-satd-8x4.ll (+526)

diff --git a/llvm/test/Transforms/SLPVectorizer/RISCV/x264-satd-8x4.ll b/llvm/test/Transforms/SLPVectorizer/RISCV/x264-satd-8x4.ll
new file mode 100644
index 0000000000000..c1042f1842107
--- /dev/null
+++ b/llvm/test/Transforms/SLPVectorizer/RISCV/x264-satd-8x4.ll
@@ -0,0 +1,526 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -mtriple=riscv64 -mattr=+m,+v,+unaligned-vector-mem \
+; RUN: -passes=slp-vectorizer -S < %s | FileCheck %s
+; Function Attrs: nounwind uwtable vscale_range(8,1024)
+define i32 @x264_pixel_satd_8x4(ptr %pix1, i32  %i_pix1, ptr  %pix2, i32  %i_pix2) {
+; CHECK-LABEL: define i32 @x264_pixel_satd_8x4(
+; CHECK-SAME: ptr [[PIX1:%.*]], i32 [[I_PIX1:%.*]], ptr [[PIX2:%.*]], i32 [[I_PIX2:%.*]]) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[IDX_EXT:%.*]] = sext i32 [[I_PIX1]] to i64
+; CHECK-NEXT:    [[IDX_EXT63:%.*]] = sext i32 [[I_PIX2]] to i64
+; CHECK-NEXT:    [[ARRAYIDX3:%.*]] = getelementptr inbounds nuw i8, ptr [[PIX1]], i64 4
+; CHECK-NEXT:    [[ARRAYIDX5:%.*]] = getelementptr inbounds nuw i8, ptr [[PIX2]], i64 4
+; CHECK-NEXT:    [[ADD_PTR:%.*]] = getelementptr inbounds i8, ptr [[PIX1]], i64 [[IDX_EXT]]
+; CHECK-NEXT:    [[ADD_PTR64:%.*]] = getelementptr inbounds i8, ptr [[PIX2]], i64 [[IDX_EXT63]]
+; CHECK-NEXT:    [[ARRAYIDX3_1:%.*]] = getelementptr inbounds nuw i8, ptr [[ADD_PTR]], i64 4
+; CHECK-NEXT:    [[ARRAYIDX5_1:%.*]] = getelementptr inbounds nuw i8, ptr [[ADD_PTR64]], i64 4
+; CHECK-NEXT:    [[ADD_PTR_1:%.*]] = getelementptr inbounds i8, ptr [[ADD_PTR]], i64 [[IDX_EXT]]
+; CHECK-NEXT:    [[ADD_PTR64_1:%.*]] = getelementptr inbounds i8, ptr [[ADD_PTR64]], i64 [[IDX_EXT63]]
+; CHECK-NEXT:    [[ARRAYIDX3_2:%.*]] = getelementptr inbounds nuw i8, ptr [[ADD_PTR_1]], i64 4
+; CHECK-NEXT:    [[ARRAYIDX5_2:%.*]] = getelementptr inbounds nuw i8, ptr [[ADD_PTR64_1]], i64 4
+; CHECK-NEXT:    [[ADD_PTR_2:%.*]] = getelementptr inbounds i8, ptr [[ADD_PTR_1]], i64 [[IDX_EXT]]
+; CHECK-NEXT:    [[ADD_PTR64_2:%.*]] = getelementptr inbounds i8, ptr [[ADD_PTR64_1]], i64 [[IDX_EXT63]]
+; CHECK-NEXT:    [[ARRAYIDX3_3:%.*]] = getelementptr inbounds nuw i8, ptr [[ADD_PTR_2]], i64 4
+; CHECK-NEXT:    [[ARRAYIDX5_3:%.*]] = getelementptr inbounds nuw i8, ptr [[ADD_PTR64_2]], i64 4
+; CHECK-NEXT:    [[TMP0:%.*]] = load <4 x i8>, ptr [[PIX1]], align 1
+; CHECK-NEXT:    [[TMP1:%.*]] = load <4 x i8>, ptr [[PIX2]], align 1
+; CHECK-NEXT:    [[TMP2:%.*]] = load <4 x i8>, ptr [[ARRAYIDX3]], align 1
+; CHECK-NEXT:    [[TMP3:%.*]] = load <4 x i8>, ptr [[ARRAYIDX5]], align 1
+; CHECK-NEXT:    [[TMP4:%.*]] = load <4 x i8>, ptr [[ADD_PTR]], align 1
+; CHECK-NEXT:    [[TMP5:%.*]] = load <4 x i8>, ptr [[ADD_PTR64]], align 1
+; CHECK-NEXT:    [[TMP6:%.*]] = load <4 x i8>, ptr [[ARRAYIDX3_1]], align 1
+; CHECK-NEXT:    [[TMP7:%.*]] = load <4 x i8>, ptr [[ARRAYIDX5_1]], align 1
+; CHECK-NEXT:    [[TMP8:%.*]] = load <4 x i8>, ptr [[ADD_PTR_1]], align 1
+; CHECK-NEXT:    [[TMP9:%.*]] = load <4 x i8>, ptr [[ADD_PTR64_1]], align 1
+; CHECK-NEXT:    [[TMP10:%.*]] = load <4 x i8>, ptr [[ARRAYIDX3_2]], align 1
+; CHECK-NEXT:    [[TMP11:%.*]] = load <4 x i8>, ptr [[ARRAYIDX5_2]], align 1
+; CHECK-NEXT:    [[TMP12:%.*]] = load <4 x i8>, ptr [[ADD_PTR_2]], align 1
+; CHECK-NEXT:    [[TMP13:%.*]] = shufflevector <4 x i8> [[TMP0]], <4 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP14:%.*]] = shufflevector <4 x i8> [[TMP4]], <4 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP15:%.*]] = shufflevector <4 x i8> [[TMP0]], <4 x i8> [[TMP4]], <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP16:%.*]] = shufflevector <4 x i8> [[TMP8]], <4 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP17:%.*]] = shufflevector <16 x i8> [[TMP15]], <16 x i8> [[TMP16]], <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 16, i32 17, i32 18, i32 19, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP18:%.*]] = shufflevector <4 x i8> [[TMP12]], <4 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP19:%.*]] = shufflevector <16 x i8> [[TMP17]], <16 x i8> [[TMP18]], <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 16, i32 17, i32 18, i32 19>
+; CHECK-NEXT:    [[TMP20:%.*]] = zext <16 x i8> [[TMP19]] to <16 x i32>
+; CHECK-NEXT:    [[TMP21:%.*]] = load <4 x i8>, ptr [[ADD_PTR64_2]], align 1
+; CHECK-NEXT:    [[TMP22:%.*]] = shufflevector <4 x i8> [[TMP1]], <4 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP23:%.*]] = shufflevector <4 x i8> [[TMP5]], <4 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP24:%.*]] = shufflevector <4 x i8> [[TMP1]], <4 x i8> [[TMP5]], <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP25:%.*]] = shufflevector <4 x i8> [[TMP9]], <4 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP26:%.*]] = shufflevector <16 x i8> [[TMP24]], <16 x i8> [[TMP25]], <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 16, i32 17, i32 18, i32 19, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP27:%.*]] = shufflevector <4 x i8> [[TMP21]], <4 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP28:%.*]] = shufflevector <16 x i8> [[TMP26]], <16 x i8> [[TMP27]], <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 16, i32 17, i32 18, i32 19>
+; CHECK-NEXT:    [[TMP29:%.*]] = zext <16 x i8> [[TMP28]] to <16 x i32>
+; CHECK-NEXT:    [[TMP30:%.*]] = sub nsw <16 x i32> [[TMP20]], [[TMP29]]
+; CHECK-NEXT:    [[TMP31:%.*]] = load <4 x i8>, ptr [[ARRAYIDX3_3]], align 1
+; CHECK-NEXT:    [[TMP32:%.*]] = shufflevector <4 x i8> [[TMP2]], <4 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP33:%.*]] = shufflevector <4 x i8> [[TMP6]], <4 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP34:%.*]] = shufflevector <4 x i8> [[TMP2]], <4 x i8> [[TMP6]], <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP35:%.*]] = shufflevector <4 x i8> [[TMP10]], <4 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP36:%.*]] = shufflevector <16 x i8> [[TMP34]], <16 x i8> [[TMP35]], <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 16, i32 17, i32 18, i32 19, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP37:%.*]] = shufflevector <4 x i8> [[TMP31]], <4 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP38:%.*]] = shufflevector <16 x i8> [[TMP36]], <16 x i8> [[TMP37]], <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 16, i32 17, i32 18, i32 19>
+; CHECK-NEXT:    [[TMP39:%.*]] = zext <16 x i8> [[TMP38]] to <16 x i32>
+; CHECK-NEXT:    [[TMP40:%.*]] = load <4 x i8>, ptr [[ARRAYIDX5_3]], align 1
+; CHECK-NEXT:    [[TMP41:%.*]] = shufflevector <4 x i8> [[TMP3]], <4 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP42:%.*]] = shufflevector <4 x i8> [[TMP7]], <4 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP43:%.*]] = shufflevector <4 x i8> [[TMP3]], <4 x i8> [[TMP7]], <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP44:%.*]] = shufflevector <4 x i8> [[TMP11]], <4 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP45:%.*]] = shufflevector <16 x i8> [[TMP43]], <16 x i8> [[TMP44]], <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 16, i32 17, i32 18, i32 19, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP46:%.*]] = shufflevector <4 x i8> [[TMP40]], <4 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+; CHECK-NEXT:    [[TMP47:%.*]] = shufflevector <16 x i8> [[TMP45]], <16 x i8> [[TMP46]], <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 16, i32 17, i32 18, i32 19>
+; CHECK-NEXT:    [[TMP48:%.*]] = zext <16 x i8> [[TMP47]] to <16 x i32>
+; CHECK-NEXT:    [[TMP49:%.*]] = sub nsw <16 x i32> [[TMP39]], [[TMP48]]
+; CHECK-NEXT:    [[TMP50:%.*]] = shl nsw <16 x i32> [[TMP49]], splat (i32 16)
+; CHECK-NEXT:    [[TMP51:%.*]] = add nsw <16 x i32> [[TMP50]], [[TMP30]]
+; CHECK-NEXT:    [[TMP52:%.*]] = shufflevector <16 x i32> [[TMP51]], <16 x i32> poison, <16 x i32> <i32 1, i32 0, i32 3, i32 2, i32 5, i32 4, i32 7, i32 6, i32 9, i32 8, i32 11, i32 10, i32 13, i32 12, i32 15, i32 14>
+; CHECK-NEXT:    [[TMP53:%.*]] = add nsw <16 x i32> [[TMP52]], [[TMP51]]
+; CHECK-NEXT:    [[TMP54:%.*]] = sub nsw <16 x i32> [[TMP52]], [[TMP51]]
+; CHECK-NEXT:    [[TMP55:%.*]] = shufflevector <16 x i32> [[TMP53]], <16 x i32> [[TMP54]], <16 x i32> <i32 0, i32 17, i32 2, i32 19, i32 4, i32 21, i32 6, i32 23, i32 8, i32 25, i32 10, i32 27, i32 12, i32 29, i32 14, i32 31>
+; CHECK-NEXT:    [[TMP56:%.*]] = shufflevector <16 x i32> [[TMP55]], <16 x i32> poison, <16 x i32> <i32 2, i32 3, i32 0, i32 1, i32 6, i32 7, i32 4, i32 5, i32 10, i32 11, i32 8, i32 9, i32 14, i32 15, i32 12, i32 13>
+; CHECK-NEXT:    [[TMP57:%.*]] = add nsw <16 x i32> [[TMP55]], [[TMP56]]
+; CHECK-NEXT:    [[TMP58:%.*]] = sub nsw <16 x i32> [[TMP55]], [[TMP56]]
+; CHECK-NEXT:    [[TMP59:%.*]] = shufflevector <16 x i32> [[TMP57]], <16 x i32> [[TMP58]], <16 x i32> <i32 16, i32 17, i32 2, i32 3, i32 20, i32 21, i32 6, i32 7, i32 24, i32 25, i32 10, i32 11, i32 28, i32 29, i32 14, i32 15>
+; CHECK-NEXT:    [[TMP60:%.*]] = shufflevector <16 x i32> [[TMP59]], <16 x i32> poison, <16 x i32> <i32 4, i32 5, i32 6, i32 7, i32 0, i32 1, i32 2, i32 3, i32 12, i32 13, i32 14, i32 15, i32 8, i32 9, i32 10, i32 11>
+; CHECK-NEXT:    [[TMP61:%.*]] = sub nsw <16 x i32> [[TMP59]], [[TMP60]]
+; CHECK-NEXT:    [[TMP62:%.*]] = add nsw <16 x i32> [[TMP59]], [[TMP60]]
+; CHECK-NEXT:    [[TMP63:%.*]] = shufflevector <16 x i32> [[TMP61]], <16 x i32> [[TMP62]], <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 20, i32 21, i32 22, i32 23, i32 8, i32 9, i32 10, i32 11, i32 28, i32 29, i32 30, i32 31>
+; CHECK-NEXT:    [[TMP64:%.*]] = shufflevector <16 x i32> [[TMP63]], <16 x i32> poison, <16 x i32> <i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>
+; CHECK-NEXT:    [[TMP65:%.*]] = add nsw <16 x i32> [[TMP63]], [[TMP64]]
+; CHECK-NEXT:    [[TMP66:%.*]] = sub nsw <16 x i32> [[TMP63]], [[TMP64]]
+; CHECK-NEXT:    [[TMP67:%.*]] = shufflevector <16 x i32> [[TMP65]], <16 x i32> [[TMP66]], <16 x i32> <i32 16, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22, i32 23, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>
+; CHECK-NEXT:    [[TMP68:%.*]] = lshr <16 x i32> [[TMP67]], splat (i32 15)
+; CHECK-NEXT:    [[TMP69:%.*]] = and <16 x i32> [[TMP68]], splat (i32 65537)
+; CHECK-NEXT:    [[TMP70:%.*]] = mul nuw <16 x i32> [[TMP69]], splat (i32 65535)
+; CHECK-NEXT:    [[TMP71:%.*]] = add <16 x i32> [[TMP70]], [[TMP67]]
+; CHECK-NEXT:    [[TMP72:%.*]] = xor <16 x i32> [[TMP71]], [[TMP70]]
+; CHECK-NEXT:    [[TMP73:%.*]] = call i32 @llvm.vector.reduce.add.v16i32(<16 x i32> [[TMP72]])
+; CHECK-NEXT:    [[CONV118:%.*]] = and i32 [[TMP73]], 65535
+; CHECK-NEXT:    [[SHR:%.*]] = lshr i32 [[TMP73]], 16
+; CHECK-NEXT:    [[ADD119:%.*]] = add nuw nsw i32 [[CONV118]], [[SHR]]
+; CHECK-NEXT:    [[SHR120:%.*]] = lshr i32 [[ADD119]], 1
+; CHECK-NEXT:    ret i32 [[SHR120]]
+;
+entry:
+  %idx.ext = sext i32 %i_pix1 to i64
+  %idx.ext63 = sext i32 %i_pix2 to i64
+  %0 = load i8, ptr %pix1, align 1
+  %conv = zext i8 %0 to i32
+  %1 = load i8, ptr %pix2, align 1
+  %conv2 = zext i8 %1 to i32
+  %sub = sub nsw i32 %conv, %conv2
+  %arrayidx3 = getelementptr inbounds nuw i8, ptr %pix1, i64 4
+  %2 = load i8, ptr %arrayidx3, align 1
+  %conv4 = zext i8 %2 to i32
+  %arrayidx5 = getelementptr inbounds nuw i8, ptr %pix2, i64 4
+  %3 = load i8, ptr %arrayidx5, align 1
+  %conv6 = zext i8 %3 to i32
+  %sub7 = sub nsw i32 %conv4, %conv6
+  %shl = shl nsw i32 %sub7, 16
+  %add = add nsw i32 %shl, %sub
+  %arrayidx8 = getelementptr inbounds nuw i8, ptr %pix1, i64 1
+  %4 = load i8, ptr %arrayidx8, align 1
+  %conv9 = zext i8 %4 to i32
+  %arrayidx10 = getelementptr inbounds nuw i8, ptr %pix2, i64 1
+  %5 = load i8, ptr %arrayidx10, align 1
+  %conv11 = zext i8 %5 to i32
+  %sub12 = sub nsw i32 %conv9, %conv11
+  %arrayidx13 = getelementptr inbounds nuw i8, ptr %pix1, i64 5
+  %6 = load i8, ptr %arrayidx13, align 1
+  %conv14 = zext i8 %6 to i32
+  %arrayidx15 = getelementptr inbounds nuw i8, ptr %pix2, i64 5
+  %7 = load i8, ptr %arrayidx15, align 1
+  %conv16 = zext i8 %7 to i32
+  %sub17 = sub nsw i32 %conv14, %conv16
+  %shl18 = shl nsw i32 %sub17, 16
+  %add19 = add nsw i32 %shl18, %sub12
+  %arrayidx20 = getelementptr inbounds nuw i8, ptr %pix1, i64 2
+  %8 = load i8, ptr %arrayidx20, align 1
+  %conv21 = zext i8 %8 to i32
+  %arrayidx22 = getelementptr inbounds nuw i8, ptr %pix2, i64 2
+  %9 = load i8, ptr %arrayidx22, align 1
+  %conv23 = zext i8 %9 to i32
+  %sub24 = sub nsw i32 %conv21, %conv23
+  %arrayidx25 = getelementptr inbounds nuw i8, ptr %pix1, i64 6
+  %10 = load i8, ptr %arrayidx25, align 1
+  %conv26 = zext i8 %10 to i32
+  %arrayidx27 = getelementptr inbounds nuw i8, ptr %pix2, i64 6
+  %11 = load i8, ptr %arrayidx27, align 1
+  %conv28 = zext i8 %11 to i32
+  %sub29 = sub nsw i32 %conv26, %conv28
+  %shl30 = shl nsw i32 %sub29, 16
+  %add31 = add nsw i32 %shl30, %sub24
+  %arrayidx32 = getelementptr inbounds nuw i8, ptr %pix1, i64 3
+  %12 = load i8, ptr %arrayidx32, align 1
+  %conv33 = zext i8 %12 to i32
+  %arrayidx34 = getelementptr inbounds nuw i8, ptr %pix2, i64 3
+  %13 = load i8, ptr %arrayidx34, align 1
+  %conv35 = zext i8 %13 to i32
+  %sub36 = sub nsw i32 %conv33, %conv35
+  %arrayidx37 = getelementptr inbounds nuw i8, ptr %pix1, i64 7
+  %14 = load i8, ptr %arrayidx37, align 1
+  %conv38 = zext i8 %14 to i32
+  %arrayidx39 = getelementptr inbounds nuw i8, ptr %pix2, i64 7
+  %15 = load i8, ptr %arrayidx39, align 1
+  %conv40 = zext i8 %15 to i32
+  %sub41 = sub nsw i32 %conv38, %conv40
+  %shl42 = shl nsw i32 %sub41, 16
+  %add43 = add nsw i32 %shl42, %sub36
+  %add44 = add nsw i32 %add19, %add
+  %sub45 = sub nsw i32 %add, %add19
+  %add46 = add nsw i32 %add43, %add31
+  %sub47 = sub nsw i32 %add31, %add43
+  %add48 = add nsw i32 %add46, %add44
+  %sub51 = sub nsw i32 %add44, %add46
+  %add55 = add nsw i32 %sub47, %sub45
+  %sub59 = sub nsw i32 %sub45, %sub47
+  %add.ptr = getelementptr inbounds i8, ptr %pix1, i64 %idx.ext
+  %add.ptr64 = getelementptr inbounds i8, ptr %pix2, i64 %idx.ext63
+  %16 = load i8, ptr %add.ptr, align 1
+  %conv.1 = zext i8 %16 to i32
+  %17 = load i8, ptr %add.ptr64, align 1
+  %conv2.1 = zext i8 %17 to i32
+  %sub.1 = sub nsw i32 %conv.1, %conv2.1
+  %arrayidx3.1 = getelementptr inbounds nuw i8, ptr %add.ptr, i64 4
+  %18 = load i8, ptr %arrayidx3.1, align 1
+  %conv4.1 = zext i8 %18 to i32
+  %arrayidx5.1 = getelementptr inbounds nuw i8, ptr %add.ptr64, i64 4
+  %19 = load i8, ptr %arrayidx5.1, align 1
+  %conv6.1 = zext i8 %19 to i32
+  %sub7.1 = sub nsw i32 %conv4.1, %conv6.1
+  %shl.1 = shl nsw i32 %sub7.1, 16
+  %add.1 = add nsw i32 %shl.1, %sub.1
+  %arrayidx8.1 = getelementptr inbounds nuw i8, ptr %add.ptr, i64 1
+  %20 = load i8, ptr %arrayidx8.1, align 1
+  %conv9.1 = zext i8 %20 to i32
+  %arrayidx10.1 = getelementptr inbounds nuw i8, ptr %add.ptr64, i64 1
+  %21 = load i8, ptr %arrayidx10.1, align 1
+  %conv11.1 = zext i8 %21 to i32
+  %sub12.1 = sub nsw i32 %conv9.1, %conv11.1
+  %arrayidx13.1 = getelementptr inbounds nuw i8, ptr %add.ptr, i64 5
+  %22 = load i8, ptr %arrayidx13.1, align 1
+  %conv14.1 = zext i8 %22 to i32
+  %arrayidx15.1 = getelementptr inbounds nuw i8, ptr %add.ptr64, i64 5
+  %23 = load i8, ptr %arrayidx15.1, align 1
+  %conv16.1 = zext i8 %23 to i32
+  %sub17.1 = sub nsw i32 %conv14.1, %conv16.1
+  %shl18.1 = shl nsw i32 %sub17.1, 16
+  %add19.1 = add nsw i32 %shl18.1, %sub12.1
+  %arrayidx20.1 = getelementptr inbounds nuw i8, ptr %add.ptr, i64 2
+  %24 = load i8, ptr %arrayidx20.1, align 1
+  %conv21.1 = zext i8 %24 to i32
+  %arrayidx22.1 = getelementptr inbounds nuw i8, ptr %add.ptr64, i64 2
+  %25 = load i8, ptr %arrayidx22.1, align 1
+  %conv23.1 = zext i8 %25 to i32
+  %sub24.1 = sub nsw i32 %conv21.1, %conv23.1
+  %arrayidx25.1 = getelementptr inbounds nuw i8, ptr %add.ptr, i64 6
+  %26 = load i8, ptr %arrayidx25.1, align 1
+  %conv26.1 = zext i8 %26 to i32
+  %arrayidx27.1 = getelementptr inbounds nuw i8, ptr %add.ptr64, i64 6
+  %27 = load i8, ptr %arrayidx27.1, align 1
+  %conv28.1 = zext i8 %27 to i32
+  %sub29.1 = sub nsw i32 %conv26.1, %conv28.1
+  %shl30.1 = shl nsw i32 %sub29.1, 16
+  %add31.1 = add nsw i32 %shl30.1, %sub24.1
+  %arrayidx32.1 = getelementptr inbounds nuw i8, ptr %add.ptr, i64 3
+  %28 = load i8, ptr %arrayidx32.1, align 1
+  %conv33.1 = zext i8 %28 to i32
+  %arrayidx34.1 = getelementptr inbounds nuw i8, ptr %add.ptr64, i64 3
+  %29 = load i8, ptr %arrayidx34.1, align 1
+  %conv35.1 = zext i8 %29 to i32
+  %sub36.1 = sub nsw i32 %conv33.1, %conv35.1
+  %arrayidx37.1 = getelementptr inbounds nuw i8, ptr %add.ptr, i64 7
+  %30 = load i8, ptr %arrayidx37.1, align 1
+  %conv38.1 = zext i8 %30 to i32
+  %arrayidx39.1 = getelementptr inbounds nuw i8, ptr %add.ptr64, i64 7
+  %31 = load i8, ptr %arrayidx39.1, align 1
+  %conv40.1 = zext i8 %31 to i32
+  %sub41.1 = sub nsw i32 %conv38.1, %conv40.1
+  %shl42.1 = shl nsw i32 %sub41.1, 16
+  %add43.1 ...
[truncated]

topperc · 2025-10-08T20:16:54Z

Please put [SLP] in the title

llvm-ci · 2025-10-10T21:29:05Z

LLVM Buildbot has detected a new failure on builder hip-third-party-libs-test running on ext_buildbot_hw_05-hip-docker while building llvm at step 4 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/206/builds/7380

Here is the relevant piece of the build log for the reference

Step 4 (annotate) failure: '../llvm-zorg/zorg/buildbot/builders/annotated/hip-tpl.py --jobs=32' (failure)
...
[5499/8112] Linking CXX shared library lib/libMLIRIndexToLLVM.so.22.0git
[5500/8112] Creating library symlink lib/libMLIRIndexToLLVM.so
[5501/8112] Linking CXX shared library lib/libMLIRAMDGPUToROCDL.so.22.0git
[5502/8112] Creating library symlink lib/libMLIRAMDGPUToROCDL.so
[5503/8112] Linking CXX shared library lib/libMLIRComplexToLLVM.so.22.0git
[5504/8112] Linking CXX shared library lib/libMLIRMPIToLLVM.so.22.0git
[5505/8112] Creating library symlink lib/libMLIRComplexToLLVM.so
[5506/8112] Creating library symlink lib/libMLIRMPIToLLVM.so
[5507/8112] Creating library symlink lib/libMLIRAMDGPUTransforms.so
[5508/8112] Linking CXX shared library lib/libMLIRMathToXeVM.so.22.0git
FAILED: lib/libMLIRMathToXeVM.so.22.0git 
: && /usr/bin/c++ -fPIC -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-array-bounds -Wno-stringop-overread -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Wundef -Wno-unused-but-set-parameter -Wno-deprecated-copy -O3 -DNDEBUG  -Wl,-z,defs -Wl,-z,nodelete   -Wl,-rpath-link,/home/botworker/bbot/hip-third-party-libs-test/build/./lib  -Wl,--gc-sections -shared -Wl,-soname,libMLIRMathToXeVM.so.22.0git -o lib/libMLIRMathToXeVM.so.22.0git tools/mlir/lib/Conversion/MathToXeVM/CMakeFiles/obj.MLIRMathToXeVM.dir/MathToXeVM.cpp.o  -Wl,-rpath,"\$ORIGIN/../lib:/home/botworker/bbot/hip-third-party-libs-test/build/lib:"  lib/libMLIRMathDialect.so.22.0git  lib/libMLIRLLVMCommonConversion.so.22.0git  lib/libMLIRVectorDialect.so.22.0git  lib/libMLIRLLVMDialect.so.22.0git  lib/libLLVMCore.so.22.0git  lib/libMLIRPtrMemorySpaceInterfaces.so.22.0git  lib/libLLVMBinaryFormat.so.22.0git  lib/libMLIRTransforms.so.22.0git  lib/libMLIRTransformUtils.so.22.0git  lib/libMLIRSubsetOpInterface.so.22.0git  lib/libMLIRRewrite.so.22.0git  lib/libMLIRRewritePDL.so.22.0git  lib/libMLIRPDLToPDLInterp.so.22.0git  lib/libMLIRPass.so.22.0git  lib/libMLIRPDLInterpDialect.so.22.0git  lib/libMLIRPDLDialect.so.22.0git  lib/libMLIRIndexingMapOpInterface.so.22.0git  lib/libMLIRMaskableOpInterface.so.22.0git  lib/libMLIRMaskingOpInterface.so.22.0git  lib/libMLIRTensorDialect.so.22.0git  lib/libMLIRAffineDialect.so.22.0git  lib/libMLIRMemRefDialect.so.22.0git  lib/libMLIRMemorySlotInterfaces.so.22.0git  lib/libMLIRMemOpInterfaces.so.22.0git  lib/libMLIRRuntimeVerifiableOpInterface.so.22.0git  lib/libMLIRArithUtils.so.22.0git  lib/libMLIRDialectUtils.so.22.0git  lib/libMLIRComplexDialect.so.22.0git  lib/libMLIRArithDialect.so.22.0git  lib/libMLIRUBDialect.so.22.0git  lib/libMLIRCastInterfaces.so.22.0git  lib/libMLIRInferIntRangeCommon.so.22.0git  lib/libMLIRShapedOpInterfaces.so.22.0git  lib/libMLIRDialect.so.22.0git  lib/libMLIRParallelCombiningOpInterface.so.22.0git  lib/libMLIRValueBoundsOpInterface.so.22.0git  lib/libMLIRAnalysis.so.22.0git  lib/libMLIRControlFlowInterfaces.so.22.0git  lib/libMLIRLoopLikeInterface.so.22.0git  lib/libMLIRFunctionInterfaces.so.22.0git  lib/libMLIRCallInterfaces.so.22.0git  lib/libMLIRSideEffectInterfaces.so.22.0git  lib/libMLIRDataLayoutInterfaces.so.22.0git  lib/libMLIRInferIntRangeInterface.so.22.0git  lib/libMLIRInferTypeOpInterface.so.22.0git  lib/libMLIRPresburger.so.22.0git  lib/libMLIRViewLikeInterface.so.22.0git  lib/libMLIRDestinationStyleOpInterface.so.22.0git  lib/libMLIRVectorInterfaces.so.22.0git  lib/libMLIRIR.so.22.0git  lib/libMLIRSupport.so.22.0git  lib/libLLVMSupport.so.22.0git  -Wl,-rpath-link,/home/botworker/bbot/hip-third-party-libs-test/build/lib && :
/usr/bin/ld: tools/mlir/lib/Conversion/MathToXeVM/CMakeFiles/obj.MLIRMathToXeVM.dir/MathToXeVM.cpp.o: in function `mlir::impl::ConvertMathToXeVMBase<(anonymous namespace)::ConvertMathToXeVMPass>::getDependentDialects(mlir::DialectRegistry&) const':
MathToXeVM.cpp:(.text._ZNK4mlir4impl21ConvertMathToXeVMBaseIN12_GLOBAL__N_121ConvertMathToXeVMPassEE20getDependentDialectsERNS_15DialectRegistryE+0xa3): undefined reference to `mlir::detail::TypeIDResolver<mlir::xevm::XeVMDialect, void>::id'
/usr/bin/ld: tools/mlir/lib/Conversion/MathToXeVM/CMakeFiles/obj.MLIRMathToXeVM.dir/MathToXeVM.cpp.o: in function `std::_Function_handler<mlir::Dialect* (mlir::MLIRContext*), mlir::DialectRegistry::insert<mlir::xevm::XeVMDialect>()::{lambda(mlir::MLIRContext*)#1}>::_M_invoke(std::_Any_data const&, mlir::MLIRContext*&&)':
MathToXeVM.cpp:(.text._ZNSt17_Function_handlerIFPN4mlir7DialectEPNS0_11MLIRContextEEZNS0_15DialectRegistry6insertINS0_4xevm11XeVMDialectEEEvvEUlS4_E_E9_M_invokeERKSt9_Any_dataOS4_[_ZNSt17_Function_handlerIFPN4mlir7DialectEPNS0_11MLIRContextEEZNS0_15DialectRegistry6insertINS0_4xevm11XeVMDialectEEEvvEUlS4_E_E9_M_invokeERKSt9_Any_dataOS4_]+0x13): undefined reference to `mlir::detail::TypeIDResolver<mlir::xevm::XeVMDialect, void>::id'
/usr/bin/ld: tools/mlir/lib/Conversion/MathToXeVM/CMakeFiles/obj.MLIRMathToXeVM.dir/MathToXeVM.cpp.o: in function `std::unique_ptr<mlir::Dialect, std::default_delete<mlir::Dialect> > llvm::function_ref<std::unique_ptr<mlir::Dialect, std::default_delete<mlir::Dialect> > ()>::callback_fn<mlir::MLIRContext::getOrLoadDialect<mlir::xevm::XeVMDialect>()::{lambda()#1}>(long)':
MathToXeVM.cpp:(.text._ZN4llvm12function_refIFSt10unique_ptrIN4mlir7DialectESt14default_deleteIS3_EEvEE11callback_fnIZNS2_11MLIRContext16getOrLoadDialectINS2_4xevm11XeVMDialectEEEPT_vEUlvE_EES6_l[_ZN4llvm12function_refIFSt10unique_ptrIN4mlir7DialectESt14default_deleteIS3_EEvEE11callback_fnIZNS2_11MLIRContext16getOrLoadDialectINS2_4xevm11XeVMDialectEEEPT_vEUlvE_EES6_l]+0x23): undefined reference to `mlir::xevm::XeVMDialect::XeVMDialect(mlir::MLIRContext*)'
/usr/bin/ld: tools/mlir/lib/Conversion/MathToXeVM/CMakeFiles/obj.MLIRMathToXeVM.dir/MathToXeVM.cpp.o: in function `ConvertNativeFuncPattern<mlir::arith::DivFOp>::matchAndRewrite(mlir::arith::DivFOp, mlir::arith::DivFOpAdaptor, mlir::ConversionPatternRewriter&) const':
MathToXeVM.cpp:(.text._ZNK24ConvertNativeFuncPatternIN4mlir5arith6DivFOpEE15matchAndRewriteES2_NS1_13DivFOpAdaptorERNS0_25ConversionPatternRewriterE[_ZNK24ConvertNativeFuncPatternIN4mlir5arith6DivFOpEE15matchAndRewriteES2_NS1_13DivFOpAdaptorERNS0_25ConversionPatternRewriterE]+0x606): undefined reference to `mlir::arith::convertArithFastMathAttrToLLVM(mlir::arith::FastMathFlagsAttr)'
/usr/bin/ld: tools/mlir/lib/Conversion/MathToXeVM/CMakeFiles/obj.MLIRMathToXeVM.dir/MathToXeVM.cpp.o: in function `ConvertNativeFuncPattern<mlir::math::TanOp>::matchAndRewrite(mlir::math::TanOp, mlir::math::TanOpAdaptor, mlir::ConversionPatternRewriter&) const':
MathToXeVM.cpp:(.text._ZNK24ConvertNativeFuncPatternIN4mlir4math5TanOpEE15matchAndRewriteES2_NS1_12TanOpAdaptorERNS0_25ConversionPatternRewriterE[_ZNK24ConvertNativeFuncPatternIN4mlir4math5TanOpEE15matchAndRewriteES2_NS1_12TanOpAdaptorERNS0_25ConversionPatternRewriterE]+0x606): undefined reference to `mlir::arith::convertArithFastMathAttrToLLVM(mlir::arith::FastMathFlagsAttr)'
/usr/bin/ld: tools/mlir/lib/Conversion/MathToXeVM/CMakeFiles/obj.MLIRMathToXeVM.dir/MathToXeVM.cpp.o: in function `ConvertNativeFuncPattern<mlir::math::SqrtOp>::matchAndRewrite(mlir::math::SqrtOp, mlir::math::SqrtOpAdaptor, mlir::ConversionPatternRewriter&) const':
MathToXeVM.cpp:(.text._ZNK24ConvertNativeFuncPatternIN4mlir4math6SqrtOpEE15matchAndRewriteES2_NS1_13SqrtOpAdaptorERNS0_25ConversionPatternRewriterE[_ZNK24ConvertNativeFuncPatternIN4mlir4math6SqrtOpEE15matchAndRewriteES2_NS1_13SqrtOpAdaptorERNS0_25ConversionPatternRewriterE]+0x606): undefined reference to `mlir::arith::convertArithFastMathAttrToLLVM(mlir::arith::FastMathFlagsAttr)'
/usr/bin/ld: tools/mlir/lib/Conversion/MathToXeVM/CMakeFiles/obj.MLIRMathToXeVM.dir/MathToXeVM.cpp.o: in function `ConvertNativeFuncPattern<mlir::math::SinOp>::matchAndRewrite(mlir::math::SinOp, mlir::math::SinOpAdaptor, mlir::ConversionPatternRewriter&) const':
MathToXeVM.cpp:(.text._ZNK24ConvertNativeFuncPatternIN4mlir4math5SinOpEE15matchAndRewriteES2_NS1_12SinOpAdaptorERNS0_25ConversionPatternRewriterE[_ZNK24ConvertNativeFuncPatternIN4mlir4math5SinOpEE15matchAndRewriteES2_NS1_12SinOpAdaptorERNS0_25ConversionPatternRewriterE]+0x606): undefined reference to `mlir::arith::convertArithFastMathAttrToLLVM(mlir::arith::FastMathFlagsAttr)'
/usr/bin/ld: tools/mlir/lib/Conversion/MathToXeVM/CMakeFiles/obj.MLIRMathToXeVM.dir/MathToXeVM.cpp.o: in function `ConvertNativeFuncPattern<mlir::math::RsqrtOp>::matchAndRewrite(mlir::math::RsqrtOp, mlir::math::RsqrtOpAdaptor, mlir::ConversionPatternRewriter&) const':
MathToXeVM.cpp:(.text._ZNK24ConvertNativeFuncPatternIN4mlir4math7RsqrtOpEE15matchAndRewriteES2_NS1_14RsqrtOpAdaptorERNS0_25ConversionPatternRewriterE[_ZNK24ConvertNativeFuncPatternIN4mlir4math7RsqrtOpEE15matchAndRewriteES2_NS1_14RsqrtOpAdaptorERNS0_25ConversionPatternRewriterE]+0x606): undefined reference to `mlir::arith::convertArithFastMathAttrToLLVM(mlir::arith::FastMathFlagsAttr)'
/usr/bin/ld: tools/mlir/lib/Conversion/MathToXeVM/CMakeFiles/obj.MLIRMathToXeVM.dir/MathToXeVM.cpp.o:MathToXeVM.cpp:(.text._ZNK24ConvertNativeFuncPatternIN4mlir4math6PowFOpEE15matchAndRewriteES2_NS1_13PowFOpAdaptorERNS0_25ConversionPatternRewriterE[_ZNK24ConvertNativeFuncPatternIN4mlir4math6PowFOpEE15matchAndRewriteES2_NS1_13PowFOpAdaptorERNS0_25ConversionPatternRewriterE]+0x606): more undefined references to `mlir::arith::convertArithFastMathAttrToLLVM(mlir::arith::FastMathFlagsAttr)' follow
collect2: error: ld returned 1 exit status
[5509/8112] Linking CXX shared library lib/libMLIRSCFToEmitC.so.22.0git
[5510/8112] Linking CXX shared library lib/libMLIRMemRefToLLVM.so.22.0git
[5511/8112] Linking CXX shared library lib/libMLIRPtrToLLVM.so.22.0git
[5512/8112] Building CXX object tools/mlir/lib/Conversion/SCFToGPU/CMakeFiles/obj.MLIRSCFToGPU.dir/SCFToGPU.cpp.o
[5513/8112] Linking CXX shared library lib/libMLIRUBToLLVM.so.22.0git
[5514/8112] Linking CXX shared library lib/libMLIRMathToLLVM.so.22.0git
[5515/8112] Linking CXX shared library lib/libMLIRVectorToArmSME.so.22.0git
[5516/8112] Linking CXX shared library lib/libMLIRXeVMToLLVM.so.22.0git
[5517/8112] Linking CXX shared library lib/libMLIRArmNeonTransforms.so.22.0git
[5518/8112] Linking CXX shared library lib/libMLIRAMXDialect.so.22.0git
[5519/8112] Building CXX object tools/mlir/lib/Dialect/Linalg/Transforms/CMakeFiles/obj.MLIRLinalgTransforms.dir/Transforms.cpp.o
[5520/8112] Building CXX object tools/mlir/lib/Dialect/Linalg/Transforms/CMakeFiles/obj.MLIRLinalgTransforms.dir/Vectorization.cpp.o
[5521/8112] Building CXX object tools/mlir/lib/Conversion/GPUCommon/CMakeFiles/obj.MLIRGPUToGPURuntimeTransforms.dir/GPUOpsLowering.cpp.o
[5522/8112] Building CXX object tools/mlir/test/lib/Dialect/ControlFlow/CMakeFiles/MLIRControlFlowTestPasses.dir/TestAssert.cpp.o
[5523/8112] Building CXX object tools/mlir/test/lib/Dialect/Linalg/CMakeFiles/MLIRLinalgTestPasses.dir/TestPadFusion.cpp.o
[5524/8112] Building CXX object tools/mlir/test/lib/Conversion/MemRefToLLVM/CMakeFiles/MLIRTestMemRefToLLVMWithTransforms.dir/TestMemRefToLLVMWithTransforms.cpp.o
[5525/8112] Building CXX object tools/mlir/lib/Dialect/GPU/Pipelines/CMakeFiles/obj.MLIRGPUPipelines.dir/GPUToNVVMPipeline.cpp.o
[5526/8112] Building CXX object tools/mlir/lib/Conversion/GPUToNVVM/CMakeFiles/obj.MLIRGPUToNVVMTransforms.dir/WmmaOpsToNvvm.cpp.o
[5527/8112] Building CXX object tools/mlir/lib/Conversion/GPUToLLVMSPV/CMakeFiles/obj.MLIRGPUToLLVMSPV.dir/GPUToLLVMSPV.cpp.o
Step 7 (build cmake config) failure: build cmake config (failure)
...
[5499/8112] Linking CXX shared library lib/libMLIRIndexToLLVM.so.22.0git
[5500/8112] Creating library symlink lib/libMLIRIndexToLLVM.so
[5501/8112] Linking CXX shared library lib/libMLIRAMDGPUToROCDL.so.22.0git
[5502/8112] Creating library symlink lib/libMLIRAMDGPUToROCDL.so
[5503/8112] Linking CXX shared library lib/libMLIRComplexToLLVM.so.22.0git
[5504/8112] Linking CXX shared library lib/libMLIRMPIToLLVM.so.22.0git
[5505/8112] Creating library symlink lib/libMLIRComplexToLLVM.so
[5506/8112] Creating library symlink lib/libMLIRMPIToLLVM.so
[5507/8112] Creating library symlink lib/libMLIRAMDGPUTransforms.so
[5508/8112] Linking CXX shared library lib/libMLIRMathToXeVM.so.22.0git
FAILED: lib/libMLIRMathToXeVM.so.22.0git 
: && /usr/bin/c++ -fPIC -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-array-bounds -Wno-stringop-overread -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Wundef -Wno-unused-but-set-parameter -Wno-deprecated-copy -O3 -DNDEBUG  -Wl,-z,defs -Wl,-z,nodelete   -Wl,-rpath-link,/home/botworker/bbot/hip-third-party-libs-test/build/./lib  -Wl,--gc-sections -shared -Wl,-soname,libMLIRMathToXeVM.so.22.0git -o lib/libMLIRMathToXeVM.so.22.0git tools/mlir/lib/Conversion/MathToXeVM/CMakeFiles/obj.MLIRMathToXeVM.dir/MathToXeVM.cpp.o  -Wl,-rpath,"\$ORIGIN/../lib:/home/botworker/bbot/hip-third-party-libs-test/build/lib:"  lib/libMLIRMathDialect.so.22.0git  lib/libMLIRLLVMCommonConversion.so.22.0git  lib/libMLIRVectorDialect.so.22.0git  lib/libMLIRLLVMDialect.so.22.0git  lib/libLLVMCore.so.22.0git  lib/libMLIRPtrMemorySpaceInterfaces.so.22.0git  lib/libLLVMBinaryFormat.so.22.0git  lib/libMLIRTransforms.so.22.0git  lib/libMLIRTransformUtils.so.22.0git  lib/libMLIRSubsetOpInterface.so.22.0git  lib/libMLIRRewrite.so.22.0git  lib/libMLIRRewritePDL.so.22.0git  lib/libMLIRPDLToPDLInterp.so.22.0git  lib/libMLIRPass.so.22.0git  lib/libMLIRPDLInterpDialect.so.22.0git  lib/libMLIRPDLDialect.so.22.0git  lib/libMLIRIndexingMapOpInterface.so.22.0git  lib/libMLIRMaskableOpInterface.so.22.0git  lib/libMLIRMaskingOpInterface.so.22.0git  lib/libMLIRTensorDialect.so.22.0git  lib/libMLIRAffineDialect.so.22.0git  lib/libMLIRMemRefDialect.so.22.0git  lib/libMLIRMemorySlotInterfaces.so.22.0git  lib/libMLIRMemOpInterfaces.so.22.0git  lib/libMLIRRuntimeVerifiableOpInterface.so.22.0git  lib/libMLIRArithUtils.so.22.0git  lib/libMLIRDialectUtils.so.22.0git  lib/libMLIRComplexDialect.so.22.0git  lib/libMLIRArithDialect.so.22.0git  lib/libMLIRUBDialect.so.22.0git  lib/libMLIRCastInterfaces.so.22.0git  lib/libMLIRInferIntRangeCommon.so.22.0git  lib/libMLIRShapedOpInterfaces.so.22.0git  lib/libMLIRDialect.so.22.0git  lib/libMLIRParallelCombiningOpInterface.so.22.0git  lib/libMLIRValueBoundsOpInterface.so.22.0git  lib/libMLIRAnalysis.so.22.0git  lib/libMLIRControlFlowInterfaces.so.22.0git  lib/libMLIRLoopLikeInterface.so.22.0git  lib/libMLIRFunctionInterfaces.so.22.0git  lib/libMLIRCallInterfaces.so.22.0git  lib/libMLIRSideEffectInterfaces.so.22.0git  lib/libMLIRDataLayoutInterfaces.so.22.0git  lib/libMLIRInferIntRangeInterface.so.22.0git  lib/libMLIRInferTypeOpInterface.so.22.0git  lib/libMLIRPresburger.so.22.0git  lib/libMLIRViewLikeInterface.so.22.0git  lib/libMLIRDestinationStyleOpInterface.so.22.0git  lib/libMLIRVectorInterfaces.so.22.0git  lib/libMLIRIR.so.22.0git  lib/libMLIRSupport.so.22.0git  lib/libLLVMSupport.so.22.0git  -Wl,-rpath-link,/home/botworker/bbot/hip-third-party-libs-test/build/lib && :
/usr/bin/ld: tools/mlir/lib/Conversion/MathToXeVM/CMakeFiles/obj.MLIRMathToXeVM.dir/MathToXeVM.cpp.o: in function `mlir::impl::ConvertMathToXeVMBase<(anonymous namespace)::ConvertMathToXeVMPass>::getDependentDialects(mlir::DialectRegistry&) const':
MathToXeVM.cpp:(.text._ZNK4mlir4impl21ConvertMathToXeVMBaseIN12_GLOBAL__N_121ConvertMathToXeVMPassEE20getDependentDialectsERNS_15DialectRegistryE+0xa3): undefined reference to `mlir::detail::TypeIDResolver<mlir::xevm::XeVMDialect, void>::id'
/usr/bin/ld: tools/mlir/lib/Conversion/MathToXeVM/CMakeFiles/obj.MLIRMathToXeVM.dir/MathToXeVM.cpp.o: in function `std::_Function_handler<mlir::Dialect* (mlir::MLIRContext*), mlir::DialectRegistry::insert<mlir::xevm::XeVMDialect>()::{lambda(mlir::MLIRContext*)#1}>::_M_invoke(std::_Any_data const&, mlir::MLIRContext*&&)':
MathToXeVM.cpp:(.text._ZNSt17_Function_handlerIFPN4mlir7DialectEPNS0_11MLIRContextEEZNS0_15DialectRegistry6insertINS0_4xevm11XeVMDialectEEEvvEUlS4_E_E9_M_invokeERKSt9_Any_dataOS4_[_ZNSt17_Function_handlerIFPN4mlir7DialectEPNS0_11MLIRContextEEZNS0_15DialectRegistry6insertINS0_4xevm11XeVMDialectEEEvvEUlS4_E_E9_M_invokeERKSt9_Any_dataOS4_]+0x13): undefined reference to `mlir::detail::TypeIDResolver<mlir::xevm::XeVMDialect, void>::id'
/usr/bin/ld: tools/mlir/lib/Conversion/MathToXeVM/CMakeFiles/obj.MLIRMathToXeVM.dir/MathToXeVM.cpp.o: in function `std::unique_ptr<mlir::Dialect, std::default_delete<mlir::Dialect> > llvm::function_ref<std::unique_ptr<mlir::Dialect, std::default_delete<mlir::Dialect> > ()>::callback_fn<mlir::MLIRContext::getOrLoadDialect<mlir::xevm::XeVMDialect>()::{lambda()#1}>(long)':
MathToXeVM.cpp:(.text._ZN4llvm12function_refIFSt10unique_ptrIN4mlir7DialectESt14default_deleteIS3_EEvEE11callback_fnIZNS2_11MLIRContext16getOrLoadDialectINS2_4xevm11XeVMDialectEEEPT_vEUlvE_EES6_l[_ZN4llvm12function_refIFSt10unique_ptrIN4mlir7DialectESt14default_deleteIS3_EEvEE11callback_fnIZNS2_11MLIRContext16getOrLoadDialectINS2_4xevm11XeVMDialectEEEPT_vEUlvE_EES6_l]+0x23): undefined reference to `mlir::xevm::XeVMDialect::XeVMDialect(mlir::MLIRContext*)'
/usr/bin/ld: tools/mlir/lib/Conversion/MathToXeVM/CMakeFiles/obj.MLIRMathToXeVM.dir/MathToXeVM.cpp.o: in function `ConvertNativeFuncPattern<mlir::arith::DivFOp>::matchAndRewrite(mlir::arith::DivFOp, mlir::arith::DivFOpAdaptor, mlir::ConversionPatternRewriter&) const':
MathToXeVM.cpp:(.text._ZNK24ConvertNativeFuncPatternIN4mlir5arith6DivFOpEE15matchAndRewriteES2_NS1_13DivFOpAdaptorERNS0_25ConversionPatternRewriterE[_ZNK24ConvertNativeFuncPatternIN4mlir5arith6DivFOpEE15matchAndRewriteES2_NS1_13DivFOpAdaptorERNS0_25ConversionPatternRewriterE]+0x606): undefined reference to `mlir::arith::convertArithFastMathAttrToLLVM(mlir::arith::FastMathFlagsAttr)'
/usr/bin/ld: tools/mlir/lib/Conversion/MathToXeVM/CMakeFiles/obj.MLIRMathToXeVM.dir/MathToXeVM.cpp.o: in function `ConvertNativeFuncPattern<mlir::math::TanOp>::matchAndRewrite(mlir::math::TanOp, mlir::math::TanOpAdaptor, mlir::ConversionPatternRewriter&) const':
MathToXeVM.cpp:(.text._ZNK24ConvertNativeFuncPatternIN4mlir4math5TanOpEE15matchAndRewriteES2_NS1_12TanOpAdaptorERNS0_25ConversionPatternRewriterE[_ZNK24ConvertNativeFuncPatternIN4mlir4math5TanOpEE15matchAndRewriteES2_NS1_12TanOpAdaptorERNS0_25ConversionPatternRewriterE]+0x606): undefined reference to `mlir::arith::convertArithFastMathAttrToLLVM(mlir::arith::FastMathFlagsAttr)'
/usr/bin/ld: tools/mlir/lib/Conversion/MathToXeVM/CMakeFiles/obj.MLIRMathToXeVM.dir/MathToXeVM.cpp.o: in function `ConvertNativeFuncPattern<mlir::math::SqrtOp>::matchAndRewrite(mlir::math::SqrtOp, mlir::math::SqrtOpAdaptor, mlir::ConversionPatternRewriter&) const':
MathToXeVM.cpp:(.text._ZNK24ConvertNativeFuncPatternIN4mlir4math6SqrtOpEE15matchAndRewriteES2_NS1_13SqrtOpAdaptorERNS0_25ConversionPatternRewriterE[_ZNK24ConvertNativeFuncPatternIN4mlir4math6SqrtOpEE15matchAndRewriteES2_NS1_13SqrtOpAdaptorERNS0_25ConversionPatternRewriterE]+0x606): undefined reference to `mlir::arith::convertArithFastMathAttrToLLVM(mlir::arith::FastMathFlagsAttr)'
/usr/bin/ld: tools/mlir/lib/Conversion/MathToXeVM/CMakeFiles/obj.MLIRMathToXeVM.dir/MathToXeVM.cpp.o: in function `ConvertNativeFuncPattern<mlir::math::SinOp>::matchAndRewrite(mlir::math::SinOp, mlir::math::SinOpAdaptor, mlir::ConversionPatternRewriter&) const':
MathToXeVM.cpp:(.text._ZNK24ConvertNativeFuncPatternIN4mlir4math5SinOpEE15matchAndRewriteES2_NS1_12SinOpAdaptorERNS0_25ConversionPatternRewriterE[_ZNK24ConvertNativeFuncPatternIN4mlir4math5SinOpEE15matchAndRewriteES2_NS1_12SinOpAdaptorERNS0_25ConversionPatternRewriterE]+0x606): undefined reference to `mlir::arith::convertArithFastMathAttrToLLVM(mlir::arith::FastMathFlagsAttr)'
/usr/bin/ld: tools/mlir/lib/Conversion/MathToXeVM/CMakeFiles/obj.MLIRMathToXeVM.dir/MathToXeVM.cpp.o: in function `ConvertNativeFuncPattern<mlir::math::RsqrtOp>::matchAndRewrite(mlir::math::RsqrtOp, mlir::math::RsqrtOpAdaptor, mlir::ConversionPatternRewriter&) const':
MathToXeVM.cpp:(.text._ZNK24ConvertNativeFuncPatternIN4mlir4math7RsqrtOpEE15matchAndRewriteES2_NS1_14RsqrtOpAdaptorERNS0_25ConversionPatternRewriterE[_ZNK24ConvertNativeFuncPatternIN4mlir4math7RsqrtOpEE15matchAndRewriteES2_NS1_14RsqrtOpAdaptorERNS0_25ConversionPatternRewriterE]+0x606): undefined reference to `mlir::arith::convertArithFastMathAttrToLLVM(mlir::arith::FastMathFlagsAttr)'
/usr/bin/ld: tools/mlir/lib/Conversion/MathToXeVM/CMakeFiles/obj.MLIRMathToXeVM.dir/MathToXeVM.cpp.o:MathToXeVM.cpp:(.text._ZNK24ConvertNativeFuncPatternIN4mlir4math6PowFOpEE15matchAndRewriteES2_NS1_13PowFOpAdaptorERNS0_25ConversionPatternRewriterE[_ZNK24ConvertNativeFuncPatternIN4mlir4math6PowFOpEE15matchAndRewriteES2_NS1_13PowFOpAdaptorERNS0_25ConversionPatternRewriterE]+0x606): more undefined references to `mlir::arith::convertArithFastMathAttrToLLVM(mlir::arith::FastMathFlagsAttr)' follow
collect2: error: ld returned 1 exit status
[5509/8112] Linking CXX shared library lib/libMLIRSCFToEmitC.so.22.0git
[5510/8112] Linking CXX shared library lib/libMLIRMemRefToLLVM.so.22.0git
[5511/8112] Linking CXX shared library lib/libMLIRPtrToLLVM.so.22.0git
[5512/8112] Building CXX object tools/mlir/lib/Conversion/SCFToGPU/CMakeFiles/obj.MLIRSCFToGPU.dir/SCFToGPU.cpp.o
[5513/8112] Linking CXX shared library lib/libMLIRUBToLLVM.so.22.0git
[5514/8112] Linking CXX shared library lib/libMLIRMathToLLVM.so.22.0git
[5515/8112] Linking CXX shared library lib/libMLIRVectorToArmSME.so.22.0git
[5516/8112] Linking CXX shared library lib/libMLIRXeVMToLLVM.so.22.0git
[5517/8112] Linking CXX shared library lib/libMLIRArmNeonTransforms.so.22.0git
[5518/8112] Linking CXX shared library lib/libMLIRAMXDialect.so.22.0git
[5519/8112] Building CXX object tools/mlir/lib/Dialect/Linalg/Transforms/CMakeFiles/obj.MLIRLinalgTransforms.dir/Transforms.cpp.o
[5520/8112] Building CXX object tools/mlir/lib/Dialect/Linalg/Transforms/CMakeFiles/obj.MLIRLinalgTransforms.dir/Vectorization.cpp.o
[5521/8112] Building CXX object tools/mlir/lib/Conversion/GPUCommon/CMakeFiles/obj.MLIRGPUToGPURuntimeTransforms.dir/GPUOpsLowering.cpp.o
[5522/8112] Building CXX object tools/mlir/test/lib/Dialect/ControlFlow/CMakeFiles/MLIRControlFlowTestPasses.dir/TestAssert.cpp.o
[5523/8112] Building CXX object tools/mlir/test/lib/Dialect/Linalg/CMakeFiles/MLIRLinalgTestPasses.dir/TestPadFusion.cpp.o
[5524/8112] Building CXX object tools/mlir/test/lib/Conversion/MemRefToLLVM/CMakeFiles/MLIRTestMemRefToLLVMWithTransforms.dir/TestMemRefToLLVMWithTransforms.cpp.o
[5525/8112] Building CXX object tools/mlir/lib/Dialect/GPU/Pipelines/CMakeFiles/obj.MLIRGPUPipelines.dir/GPUToNVVMPipeline.cpp.o
[5526/8112] Building CXX object tools/mlir/lib/Conversion/GPUToNVVM/CMakeFiles/obj.MLIRGPUToNVVMTransforms.dir/WmmaOpsToNvvm.cpp.o
[5527/8112] Building CXX object tools/mlir/lib/Conversion/GPUToLLVMSPV/CMakeFiles/obj.MLIRGPUToLLVMSPV.dir/GPUToLLVMSPV.cpp.o

…62542) Precommit a test.

mgudim requested a review from alexey-bataev October 8, 2025 20:08

llvmbot added backend:RISC-V llvm:transforms labels Oct 8, 2025

mgudim mentioned this pull request Oct 8, 2025

Widen rt stride loads #162336

Open

mgudim changed the title ~~[RISCV] Add a test for satd-8x4 from x264 benchmark.~~ [RISCV][SLP]Add a test for satd-8x4 from x264 benchmark. Oct 8, 2025

[RISCV][SLP] Add a test for satd-8x4 from x264 benchmark.

9f0bfeb

mgudim force-pushed the x264_test branch from 612a4a3 to 9f0bfeb Compare October 8, 2025 20:20

alexey-bataev changed the title ~~[RISCV][SLP]Add a test for satd-8x4 from x264 benchmark.~~ [RISCV][SLP[NFC]]Add a test for satd-8x4 from x264 benchmark. Oct 8, 2025

alexey-bataev approved these changes Oct 8, 2025

View reviewed changes

alexey-bataev changed the title ~~[RISCV][SLP[NFC]]Add a test for satd-8x4 from x264 benchmark.~~ [RISCV][SLP][NFC]Add a test for satd-8x4 from x264 benchmark. Oct 8, 2025

Merge branch 'main' into x264_test

3d684a1

mgudim enabled auto-merge (squash) October 8, 2025 20:55

mgudim added 3 commits October 8, 2025 17:00

Merge branch 'main' into x264_test

9e7fabb

Merge branch 'main' into x264_test

a57a5b4

Merge branch 'main' into x264_test

6feacac

mgudim merged commit 004270d into llvm:main Oct 10, 2025
9 of 10 checks passed

DharuniRAcharya pushed a commit to DharuniRAcharya/llvm-project that referenced this pull request Oct 13, 2025

[RISCV][SLP][NFC]Add a test for satd-8x4 from x264 benchmark. (llvm#1…

562046f

…62542) Precommit a test.

akadutta pushed a commit to akadutta/llvm-project that referenced this pull request Oct 14, 2025

[RISCV][SLP][NFC]Add a test for satd-8x4 from x264 benchmark. (llvm#1…

81f9ad9

…62542) Precommit a test.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RISCV][SLP][NFC]Add a test for satd-8x4 from x264 benchmark. #162542

[RISCV][SLP][NFC]Add a test for satd-8x4 from x264 benchmark. #162542

Uh oh!

mgudim commented Oct 8, 2025

Uh oh!

llvmbot commented Oct 8, 2025

Uh oh!

llvmbot commented Oct 8, 2025

Uh oh!

topperc commented Oct 8, 2025

Uh oh!

Uh oh!

llvm-ci commented Oct 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[RISCV][SLP][NFC]Add a test for satd-8x4 from x264 benchmark. #162542

[RISCV][SLP][NFC]Add a test for satd-8x4 from x264 benchmark. #162542

Uh oh!

Conversation

mgudim commented Oct 8, 2025

Uh oh!

llvmbot commented Oct 8, 2025

Uh oh!

llvmbot commented Oct 8, 2025

Uh oh!

topperc commented Oct 8, 2025

Uh oh!

Uh oh!

llvm-ci commented Oct 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants