Skip to content

[SLP]Initial support for copyable elements #147366

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1,208 changes: 1,040 additions & 168 deletions llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -191,12 +191,12 @@ define i32 @reorder_indices_1(float %0) {
; NON-POW2-NEXT: entry:
; NON-POW2-NEXT: [[NOR1:%.*]] = alloca [0 x [3 x float]], i32 0, align 4
; NON-POW2-NEXT: [[TMP1:%.*]] = load <3 x float>, ptr [[NOR1]], align 4
; NON-POW2-NEXT: [[TMP3:%.*]] = fneg <3 x float> [[TMP1]]
; NON-POW2-NEXT: [[TMP2:%.*]] = shufflevector <3 x float> [[TMP1]], <3 x float> poison, <3 x i32> <i32 1, i32 2, i32 0>
; NON-POW2-NEXT: [[TMP3:%.*]] = fneg <3 x float> [[TMP2]]
; NON-POW2-NEXT: [[TMP4:%.*]] = insertelement <3 x float> poison, float [[TMP0]], i32 0
; NON-POW2-NEXT: [[TMP5:%.*]] = shufflevector <3 x float> [[TMP4]], <3 x float> poison, <3 x i32> zeroinitializer
; NON-POW2-NEXT: [[TMP6:%.*]] = fmul <3 x float> [[TMP3]], [[TMP5]]
; NON-POW2-NEXT: [[TMP10:%.*]] = shufflevector <3 x float> [[TMP6]], <3 x float> poison, <3 x i32> <i32 1, i32 2, i32 0>
; NON-POW2-NEXT: [[TMP7:%.*]] = call <3 x float> @llvm.fmuladd.v3f32(<3 x float> [[TMP1]], <3 x float> zeroinitializer, <3 x float> [[TMP10]])
; NON-POW2-NEXT: [[TMP7:%.*]] = call <3 x float> @llvm.fmuladd.v3f32(<3 x float> [[TMP1]], <3 x float> zeroinitializer, <3 x float> [[TMP6]])
; NON-POW2-NEXT: [[TMP8:%.*]] = call <3 x float> @llvm.fmuladd.v3f32(<3 x float> [[TMP5]], <3 x float> [[TMP7]], <3 x float> zeroinitializer)
; NON-POW2-NEXT: [[TMP9:%.*]] = fmul <3 x float> [[TMP8]], zeroinitializer
; NON-POW2-NEXT: store <3 x float> [[TMP9]], ptr [[NOR1]], align 4
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,17 +4,14 @@
define void @test() {
; CHECK-LABEL: define void @test() {
; CHECK-NEXT: [[BB:.*:]]
; CHECK-NEXT: [[ICMP:%.*]] = icmp samesign ult i32 0, 0
; CHECK-NEXT: [[SELECT:%.*]] = select i1 [[ICMP]], i32 0, i32 0
; CHECK-NEXT: [[SELECT:%.*]] = select i1 false, i32 0, i32 0
; CHECK-NEXT: [[ZEXT:%.*]] = zext i32 [[SELECT]] to i64
; CHECK-NEXT: [[GETELEMENTPTR:%.*]] = getelementptr ptr addrspace(1), ptr addrspace(1) null, i64 [[ZEXT]]
; CHECK-NEXT: store ptr addrspace(1) null, ptr addrspace(1) [[GETELEMENTPTR]], align 8
; CHECK-NEXT: store volatile i32 0, ptr addrspace(1) null, align 4
; CHECK-NEXT: [[CALL:%.*]] = call i32 null(<2 x double> zeroinitializer)
; CHECK-NEXT: [[TMP2:%.*]] = insertelement <4 x i32> <i32 0, i32 0, i32 0, i32 poison>, i32 [[CALL]], i32 3
; CHECK-NEXT: [[TMP3:%.*]] = icmp eq <4 x i32> [[TMP2]], zeroinitializer
; CHECK-NEXT: [[TMP4:%.*]] = shufflevector <4 x i1> [[TMP3]], <4 x i1> poison, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 poison, i32 poison, i32 poison, i32 poison>
; CHECK-NEXT: [[TMP5:%.*]] = shufflevector <8 x i1> [[TMP4]], <8 x i1> <i1 false, i1 false, i1 false, i1 false, i1 undef, i1 undef, i1 undef, i1 undef>, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 8, i32 9, i32 10, i32 11>
; CHECK-NEXT: ret void
;
bb:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,14 +19,14 @@ define void @test(ptr %0, i32 %add651) {
; CHECK-NEXT: [[ARRAYIDX660:%.*]] = getelementptr i8, ptr [[TMP4]], i64 7800
; CHECK-NEXT: [[ARRAYIDX689:%.*]] = getelementptr i8, ptr [[TMP4]], i64 7816
; CHECK-NEXT: [[TMP6:%.*]] = add <2 x i32> [[TMP3]], splat (i32 1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why didn't this v2i32 add fold into the v4i32 add below as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check that tries to check the profitability considers it unprofitable. I tried to to make it vectorizable, but found out it is unprofitable too, too many buildvectors, which add the cost and make the whole tree non-profitable for vectorization

; CHECK-NEXT: [[TMP8:%.*]] = add <2 x i32> [[TMP6]], [[TMP7]]
; CHECK-NEXT: [[TMP9:%.*]] = insertelement <2 x i32> <i32 1, i32 poison>, i32 [[TMP5]], i32 1
; CHECK-NEXT: [[TMP10:%.*]] = add <2 x i32> [[TMP8]], [[TMP9]]
; CHECK-NEXT: [[TMP10:%.*]] = add <2 x i32> [[TMP6]], [[TMP7]]
; CHECK-NEXT: [[TMP11:%.*]] = insertelement <4 x i32> poison, i32 [[ADD651]], i32 0
; CHECK-NEXT: [[TMP13:%.*]] = insertelement <4 x i32> [[TMP11]], i32 [[TMP2]], i32 1
; CHECK-NEXT: [[TMP19:%.*]] = shufflevector <2 x i32> [[TMP10]], <2 x i32> poison, <4 x i32> <i32 0, i32 1, i32 poison, i32 poison>
; CHECK-NEXT: [[TMP14:%.*]] = shufflevector <4 x i32> [[TMP13]], <4 x i32> [[TMP19]], <4 x i32> <i32 0, i32 1, i32 4, i32 5>
; CHECK-NEXT: [[TMP15:%.*]] = lshr <4 x i32> [[TMP14]], splat (i32 1)
; CHECK-NEXT: [[TMP20:%.*]] = insertelement <4 x i32> <i32 0, i32 0, i32 1, i32 poison>, i32 [[TMP5]], i32 3
; CHECK-NEXT: [[TMP21:%.*]] = add <4 x i32> [[TMP14]], [[TMP20]]
; CHECK-NEXT: [[TMP15:%.*]] = lshr <4 x i32> [[TMP21]], splat (i32 1)
; CHECK-NEXT: [[SHR685:%.*]] = lshr i32 [[TMP2]], 1
; CHECK-NEXT: [[TMP16:%.*]] = trunc <4 x i32> [[TMP15]] to <4 x i16>
; CHECK-NEXT: [[CONV686:%.*]] = trunc i32 [[SHR685]] to i16
Expand Down
38 changes: 19 additions & 19 deletions llvm/test/Transforms/SLPVectorizer/X86/pr35497.ll
Original file line number Diff line number Diff line change
Expand Up @@ -87,29 +87,29 @@ define void @pr35497(ptr %p, i64 %c) {
; AVX-LABEL: @pr35497(
; AVX-NEXT: entry:
; AVX-NEXT: [[TMP0:%.*]] = load i64, ptr [[P:%.*]], align 1
; AVX-NEXT: [[TMP5:%.*]] = insertelement <2 x i64> poison, i64 [[C:%.*]], i32 0
; AVX-NEXT: [[TMP11:%.*]] = shufflevector <2 x i64> [[TMP5]], <2 x i64> poison, <2 x i32> zeroinitializer
; AVX-NEXT: [[TMP13:%.*]] = lshr <2 x i64> [[TMP11]], splat (i64 6)
; AVX-NEXT: [[TMP1:%.*]] = insertelement <2 x i64> poison, i64 [[C:%.*]], i32 0
; AVX-NEXT: [[TMP2:%.*]] = shufflevector <2 x i64> [[TMP1]], <2 x i64> poison, <2 x i32> zeroinitializer
; AVX-NEXT: [[TMP3:%.*]] = lshr <2 x i64> [[TMP2]], splat (i64 6)
; AVX-NEXT: [[ARRAYIDX2_2:%.*]] = getelementptr inbounds [0 x i64], ptr [[P]], i64 0, i64 4
; AVX-NEXT: [[ARRAYIDX2_5:%.*]] = getelementptr inbounds [0 x i64], ptr [[P]], i64 0, i64 1
; AVX-NEXT: [[TMP1:%.*]] = insertelement <2 x i64> [[TMP11]], i64 [[TMP0]], i32 1
; AVX-NEXT: [[TMP2:%.*]] = shl <2 x i64> [[TMP1]], splat (i64 2)
; AVX-NEXT: [[TMP3:%.*]] = and <2 x i64> [[TMP2]], splat (i64 20)
; AVX-NEXT: [[TMP14:%.*]] = shufflevector <2 x i64> [[TMP3]], <2 x i64> [[TMP1]], <2 x i32> <i32 1, i32 2>
; AVX-NEXT: [[TMP16:%.*]] = shufflevector <2 x i64> [[TMP13]], <2 x i64> [[TMP14]], <2 x i32> <i32 1, i32 3>
; AVX-NEXT: [[TMP6:%.*]] = add <2 x i64> [[TMP14]], [[TMP16]]
; AVX-NEXT: [[TMP17:%.*]] = extractelement <2 x i64> [[TMP6]], i32 1
; AVX-NEXT: store i64 [[TMP17]], ptr [[P]], align 1
; AVX-NEXT: [[TMP4:%.*]] = add nuw nsw <2 x i64> [[TMP3]], [[TMP13]]
; AVX-NEXT: [[TMP12:%.*]] = extractelement <2 x i64> [[TMP6]], i32 0
; AVX-NEXT: [[TMP4:%.*]] = insertelement <2 x i64> [[TMP2]], i64 [[TMP0]], i32 1
; AVX-NEXT: [[TMP5:%.*]] = shl <2 x i64> [[TMP4]], splat (i64 2)
; AVX-NEXT: [[TMP6:%.*]] = and <2 x i64> [[TMP5]], splat (i64 20)
; AVX-NEXT: [[TMP7:%.*]] = shufflevector <2 x i64> [[TMP6]], <2 x i64> [[TMP4]], <2 x i32> <i32 1, i32 2>
; AVX-NEXT: [[TMP8:%.*]] = shufflevector <2 x i64> [[TMP3]], <2 x i64> [[TMP7]], <2 x i32> <i32 1, i32 3>
; AVX-NEXT: [[TMP9:%.*]] = add <2 x i64> [[TMP7]], [[TMP8]]
; AVX-NEXT: [[TMP10:%.*]] = extractelement <2 x i64> [[TMP9]], i32 1
; AVX-NEXT: store i64 [[TMP10]], ptr [[P]], align 1
; AVX-NEXT: [[TMP11:%.*]] = add nuw nsw <2 x i64> [[TMP6]], [[TMP3]]
; AVX-NEXT: [[TMP12:%.*]] = extractelement <2 x i64> [[TMP9]], i32 0
; AVX-NEXT: store i64 [[TMP12]], ptr [[ARRAYIDX2_5]], align 1
; AVX-NEXT: [[TMP7:%.*]] = shl <2 x i64> [[TMP6]], splat (i64 2)
; AVX-NEXT: [[TMP8:%.*]] = and <2 x i64> [[TMP7]], splat (i64 20)
; AVX-NEXT: [[TMP15:%.*]] = extractelement <2 x i64> [[TMP4]], i32 0
; AVX-NEXT: [[TMP13:%.*]] = shl <2 x i64> [[TMP9]], splat (i64 2)
; AVX-NEXT: [[TMP14:%.*]] = and <2 x i64> [[TMP13]], splat (i64 20)
; AVX-NEXT: [[TMP15:%.*]] = extractelement <2 x i64> [[TMP11]], i32 0
; AVX-NEXT: store i64 [[TMP15]], ptr [[P]], align 1
; AVX-NEXT: [[TMP9:%.*]] = lshr <2 x i64> [[TMP4]], splat (i64 6)
; AVX-NEXT: [[TMP10:%.*]] = add nuw nsw <2 x i64> [[TMP8]], [[TMP9]]
; AVX-NEXT: store <2 x i64> [[TMP10]], ptr [[ARRAYIDX2_2]], align 1
; AVX-NEXT: [[TMP16:%.*]] = lshr <2 x i64> [[TMP11]], splat (i64 6)
; AVX-NEXT: [[TMP17:%.*]] = add nuw nsw <2 x i64> [[TMP14]], [[TMP16]]
; AVX-NEXT: store <2 x i64> [[TMP17]], ptr [[ARRAYIDX2_2]], align 1
; AVX-NEXT: ret void
;
entry:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,11 @@ define i1 @test() {
; CHECK-LABEL: define i1 @test() {
; CHECK-NEXT: [[ENTRY:.*:]]
; CHECK-NEXT: [[H_PROMOTED118_I_FR:%.*]] = freeze i32 1
; CHECK-NEXT: [[TMP3:%.*]] = insertelement <2 x i32> <i32 poison, i32 0>, i32 [[H_PROMOTED118_I_FR]], i32 0
; CHECK-NEXT: [[TMP4:%.*]] = add <2 x i32> zeroinitializer, [[TMP3]]
; CHECK-NEXT: [[TMP2:%.*]] = shufflevector <2 x i32> [[TMP4]], <2 x i32> poison, <4 x i32> <i32 0, i32 0, i32 1, i32 0>
; CHECK-NEXT: [[TMP0:%.*]] = insertelement <4 x i32> <i32 0, i32 0, i32 poison, i32 0>, i32 [[H_PROMOTED118_I_FR]], i32 2
; CHECK-NEXT: [[TMP1:%.*]] = add <4 x i32> zeroinitializer, [[TMP0]]
; CHECK-NEXT: [[TMP2:%.*]] = shufflevector <4 x i32> [[TMP0]], <4 x i32> [[TMP1]], <4 x i32> <i32 2, i32 2, i32 7, i32 2>
; CHECK-NEXT: [[TMP5:%.*]] = add <4 x i32> [[TMP1]], [[TMP2]]
; CHECK-NEXT: [[TMP6:%.*]] = and <4 x i32> [[TMP5]], <i32 0, i32 1, i32 1, i32 1>
; CHECK-NEXT: [[TMP7:%.*]] = icmp eq <4 x i32> [[TMP6]], <i32 1, i32 0, i32 0, i32 0>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -190,12 +190,12 @@ define i32 @reorder_indices_1(float %0) {
; NON-POW2-NEXT: entry:
; NON-POW2-NEXT: [[NOR1:%.*]] = alloca [0 x [3 x float]], i32 0, align 4
; NON-POW2-NEXT: [[TMP1:%.*]] = load <3 x float>, ptr [[NOR1]], align 4
; NON-POW2-NEXT: [[TMP3:%.*]] = fneg <3 x float> [[TMP1]]
; NON-POW2-NEXT: [[TMP2:%.*]] = shufflevector <3 x float> [[TMP1]], <3 x float> poison, <3 x i32> <i32 1, i32 2, i32 0>
; NON-POW2-NEXT: [[TMP3:%.*]] = fneg <3 x float> [[TMP2]]
; NON-POW2-NEXT: [[TMP4:%.*]] = insertelement <3 x float> poison, float [[TMP0]], i32 0
; NON-POW2-NEXT: [[TMP5:%.*]] = shufflevector <3 x float> [[TMP4]], <3 x float> poison, <3 x i32> zeroinitializer
; NON-POW2-NEXT: [[TMP6:%.*]] = fmul <3 x float> [[TMP3]], [[TMP5]]
; NON-POW2-NEXT: [[TMP10:%.*]] = shufflevector <3 x float> [[TMP6]], <3 x float> poison, <3 x i32> <i32 1, i32 2, i32 0>
; NON-POW2-NEXT: [[TMP7:%.*]] = call <3 x float> @llvm.fmuladd.v3f32(<3 x float> [[TMP1]], <3 x float> zeroinitializer, <3 x float> [[TMP10]])
; NON-POW2-NEXT: [[TMP7:%.*]] = call <3 x float> @llvm.fmuladd.v3f32(<3 x float> [[TMP1]], <3 x float> zeroinitializer, <3 x float> [[TMP6]])
; NON-POW2-NEXT: [[TMP8:%.*]] = call <3 x float> @llvm.fmuladd.v3f32(<3 x float> [[TMP5]], <3 x float> [[TMP7]], <3 x float> zeroinitializer)
; NON-POW2-NEXT: [[TMP9:%.*]] = fmul <3 x float> [[TMP8]], zeroinitializer
; NON-POW2-NEXT: store <3 x float> [[TMP9]], ptr [[NOR1]], align 4
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,20 +6,21 @@ define i32 @test(i8 %0) {
; CHECK-SAME: i8 [[TMP0:%.*]]) {
; CHECK-NEXT: [[ENTRY:.*:]]
; CHECK-NEXT: [[CMP13_NOT_5:%.*]] = icmp eq i64 0, 0
; CHECK-NEXT: [[TMP1:%.*]] = load i8, ptr addrspace(21) getelementptr inbounds (i8, ptr addrspace(21) null, i64 7), align 1
; CHECK-NEXT: [[TMP2:%.*]] = insertelement <2 x i8> <i8 0, i8 poison>, i8 [[TMP1]], i32 1
; CHECK-NEXT: [[TMP3:%.*]] = icmp eq <2 x i8> zeroinitializer, [[TMP2]]
; CHECK-NEXT: [[TMP4:%.*]] = load volatile i8, ptr null, align 8
; CHECK-NEXT: [[TMP5:%.*]] = load <2 x i8>, ptr addrspace(21) getelementptr inbounds (i8, ptr addrspace(21) null, i64 8), align 8
; CHECK-NEXT: [[TMP2:%.*]] = load i8, ptr addrspace(21) getelementptr inbounds (i8, ptr addrspace(21) null, i64 9), align 1
; CHECK-NEXT: [[TEST_STRUCTCOPY_14_S14_CM_COERCE_SROA_2_0_COPYLOAD:%.*]] = load i48, ptr addrspace(21) getelementptr inbounds (i8, ptr addrspace(21) null, i64 8), align 8
; CHECK-NEXT: [[TMP12:%.*]] = load i8, ptr addrspace(21) null, align 2
; CHECK-NEXT: [[TMP13:%.*]] = load volatile i8, ptr null, align 2
; CHECK-NEXT: [[TMP5:%.*]] = load <2 x i8>, ptr addrspace(21) getelementptr inbounds (i8, ptr addrspace(21) null, i64 7), align 1
; CHECK-NEXT: [[TMP32:%.*]] = shufflevector <2 x i8> <i8 0, i8 poison>, <2 x i8> [[TMP5]], <2 x i32> <i32 0, i32 2>
; CHECK-NEXT: [[TMP3:%.*]] = icmp eq <2 x i8> zeroinitializer, [[TMP32]]
; CHECK-NEXT: [[TMP6:%.*]] = shufflevector <2 x i8> [[TMP5]], <2 x i8> poison, <8 x i32> <i32 0, i32 1, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
; CHECK-NEXT: [[TMP7:%.*]] = shufflevector <8 x i8> <i8 0, i8 0, i8 poison, i8 0, i8 0, i8 poison, i8 0, i8 0>, <8 x i8> [[TMP6]], <8 x i32> <i32 0, i32 1, i32 8, i32 3, i32 4, i32 9, i32 6, i32 7>
; CHECK-NEXT: [[TMP33:%.*]] = shufflevector <8 x i8> <i8 0, i8 0, i8 poison, i8 0, i8 0, i8 poison, i8 0, i8 0>, <8 x i8> [[TMP6]], <8 x i32> <i32 0, i32 1, i32 9, i32 3, i32 4, i32 poison, i32 6, i32 7>
; CHECK-NEXT: [[TMP7:%.*]] = insertelement <8 x i8> [[TMP33]], i8 [[TMP2]], i32 5
; CHECK-NEXT: [[TMP8:%.*]] = icmp eq <8 x i8> zeroinitializer, [[TMP7]]
; CHECK-NEXT: [[TEST_STRUCTCOPY_14_S14_CM_COERCE_SROA_2_0_COPYLOAD:%.*]] = load i48, ptr addrspace(21) getelementptr inbounds (i8, ptr addrspace(21) null, i64 8), align 8
; CHECK-NEXT: [[TMP9:%.*]] = insertelement <4 x i48> <i48 poison, i48 0, i48 0, i48 0>, i48 [[TEST_STRUCTCOPY_14_S14_CM_COERCE_SROA_2_0_COPYLOAD]], i32 0
; CHECK-NEXT: [[TMP10:%.*]] = trunc <4 x i48> [[TMP9]] to <4 x i8>
; CHECK-NEXT: [[TMP11:%.*]] = icmp eq <4 x i8> zeroinitializer, [[TMP10]]
; CHECK-NEXT: [[TMP12:%.*]] = load i8, ptr addrspace(21) null, align 2
; CHECK-NEXT: [[TMP13:%.*]] = load volatile i8, ptr null, align 2
; CHECK-NEXT: [[TMP14:%.*]] = load <2 x i8>, ptr addrspace(21) getelementptr inbounds (i8, ptr addrspace(21) null, i64 8), align 8
; CHECK-NEXT: [[TMP15:%.*]] = insertelement <8 x i8> <i8 0, i8 poison, i8 0, i8 poison, i8 poison, i8 0, i8 0, i8 0>, i8 [[TMP12]], i32 1
; CHECK-NEXT: [[TMP16:%.*]] = shufflevector <2 x i8> [[TMP14]], <2 x i8> poison, <8 x i32> <i32 0, i32 1, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
Expand Down
12 changes: 5 additions & 7 deletions llvm/test/Transforms/SLPVectorizer/revec.ll
Original file line number Diff line number Diff line change
Expand Up @@ -306,13 +306,11 @@ define void @test11(<2 x i64> %0, i64 %1, <2 x i64> %2) {
; CHECK-LABEL: @test11(
; CHECK-NEXT: entry:
; CHECK-NEXT: [[TMP3:%.*]] = insertelement <2 x i64> [[TMP0:%.*]], i64 [[TMP1:%.*]], i32 1
; CHECK-NEXT: [[TMP4:%.*]] = add <2 x i64> <i64 5, i64 0>, [[TMP2:%.*]]
; CHECK-NEXT: [[TMP5:%.*]] = trunc <2 x i64> [[TMP4]] to <2 x i16>
; CHECK-NEXT: [[TMP6:%.*]] = shufflevector <2 x i16> [[TMP5]], <2 x i16> poison, <4 x i32> <i32 0, i32 1, i32 poison, i32 poison>
; CHECK-NEXT: [[TMP7:%.*]] = trunc <2 x i64> [[TMP3]] to <2 x i16>
; CHECK-NEXT: [[TMP10:%.*]] = shufflevector <2 x i16> [[TMP7]], <2 x i16> poison, <4 x i32> <i32 0, i32 1, i32 poison, i32 poison>
; CHECK-NEXT: [[TMP8:%.*]] = shufflevector <4 x i16> [[TMP6]], <4 x i16> [[TMP10]], <4 x i32> <i32 0, i32 1, i32 4, i32 5>
; CHECK-NEXT: [[TMP9:%.*]] = trunc <4 x i16> [[TMP8]] to <4 x i8>
; CHECK-NEXT: [[TMP4:%.*]] = shufflevector <2 x i64> [[TMP2:%.*]], <2 x i64> poison, <4 x i32> <i32 0, i32 1, i32 poison, i32 poison>
; CHECK-NEXT: [[TMP5:%.*]] = shufflevector <2 x i64> [[TMP3]], <2 x i64> poison, <4 x i32> <i32 0, i32 1, i32 poison, i32 poison>
; CHECK-NEXT: [[TMP6:%.*]] = shufflevector <4 x i64> [[TMP4]], <4 x i64> [[TMP5]], <4 x i32> <i32 0, i32 1, i32 4, i32 5>
; CHECK-NEXT: [[TMP7:%.*]] = add <4 x i64> <i64 5, i64 0, i64 0, i64 0>, [[TMP6]]
; CHECK-NEXT: [[TMP9:%.*]] = trunc <4 x i64> [[TMP7]] to <4 x i8>
; CHECK-NEXT: [[TMP11:%.*]] = urem <4 x i8> [[TMP9]], zeroinitializer
; CHECK-NEXT: [[TMP12:%.*]] = icmp ne <4 x i8> [[TMP11]], zeroinitializer
; CHECK-NEXT: ret void
Expand Down
Loading