-
Notifications
You must be signed in to change notification settings - Fork 15.3k
Description
| Bugzilla Link | 38792 |
| Version | trunk |
| OS | Linux |
| Attachments | reduced testcase |
| CC | @topperc,@efriedma-quic,@hfinkel,@RKSimon,@JonPsson,@rotateright,@uweigand |
Extended Description
The LoopVectorizer has interleaved loads and stores. Basically, the loaded elements should pairwise be reversed like
[0 1 2 3] -> [1 0 3 2]
The Vectorizer does not understand this but generates from two interleave groups first a result for the load group, and then makes another shuffle for the store group
%tmp6 = load <4 x i64>, <4 x i64>* %tmp5, align 8
%tmp7 = shufflevector <4 x i64> %tmp6, <4 x i64> undef, <2 x i32> <i32 0, i32 2>
%tmp8 = shufflevector <4 x i64> %tmp6, <4 x i64> undef, <2 x i32> <i32 1, i32 3>
%tmp9 = shufflevector <2 x i64> %tmp8, <2 x i64> %tmp7, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
%tmp10 = shufflevector <4 x i64> %tmp9, <4 x i64> undef, <4 x i32> <i32 0, i32 2, i32 1, i32 3>
store <4 x i64> %tmp10, <4 x i64>* undef, align 8
This results in [1 0 3 2], and I would have hoped that this would become a single shufflevector after instcombine, but this does not happen.
There are comments in InstCombine that this is purposely done very conservatively. It is however clear that this does not give good code on SystemZ.
I wonder if anyone has any idea if InstCombiner should handle this case, or if not, where should this be done. A custom DAGCombine by the target?
bin/opt ./tc_instcombine.ll -mtriple=systemz-unknown -mcpu=z13 -S -o out.opt.ll -instcombine