Skip to content

[InstCombine] Fails to combine three shufflevectors produced by LoopVectorizer #38140

@JonPsson

Description

@JonPsson
Bugzilla Link 38792
Version trunk
OS Linux
Attachments reduced testcase
CC @topperc,@efriedma-quic,@hfinkel,@RKSimon,@JonPsson,@rotateright,@uweigand

Extended Description

The LoopVectorizer has interleaved loads and stores. Basically, the loaded elements should pairwise be reversed like

[0 1 2 3] -> [1 0 3 2]

The Vectorizer does not understand this but generates from two interleave groups first a result for the load group, and then makes another shuffle for the store group

%tmp6 = load <4 x i64>, <4 x i64>* %tmp5, align 8
%tmp7 = shufflevector <4 x i64> %tmp6, <4 x i64> undef, <2 x i32> <i32 0, i32 2>
%tmp8 = shufflevector <4 x i64> %tmp6, <4 x i64> undef, <2 x i32> <i32 1, i32 3>
%tmp9 = shufflevector <2 x i64> %tmp8, <2 x i64> %tmp7, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
%tmp10 = shufflevector <4 x i64> %tmp9, <4 x i64> undef, <4 x i32> <i32 0, i32 2, i32 1, i32 3>
store <4 x i64> %tmp10, <4 x i64>* undef, align 8

This results in [1 0 3 2], and I would have hoped that this would become a single shufflevector after instcombine, but this does not happen.

There are comments in InstCombine that this is purposely done very conservatively. It is however clear that this does not give good code on SystemZ.

I wonder if anyone has any idea if InstCombiner should handle this case, or if not, where should this be done. A custom DAGCombine by the target?

bin/opt ./tc_instcombine.ll -mtriple=systemz-unknown -mcpu=z13 -S -o out.opt.ll -instcombine

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugzillaIssues migrated from bugzillallvm:instcombineCovers the InstCombine, InstSimplify and AggressiveInstCombine passes

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions