Skip to content

Conversation

@konstantinschwarz
Copy link
Collaborator

No description provided.

Copy link
Collaborator

@khallouh khallouh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks good. Just a few nits. No comments on the tests yet because of the draft status. I have 2 questions:

  • Is the goal to match only half vector extract patterns or Sub-Vector extracts in general? (quarter extracts being also relevant for AIE's API)
  • How about the missing canonicalization combine for shufflevector in GlobalISel? I think this is needed if we want to support extracts from the second vector input with the current combine implementation in this PR.

const LLT Src1Ty = MRI.getType(Src1Reg);

if (!DstTy.isVector() || !Src1Ty.isVector())
return false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed? G_SHUFFLE_VECTOR only operates on vectors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, G_SHUFFLE_VECTOR can have scalar inputs. The opcode is a bit weird.

https://github.com/Xilinx/llvm-aie/blob/aie-public/llvm/test/CodeGen/AArch64/GlobalISel/legalize-shuffle-vector.mir#L119

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this misses the following edge case:

%0:_(<2 x s32>) = COPY $x0
%1:_(<2 x s32>) = UNDEF
%2:_(s32) = G_SHUFFLE_VECTOR %0, %1, shufflemask(0)

In this case, you could unmerge the first array into the destination. Since the target is not a vector, this would skip it. How useful this is? Who knows.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that case could even be matched to a G_EXTRACT_VECTOR_ELT. It might not be very useful for AIE's API since element extracts would be implemented using extractelement instead of shufflevector.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it is a bit obscure and the overhead it would cause is minimal.

assert(MI.getOpcode() == TargetOpcode::G_SHUFFLE_VECTOR);
const Register DstReg = MI.getOperand(0).getReg();
const Register Src1Reg = MI.getOperand(1).getReg();
ArrayRef<int> Mask = MI.getOperand(3).getShuffleMask();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: ShuffleMask?

Copy link
Contributor

@ValentijnvdBeek ValentijnvdBeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Konstantin, thanks for taking my design and generally unjunior/internifying. This looks very clean, great job. I am not able to run it a.t.m., but I will look at that whenever the PR is complete.

There are two small worries:

This check is really expensive, this replaces the basic design but makes it hard to add the optimizations that might be needed to merge upstream.

Having separate optimizer causes it to be missed by the G_EXTRACT->G_INSERT->G_BUILD sequences that might be valid representations. Upstream doesn't, for sensible reasons, canonize those into shufflevectors so you need to make sure that this optimizer is on those sequences as well.

See MR linked in: #41 (comment)

Comment on lines +1534 to +1538
def combine_shuffle_to_extract_vector : GICombineRule<
(defs root:$root, build_fn_matchinfo:$matchinfo),
(match (wip_match_opcode G_SHUFFLE_VECTOR):$root,
[{ return Helper.matchShuffleToExtractSubvector(*${root}, ${matchinfo}); }]),
(apply [{ Helper.applyBuildFn(*${root}, ${matchinfo}); }])>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid using wip_match_opcode, upstream really doesn't like this since it causes imprecise patterns and slows down the compilation. In this case, it should be very little difference compared to just matching SHUFFLE_VECTOR directly.

https://llvm.org/docs/GlobalISel/MIRPatterns.html#gallery
See MR linked in: #41 (comment)

const LLT Src1Ty = MRI.getType(Src1Reg);

if (!DstTy.isVector() || !Src1Ty.isVector())
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, G_SHUFFLE_VECTOR can have scalar inputs. The opcode is a bit weird.

https://github.com/Xilinx/llvm-aie/blob/aie-public/llvm/test/CodeGen/AArch64/GlobalISel/legalize-shuffle-vector.mir#L119

fsub_to_fneg, commute_constant_to_rhs, match_ands, match_ors,
combine_concat_vector, double_icmp_zero_and_or_combine, match_addos,
combine_shuffle_concat]>;
combine_shuffle_concat, combine_shuffle_to_extract_vector]>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: this is a bit of an expensive combiner to run since it iterates through a decent chunk of elements twice. Previously, this was an explicitly opt in combiner that a target enabled by calling a specific function. This will now run it on all targets, this is probably worth it. Also, upstream will probably be running those checks now anyways, so it doesn't matter as much.

return Helper.tryCombineShuffleVector(MI);

See MR linked in: #41 (comment)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What upstream checks are you referring to? I think the current tryCombineShuffleVector only matches merge-like patterns, but we could definitely move our combine_shuffle_to_extract_vector combine there, would that make it less expensive?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, checking elements to the shuffle vector is just an O(n²) operations unless you use a hashmap. At the moment, this is done in tryCombineShuffleVector which a backend can enable. This is now run always in all backends. I don't know exactly why someone made the decision to have that combiner be run explicitly rather than using tablegen like the others

const LLT Src1Ty = MRI.getType(Src1Reg);

if (!DstTy.isVector() || !Src1Ty.isVector())
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this misses the following edge case:

%0:_(<2 x s32>) = COPY $x0
%1:_(<2 x s32>) = UNDEF
%2:_(s32) = G_SHUFFLE_VECTOR %0, %1, shufflemask(0)

In this case, you could unmerge the first array into the destination. Since the target is not a vector, this would skip it. How useful this is? Who knows.

Comment on lines +7278 to +7290
auto CheckExtractMask = [=](unsigned Start, unsigned NumElems) -> bool {
auto ExtractMask = createSequentialMask(Start, NumElems, 0);

for (unsigned I = 0; I < NumDstElems; I++) {
if (Mask[I] == -1)
continue;

if (Mask[I] != ExtractMask[I])
return false;
}

return true;
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this is a lot, this is great.

if (NumDstElems * 2 != NumSrc1Elems)
return false;

auto CheckExtractMask = [=](unsigned Start, unsigned NumElems) -> bool {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we can use by-reference capture and automatic return type deduction.

@konstantinschwarz
Copy link
Collaborator Author

We added a lot of helper functions for matching sub-vector extraction operations recently. Those will be extended to cover the combine in this PR.

@konstantinschwarz konstantinschwarz deleted the kschwarz.shufflevector.to.extract.subvector branch February 6, 2025 00:09
mgehre-amd pushed a commit that referenced this pull request Aug 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants