Skip to content

Commit de52d23

Browse files
committed
[AArch64][GlobalISel] Legalize more CTPOP vector types.
Similar to other operations, s8, s16 s32 and s64 vector elements are clamped to legal vector sizes, odd number of elements are widened to the next power-2 and s128 is scalarized. This helps legalize cttz as well as ctpop.
1 parent 215c0d2 commit de52d23

File tree

4 files changed

+868
-429
lines changed

4 files changed

+868
-429
lines changed

llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6139,6 +6139,7 @@ LegalizerHelper::moreElementsVector(MachineInstr &MI, unsigned TypeIdx,
61396139
case TargetOpcode::G_FCANONICALIZE:
61406140
case TargetOpcode::G_SEXT_INREG:
61416141
case TargetOpcode::G_ABS:
6142+
case TargetOpcode::G_CTPOP:
61426143
if (TypeIdx != 0)
61436144
return UnableToLegalize;
61446145
Observer.changingInstr(MI);

llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -323,7 +323,13 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST)
323323
.clampScalar(0, s32, s128)
324324
.widenScalarToNextPow2(0)
325325
.minScalarEltSameAsIf(always, 1, 0)
326-
.maxScalarEltSameAsIf(always, 1, 0);
326+
.maxScalarEltSameAsIf(always, 1, 0)
327+
.clampNumElements(0, v8s8, v16s8)
328+
.clampNumElements(0, v4s16, v8s16)
329+
.clampNumElements(0, v2s32, v4s32)
330+
.clampNumElements(0, v2s64, v2s64)
331+
.moreElementsToNextPow2(0)
332+
.scalarizeIf(scalarOrEltWiderThan(0, 64), 0);
327333

328334
getActionDefinitionsBuilder(G_CTLZ)
329335
.legalForCartesianProduct(

0 commit comments

Comments
 (0)