[X86] Vector 8-bit shifts by variable amounts should use power of two multiply

8-bit shifts by a variable amount are implemented by three selections between a shift by a power of two and the unshifted value based on the corresponding amount bits:

```asm
shiftVarU8_sse41:
        movdqa  xmm2, xmm0
        movdqa  xmm3, xmm0
        psllw   xmm3, 4
        pand    xmm3, xmmword ptr [rip + .LCPI1_0]
        psllw   xmm1, 5
        movdqa  xmm0, xmm1
        pblendvb        xmm2, xmm3, xmm0
        movdqa  xmm3, xmm2
        psllw   xmm3, 2
        pand    xmm3, xmmword ptr [rip + .LCPI1_1]
        paddb   xmm1, xmm1
        movdqa  xmm0, xmm1
        pblendvb        xmm2, xmm3, xmm0
        movdqa  xmm3, xmm2
        paddb   xmm3, xmm2
        paddb   xmm1, xmm1
        movdqa  xmm0, xmm1
        pblendvb        xmm2, xmm3, xmm0
        movdqa  xmm0, xmm2
        ret
```

If SSSE3 is available, it is cheaper to instead multiply by a power of two which can be obtained by using a shuffle as a lookup table (although clang tends to implement multiplies less efficiently):

```asm
shiftVarU8_ideal:
        pand    xmm1, xmmword ptr [rip + .LCPI2_0]
        movq    xmm2, qword ptr [rip + .LCPI2_1]
        pshufb  xmm2, xmm1
        movdqa  xmm1, xmm2
        pmullw  xmm1, xmm0
        pand    xmm1, xmmword ptr [rip + .LCPI2_2]
        pand    xmm2, xmmword ptr [rip + .LCPI2_3]
        psrlw   xmm0, 8
        pmullw  xmm0, xmm2
        por     xmm0, xmm1
        ret
```

The first `pand` may be skipped if amounts greater than 7 is considered undefined. This method appears to be optimal until AVX512BW.


https://godbolt.org/z/GabW4h6TG


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[X86] Vector 8-bit shifts by variable amounts should use power of two multiply #165964

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[X86] Vector 8-bit shifts by variable amounts should use power of two multiply #165964

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions