Skip to content

_mm256_alignr_epi8 is incorrect #328

@Cocalus

Description

@Cocalus

The _mm256_alignr_epi8 implementation and test do not match the Intel specs. The instruction is annoyingly split into two separate 128 bit lanes. Or in other words the same as _mm_alignr_epi8 being applied independently to the upper and lower 128 bits lanes.

See
https://software.intel.com/en-us/blogs/2015/01/13/programming-using-avx2-permutations
For an explanation of how it's implemented and work arounds for the split lanes in some avx2 instructions. It's probably worth double checking the other instructions there.

To make it more confusing in AVX-512 The *_epi32 and *_epi64 variants of alignr do not split lanes, while the *_epi8 variants still do.

I maybe mistaken but I thought there were tests to make sure the correct instruction was generated, It seems unlikely to have accidentally of passed that test.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions