Skip to content

AArch64: Missed post-increment opportunity #137084

@MatzeB

Description

@MatzeB

Repro:

#include <arm_neon.h>
void partialWrite(uint8_t* p, uint16x8_t vec) {
    vst1q_lane_u16(reinterpret_cast<uint16_t*>(p), vec, 0);
    vst1q_lane_u8(p + 2, vreinterpretq_u8_u16(vec), 2);
}

Currently produces:

$ clang++ -target aarch64-redhat-linux-gnu -march=armv9-a+sve2+fp16 repro.cpp
...
       add     x8, x0, #2
        str     h0, [x0]
        st1     { v0.b }[2], [x8]
        ret

misses the opportunity to use post-increment on the first str Could be the following (as produced by GCC):

       str     h0, [x0], 2
       st1     {v0.b}[2], [x0]
       ret

(this mirrors meta T222168293 )

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions