-
Notifications
You must be signed in to change notification settings - Fork 15.2k
Open
Description
Repro: https://godbolt.org/z/9bW58ax69
#include <string.h>
#include <arm_neon.h>
#include <arm_sve.h>
void foobar(uint16_t *p, uint8_t *p2, uint64x2_t vec, int x) {
vec += (uint64x2_t){3, 4};
if (x) {
asm volatile(""); // side-effect so conditional move cannot be used.
vec |= (uint64x2_t){0x12, 0x34};
*p2 = vec[1];
*p = vec[0];
} else {
// Commenting out the next two instructions will improve code in the "then" branch!
vec ^= (uint64x2_t){0x34, 0x12};
*p = vec[0];
}
}
// clang++ -target aarch64-redhat-linux-gnu -O3 -S -o - test.cpp -march=armv9-a+sve2+fp16
https://godbolt.org/z/41MoofTzP
The compiler ends up compiling the "then" part of the branch as:
...
umov w8, v0.h[0]
st1 { v0.b }[8], [x1]
strh w8, [x0]
while this could be done without going throgh the w8 GP register:
st1 { v0.b }[8], [x1]
str h0, [x0]