AArch64: uses store from GP reg where vectorized reg would be better

Repro: https://godbolt.org/z/9bW58ax69
```
#include <string.h>
#include <arm_neon.h>
#include <arm_sve.h>

void foobar(uint16_t *p, uint8_t *p2, uint64x2_t vec, int x) {
  vec += (uint64x2_t){3, 4};
  if (x) {
    asm volatile(""); // side-effect so conditional move cannot be used.
    vec |= (uint64x2_t){0x12, 0x34};
    *p2 = vec[1];
    *p = vec[0];
  } else {
    // Commenting out the next two instructions will improve code in the "then" branch!
    vec ^= (uint64x2_t){0x34, 0x12};
    *p = vec[0];
  }
}

// clang++ -target aarch64-redhat-linux-gnu -O3 -S -o - test.cpp -march=armv9-a+sve2+fp16
```
https://godbolt.org/z/41MoofTzP

The compiler ends up compiling the "then" part of the branch as:
```
...
       umov    w8, v0.h[0]
       st1     { v0.b }[8], [x1]
       strh    w8, [x0]
```
while this could be done without going throgh the w8 GP register:
```
       st1     { v0.b }[8], [x1]
       str     h0, [x0]
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AArch64: uses store from GP reg where vectorized reg would be better #137086

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

AArch64: uses store from GP reg where vectorized reg would be better #137086

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions