You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[AArch64] Support symmetric complex deinterleaving with higher factors (#151295)
For loops such as this:
```
struct foo {
double a, b;
};
void foo(struct foo *dst, struct foo *src, int n) {
for (int i = 0; i < n; i++) {
dst[i].a += src[i].a * 3.2;
dst[i].b += src[i].b * 3.2;
}
}
```
the complex deinterleaving pass will spot that the deinterleaving
associated with the structured loads cancels out the interleaving
associated with the structured stores. This happens even though
they are not truly "complex" numbers because the pass can handle
symmetric operations too. This is great because it means we can
then perform normal loads and stores instead. However, we can also
do the same for higher interleave factors, e.g. 4:
```
struct foo {
double a, b, c, d;
};
void foo(struct foo *dst, struct foo *src, int n) {
for (int i = 0; i < n; i++) {
dst[i].a += src[i].a * 3.2;
dst[i].b += src[i].b * 3.2;
dst[i].c += src[i].c * 3.2;
dst[i].d += src[i].d * 3.2;
}
}
```
This PR extends the pass to effectively treat such structures as
a set of complex numbers, i.e.
```
struct foo_alt {
std::complex<double> x, y;
};
```
with equivalence between members:
```
foo_alt.x.real == foo.a
foo_alt.x.imag == foo.b
foo_alt.y.real == foo.c
foo_alt.y.imag == foo.d
```
I've written the code to handle sets with arbitrary numbers of
complex values, but since we only support interleave factors
between 2 and 4 I've restricted the sets to 1 or 2 complex
numbers. Also, for now I've restricted support for interleave
factors of 4 to purely symmetric operations only. However, it
could also be extended to handle complex multiplications,
reductions, etc.
Fixes: #144795
0 commit comments