Skip to content

Commit 930915e

Browse files
david-armgithub-actions[bot]
authored andcommitted
Automerge: [AArch64] Support symmetric complex deinterleaving with higher factors (#151295)
For loops such as this: ``` struct foo { double a, b; }; void foo(struct foo *dst, struct foo *src, int n) { for (int i = 0; i < n; i++) { dst[i].a += src[i].a * 3.2; dst[i].b += src[i].b * 3.2; } } ``` the complex deinterleaving pass will spot that the deinterleaving associated with the structured loads cancels out the interleaving associated with the structured stores. This happens even though they are not truly "complex" numbers because the pass can handle symmetric operations too. This is great because it means we can then perform normal loads and stores instead. However, we can also do the same for higher interleave factors, e.g. 4: ``` struct foo { double a, b, c, d; }; void foo(struct foo *dst, struct foo *src, int n) { for (int i = 0; i < n; i++) { dst[i].a += src[i].a * 3.2; dst[i].b += src[i].b * 3.2; dst[i].c += src[i].c * 3.2; dst[i].d += src[i].d * 3.2; } } ``` This PR extends the pass to effectively treat such structures as a set of complex numbers, i.e. ``` struct foo_alt { std::complex<double> x, y; }; ``` with equivalence between members: ``` foo_alt.x.real == foo.a foo_alt.x.imag == foo.b foo_alt.y.real == foo.c foo_alt.y.imag == foo.d ``` I've written the code to handle sets with arbitrary numbers of complex values, but since we only support interleave factors between 2 and 4 I've restricted the sets to 1 or 2 complex numbers. Also, for now I've restricted support for interleave factors of 4 to purely symmetric operations only. However, it could also be extended to handle complex multiplications, reductions, etc. Fixes: llvm/llvm-project#144795
2 parents 3314b3c + 7f763d9 commit 930915e

File tree

4 files changed

+616
-143
lines changed

4 files changed

+616
-143
lines changed

0 commit comments

Comments
 (0)