Commit eedc38a
FFHT: ARM NEON port (#5289)
Summary:
Pull Request resolved: #5289
Patch the code generator to be capable of generating NEON
code and leave it configured to do that since we already have the
checked-in generated AVX and SSE code. Generated code size was
a potential issue so I also patched the generator to 1) reuse
generated code for previous smaller sizes whereapplicable and 2)
choose the smallest code that isn't more than 10%
slower than the very fastest code.
ghstack-source-id: 242230777
exported-using-ghexport
Reviewed By: kimishpatel
Differential Revision: D60194970
fbshipit-source-id: 37aab6813222c5a965c060286b5d5453ced22a0c1 parent d3fb502 commit eedc38a
File tree
5 files changed
+3709
-372
lines changed- extension/llm/custom_ops/spinquant/third-party/FFHT
5 files changed
+3709
-372
lines changedLines changed: 21 additions & 14 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
4 | 3 | | |
| 4 | + | |
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
10 | | - | |
11 | | - | |
12 | | - | |
13 | | - | |
14 | | - | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
15 | 18 | | |
16 | 19 | | |
17 | 20 | | |
18 | 21 | | |
19 | 22 | | |
20 | | - | |
21 | | - | |
| 23 | + | |
| 24 | + | |
22 | 25 | | |
23 | 26 | | |
24 | | - | |
25 | | - | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
26 | 30 | | |
| 31 | + | |
27 | 32 | | |
28 | | - | |
29 | | - | |
| 33 | + | |
| 34 | + | |
30 | 35 | | |
31 | 36 | | |
32 | | - | |
33 | | - | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
34 | 40 | | |
| 41 | + | |
35 | 42 | | |
36 | 43 | | |
37 | 44 | | |
| |||
Lines changed: 13 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
10 | 14 | | |
11 | 15 | | |
12 | 16 | | |
13 | 17 | | |
14 | 18 | | |
15 | 19 | | |
16 | 20 | | |
| 21 | + | |
17 | 22 | | |
18 | | - | |
19 | | - | |
20 | | - | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
21 | 26 | | |
22 | 27 | | |
23 | | - | |
24 | | - | |
25 | | - | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
26 | 32 | | |
| 33 | + | |
27 | 34 | | |
28 | 35 | | |
29 | 36 | | |
| |||
0 commit comments