Commit 2c3cdaf
committed
Optimized BGEMV for NEOVERSEV1 target
- Adds bgemv T based off of sbgemv T kernel
- Adds bgemv N which is slightly alterated to not use Y as an
accumulator due to the output being bf16 which results in loss of
precision
- Enables BGEMM_GEMV_FORWARD to proxy BGEMM to BGEMV with new kernels1 parent 2f81d6e commit 2c3cdaf
File tree
7 files changed
+426
-42
lines changed- benchmark
- kernel/arm64
- test
7 files changed
+426
-42
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
277 | 277 | | |
278 | 278 | | |
279 | 279 | | |
| 280 | + | |
280 | 281 | | |
281 | 282 | | |
282 | 283 | | |
| |||
296 | 297 | | |
297 | 298 | | |
298 | 299 | | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
299 | 303 | | |
300 | 304 | | |
301 | 305 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
84 | 84 | | |
85 | 85 | | |
86 | 86 | | |
87 | | - | |
| 87 | + | |
88 | 88 | | |
89 | 89 | | |
90 | 90 | | |
| |||
667 | 667 | | |
668 | 668 | | |
669 | 669 | | |
| 670 | + | |
| 671 | + | |
| 672 | + | |
| 673 | + | |
670 | 674 | | |
671 | 675 | | |
672 | 676 | | |
| |||
3146 | 3150 | | |
3147 | 3151 | | |
3148 | 3152 | | |
| 3153 | + | |
| 3154 | + | |
| 3155 | + | |
| 3156 | + | |
| 3157 | + | |
| 3158 | + | |
| 3159 | + | |
3149 | 3160 | | |
3150 | 3161 | | |
3151 | 3162 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | | - | |
| 2 | + | |
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
| |||
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
37 | 43 | | |
38 | 44 | | |
39 | 45 | | |
| |||
49 | 55 | | |
50 | 56 | | |
51 | 57 | | |
52 | | - | |
53 | | - | |
54 | | - | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
55 | 72 | | |
56 | 73 | | |
57 | 74 | | |
| |||
97 | 114 | | |
98 | 115 | | |
99 | 116 | | |
100 | | - | |
| 117 | + | |
101 | 118 | | |
102 | 119 | | |
103 | 120 | | |
104 | | - | |
| 121 | + | |
105 | 122 | | |
106 | 123 | | |
107 | 124 | | |
| |||
125 | 142 | | |
126 | 143 | | |
127 | 144 | | |
128 | | - | |
| 145 | + | |
129 | 146 | | |
130 | 147 | | |
131 | 148 | | |
132 | 149 | | |
133 | 150 | | |
134 | 151 | | |
135 | 152 | | |
136 | | - | |
| 153 | + | |
137 | 154 | | |
138 | 155 | | |
139 | 156 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
49 | 52 | | |
50 | 53 | | |
51 | 54 | | |
| |||
0 commit comments