Commit c23ab46
authored
feat: perf opt part4 (#43)
* wip
* refactor: rewrite dequantize_row_q4_0 by intrinsic
* log for debug
* fix q4 intrinsic
* small opt
* wip
* wip
* add vtcm_quota_size
* add perf log for hexagon-npu backend
* wip
* add log
* sync after a specfic op
* increase worker thread priority
* fix unbalanced thread slice
* small slict to fit in vtcm cache
* limit the supported row element size
* opt 4_0 dequant
* fix q4 dequant
* add power_utils
* add rms_norm
* wip
* enable rms_norm f32
* fix rms_norm with param
* fix compiling flags
* use float
* fix small row size
* vectorized rms norm
* wip
* read 2 vectors
* rename
* add perf log on update
* set empty tensors handle also
* merge some rpc functions
* opt param update
* wip
* print more log
* add struct for update param config
* add npu_device_graph_set_tensor_with_param
* merge tensor and params update
* wip
* wip
* make as template to reuse
* vectorize dequantize_row_q8_0
* opt
* avoid using union to store q data
* wip
* wip
* wip1 parent 2306f82 commit c23ab46
File tree
32 files changed
+1020
-403
lines changed- ggml/src/ggml-qnn
- npu
- device
- host
- idl
- qnn
32 files changed
+1020
-403
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
231 | 231 | | |
232 | 232 | | |
233 | 233 | | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
234 | 239 | | |
235 | 240 | | |
236 | 241 | | |
| |||
239 | 244 | | |
240 | 245 | | |
241 | 246 | | |
242 | | - | |
243 | 247 | | |
244 | 248 | | |
245 | 249 | | |
246 | 250 | | |
247 | 251 | | |
| 252 | + | |
248 | 253 | | |
249 | 254 | | |
250 | 255 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
106 | 106 | | |
107 | 107 | | |
108 | 108 | | |
| 109 | + | |
109 | 110 | | |
110 | 111 | | |
111 | 112 | | |
| |||
117 | 118 | | |
118 | 119 | | |
119 | 120 | | |
| 121 | + | |
120 | 122 | | |
121 | 123 | | |
122 | 124 | | |
| |||
130 | 132 | | |
131 | 133 | | |
132 | 134 | | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
133 | 141 | | |
134 | 142 | | |
135 | 143 | | |
| |||
147 | 155 | | |
148 | 156 | | |
149 | 157 | | |
150 | | - | |
151 | | - | |
| 158 | + | |
| 159 | + | |
152 | 160 | | |
153 | 161 | | |
154 | | - | |
155 | | - | |
156 | | - | |
157 | | - | |
158 | | - | |
159 | | - | |
160 | | - | |
161 | | - | |
162 | | - | |
163 | | - | |
164 | | - | |
165 | | - | |
166 | | - | |
167 | | - | |
| 162 | + | |
168 | 163 | | |
169 | 164 | | |
170 | 165 | | |
171 | | - | |
| 166 | + | |
172 | 167 | | |
173 | 168 | | |
174 | 169 | | |
| |||
206 | 201 | | |
207 | 202 | | |
208 | 203 | | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
209 | 227 | | |
210 | 228 | | |
211 | 229 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
13 | | - | |
| 13 | + | |
| 14 | + | |
14 | 15 | | |
15 | 16 | | |
16 | 17 | | |
| |||
45 | 46 | | |
46 | 47 | | |
47 | 48 | | |
| 49 | + | |
| 50 | + | |
48 | 51 | | |
49 | 52 | | |
50 | 53 | | |
| |||
61 | 64 | | |
62 | 65 | | |
63 | 66 | | |
| 67 | + | |
| 68 | + | |
64 | 69 | | |
65 | 70 | | |
66 | 71 | | |
| |||
69 | 74 | | |
70 | 75 | | |
71 | 76 | | |
72 | | - | |
73 | | - | |
74 | 77 | | |
75 | 78 | | |
76 | 79 | | |
77 | 80 | | |
78 | | - | |
79 | | - | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
80 | 85 | | |
81 | 86 | | |
82 | 87 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
| 28 | + | |
28 | 29 | | |
29 | 30 | | |
30 | 31 | | |
| |||
0 commit comments