Commit e9e661b
CUDA: remove unnecessary warp reduce in FA (ggml/1032)
* kqmax_new_j in every thread within warp is same after operate at line 199,this reduce can be omit
* same problem in vec32
---------
Co-authored-by: ZhaoXiaoYu <[email protected]>1 parent efb6ae9 commit e9e661b
2 files changed
+0
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
220 | 220 | | |
221 | 221 | | |
222 | 222 | | |
223 | | - | |
224 | 223 | | |
225 | 224 | | |
226 | 225 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
206 | 206 | | |
207 | 207 | | |
208 | 208 | | |
209 | | - | |
210 | 209 | | |
211 | 210 | | |
212 | 211 | | |
| |||
0 commit comments