Commit b9ce940
authored
vulkan: Fuse rope+set_rows (#16769)
This pattern appears in a lot of models, the rope operation is applied right
before storing into the KV cache (usually on the K tensor).
Add a path to some of the rope shaders that computes the destination address
based on the set_rows tensor. Compile variants of the shader with D_TYPE of
f16 (the usual KV cache type).
Add a src3 operand to ggml_vk_op_f32 - sometimes rope uses three srcs and needs
the fourth for the row indices.
Add fused_ops_write_mask to indicate which intermediate tensors need to write
their results to memory. Skipping writing the roped K value helps to allow more
nodes to run concurrently.
Add logic to ggml_vk_graph_optimize to make ROPE+VIEW+SET_ROWS consecutive. It
rarely starts out that way in the graph.
Add new backend tests.1 parent 3464bda commit b9ce940
File tree
6 files changed
+371
-117
lines changed- ggml/src/ggml-vulkan
- vulkan-shaders
- tests
6 files changed
+371
-117
lines changedLarge diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
| 13 | + | |
13 | 14 | | |
14 | 15 | | |
15 | 16 | | |
| |||
27 | 28 | | |
28 | 29 | | |
29 | 30 | | |
| 31 | + | |
30 | 32 | | |
31 | 33 | | |
32 | 34 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
19 | | - | |
| 19 | + | |
20 | 20 | | |
21 | 21 | | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
22 | 29 | | |
23 | | - | |
24 | | - | |
| 30 | + | |
| 31 | + | |
25 | 32 | | |
26 | 33 | | |
27 | 34 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
19 | | - | |
| 19 | + | |
20 | 20 | | |
21 | 21 | | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
22 | 29 | | |
23 | | - | |
24 | | - | |
| 30 | + | |
| 31 | + | |
25 | 32 | | |
26 | 33 | | |
27 | 34 | | |
| |||
Lines changed: 4 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
842 | 842 | | |
843 | 843 | | |
844 | 844 | | |
| 845 | + | |
| 846 | + | |
845 | 847 | | |
846 | 848 | | |
847 | 849 | | |
848 | 850 | | |
| 851 | + | |
| 852 | + | |
849 | 853 | | |
850 | 854 | | |
851 | 855 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2125 | 2125 | | |
2126 | 2126 | | |
2127 | 2127 | | |
| 2128 | + | |
| 2129 | + | |
| 2130 | + | |
| 2131 | + | |
| 2132 | + | |
| 2133 | + | |
| 2134 | + | |
| 2135 | + | |
| 2136 | + | |
| 2137 | + | |
| 2138 | + | |
| 2139 | + | |
| 2140 | + | |
| 2141 | + | |
| 2142 | + | |
| 2143 | + | |
| 2144 | + | |
| 2145 | + | |
| 2146 | + | |
| 2147 | + | |
| 2148 | + | |
| 2149 | + | |
| 2150 | + | |
| 2151 | + | |
| 2152 | + | |
| 2153 | + | |
| 2154 | + | |
| 2155 | + | |
2128 | 2156 | | |
2129 | 2157 | | |
2130 | 2158 | | |
| |||
2168 | 2196 | | |
2169 | 2197 | | |
2170 | 2198 | | |
2171 | | - | |
2172 | | - | |
2173 | 2199 | | |
2174 | 2200 | | |
2175 | 2201 | | |
2176 | 2202 | | |
2177 | 2203 | | |
2178 | 2204 | | |
2179 | | - | |
2180 | | - | |
2181 | | - | |
2182 | | - | |
2183 | | - | |
2184 | | - | |
2185 | | - | |
2186 | | - | |
2187 | | - | |
2188 | | - | |
2189 | | - | |
2190 | | - | |
2191 | | - | |
2192 | | - | |
2193 | | - | |
2194 | | - | |
2195 | | - | |
2196 | | - | |
2197 | | - | |
2198 | | - | |
2199 | | - | |
2200 | | - | |
2201 | | - | |
| 2205 | + | |
2202 | 2206 | | |
2203 | 2207 | | |
2204 | 2208 | | |
| |||
2227 | 2231 | | |
2228 | 2232 | | |
2229 | 2233 | | |
| 2234 | + | |
| 2235 | + | |
| 2236 | + | |
| 2237 | + | |
| 2238 | + | |
| 2239 | + | |
| 2240 | + | |
| 2241 | + | |
| 2242 | + | |
| 2243 | + | |
| 2244 | + | |
| 2245 | + | |
| 2246 | + | |
| 2247 | + | |
| 2248 | + | |
| 2249 | + | |
| 2250 | + | |
| 2251 | + | |
| 2252 | + | |
| 2253 | + | |
| 2254 | + | |
| 2255 | + | |
| 2256 | + | |
| 2257 | + | |
| 2258 | + | |
| 2259 | + | |
| 2260 | + | |
| 2261 | + | |
| 2262 | + | |
| 2263 | + | |
| 2264 | + | |
| 2265 | + | |
| 2266 | + | |
| 2267 | + | |
| 2268 | + | |
| 2269 | + | |
| 2270 | + | |
| 2271 | + | |
| 2272 | + | |
| 2273 | + | |
| 2274 | + | |
| 2275 | + | |
| 2276 | + | |
| 2277 | + | |
| 2278 | + | |
| 2279 | + | |
| 2280 | + | |
| 2281 | + | |
| 2282 | + | |
| 2283 | + | |
| 2284 | + | |
| 2285 | + | |
| 2286 | + | |
| 2287 | + | |
| 2288 | + | |
| 2289 | + | |
| 2290 | + | |
| 2291 | + | |
| 2292 | + | |
| 2293 | + | |
| 2294 | + | |
2230 | 2295 | | |
2231 | 2296 | | |
2232 | 2297 | | |
| |||
6163 | 6228 | | |
6164 | 6229 | | |
6165 | 6230 | | |
| 6231 | + | |
| 6232 | + | |
| 6233 | + | |
| 6234 | + | |
| 6235 | + | |
| 6236 | + | |
| 6237 | + | |
6166 | 6238 | | |
6167 | 6239 | | |
6168 | 6240 | | |
| |||
0 commit comments