Commit 9e5e09d
committed
sampling : remove backend-dist option (wip)
This commit removes the `--backend-dist` option and instead uses the
configured --samplers chain to determine which samplers run on the
backend.
Backend sampling is still enabled using With `--backend_sampling`, and
the sampler chain, either explictly specified using `--samplers` or the
default, is automatically analyzed to determine which samplers can run
on the backend. The system finds the longest contiguous chain of
backend supported samplers from the start of the sampler sequence.
For example:
* If the chain is `top-k -> temperature -> top-p`, and both `top-k` and
`temperature` are backend-supported but `top-p` is not, then `top-k`
and `temperature` will run on the backend, while `top-p` and
subsequent samplers run on the CPU.
* If all configured samplers are supported, the final distribution
sampling will also happen on the backend, transferring only the
sampled token IDs back to the host.
* If the sampler chain starts with an unsupported sampler (e.g.,
`penalties`), all sampling runs on the CPU. Note that this is
currently the case with the default sampler so to use backend sampling
it is required to specify a sampler chain. See below for an example.
The following shows how llama-cli can be run with backend sampling:
```console
$ llama-cli -m models/Qwen2.5-VL-3B-Instruct-Q8_0.gguf \
--prompt 'What is the capital of Sweden?' \
-n 20 \
-no-cnv \
--verbose-prompt \
-ngl 40 \
--backend-sampling \
--samplers 'top_k;temperature'
```
In this case the all sampling will happen on the backend since both
`top_k` and `temperature` are supported backend samplers.
To enable a partial backend sampling (hybrid sampling), for example
running `top_k` and `temperature` on the backend and `typ_p` on the CPU
the following sampler chain could be specified:
```console
$ llama-cli -m models/Qwen2.5-VL-3B-Instruct-Q8_0.gguf \
--prompt 'What is the capital of Sweden?' \
-n 20 \
-no-cnv \
--verbose-prompt \
-ngl 40 \
--backend-sampling \
--samplers 'top_k;temperature;top_p'
```
If this looks good then I'll follow up with updates the llama-cli and
llama-server documentation to reflect these changes.1 parent 53dca56 commit 9e5e09d
File tree
15 files changed
+214
-96
lines changed- common
- examples/batched
- tools/server
- public
- webui/src/lib
- components/app/chat/ChatSettings
- constants
- services
- stores
- types
15 files changed
+214
-96
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1520 | 1520 | | |
1521 | 1521 | | |
1522 | 1522 | | |
1523 | | - | |
1524 | | - | |
1525 | | - | |
1526 | | - | |
1527 | | - | |
1528 | | - | |
1529 | | - | |
1530 | | - | |
1531 | 1523 | | |
1532 | 1524 | | |
1533 | 1525 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1021 | 1021 | | |
1022 | 1022 | | |
1023 | 1023 | | |
1024 | | - | |
1025 | | - | |
1026 | | - | |
| 1024 | + | |
| 1025 | + | |
| 1026 | + | |
| 1027 | + | |
| 1028 | + | |
| 1029 | + | |
| 1030 | + | |
| 1031 | + | |
| 1032 | + | |
| 1033 | + | |
1027 | 1034 | | |
1028 | | - | |
1029 | | - | |
1030 | 1035 | | |
1031 | 1036 | | |
1032 | 1037 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
213 | 213 | | |
214 | 214 | | |
215 | 215 | | |
216 | | - | |
217 | 216 | | |
218 | | - | |
219 | 217 | | |
220 | 218 | | |
221 | 219 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
105 | 105 | | |
106 | 106 | | |
107 | 107 | | |
108 | | - | |
| 108 | + | |
| 109 | + | |
109 | 110 | | |
110 | 111 | | |
111 | 112 | | |
| |||
118 | 119 | | |
119 | 120 | | |
120 | 121 | | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
121 | 125 | | |
122 | 126 | | |
123 | 127 | | |
| |||
165 | 169 | | |
166 | 170 | | |
167 | 171 | | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
168 | 186 | | |
169 | 187 | | |
170 | 188 | | |
| |||
249 | 267 | | |
250 | 268 | | |
251 | 269 | | |
252 | | - | |
253 | | - | |
254 | | - | |
255 | | - | |
256 | | - | |
257 | | - | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
258 | 277 | | |
259 | 278 | | |
260 | | - | |
261 | | - | |
262 | | - | |
263 | | - | |
264 | | - | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
265 | 345 | | |
266 | 346 | | |
267 | | - | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
268 | 350 | | |
269 | 351 | | |
270 | 352 | | |
| |||
308 | 390 | | |
309 | 391 | | |
310 | 392 | | |
311 | | - | |
| 393 | + | |
| 394 | + | |
| 395 | + | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
| 399 | + | |
312 | 400 | | |
313 | 401 | | |
314 | 402 | | |
| |||
323 | 411 | | |
324 | 412 | | |
325 | 413 | | |
326 | | - | |
| 414 | + | |
327 | 415 | | |
328 | 416 | | |
| 417 | + | |
329 | 418 | | |
330 | 419 | | |
| 420 | + | |
| 421 | + | |
| 422 | + | |
| 423 | + | |
| 424 | + | |
| 425 | + | |
| 426 | + | |
| 427 | + | |
| 428 | + | |
| 429 | + | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
331 | 439 | | |
332 | 440 | | |
333 | 441 | | |
334 | 442 | | |
335 | 443 | | |
336 | | - | |
337 | | - | |
338 | | - | |
339 | | - | |
340 | | - | |
| 444 | + | |
| 445 | + | |
341 | 446 | | |
342 | 447 | | |
343 | 448 | | |
344 | 449 | | |
345 | 450 | | |
346 | 451 | | |
347 | | - | |
348 | | - | |
349 | | - | |
| 452 | + | |
| 453 | + | |
| 454 | + | |
| 455 | + | |
| 456 | + | |
350 | 457 | | |
351 | | - | |
352 | | - | |
| 458 | + | |
| 459 | + | |
| 460 | + | |
| 461 | + | |
| 462 | + | |
| 463 | + | |
| 464 | + | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
| 471 | + | |
| 472 | + | |
| 473 | + | |
353 | 474 | | |
354 | 475 | | |
355 | | - | |
| 476 | + | |
| 477 | + | |
| 478 | + | |
| 479 | + | |
| 480 | + | |
| 481 | + | |
356 | 482 | | |
357 | 483 | | |
358 | 484 | | |
| |||
362 | 488 | | |
363 | 489 | | |
364 | 490 | | |
365 | | - | |
366 | 491 | | |
367 | 492 | | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
368 | 497 | | |
369 | 498 | | |
370 | 499 | | |
| |||
387 | 516 | | |
388 | 517 | | |
389 | 518 | | |
390 | | - | |
391 | | - | |
392 | | - | |
393 | | - | |
394 | | - | |
395 | | - | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
| 523 | + | |
| 524 | + | |
| 525 | + | |
396 | 526 | | |
397 | 527 | | |
398 | 528 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
45 | 45 | | |
46 | 46 | | |
47 | 47 | | |
48 | | - | |
| 48 | + | |
49 | 49 | | |
50 | 50 | | |
51 | 51 | | |
52 | | - | |
53 | | - | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
54 | 57 | | |
55 | | - | |
56 | | - | |
57 | | - | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
58 | 61 | | |
59 | | - | |
60 | | - | |
61 | | - | |
62 | | - | |
63 | | - | |
64 | | - | |
65 | | - | |
66 | | - | |
67 | | - | |
68 | | - | |
69 | | - | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
Binary file not shown.
0 commit comments