Commit 470939d
authored
common : preallocate sampling token data vector (#8363)
`emplace_back` repeatedly-called is slower than preallocating the vector to the vocab size and directly inserting the data. Some rudimentary profiling with `chrono` improves the performance of this block of code from ~500us/op to ~40us/op.
Overall, this slightly improves the sampling performance which has a more substantial impact for the `examples/lookahead` implementation -- I am able to see a ~10% performance boost in lookahead inference.1 parent 6f0dbf6 commit 470939d
1 file changed
+3
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
378 | 378 | | |
379 | 379 | | |
380 | 380 | | |
381 | | - | |
| 381 | + | |
382 | 382 | | |
383 | 383 | | |
384 | 384 | | |
| |||
391 | 391 | | |
392 | 392 | | |
393 | 393 | | |
394 | | - | |
| 394 | + | |
395 | 395 | | |
396 | 396 | | |
397 | | - | |
| 397 | + | |
398 | 398 | | |
399 | 399 | | |
400 | 400 | | |
| |||
0 commit comments