Commit c116a75
committed
kvcache: Don't shift empty batches
When we context shift, we delete half the context and apply RoPE
with an offset to the other half. We used to RoPE across the entire
context in a single pass with a zero offset for the deleted
section. With the change to shifting in batches, we can skip any
batches where all of the offsets would be zero. This typically
reduces the number of operations by half.1 parent 3515cc3 commit c116a75
1 file changed
+17
-4
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
646 | 646 | | |
647 | 647 | | |
648 | 648 | | |
649 | | - | |
650 | | - | |
651 | 649 | | |
652 | 650 | | |
| 651 | + | |
| 652 | + | |
| 653 | + | |
| 654 | + | |
653 | 655 | | |
654 | 656 | | |
655 | 657 | | |
656 | 658 | | |
657 | 659 | | |
| 660 | + | |
| 661 | + | |
| 662 | + | |
| 663 | + | |
658 | 664 | | |
659 | 665 | | |
660 | 666 | | |
| 667 | + | |
| 668 | + | |
| 669 | + | |
| 670 | + | |
| 671 | + | |
| 672 | + | |
| 673 | + | |
661 | 674 | | |
662 | 675 | | |
663 | 676 | | |
| |||
669 | 682 | | |
670 | 683 | | |
671 | 684 | | |
672 | | - | |
| 685 | + | |
673 | 686 | | |
674 | 687 | | |
675 | | - | |
| 688 | + | |
676 | 689 | | |
677 | 690 | | |
678 | 691 | | |
| |||
0 commit comments