Commit 85a7d86
authored
memory : remove KV cache size padding (ggml-org#16812)
* memory : remove KV cache size padding
* cont : restore padding for n_kv tensor shape
* server : use slot context size instead of training context size
* server : simplify context limit logic1 parent a8ca18b commit 85a7d86
File tree
6 files changed
+14
-54
lines changed- src
- tools/server
- tests/unit
6 files changed
+14
-54
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
961 | 961 | | |
962 | 962 | | |
963 | 963 | | |
| 964 | + | |
| 965 | + | |
| 966 | + | |
| 967 | + | |
964 | 968 | | |
965 | 969 | | |
966 | 970 | | |
967 | | - | |
| 971 | + | |
968 | 972 | | |
969 | 973 | | |
970 | 974 | | |
| |||
2014 | 2018 | | |
2015 | 2019 | | |
2016 | 2020 | | |
2017 | | - | |
2018 | | - | |
2019 | | - | |
2020 | | - | |
2021 | | - | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
22 | | - | |
23 | | - | |
24 | 22 | | |
25 | 23 | | |
26 | 24 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19641 | 19641 | | |
19642 | 19642 | | |
19643 | 19643 | | |
19644 | | - | |
| 19644 | + | |
19645 | 19645 | | |
19646 | 19646 | | |
19647 | 19647 | | |
| |||
19692 | 19692 | | |
19693 | 19693 | | |
19694 | 19694 | | |
19695 | | - | |
19696 | | - | |
19697 | | - | |
19698 | | - | |
19699 | 19695 | | |
19700 | 19696 | | |
19701 | 19697 | | |
19702 | 19698 | | |
19703 | 19699 | | |
19704 | 19700 | | |
19705 | | - | |
| 19701 | + | |
19706 | 19702 | | |
19707 | 19703 | | |
19708 | 19704 | | |
| |||
19714 | 19710 | | |
19715 | 19711 | | |
19716 | 19712 | | |
19717 | | - | |
19718 | | - | |
19719 | 19713 | | |
19720 | 19714 | | |
19721 | 19715 | | |
19722 | 19716 | | |
19723 | | - | |
19724 | | - | |
19725 | | - | |
19726 | | - | |
19727 | | - | |
19728 | | - | |
19729 | | - | |
19730 | 19717 | | |
19731 | 19718 | | |
19732 | | - | |
19733 | | - | |
19734 | 19719 | | |
19735 | 19720 | | |
19736 | 19721 | | |
| |||
19757 | 19742 | | |
19758 | 19743 | | |
19759 | 19744 | | |
19760 | | - | |
| 19745 | + | |
19761 | 19746 | | |
19762 | 19747 | | |
19763 | 19748 | | |
| |||
19772 | 19757 | | |
19773 | 19758 | | |
19774 | 19759 | | |
19775 | | - | |
| 19760 | + | |
19776 | 19761 | | |
19777 | 19762 | | |
19778 | 19763 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
500 | 500 | | |
501 | 501 | | |
502 | 502 | | |
503 | | - | |
504 | 503 | | |
505 | | - | |
| 504 | + | |
506 | 505 | | |
507 | 506 | | |
508 | 507 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2866 | 2866 | | |
2867 | 2867 | | |
2868 | 2868 | | |
| 2869 | + | |
2869 | 2870 | | |
2870 | 2871 | | |
2871 | 2872 | | |
2872 | | - | |
| 2873 | + | |
| 2874 | + | |
2873 | 2875 | | |
2874 | 2876 | | |
2875 | 2877 | | |
| |||
2929 | 2931 | | |
2930 | 2932 | | |
2931 | 2933 | | |
2932 | | - | |
2933 | | - | |
2934 | | - | |
2935 | | - | |
2936 | | - | |
2937 | | - | |
2938 | | - | |
2939 | | - | |
2940 | | - | |
2941 | | - | |
2942 | 2934 | | |
2943 | 2935 | | |
2944 | 2936 | | |
2945 | 2937 | | |
2946 | 2938 | | |
2947 | 2939 | | |
2948 | 2940 | | |
2949 | | - | |
2950 | | - | |
2951 | | - | |
2952 | | - | |
2953 | | - | |
2954 | | - | |
2955 | | - | |
2956 | | - | |
2957 | | - | |
2958 | | - | |
2959 | | - | |
2960 | | - | |
2961 | | - | |
2962 | 2941 | | |
2963 | 2942 | | |
2964 | 2943 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
45 | 45 | | |
46 | 46 | | |
47 | 47 | | |
48 | | - | |
| 48 | + | |
49 | 49 | | |
50 | 50 | | |
51 | 51 | | |
| |||
0 commit comments