Commit b783ebf
committed
feat: optimize hybrid radix cache buffer insertion strategy
Add mamba_model_match_len based optimization for buffer insertion:
- Only insert buffer at actual branch points instead of fixed intervals
- Use threshold (chunked_prefill_size // 2) to decide strategy
- Reduce buffer storage overhead while maintaining cache hit rate1 parent fb6f960 commit b783ebf
File tree
2 files changed
+55
-6
lines changed- lightllm/server/router
- dynamic_prompt
- model_infer
2 files changed
+55
-6
lines changedLines changed: 32 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
80 | 80 | | |
81 | 81 | | |
82 | 82 | | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
83 | 99 | | |
84 | 100 | | |
85 | 101 | | |
86 | | - | |
87 | | - | |
88 | | - | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
89 | 110 | | |
90 | | - | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
91 | 115 | | |
92 | 116 | | |
93 | 117 | | |
| |||
105 | 129 | | |
106 | 130 | | |
107 | 131 | | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
108 | 136 | | |
109 | 137 | | |
110 | 138 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
409 | 409 | | |
410 | 410 | | |
411 | 411 | | |
| 412 | + | |
| 413 | + | |
| 414 | + | |
| 415 | + | |
412 | 416 | | |
413 | 417 | | |
414 | 418 | | |
| |||
471 | 475 | | |
472 | 476 | | |
473 | 477 | | |
| 478 | + | |
| 479 | + | |
| 480 | + | |
| 481 | + | |
| 482 | + | |
| 483 | + | |
474 | 484 | | |
475 | 485 | | |
476 | 486 | | |
| |||
518 | 528 | | |
519 | 529 | | |
520 | 530 | | |
521 | | - | |
522 | | - | |
| 531 | + | |
| 532 | + | |
523 | 533 | | |
524 | 534 | | |
525 | 535 | | |
526 | 536 | | |
527 | 537 | | |
| 538 | + | |
| 539 | + | |
| 540 | + | |
| 541 | + | |
| 542 | + | |
| 543 | + | |
| 544 | + | |
| 545 | + | |
| 546 | + | |
| 547 | + | |
| 548 | + | |
528 | 549 | | |
529 | 550 | | |
530 | 551 | | |
| |||
0 commit comments