Commit 08d5712
authored
[nvbug/5374773] chore: Add a runtime flag to enable fail fast when attn window is too large to fit at least one sequence in KV cache (NVIDIA#5974)
Signed-off-by: moraxu <[email protected]>1 parent c35c78f commit 08d5712
File tree
12 files changed
+154
-51
lines changed- cpp
- include/tensorrt_llm/executor
- tensorrt_llm
- batch_manager
- executor
- pybind/executor
- examples
- tensorrt_llm
- commands
- llmapi
- runtime
- tests/unittest/api_stability/references
12 files changed
+154
-51
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1484 | 1484 | | |
1485 | 1485 | | |
1486 | 1486 | | |
1487 | | - | |
| 1487 | + | |
| 1488 | + | |
1488 | 1489 | | |
1489 | 1490 | | |
1490 | 1491 | | |
| |||
1519 | 1520 | | |
1520 | 1521 | | |
1521 | 1522 | | |
| 1523 | + | |
1522 | 1524 | | |
1523 | 1525 | | |
1524 | 1526 | | |
| |||
1548 | 1550 | | |
1549 | 1551 | | |
1550 | 1552 | | |
| 1553 | + | |
1551 | 1554 | | |
1552 | 1555 | | |
1553 | 1556 | | |
| |||
1634 | 1637 | | |
1635 | 1638 | | |
1636 | 1639 | | |
| 1640 | + | |
| 1641 | + | |
| 1642 | + | |
| 1643 | + | |
1637 | 1644 | | |
1638 | 1645 | | |
1639 | 1646 | | |
| |||
Lines changed: 22 additions & 10 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
296 | 296 | | |
297 | 297 | | |
298 | 298 | | |
299 | | - | |
300 | 299 | | |
301 | 300 | | |
302 | 301 | | |
303 | 302 | | |
304 | 303 | | |
305 | 304 | | |
306 | 305 | | |
307 | | - | |
308 | | - | |
309 | | - | |
310 | | - | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
311 | 311 | | |
312 | 312 | | |
313 | 313 | | |
314 | 314 | | |
315 | 315 | | |
316 | 316 | | |
317 | 317 | | |
318 | | - | |
319 | | - | |
| 318 | + | |
| 319 | + | |
320 | 320 | | |
321 | 321 | | |
322 | 322 | | |
| |||
550 | 550 | | |
551 | 551 | | |
552 | 552 | | |
553 | | - | |
| 553 | + | |
| 554 | + | |
554 | 555 | | |
555 | 556 | | |
556 | 557 | | |
| |||
591 | 592 | | |
592 | 593 | | |
593 | 594 | | |
| 595 | + | |
| 596 | + | |
| 597 | + | |
| 598 | + | |
| 599 | + | |
| 600 | + | |
| 601 | + | |
| 602 | + | |
| 603 | + | |
| 604 | + | |
594 | 605 | | |
595 | 606 | | |
596 | 607 | | |
| |||
613 | 624 | | |
614 | 625 | | |
615 | 626 | | |
616 | | - | |
| 627 | + | |
617 | 628 | | |
618 | 629 | | |
619 | 630 | | |
| |||
657 | 668 | | |
658 | 669 | | |
659 | 670 | | |
660 | | - | |
| 671 | + | |
| 672 | + | |
661 | 673 | | |
662 | 674 | | |
663 | 675 | | |
| |||
Lines changed: 5 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
280 | 280 | | |
281 | 281 | | |
282 | 282 | | |
283 | | - | |
| 283 | + | |
| 284 | + | |
284 | 285 | | |
285 | 286 | | |
286 | 287 | | |
| |||
378 | 379 | | |
379 | 380 | | |
380 | 381 | | |
| 382 | + | |
| 383 | + | |
381 | 384 | | |
382 | 385 | | |
383 | | - | |
| 386 | + | |
384 | 387 | | |
385 | 388 | | |
386 | 389 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
37 | | - | |
| 37 | + | |
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
| |||
63 | 63 | | |
64 | 64 | | |
65 | 65 | | |
| 66 | + | |
66 | 67 | | |
67 | 68 | | |
68 | 69 | | |
| |||
222 | 223 | | |
223 | 224 | | |
224 | 225 | | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
225 | 231 | | |
226 | 232 | | |
227 | 233 | | |
| |||
371 | 377 | | |
372 | 378 | | |
373 | 379 | | |
| 380 | + | |
| 381 | + | |
| 382 | + | |
| 383 | + | |
| 384 | + | |
374 | 385 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
459 | 459 | | |
460 | 460 | | |
461 | 461 | | |
462 | | - | |
| 462 | + | |
463 | 463 | | |
464 | 464 | | |
465 | 465 | | |
| |||
472 | 472 | | |
473 | 473 | | |
474 | 474 | | |
475 | | - | |
| 475 | + | |
476 | 476 | | |
477 | 477 | | |
478 | 478 | | |
| |||
505 | 505 | | |
506 | 506 | | |
507 | 507 | | |
508 | | - | |
| 508 | + | |
| 509 | + | |
509 | 510 | | |
510 | 511 | | |
511 | 512 | | |
| |||
542 | 543 | | |
543 | 544 | | |
544 | 545 | | |
545 | | - | |
| 546 | + | |
| 547 | + | |
546 | 548 | | |
547 | 549 | | |
548 | 550 | | |
| |||
563 | 565 | | |
564 | 566 | | |
565 | 567 | | |
566 | | - | |
| 568 | + | |
567 | 569 | | |
568 | 570 | | |
569 | 571 | | |
| |||
613 | 615 | | |
614 | 616 | | |
615 | 617 | | |
| 618 | + | |
| 619 | + | |
| 620 | + | |
616 | 621 | | |
617 | 622 | | |
618 | 623 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
106 | 106 | | |
107 | 107 | | |
108 | 108 | | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
109 | 116 | | |
110 | 117 | | |
111 | 118 | | |
| |||
455 | 462 | | |
456 | 463 | | |
457 | 464 | | |
| 465 | + | |
| 466 | + | |
458 | 467 | | |
459 | 468 | | |
460 | 469 | | |
| |||
549 | 558 | | |
550 | 559 | | |
551 | 560 | | |
| 561 | + | |
| 562 | + | |
552 | 563 | | |
553 | 564 | | |
554 | 565 | | |
| |||
680 | 691 | | |
681 | 692 | | |
682 | 693 | | |
683 | | - | |
| 694 | + | |
| 695 | + | |
| 696 | + | |
684 | 697 | | |
685 | 698 | | |
686 | 699 | | |
| |||
0 commit comments