Skip to content

Commit e8944fc

Browse files
Merge branch 'main' into lschneider/autotune-nccl-symm
2 parents 67a647e + e8cceb0 commit e8944fc

File tree

68 files changed

+2193
-656
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

68 files changed

+2193
-656
lines changed
Lines changed: 21 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,23 @@
11
# Feature Combination Matrix
22

3-
| Feature | Overlap Scheduler | CUDA Graph | Attention Data Parallelism | Disaggregated Serving | Chunked Prefill | MTP | EAGLE-3(One Model Engine) | EAGLE-3(Two Model Engine) | Torch Sampler | TLLM C++ Sampler | KV Cache Reuse | Slide Window Attention | Logits Post Processor | Guided Decoding | LoRA |
4-
| -------------------------- | ----------------- | ---------- | -------------------------- | --------------------- | --------------- | -------- | ------------------------- | ------------------------- | ------------- | ---------------- | -------------- | ---------------------- | --------------------- | --------------- | ---- |
5-
| Overlap Scheduler | --- | | | | | | | | | | | | | | |
6-
| CUDA Graph | Yes | --- | | | | | | | | | | | | | |
7-
| Attention Data Parallelism | Yes | Yes | --- | | | | | | | | | | | | |
8-
| Disaggregated Serving | Yes | Yes | Yes | --- | | | | | | | | | | | |
9-
| Chunked Prefill | Yes | Yes | Yes | Yes | --- | | | | | | | | | | |
10-
| MTP | Yes | Yes | Yes | Yes | Yes | --- | | | | | | | | | |
11-
| EAGLE-3(One Model Engine) | Yes | Yes | Yes | Yes | Yes | No | --- | | | | | | | | |
12-
| EAGLE-3(Two Model Engine) | Yes | Yes | Yes | Yes | Yes | No | No | --- | | | | | | | |
13-
| Torch Sampler | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | --- | | | | | | |
14-
| TLLM C++ Sampler | Yes | Yes | Yes | Yes | Yes | No | No | No | No | --- | | | | | |
15-
| KV Cache Reuse | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | --- | | | | |
16-
| Slide Window Attention | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | --- | | | |
17-
| Logits Post Processor | Yes | Yes | Yes | No | Yes | No | No | No | Yes | Yes | Yes | Yes | --- | | |
18-
| Guided Decoding | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | --- | |
19-
| LoRA | Yes | No | Untested | Untested | Untested | Untested | Untested | Untested | Yes | Yes | Yes | Yes | Yes | Untested | --- |
3+
| Feature | Overlap Scheduler | CUDA Graph | Tensor Parallelism | Pipeline Parallelism | Expert Parallelism | Helix Parallelism | Attention Data Parallelism | Disaggregated Serving | Chunked Prefill | MTP | EAGLE-3(One Model Engine) | EAGLE-3(Two Model Engine) | Torch Sampler | TLLM C++ Sampler | KV Cache Reuse | Slide Window Attention | Logits Post Processor | Guided Decoding | LoRA |
4+
| -------------------------- | ----------------- | ---------- | ------------------ | -------------------- | ------------------ | ----------------- | -------------------------- | --------------------- | --------------- | -------- | ------------------------- | ------------------------- | ------------- | ---------------- | -------------- | ---------------------- | --------------------- | --------------- | -------- |
5+
| Overlap Scheduler | --- | | | | | | | | | | | | | | | | | | |
6+
| CUDA Graph | Yes | --- | | | | | | | | | | | | | | | | | |
7+
| Tensor Parallelism | Yes | Yes | --- | | | | | | | | | | | | | | | | |
8+
| Pipeline Parallelism | Yes | Yes | Yes | --- | | | | | | | | | | | | | | | |
9+
| Expert Parallelism | Yes | Yes | Yes | Yes | --- | | | | | | | | | | | | | | |
10+
| Helix Parallelism | Untested | Yes | Yes | Yes | Yes | --- | | | | | | | | | | | | | |
11+
| Attention Data Parallelism | Yes | Yes | Yes | Yes | Yes | Known issues | --- | | | | | | | | | | | | |
12+
| Disaggregated Serving | Yes | Yes | Yes | Yes | Yes | Yes | Yes | --- | | | | | | | | | | | |
13+
| Chunked Prefill | Yes | Yes | Yes | Untested | Yes | Yes | Yes | Yes | --- | | | | | | | | | | |
14+
| MTP | Yes | Yes | Yes | No | Yes | No | Yes | Yes | Yes | --- | | | | | | | | | |
15+
| EAGLE-3(One Model Engine) | Yes | Yes | Yes | No | Yes | No | Yes | Yes | Yes | No | --- | | | | | | | | |
16+
| EAGLE-3(Two Model Engine) | Yes | Yes | Yes | No | Yes | No | Yes | Yes | Yes | No | No | --- | | | | | | | |
17+
| Torch Sampler | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | --- | | | | | | |
18+
| TLLM C++ Sampler | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | No | No | No | --- | | | | | |
19+
| KV Cache Reuse | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | --- | | | | |
20+
| Slide Window Attention | Yes | Yes | Yes | Yes | Yes | Untested | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | --- | | | |
21+
| Logits Post Processor | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | Yes | No | No | No | Yes | Yes | Yes | Yes | --- | | |
22+
| Guided Decoding | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | --- | |
23+
| LoRA | Yes | No | Yes | Yes | Untested | Untested | Untested | Untested | Yes | Untested | Untested | Untested | Yes | Yes | Yes | Yes | Yes | Untested | --- |

security_scanning/examples/apps/poetry.lock

Lines changed: 3 additions & 3 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

security_scanning/examples/auto_deploy/poetry.lock

Lines changed: 3 additions & 3 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

security_scanning/examples/draft_target_model/poetry.lock

Lines changed: 11 additions & 11 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)