@@ -3,14 +3,14 @@ SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All
33SPDX-License-Identifier: Apache-2.0
44-->
55
6- # aiconfigurator
6+ # AIConfigurator
77Today, in disaggregated serving, it's quite difficult to find a proper config
88to get benefits from disaggregation such as how many prefill workers and decode workers
99do I need and what about the parallelism for each worker. Combined with SLA:
1010TTFT(Time-To-First-Token) and TPOT(Time-Per-Output-Token), it becomes even more complicated
1111to solve the throughput @ latency problem.
1212
13- We're introducing aiconfigurator to help you find a good reference to start with in your
13+ We're introducing AIConfigurator to help you find a good reference to start with in your
1414disaggregated serving journey. The tool will try to search the space to get a good deployment config
1515based on your requirement including which model you want to serve, how many GPUs you have and what's
1616the GPU. Automatically generate the config files for you to deploy with Dynamo.
@@ -52,7 +52,7 @@ With **-h**, you can have more information about optional args to customize your
5252
5353```
5454********************************************************************************
55- * Dynamo aiconfigurator Final Results *
55+ * Dynamo AIConfigurator Final Results *
5656********************************************************************************
5757 ----------------------------------------------------------------------------
5858 Input Configuration & SLA Target:
@@ -205,6 +205,15 @@ TRTLLM Versions: 0.20.0, 1.0.0rc3
205205Parallel modes: Tensor-parallel; Pipeline-parallel; Expert Tensor-parallel/Expert-parallell; Attention DP for DEEPSEEK and MoE
206206Scheduling: Static; IFB(continuous batching); Disaggregated serving; MTP for DEEPSEEK
207207
208+ ### System Data Support Matrix
209+
210+ | System | Framework(Version) | Status |
211+ | --------| -------------------| --------|
212+ | h100_sxm | TRTLLM(0.20.0, 1.0.0rc3) | ✅ |
213+ | h200_sxm | TRTLLM(0.20.0, 1.0.0rc3) | ✅ |
214+ | b200_sxm | TRTLLM(NA) | 🚧 |
215+
216+
208217## Data Collection
209218Data collection is a standalone process for collecting the database for aiconfigurator. By default, you don't have to collect the data by yourself.
210219Small versions of database will not introduce huge perf difference. Say, you can use 1.0.0rc3 data of trtllm on h200_sxm and deploy the generated
0 commit comments