Skip to content

Commit 089dd63

Browse files
lucasliedominicshanshan
authored andcommitted
[None][chore] AutoDeplopy: Update expert section on yaml configuration in README (NVIDIA#8370)
Signed-off-by: Lucas Liebenwein <[email protected]>
1 parent 6ca2f55 commit 089dd63

File tree

1 file changed

+13
-10
lines changed

1 file changed

+13
-10
lines changed

examples/auto_deploy/README.md

Lines changed: 13 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -262,26 +262,27 @@ Then use these configurations:
262262
# Using single YAML config
263263
python build_and_run_ad.py \
264264
--model "meta-llama/Meta-Llama-3.1-8B-Instruct" \
265-
--yaml-configs my_config.yaml
265+
--yaml-extra my_config.yaml
266266
267267
# Using multiple YAML configs (deep merged in order, later files have higher priority)
268268
python build_and_run_ad.py \
269269
--model "meta-llama/Meta-Llama-3.1-8B-Instruct" \
270-
--yaml-configs my_config.yaml production.yaml
270+
--yaml-extra my_config.yaml production.yaml
271271
272272
# Targeting nested AutoDeployConfig with separate YAML
273273
python build_and_run_ad.py \
274274
--model "meta-llama/Meta-Llama-3.1-8B-Instruct" \
275-
--yaml-configs my_config.yaml \
276-
--args.yaml-configs autodeploy_overrides.yaml
275+
--yaml-extra my_config.yaml \
276+
--args.yaml-extra autodeploy_overrides.yaml
277277
```
278278

279279
#### Configuration Precedence and Deep Merging
280280

281281
The configuration system follows a strict precedence order where higher priority sources override lower priority ones:
282282

283283
1. **CLI Arguments** (highest priority) - Direct command line arguments
284-
1. **YAML Configs** - Files specified via `--yaml-configs` and `--args.yaml-configs`
284+
1. **YAML Extra Configs** - Files specified via `--yaml-extra` and `--args.yaml-extra`
285+
1. **YAML Default Config** - (**do not change**) Files specified via `--yaml-default` and `--args.yaml-default`
285286
1. **Default Settings** (lowest priority) - Built-in defaults from the config classes
286287

287288
**Deep Merging**: Unlike simple overwriting, deep merging intelligently combines nested dictionaries recursively. For example:
@@ -307,18 +308,18 @@ args:
307308
**Nested Config Behavior**: When using nested configurations, outer YAML configs become init settings for inner objects, giving them higher precedence:
308309

309310
```bash
310-
# The outer yaml-configs affects the entire ExperimentConfig
311-
# The inner args.yaml-configs affects only the AutoDeployConfig
311+
# The outer yaml-extra affects the entire ExperimentConfig
312+
# The inner args.yaml-extra affects only the AutoDeployConfig
312313
python build_and_run_ad.py \
313314
--model "meta-llama/Meta-Llama-3.1-8B-Instruct" \
314-
--yaml-configs experiment_config.yaml \
315-
--args.yaml-configs autodeploy_config.yaml \
315+
--yaml-extra experiment_config.yaml \
316+
--args.yaml-extra autodeploy_config.yaml \
316317
--args.world-size=8 # CLI override beats both YAML configs
317318
```
318319

319320
#### Built-in Default Configuration
320321

321-
Both [`AutoDeployConfig`](../../tensorrt_llm/_torch/auto_deploy/llm_args.py) and [`LlmArgs`](../../tensorrt_llm/_torch/auto_deploy/llm_args.py) classes automatically load a built-in [`default.yaml`](../../tensorrt_llm/_torch/auto_deploy/config/default.yaml) configuration file that provides sensible defaults for the AutoDeploy inference optimizer pipeline. This file is specified in the [`_get_config_dict()`](../../tensorrt_llm/_torch/auto_deploy/llm_args.py) function and defines default transform configurations for graph optimization stages.
322+
Both [`AutoDeployConfig`](../../tensorrt_llm/_torch/auto_deploy/llm_args.py) and [`LlmArgs`](../../tensorrt_llm/_torch/auto_deploy/llm_args.py) classes automatically load a built-in [`default.yaml`](../../tensorrt_llm/_torch/auto_deploy/config/default.yaml) configuration file that provides sensible defaults for the AutoDeploy inference optimizer pipeline. This file is specified via the [`yaml_default`](../../tensorrt_llm/_torch/auto_deploy/llm_args.py) field and defines default transform configurations for graph optimization stages.
322323

323324
The built-in defaults are automatically merged with your configurations at the lowest priority level, ensuring that your custom settings always override the defaults. You can inspect the current default configuration to understand the baseline transform pipeline:
324325

@@ -332,6 +333,8 @@ python build_and_run_ad.py \
332333
--args.transforms.export-to-gm.strict=true
333334
```
334335

336+
As indicated before, this can be overwritten via the `yaml_default` (`--yaml-default`) field but note that this will overwrite the entire Inference Optimizer pipeline.
337+
335338
</details>
336339

337340
## Roadmap

0 commit comments

Comments
 (0)