You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: examples/auto_deploy/README.md
+13-10Lines changed: 13 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -262,26 +262,27 @@ Then use these configurations:
262
262
# Using single YAML config
263
263
python build_and_run_ad.py \
264
264
--model "meta-llama/Meta-Llama-3.1-8B-Instruct" \
265
-
--yaml-configs my_config.yaml
265
+
--yaml-extra my_config.yaml
266
266
267
267
# Using multiple YAML configs (deep merged in order, later files have higher priority)
268
268
python build_and_run_ad.py \
269
269
--model "meta-llama/Meta-Llama-3.1-8B-Instruct" \
270
-
--yaml-configs my_config.yaml production.yaml
270
+
--yaml-extra my_config.yaml production.yaml
271
271
272
272
# Targeting nested AutoDeployConfig with separate YAML
273
273
python build_and_run_ad.py \
274
274
--model "meta-llama/Meta-Llama-3.1-8B-Instruct" \
275
-
--yaml-configs my_config.yaml \
276
-
--args.yaml-configs autodeploy_overrides.yaml
275
+
--yaml-extra my_config.yaml \
276
+
--args.yaml-extra autodeploy_overrides.yaml
277
277
```
278
278
279
279
#### Configuration Precedence and Deep Merging
280
280
281
281
The configuration system follows a strict precedence order where higher priority sources override lower priority ones:
282
282
283
283
1. **CLI Arguments** (highest priority) - Direct command line arguments
284
-
1. **YAML Configs** - Files specified via `--yaml-configs` and `--args.yaml-configs`
284
+
1. **YAML Extra Configs** - Files specified via `--yaml-extra` and `--args.yaml-extra`
285
+
1. **YAML Default Config** - (**do not change**) Files specified via `--yaml-default` and `--args.yaml-default`
285
286
1. **Default Settings** (lowest priority) - Built-in defaults from the config classes
286
287
287
288
**Deep Merging**: Unlike simple overwriting, deep merging intelligently combines nested dictionaries recursively. For example:
@@ -307,18 +308,18 @@ args:
307
308
**Nested Config Behavior**: When using nested configurations, outer YAML configs become init settings for inner objects, giving them higher precedence:
308
309
309
310
```bash
310
-
# The outer yaml-configs affects the entire ExperimentConfig
311
-
# The inner args.yaml-configs affects only the AutoDeployConfig
311
+
# The outer yaml-extra affects the entire ExperimentConfig
312
+
# The inner args.yaml-extra affects only the AutoDeployConfig
312
313
python build_and_run_ad.py \
313
314
--model "meta-llama/Meta-Llama-3.1-8B-Instruct" \
314
-
--yaml-configs experiment_config.yaml \
315
-
--args.yaml-configs autodeploy_config.yaml \
315
+
--yaml-extra experiment_config.yaml \
316
+
--args.yaml-extra autodeploy_config.yaml \
316
317
--args.world-size=8 # CLI override beats both YAML configs
317
318
```
318
319
319
320
#### Built-in Default Configuration
320
321
321
-
Both [`AutoDeployConfig`](../../tensorrt_llm/_torch/auto_deploy/llm_args.py) and [`LlmArgs`](../../tensorrt_llm/_torch/auto_deploy/llm_args.py) classes automatically load a built-in [`default.yaml`](../../tensorrt_llm/_torch/auto_deploy/config/default.yaml) configuration file that provides sensible defaults for the AutoDeploy inference optimizer pipeline. This file is specified in the [`_get_config_dict()`](../../tensorrt_llm/_torch/auto_deploy/llm_args.py) function and defines default transform configurations for graph optimization stages.
322
+
Both [`AutoDeployConfig`](../../tensorrt_llm/_torch/auto_deploy/llm_args.py) and [`LlmArgs`](../../tensorrt_llm/_torch/auto_deploy/llm_args.py) classes automatically load a built-in [`default.yaml`](../../tensorrt_llm/_torch/auto_deploy/config/default.yaml) configuration file that provides sensible defaults for the AutoDeploy inference optimizer pipeline. This file is specified via the [`yaml_default`](../../tensorrt_llm/_torch/auto_deploy/llm_args.py) field and defines default transform configurations for graph optimization stages.
322
323
323
324
The built-in defaults are automatically merged with your configurations at the lowest priority level, ensuring that your custom settings always override the defaults. You can inspect the current default configuration to understand the baseline transform pipeline:
324
325
@@ -332,6 +333,8 @@ python build_and_run_ad.py \
332
333
--args.transforms.export-to-gm.strict=true
333
334
```
334
335
336
+
As indicated before, this can be overwritten via the `yaml_default` (`--yaml-default`) field but note that this will overwrite the entire Inference Optimizer pipeline.
0 commit comments