You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Initial set of changes to document quick search
* Changes based on review comments
* Fixing formatting
* Create brute and quick sections
* Creating links and descriptions for auto vs manual
* Some more details and cleanup
* fix typo
Co-authored-by: tgerdes <[email protected]>
to test the model with different concurrency and batch sizes of requests. Using
44
-
[Manual Config Search](docs/config_search.md#manual-configuration-search), you can create manual sweeps for every parameter that can be specified in the model configuration.
44
+
[Manual Config Search](docs/config_search.md#manual-brute-search), you can create manual sweeps for every parameter that can be specified in the model configuration.
45
45
46
46
*[Detailed and summary reports](docs/report.md): Model Analyzer is able to generate
47
47
summarized and detailed reports that can help you better understand the trade-offs
Copy file name to clipboardExpand all lines: docs/cli.md
+13-7Lines changed: 13 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -26,7 +26,7 @@ $ model-analyzer -h
26
26
Options like `-q`, `--quiet` and `-v`, `--verbose` are global and apply to all
27
27
model analyzer subcommands.
28
28
29
-
## Model Analyze Modes
29
+
## Model Analyzer Modes
30
30
31
31
The `-m` or `--mode` flag is global and is accessible to all subcommands. It tells the model analyzer the context
32
32
in which it is being run. Currently model analyzer supports 2 modes.
@@ -86,8 +86,8 @@ $ model-analyzer profile -h
86
86
87
87
Depending on the command line or YAML config options provided, the `profile`
88
88
subcommand will either perform a
89
-
[manual](./config_search.md#manual-configuration-search) or [automatic
90
-
search](./config_search.md#automatic-configuration-search) over perf analyzer
89
+
[manual](./config_search.md#manual-brute-search), [automatic](./config_search.md#automatic-brute-search), or
90
+
[quick](./config_search.md#quick-configuration-search) search over perf analyzer
91
91
and model config file parameters. For each combination of [model config
92
92
parameters](./config.md#model-config-parameters) (e.g. _max batch size_, _dynamic batching_, and _instance count_), it will run tritonserver and perf analyzer instances with
93
93
all the specified run parameters (client request concurrency and static batch
@@ -112,19 +112,25 @@ Some example profile commands are shown here. For a full example see the
2. Run auto config search on 2 models called `resnet50_libtorch` and `vgg16_graphdef` located in `/home/model_repo` and save checkpoints to `checkpoints`
115
+
2. Run quick search on a model called `resnet50_libtorch` located in `/home/model_repo`
3. Run auto config search on 2 models called `resnet50_libtorch` and `vgg16_graphdef` located in `/home/model_repo` and save checkpoints to `checkpoints`
3. Run auto config search on a model called `resnet50_libtorch` located in `/home/model_repo`, but change the repository where model config variants are stored to `/home/output_repo`
127
+
4. Run auto config search on a model called `resnet50_libtorch` located in `/home/model_repo`, but change the repository where model config variants are stored to `/home/output_repo`
4. Run profile over manually defined configurations for a models `classification_malaria_v1` and `classification_chestxray_v1` located in `/home/model_repo` using the YAML config file
133
+
5. Run profile over manually defined configurations for a models `classification_malaria_v1` and `classification_chestxray_v1` located in `/home/model_repo` using the YAML config file
128
134
129
135
```
130
136
$ model-analyzer profile -f /path/to/config.yaml
@@ -157,7 +163,7 @@ profile_models:
157
163
max_queue_delay_microseconds: [100]
158
164
```
159
165
160
-
5. Apply objectives and constraints to sort and filter results in summary plots and tables using yaml config file.
166
+
6. Apply objectives and constraints to sort and filter results in summary plots and tables using yaml config file.
Copy file name to clipboardExpand all lines: docs/config_search.md
+46-34Lines changed: 46 additions & 34 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,20 +16,24 @@ limitations under the License.
16
16
17
17
# Model Config Search
18
18
19
-
Model Analyzer's `profile` subcommand supports **automatic** and **manual**
20
-
sweeping through different configurations for Triton models.
19
+
Model Analyzer's `profile` subcommand supports multiple modes when searching to find the best model configuration.
20
+
*[Brute](config_search.md#brute-search-mode) is the default, and will do a brute-force sweep of the cross product of all possible configurations
21
+
*[Quick](config_search.md#quick-search-mode) will use heuristics to try to find the optimal configuration much quicker than brute, and can be enabled via `--run-config-search-mode quick`
21
22
22
-
## Automatic Configuration Search
23
+
## Brute Search Mode
24
+
25
+
Model Analyzer's brute search mode will do a brute-force sweep of the cross product of all possible configurations. You can [Manually](config_search.md#manual-brute-search) provide `model_config_parameters` to tell Model Analyzer what to sweep over, or you can
26
+
let it [Automatically](config_search.md#automatic-brute-search) sweep through configurations expected to have the highest impact on performance for Triton models.
27
+
28
+
### Automatic Brute Search
23
29
24
30
Automatic configuration search is the default behavior when running Model
25
-
Analyzer. This mode is enabled when there is not any parameters specified for the
26
-
`model_config_parameters` section of the Model Analyzer Config. The parameters
31
+
Analyzer without manually specifying what values to search. The parameters
Additionally, [`dynamic_batching`](https://github.com/triton-inference-server/server/blob/master/docs/model_configuration.md#dynamic-batcher) will be enabled.
32
-
36
+
Additionally, [`dynamic_batching`](https://github.com/triton-inference-server/server/blob/master/docs/model_configuration.md#dynamic-batcher) will be enabled if it is legal to do so.
33
37
34
38
An example model analyzer config that performs automatic config search looks
35
39
like below:
@@ -50,7 +54,7 @@ For each `instance_group`, Model Analyzer will sweep values 1 through 128 increa
* If the model is using [ONNX](https://github.com/triton-inference-server/onnxruntime_backend) or [Tensorflow backend](https://github.com/triton-inference-server/tensorflow_backend), the "execution_accelerators" parameters. More information about this parameter is
164
-
available in the [Triton Optimization Guide](https://github.com/triton-inference-server/server/blob/main/docs/optimization.md#framework-specific-optimization)
- If the model is using [ONNX](https://github.com/triton-inference-server/onnxruntime_backend) or [Tensorflow backend](https://github.com/triton-inference-server/tensorflow_backend), the "execution_accelerators" parameters. More information about this parameter is
166
+
available in the [Triton Optimization Guide](https://github.com/triton-inference-server/server/blob/main/docs/optimization.md#framework-specific-optimization)
167
+
168
+
## Quick Search Mode
169
+
170
+
Quick search can be enabled by adding the parameter `--run-config-search-mode quick` to the CLI.
171
+
172
+
It uses a hill climbing algorithm to search the configuration space, looking for
173
+
the maximal objective value within the specified constraints. In the majority of cases
174
+
this will find greater than 95% of the maximum objective value (that could be found using a brute force search), while needing to search less than 10% of the configuration space.
175
+
176
+
After it has found the best config(s), it will then sweep the top-N configurations found (specified by `--num-configs-per-model`) over the default concurrency range before generation of the summary reports.
0 commit comments