You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -33,6 +33,8 @@ Triton Model Analyzer is a CLI tool which can help you find a more optimal confi
33
33
34
34
### Search Modes
35
35
36
+
-[Optuna Search](docs/config_search.md#optuna-search-mode)**_-ALPHA RELEASE-_** allows you to search for every parameter that can be specified in the model configuration, using a hyperparameter optimization framework. Please see the [Optuna](https://optuna.org/) website if you are interested in specific details on how the algorithm functions.
37
+
36
38
-[Quick Search](docs/config_search.md#quick-search-mode) will **sparsely** search the [Max Batch Size](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md#maximum-batch-size),
37
39
[Dynamic Batching](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md#dynamic-batcher), and
38
40
[Instance Group](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md#instance-groups) spaces by utilizing a heuristic hill-climbing algorithm to help you quickly find a more optimal configuration
-**Search type:** Heuristic sweep using a hyperparameter optimization framework to find an optimal configuration
54
+
-**Command:**`--run-config-search-mode optuna`
51
55
52
56
---
53
57
@@ -276,6 +280,108 @@ profile_models:
276
280
277
281
---
278
282
283
+
## Optuna Search Mode
284
+
285
+
**-ALPHA RELEASE-**
286
+
287
+
_This mode has the following limitations:_
288
+
289
+
- **Ensemble, BLS or concurrent multi-model profiling is not supported**
290
+
- **Profiling with request rate is not supported**
291
+
292
+
This mode uses a hyperparameter optimization framework to search the configuration
293
+
space, looking for the maximal objective value within the specified constraints.
294
+
Please see the [Optuna](https://optuna.org/) website if you are interested in specific details on how the algorithm functions.
295
+
296
+
Optuna allows you to search for every parameter that can be specified in the model configuration. Parameters can be specified
297
+
with a min/max range (using the run-config-search options) or a list of parameters to test against can be set in the
298
+
parameters/model_config_parameters field.
299
+
300
+
After optuna search has found the best config(s), it will then sweep the top-N configurations found (specified by `--num-configs-per-model`) over the default concurrency range before generation of the summary reports.
301
+
302
+
---
303
+
304
+
_An example model analyzer YAML config that performs an Optuna Search:_
305
+
306
+
```yaml
307
+
model_repository: /path/to/model/repository/
308
+
309
+
run_config_search_mode: optuna
310
+
profile_models:
311
+
- model_A
312
+
```
313
+
314
+
---
315
+
316
+
A number of new configuration options were added to support tailoring the Optuna search to your needs:
317
+
318
+
- `--min/max_percentage_of_search_space`: sets the percentage of the space you want Optuna to search
319
+
- `--optuna-min/max-trials`: sets the number of trials Optuna will attempt
320
+
- `--optuna-early-exit-threshold`: sets the number of trials without improvement before triggering early exit
321
+
- `--use-concurrency-formula`: uses a formula (2 \* batch size \* instance group count), rather than sweeping concurrency
322
+
323
+
---
324
+
325
+
_An example that performs an Optuna Search using these new configuration options:_
326
+
327
+
```yaml
328
+
model_repository: /path/to/model/repository/
329
+
330
+
run_config_search_mode: optuna
331
+
run_config_search_max_instance_count: 8
332
+
run_config_search_min_concurrency: 32
333
+
run_config_search_max_concurrency: 256
334
+
335
+
use_concurrency_formula: True
336
+
min_percentage_of_search_space: 10
337
+
optuna_max_trials: 200
338
+
optuna_early_exit_threshold: 15
339
+
340
+
profile_models:
341
+
model_A:
342
+
model_config_parameters:
343
+
max_batch_size: [1, 4, 8, 32, 64, 128]
344
+
dynamic_batching:
345
+
max_queue_delay_microseconds: [100, 200, 300]
346
+
parameters:
347
+
batch_sizes: 1, 2, 4, 8, 16
348
+
```
349
+
350
+
_The debug output showing how the space will be searched:_
351
+
352
+
```yaml
353
+
Number of configs in search space: 720
354
+
batch_sizes: [1, 2, 4, 8, 16] (5)
355
+
max_batch_size: [1, 4, 8, 32, 64, 128] (6)
356
+
instance_group: 1 to 8 (8)
357
+
max_queue_delay_microseconds: [100, 200, 300] (3)
358
+
359
+
Minimum number of trials: 72 (10% of search space)
360
+
Maximum number of trials: 200 (set by max trials)
361
+
```
362
+
363
+
---
364
+
365
+
### Optuna Search in Detail
366
+
367
+
When performing an Optuna Search, Model Analyzer's goal is to maximize the configuration's `objective score`. First,
368
+
MA profiles the default configuration and assigns it an `objective score` of zero. All future configurations
369
+
are also assigned an `objective score`; with positive values indicating this configuration is better than the default
370
+
configuration and negative values indicating it performs worse.
371
+
372
+
_Here is an example debug output:_
373
+
374
+
```yaml
375
+
Trial 7 of 200:
376
+
Creating model config: model_A_config_6
377
+
Setting dynamic_batching to {'max_queue_delay_microseconds': 200}
378
+
Setting instance_group to [{'count': 4, 'kind': 'KIND_GPU'}]
0 commit comments