Skip to content

Commit f039395

Browse files
authored
Add BLS support to main (#662)
* Add --bls-composing-models config option (#628) * Added --bls-models to CLI * Adding unit test * Fixig typo and updating documentation * Fix test description * Adding YAML config BLS test * Changing to bls_composing_models * submodel -> composing * Logic to create BLS composing models for default config (#637) * Initial changes for profiling BLS * Removing errant results check-in * Fixing existing unit tests * Added unit test to protect default config generation * Create unchanged BLS composing model output directory (#639) * Created original name/dir for BLS composing models * Add clarifying comment * Fixing typo * Logic for non-default BLS profiling (#643) * Initial changes to add non-default BLS configs * Changing comment * Adding missing check of top level model config * BLS Summary Reporting (#648) * Initial changes for BLS summary reporting * Fixing and adding golden metrics for BLS * Adding table manager unit testing for BLS * BLS Detailed Reporting (#650) * Add support for BLS detailed reporting * Fixing copyright issue * Ensemble -> BLS * Checking for illegal cases with BLS (#651) * Checking for illegal cases with BLS * Fix typo, type checking, and unit test * Refactor composing model arch (#653) * Refactor of MRC * Combined ensemble & bls composing models into a single list. All unit tests passing * Fix type checking error * Combining bls/ensemble composing models * Refactored composing model creation * Refactoring MRC * Refactoring of get_next_model_run_config * Partial refactor of report_manager * Refactored summary sentence * Refactored report manager * Add missing newline * Updates from review * Added check to ensure composing models are not ensembles (#656) * Cleaning up directory writing and ensemble model loading * Fixing extra dimension bug * Add BLS documentation (#661) * Redoing BLS doc changes in new branch * Changes based on Tim's review * Getting CodeQL clean! * Fixing more CodeQL issues * Another codeQL fix * Removing more assertTrues * Fixing fall through (codeQL) warning
1 parent a6708ed commit f039395

32 files changed

+14376
-304
lines changed

README.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ limitations under the License.
1818

1919
# Triton Model Analyzer
2020

21-
Triton Model Analyzer is a CLI tool which can help you find a more optimal configuration, on a given piece of hardware, for single, multiple, or ensemble models running on a [Triton Inference Server](https://github.com/triton-inference-server/server/). Model Analyzer will also generate reports to help you better understand the trade-offs of the different configurations along with their compute and memory requirements.
21+
Triton Model Analyzer is a CLI tool which can help you find a more optimal configuration, on a given piece of hardware, for single, multiple, ensemble, or BLS models running on a [Triton Inference Server](https://github.com/triton-inference-server/server/). Model Analyzer will also generate reports to help you better understand the trade-offs of the different configurations along with their compute and memory requirements.
2222
<br><br>
2323

2424
# Features
@@ -40,7 +40,10 @@ Triton Model Analyzer is a CLI tool which can help you find a more optimal confi
4040
### Model Types
4141

4242
- [Ensemble Model Search](docs/config_search.md#ensemble-model-search): Model Analyzer can help you find the optimal
43-
settings when profiling a non-BLS ensemble model, utilizing the [Quick Search](docs/config_search.md#quick-search-mode) algorithm
43+
settings when profiling an ensemble model, utilizing the [Quick Search](docs/config_search.md#quick-search-mode) algorithm
44+
45+
- [BLS Model Search](docs/config_search.md#bls-model-search): Model Analyzer can help you find the optimal
46+
settings when profiling a BLS model, utilizing the [Quick Search](docs/config_search.md#quick-search-mode) algorithm
4447

4548
- [Multi-Model Search](docs/config_search.md#multi-model-search-mode): **EARLY ACCESS** - Model Analyzer can help you
4649
find the optimal settings when profiling multiple concurrent models, utilizing the [Quick Search](docs/config_search.md#quick-search-mode) algorithm

docs/config.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -89,6 +89,9 @@ model_repository: <string>
8989
# List of the model names to be profiled
9090
profile_models: <comma-delimited-string-list>
9191
92+
# List of composing models for BLS models
93+
bls_composing_models: <comma-delimited-string-list>
94+
9295
# Full path to directory to which to read and write checkpoints and profile data
9396
[ checkpoint_directory: <string> | default: './checkpoints' ]
9497
@@ -252,6 +255,9 @@ The following config options are supported **only by the YAML** config file.
252255
# YAML config section for each model to be profiled
253256
profile_models: <comma-delimited-string-list|list|profile_model>
254257
258+
# List of composing models for BLS models
259+
bls_composing_models: <comma-delimited-string-list>
260+
255261
# List of constraints placed on the config search results
256262
[ constraints: <constraint> ]
257263

docs/config_search.md

Lines changed: 24 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@ limitations under the License.
2323
- [Manual Brute Search](#manual-brute-search)
2424
- [Quick Search Mode](#quick-search-mode)
2525
- [Ensemble Model Search](#ensemble-model-search)
26+
- [BLS Model Search](#bls-model-search)
2627
- [Multi-Model Search Mode](#multi-model-search-mode)
2728

2829
<br>
@@ -36,13 +37,14 @@ Model Analyzer's `profile` subcommand supports multiple modes when searching to
3637
- [Brute Force Search](config_search.md#brute-search-mode)
3738
- **Search type:** Brute-force sweep of the cross product of all possible configurations
3839
- **Default for:**
39-
- Single non-ensemble models
40+
- Single models, which are not ensemble or BLS
4041
- Multiple models being profiled sequentially
4142
- **Command:** `--run-config-search-mode brute`
4243
- [Quick Search](config_search.md#quick-search-mode)
4344
- **Search type:** Heuristic sweep using a hill-climbing algorithm to find an optimal configuration
4445
- **Default for:**
4546
- Single ensemble models
47+
- Single BLS models
4648
- Multiple models being profiled concurrently
4749
- **Command:** `--run-config-search-mode quick`
4850

@@ -54,19 +56,19 @@ Model Analyzer's default search mode depends on the type of model and if you are
5456

5557
- [Sequential (single or multi-model) Search](config_search.md#brute-search-mode)
5658
- **Default Search type:** [Brute Force Search](config_search.md#brute-search-mode)
57-
- **Command:** N/A
5859
- [Concurrent / Multi-model Search](config_search.md#multi-model-search-mode)
5960
- **Default Search type:** [Quick Search](config_search.md#quick-search-mode)
6061
- **Command:** `--run-config-profile-models-concurrently-enable`
6162
- [Ensemble Model Search](config_search.md#ensemble-model-search):
6263
- **Default Search type:** [Quick Search](config_search.md#quick-search-mode)
63-
- **Command:** N/A
64+
- [BLS Model Search](config_search.md#bls-model-search):
65+
- **Default Search type:** [Quick Search](config_search.md#quick-search-mode)
6466

6567
---
6668

6769
## Brute Search Mode
6870

69-
**Default search mode when profiling non-ensemble models sequentially**
71+
**Default search mode when profiling non-ensemble/BLS models sequentially**
7072

7173
Model Analyzer's brute search mode will do a brute-force sweep of the cross product of all possible configurations. <br>
7274
It has two modes:
@@ -225,7 +227,7 @@ manual sweep:
225227

226228
## Quick Search Mode
227229

228-
**Default search mode when profiling ensemble models or multiple models concurrently**
230+
**Default search mode when profiling ensemble models, BLS models, or multiple models concurrently**
229231

230232
This mode uses a hill climbing algorithm to search the configuration space, looking for
231233
the maximal objective value within the specified constraints. In the majority of cases
@@ -278,8 +280,23 @@ _This mode has the following limitations:_
278280
- Can only be run in `quick` search mode
279281
- Only supports up to four composing models
280282
- Does not support `cpu_only` option for composing models
283+
- Composing models cannot be ensemble or BLS models
284+
285+
Ensemble models can be optimized using the Quick Search mode's hill climbing algorithm to search the composing models' configuration spaces in parallel, looking for the maximal objective value within the specified constraints. Model Analyzer has observed positive outcomes towards finding the maximum objective value; with runtimes under one hour (compared to the days it would take a brute force run to complete) for ensembles that contain up to four composing models.
286+
287+
After Model Analyzer has found the best config(s), it will then sweep the top-N configurations found (specified by `--num-configs-per-model`) over the concurrency range before generation of the summary reports.
288+
289+
---
290+
291+
## BLS Model Search
292+
293+
_This mode has the following limitations:_
294+
295+
- Can only be run in `quick` search mode
296+
- Only supports up to four composing models
297+
- Composing models cannot be ensemble or BLS models
281298

282-
Ensemble models can be optimized using the Quick Search mode's hill climbing algorithm to search the ensemble sub-model's configuration spaces in parallel, looking for the maximal objective value within the specified constraints. Model Analyzer has observed positive outcomes towards finding the maximum objective value; with runtimes under one hour (compared to the days it would take a brute force run to complete) for ensembles with up to four composing models.
299+
BLS models can be optimized using the Quick Search mode's hill climbing algorithm to search the BLS composing models' configuration spaces, as well as the BLS model's instance count, in parallel, looking for the maximal objective value within the specified constraints. Model Analyzer has observed positive outcomes towards finding the maximum objective value; with runtimes under one hour (compared to the days it would take a brute force run to complete) for BLS models that contain up to four composing models.
283300

284301
After Model Analyzer has found the best config(s), it will then sweep the top-N configurations found (specified by `--num-configs-per-model`) over the concurrency range before generation of the summary reports.
285302

@@ -318,7 +335,7 @@ profile_models:
318335

319336
### **Model Weighting**
320337

321-
In additon to setting a model's objectives or constraints, in multi-model search mode, you have the ability to set a model's weighting. By default each model is set for equal weighting (value of 1), but in the YAML you can specify `weighting: <int>` which will bias that model's objectives when evaluating for an optimal result.
338+
In addition to setting a model's objectives or constraints, in multi-model search mode, you have the ability to set a model's weighting. By default each model is set for equal weighting (value of 1), but in the YAML you can specify `weighting: <int>` which will bias that model's objectives when evaluating for an optimal result.
322339

323340
---
324341

model_analyzer/config/generate/base_model_config_generator.py

Lines changed: 16 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@
2222
from model_analyzer.constants import LOGGER_NAME
2323
from model_analyzer.triton.model.model_config import ModelConfig
2424
from .model_profile_spec import ModelProfileSpec
25+
from copy import deepcopy
2526
import abc
2627
import logging
2728

@@ -233,15 +234,6 @@ def make_ensemble_model_config(
233234
model_config_dict['name'] = variant_name
234235
model_config = ModelConfig.create_from_dictionary(model_config_dict)
235236

236-
for composing_model_config in ensemble_composing_model_configs:
237-
variant_name = composing_model_config.get_field("name")
238-
composing_model_name = BaseModelConfigGenerator.extract_model_name_from_variant_name(
239-
variant_name)
240-
241-
model_config.set_composing_model_variant_name(
242-
composing_model_name=composing_model_name,
243-
variant_name=variant_name)
244-
245237
return model_config
246238

247239
@staticmethod
@@ -283,6 +275,21 @@ def extract_model_name_from_variant_name(variant_name: str) -> str:
283275
"""
284276
return variant_name[:variant_name.find("_config_")]
285277

278+
@staticmethod
279+
def create_original_config_from_variant(
280+
variant_config: ModelConfig) -> ModelConfig:
281+
"""
282+
Removes 'config_#/default' from the variant config and returns
283+
a new model config
284+
"""
285+
original_config = deepcopy(variant_config)
286+
287+
original_config.set_model_name(
288+
BaseModelConfigGenerator.extract_model_name_from_variant_name(
289+
variant_config.get_field("name")))
290+
291+
return original_config
292+
286293
@staticmethod
287294
def _apply_value_to_dict(key: Any, value: Any, dict_in: Dict) -> None:
288295
"""

model_analyzer/config/generate/model_profile_spec.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,3 +52,7 @@ def supports_dynamic_batching(self) -> bool:
5252
if "sequence_batching" in self._default_model_config:
5353
supports_dynamic_batching = False
5454
return supports_dynamic_batching
55+
56+
def is_ensemble(self) -> bool:
57+
""" Returns true if the model is an ensemble """
58+
return ("ensemble_scheduling" in self._default_model_config)

model_analyzer/config/generate/quick_plus_concurrency_sweep_run_config_generator.py

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -47,8 +47,8 @@ class QuickPlusConcurrencySweepRunConfigGenerator(ConfigGeneratorInterface):
4747
def __init__(self, search_config: SearchConfig,
4848
config: ConfigCommandProfile, gpus: List[GPUDevice],
4949
models: List[ModelProfileSpec],
50-
ensemble_composing_models: Dict[str, List[ModelProfileSpec]],
51-
client: TritonClient, result_manager: ResultManager,
50+
composing_models: List[ModelProfileSpec], client: TritonClient,
51+
result_manager: ResultManager,
5252
model_variant_name_manager: ModelVariantNameManager):
5353
"""
5454
Parameters
@@ -60,8 +60,8 @@ def __init__(self, search_config: SearchConfig,
6060
gpus: List of GPUDevices
6161
models: List of ModelProfileSpec
6262
List of models to profile
63-
ensemble_composing_models: Dict of List of ModelProfileSpec
64-
Dict indexed by model name of list of composing models to profile
63+
composing_models: List of ModelProfileSpec
64+
List of composing models that exist inside of the supplied models
6565
client: TritonClient
6666
result_manager: ResultManager
6767
The object that handles storing and sorting the results from the perf analyzer
@@ -74,7 +74,7 @@ def __init__(self, search_config: SearchConfig,
7474
self._config = config
7575
self._gpus = gpus
7676
self._models = models
77-
self._ensemble_composing_models = ensemble_composing_models
77+
self._composing_models = composing_models
7878
self._client = client
7979
self._result_manager = result_manager
8080
self._model_variant_name_manager = model_variant_name_manager
@@ -118,7 +118,7 @@ def _create_quick_run_config_generator(self) -> QuickRunConfigGenerator:
118118
config=self._config,
119119
gpus=self._gpus,
120120
models=self._models,
121-
ensemble_composing_models=self._ensemble_composing_models,
121+
composing_models=self._composing_models,
122122
client=self._client,
123123
model_variant_name_manager=self._model_variant_name_manager)
124124

0 commit comments

Comments
 (0)