Skip to content

Commit 3d242a9

Browse files
Ashwin Rameshdzier
authored andcommitted
Updated Docs for 21.06 (#165)
1 parent 9598e4d commit 3d242a9

File tree

9 files changed

+47
-5
lines changed

9 files changed

+47
-5
lines changed

docs/cli.md

Lines changed: 22 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -33,9 +33,28 @@ in which it is being run. Currently model analyzer supports 2 modes.
3333

3434
### Online Mode
3535

36-
This is the default mode. When in this mode, Model Analyzer will operate to find the optimal model
37-
configuration for an online inference scenario. In this scenario, Triton server will receive requests
38-
on demand with an expectation that latency will be minimized.
36+
This is the default mode. When in this mode, Model Analyzer will operate to find
37+
the optimal model configuration for an online inference scenario. In this
38+
scenario, Triton server will receive requests on demand with an expectation that
39+
latency will be minimized.
40+
41+
By default in online mode, the best model configuration will be the one that
42+
minimizes latency. If a latency budget is specified the best model configuration
43+
will be the one with the highest throughput in the given budget. The analyze and
44+
report subcommands also generate summaries specific to online inference. See the
45+
example [online summary](../examples/online_summary.pdf) and [detailed
46+
report](../examples/online_summary.pdf).
47+
48+
### Offline Mode
49+
50+
The offline mode `--mode=offline` tells Model Analyzer to set its defaults to
51+
find a model that maximizes throughput. In the offline scenario, Triton
52+
processes requests offline and therefore inference throughput is the priority. A
53+
minimum throughput can be specified using `--min-throughput` to ignore any
54+
configuration that does not exceed a minimum number of inferences per second.
55+
Both the summary and the detailed report will contain alternative graphs in the
56+
offline mode. See the [offline summary](../examples/offline_summary.pdf) and
57+
[detailed report](../examples/offline_detailed_report.pdf) examples.
3958

4059
## Model Analyzer Subcommands
4160

docs/config.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -220,6 +220,9 @@ analysis_models: <comma-delimited-string-list>
220220
# Shorthand that allows a user to specify a max latency constraint in ms
221221
[ latency_budget: <int>]
222222
223+
# Shorthand that allows a user to specify a min throughput constraint
224+
[ min_throughput: <int>]
225+
223226
# Specify path to config yaml file
224227
[ config_file: <string> ]
225228
```
@@ -563,6 +566,26 @@ and two instance group combinations). If both `model_config_parameters` and
563566
`parameters` keys are specified, the list of sweep configurations will be the
564567
cartesian product of both of the lists.
565568

569+
### `<cpu_only>`
570+
571+
This flag tells the model analyzer that, whether performing a search during profiling
572+
or generating reports, this model should use CPU instances only. In order to run a model on CPU only you must provide a value of `true` for this flag.
573+
574+
#### Example
575+
576+
```yaml
577+
model_repository: /path/to/model/repository/
578+
profile_models:
579+
model_1:
580+
cpu_only: true
581+
model_2:
582+
perf_analyzer_flags:
583+
percentile: 95
584+
latency_report_file: /path/to/latency/report/file
585+
```
586+
The above config tells model analyzer to profile `model_1` on CPU only,
587+
but profile `model_2` using GPU.
588+
566589
### `<perf-analyzer-flags>`
567590

568591
This field allows fine-grained control over the behavior of the `perf_analyzer`

docs/report.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ $ model-analyzer analyze --analysis-models <list of model names> --checkpoint-di
3232

3333
The export directory will, by default, contain 3 subdirectories. The summary
3434
report for a model will be located in `[export-path]/reports/summaries/<model
35-
name>`. The report will look like the one shown [*here*](../examples/summary.pdf).
35+
name>`. The report will look like the one shown [*here*](../examples/online_summary.pdf).
3636

3737
To disable summary report generation use `--summarize=false` or set the
3838
`summarize` yaml option to `false`.
@@ -54,7 +54,7 @@ particular model config with which the measurements are obtained, as well as
5454
extra configurable plots. The user can define the plots they would like to see
5555
in the detailed report using the YAML config file (See [**Configuring Model
5656
Analyzer**](./config.md) section for more details) The detailed report will
57-
look like the one shown [*here*](../examples/detailed_report.pdf).
57+
look like the one shown [*here*](../examples/online_detailed_report.pdf).
5858

5959

6060
See the [**quick start**](./quick_start.md#plots) and [**configuring model

examples/detailed_report.pdf

-131 Bytes
Binary file not shown.
131 Bytes
Binary file not shown.

examples/offline_summary.pdf

130 Bytes
Binary file not shown.
131 Bytes
Binary file not shown.

examples/online_summary.pdf

131 Bytes
Binary file not shown.

examples/summary.pdf

-131 Bytes
Binary file not shown.

0 commit comments

Comments
 (0)