@@ -34,27 +34,25 @@ in which it is being run. Currently model analyzer supports 2 modes.
3434### Online Mode
3535
3636This is the default mode. When in this mode, Model Analyzer will operate to find
37- the optimal model configuration for an online inference scenario. In this
38- scenario, Triton server will receive requests on demand with an expectation that
39- latency will be minimized.
37+ the optimal model configuration for an online inference scenario. By default in
38+ online mode, the best model configuration will be the one that maximizes
39+ throughput. If a latency budget is specified to the [ analyze subcommand] ( subcommand-analyze ) via
40+ ` --latency-budget ` , then the best model configuration will be the one with the highest throughput in the given budget.
4041
41- By default in online mode, the best model configuration will be the one that
42- minimizes latency. If a latency budget is specified the best model configuration
43- will be the one with the highest throughput in the given budget. The analyze and
44- report subcommands also generate summaries specific to online inference. See the
45- example [ online summary] ( ../examples/online_summary.pdf ) and [ detailed
46- report] ( ../examples/online_summary.pdf ) .
42+ In online mode the analyze and report subcommands will generate summaries specific to online inference.
43+ See the example [ online summary] ( ../examples/online_summary.pdf ) and [ online detailed report] ( ../examples/online_summary.pdf ) .
4744
4845### Offline Mode
4946
50- The offline mode ` --mode=offline ` tells Model Analyzer to set its defaults to
51- find a model that maximizes throughput. In the offline scenario, Triton
52- processes requests offline and therefore inference throughput is the priority. A
53- minimum throughput can be specified using ` --min-throughput ` to ignore any
54- configuration that does not exceed a minimum number of inferences per second.
55- Both the summary and the detailed report will contain alternative graphs in the
56- offline mode. See the [ offline summary] ( ../examples/offline_summary.pdf ) and
57- [ detailed report] ( ../examples/offline_detailed_report.pdf ) examples.
47+ The offline mode ` --mode=offline ` tells Model Analyzer to operate to find the
48+ optimal model configuration for an offline inference scenario. By default
49+ in offline mode, the best model configuration will be the one that maximizes throughput.
50+ A minimum throughput can be specified to the [ analyze subcommand] ( subcommand-analyze )
51+ via ` --min-throughput ` to ignore any configuration that does not exceed a minimum number of inferences per second.
52+
53+ In offline mode the analyze and report subcommands will generate reports specific to offline inference.
54+ See the example [ offline summary] ( ../examples/offline_summary.pdf ) and
55+ [ offline detailed report] ( ../examples/offline_detailed_report.pdf ) examples.
5856
5957## Model Analyzer Subcommands
6058
0 commit comments