Skip to content

Commit fe04987

Browse files
authored
Update README for 23.06 (#719)
1 parent 815ab67 commit fe04987

File tree

1 file changed

+85
-3
lines changed

1 file changed

+85
-3
lines changed

README.md

Lines changed: 85 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,88 @@ limitations under the License.
1717
[![License](https://img.shields.io/badge/License-Apache_2.0-lightgrey.svg)](https://opensource.org/licenses/Apache-2.0)
1818

1919
# Triton Model Analyzer
20-
+**Note** <br>
21-
+You are currently on the r23.06 branch which tracks stabilization towards the next release.<br>
22-
+This branch is not usable during stabilization.
20+
21+
Triton Model Analyzer is a CLI tool which can help you find a more optimal configuration, on a given piece of hardware, for single, multiple, ensemble, or BLS models running on a [Triton Inference Server](https://github.com/triton-inference-server/server/). Model Analyzer will also generate reports to help you better understand the trade-offs of the different configurations along with their compute and memory requirements.
22+
<br><br>
23+
24+
# Features
25+
26+
### Search Modes
27+
28+
- [Quick Search](docs/config_search.md#quick-search-mode) will **sparsely** search the [Max Batch Size](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md#maximum-batch-size),
29+
[Dynamic Batching](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md#dynamic-batcher), and
30+
[Instance Group](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md#instance-groups) spaces by utilizing a heuristic hill-climbing algorithm to help you quickly find a more optimal configuration
31+
32+
- [Automatic Brute Search](docs/config_search.md#automatic-brute-search) will **exhaustively** search the
33+
[Max Batch Size](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md#maximum-batch-size),
34+
[Dynamic Batching](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md#dynamic-batcher), and
35+
[Instance Group](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md#instance-groups)
36+
parameters of your model configuration
37+
38+
- [Manual Brute Search](docs/config_search.md#manual-brute-search) allows you to create manual sweeps for every parameter that can be specified in the model configuration
39+
40+
### Model Types
41+
42+
- [Ensemble Model Search](docs/config_search.md#ensemble-model-search): Model Analyzer can help you find the optimal
43+
settings when profiling an ensemble model, utilizing the [Quick Search](docs/config_search.md#quick-search-mode) algorithm
44+
45+
- [BLS Model Search](docs/config_search.md#bls-model-search): Model Analyzer can help you find the optimal
46+
settings when profiling a BLS model, utilizing the [Quick Search](docs/config_search.md#quick-search-mode) algorithm
47+
48+
- [Multi-Model Search](docs/config_search.md#multi-model-search-mode): **EARLY ACCESS** - Model Analyzer can help you
49+
find the optimal settings when profiling multiple concurrent models, utilizing the [Quick Search](docs/config_search.md#quick-search-mode) algorithm
50+
51+
### Other Features
52+
53+
- [Detailed and summary reports](docs/report.md): Model Analyzer is able to generate
54+
summarized and detailed reports that can help you better understand the trade-offs
55+
between different model configurations that can be used for your model.
56+
57+
- [QoS Constraints](docs/config.md#constraint): Constraints can help you
58+
filter out the Model Analyzer results based on your QoS requirements. For
59+
example, you can specify a latency budget to filter out model configurations
60+
that do not satisfy the specified latency threshold.
61+
<br><br>
62+
63+
# Examples and Tutorials
64+
65+
### **Single Model**
66+
67+
See the [Single Model Quick Start](docs/quick_start.md) for a guide on how to use Model Analyzer to profile, analyze and report on a simple PyTorch model.
68+
69+
### **Multi Model**
70+
71+
See the [Multi-model Quick Start](docs/mm_quick_start.md) for a guide on how to use Model Analyzer to profile, analyze and report on two models running concurrently on the same GPU.
72+
<br><br>
73+
74+
# Documentation
75+
76+
- [Installation](docs/install.md)
77+
- [Model Analyzer CLI](docs/cli.md)
78+
- [Launch Modes](docs/launch_modes.md)
79+
- [Configuring Model Analyzer](docs/config.md)
80+
- [Model Analyzer Metrics](docs/metrics.md)
81+
- [Model Config Search](docs/config_search.md)
82+
- [Checkpointing](docs/checkpoints.md)
83+
- [Model Analyzer Reports](docs/report.md)
84+
- [Deployment with Kubernetes](docs/kubernetes_deploy.md)
85+
<br><br>
86+
87+
# Reporting problems, asking questions
88+
89+
We appreciate any feedback, questions or bug reporting regarding this
90+
project. When help with code is needed, follow the process outlined in
91+
the Stack Overflow (https://stackoverflow.com/help/mcve)
92+
document. Ensure posted examples are:
93+
94+
- minimal – use as little code as possible that still produces the
95+
same problem
96+
97+
- complete – provide all parts needed to reproduce the problem. Check
98+
if you can strip external dependency and still show the problem. The
99+
less time we spend on reproducing problems the more time we have to
100+
fix it
101+
102+
- verifiable – test the code you're about to provide to make sure it
103+
reproduces the problem. Remove all other problems that are not
104+
related to your request/question.

0 commit comments

Comments
 (0)