Skip to content

Commit 815ab67

Browse files
authored
Update README and versions for 23.06 branch (#708)
* Update README and versions for 23.06 branch * Update README.md for 23.06
1 parent 44707fc commit 815ab67

File tree

9 files changed

+16
-104
lines changed

9 files changed

+16
-104
lines changed

Dockerfile

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -12,11 +12,11 @@
1212
# See the License for the specific language governing permissions and
1313
# limitations under the License.
1414

15-
ARG BASE_IMAGE=nvcr.io/nvidia/tritonserver:23.05-py3
16-
ARG TRITONSDK_BASE_IMAGE=nvcr.io/nvidia/tritonserver:23.05-py3-sdk
15+
ARG BASE_IMAGE=nvcr.io/nvidia/tritonserver:23.06-py3
16+
ARG TRITONSDK_BASE_IMAGE=nvcr.io/nvidia/tritonserver:23.06-py3-sdk
1717

18-
ARG MODEL_ANALYZER_VERSION=1.29.0dev
19-
ARG MODEL_ANALYZER_CONTAINER_VERSION=23.06dev
18+
ARG MODEL_ANALYZER_VERSION=1.29.0
19+
ARG MODEL_ANALYZER_CONTAINER_VERSION=23.06
2020

2121
FROM ${TRITONSDK_BASE_IMAGE} as sdk
2222

README.md

Lines changed: 3 additions & 91 deletions
Original file line numberDiff line numberDiff line change
@@ -17,94 +17,6 @@ limitations under the License.
1717
[![License](https://img.shields.io/badge/License-Apache_2.0-lightgrey.svg)](https://opensource.org/licenses/Apache-2.0)
1818

1919
# Triton Model Analyzer
20-
21-
**LATEST RELEASE: You are currently on the main branch which tracks
22-
under-development progress towards the next release. The latest
23-
release of the Triton Model Analyzer is 1.28.0 and is available on
24-
branch
25-
[r23.05](https://github.com/triton-inference-server/model_analyzer/tree/r23.05).**
26-
27-
Triton Model Analyzer is a CLI tool which can help you find a more optimal configuration, on a given piece of hardware, for single, multiple, ensemble, or BLS models running on a [Triton Inference Server](https://github.com/triton-inference-server/server/). Model Analyzer will also generate reports to help you better understand the trade-offs of the different configurations along with their compute and memory requirements.
28-
<br><br>
29-
30-
# Features
31-
32-
### Search Modes
33-
34-
- [Quick Search](docs/config_search.md#quick-search-mode) will **sparsely** search the [Max Batch Size](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md#maximum-batch-size),
35-
[Dynamic Batching](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md#dynamic-batcher), and
36-
[Instance Group](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md#instance-groups) spaces by utilizing a heuristic hill-climbing algorithm to help you quickly find a more optimal configuration
37-
38-
- [Automatic Brute Search](docs/config_search.md#automatic-brute-search) will **exhaustively** search the
39-
[Max Batch Size](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md#maximum-batch-size),
40-
[Dynamic Batching](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md#dynamic-batcher), and
41-
[Instance Group](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md#instance-groups)
42-
parameters of your model configuration
43-
44-
- [Manual Brute Search](docs/config_search.md#manual-brute-search) allows you to create manual sweeps for every parameter that can be specified in the model configuration
45-
46-
### Model Types
47-
48-
- [Ensemble Model Search](docs/config_search.md#ensemble-model-search): Model Analyzer can help you find the optimal
49-
settings when profiling an ensemble model, utilizing the [Quick Search](docs/config_search.md#quick-search-mode) algorithm
50-
51-
- [BLS Model Search](docs/config_search.md#bls-model-search): Model Analyzer can help you find the optimal
52-
settings when profiling a BLS model, utilizing the [Quick Search](docs/config_search.md#quick-search-mode) algorithm
53-
54-
- [Multi-Model Search](docs/config_search.md#multi-model-search-mode): **EARLY ACCESS** - Model Analyzer can help you
55-
find the optimal settings when profiling multiple concurrent models, utilizing the [Quick Search](docs/config_search.md#quick-search-mode) algorithm
56-
57-
### Other Features
58-
59-
- [Detailed and summary reports](docs/report.md): Model Analyzer is able to generate
60-
summarized and detailed reports that can help you better understand the trade-offs
61-
between different model configurations that can be used for your model.
62-
63-
- [QoS Constraints](docs/config.md#constraint): Constraints can help you
64-
filter out the Model Analyzer results based on your QoS requirements. For
65-
example, you can specify a latency budget to filter out model configurations
66-
that do not satisfy the specified latency threshold.
67-
<br><br>
68-
69-
# Examples and Tutorials
70-
71-
### **Single Model**
72-
73-
See the [Single Model Quick Start](docs/quick_start.md) for a guide on how to use Model Analyzer to profile, analyze and report on a simple PyTorch model.
74-
75-
### **Multi Model**
76-
77-
See the [Multi-model Quick Start](docs/mm_quick_start.md) for a guide on how to use Model Analyzer to profile, analyze and report on two models running concurrently on the same GPU.
78-
<br><br>
79-
80-
# Documentation
81-
82-
- [Installation](docs/install.md)
83-
- [Model Analyzer CLI](docs/cli.md)
84-
- [Launch Modes](docs/launch_modes.md)
85-
- [Configuring Model Analyzer](docs/config.md)
86-
- [Model Analyzer Metrics](docs/metrics.md)
87-
- [Model Config Search](docs/config_search.md)
88-
- [Checkpointing](docs/checkpoints.md)
89-
- [Model Analyzer Reports](docs/report.md)
90-
- [Deployment with Kubernetes](docs/kubernetes_deploy.md)
91-
<br><br>
92-
93-
# Reporting problems, asking questions
94-
95-
We appreciate any feedback, questions or bug reporting regarding this
96-
project. When help with code is needed, follow the process outlined in
97-
the Stack Overflow (https://stackoverflow.com/help/mcve)
98-
document. Ensure posted examples are:
99-
100-
- minimal – use as little code as possible that still produces the
101-
same problem
102-
103-
- complete – provide all parts needed to reproduce the problem. Check
104-
if you can strip external dependency and still show the problem. The
105-
less time we spend on reproducing problems the more time we have to
106-
fix it
107-
108-
- verifiable – test the code you're about to provide to make sure it
109-
reproduces the problem. Remove all other problems that are not
110-
related to your request/question.
20+
+**Note** <br>
21+
+You are currently on the r23.06 branch which tracks stabilization towards the next release.<br>
22+
+This branch is not usable during stabilization.

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
1.29.0dev
1+
1.29.0

docs/config.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -153,7 +153,7 @@ cpu_only_composing_models: <comma-delimited-string-list>
153153
[ reload_model_disable: <bool> | default: false]
154154
155155
# Triton Docker image tag used when launching using Docker mode
156-
[ triton_docker_image: <string> | default: nvcr.io/nvidia/tritonserver:23.05-py3 ]
156+
[ triton_docker_image: <string> | default: nvcr.io/nvidia/tritonserver:23.06-py3 ]
157157
158158
# Triton Server HTTP endpoint url used by Model Analyzer client"
159159
[ triton_http_endpoint: <string> | default: localhost:8000 ]

docs/kubernetes_deploy.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,7 @@ images:
7979
8080
triton:
8181
image: nvcr.io/nvidia/tritonserver
82-
tag: 23.05-py3
82+
tag: 23.06-py3
8383
```
8484

8585
The model analyzer executable uses the config file defined in `helm-chart/templates/config-map.yaml`. This config can be modified to supply arguments to model analyzer. Only the content under the `config.yaml` section of the file should be modified.

docs/mm_quick_start.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ git pull origin main
4949
**1. Pull the SDK container:**
5050

5151
```
52-
docker pull nvcr.io/nvidia/tritonserver:23.05-py3-sdk
52+
docker pull nvcr.io/nvidia/tritonserver:23.06-py3-sdk
5353
```
5454

5555
**2. Run the SDK container**
@@ -59,7 +59,7 @@ docker run -it --gpus all \
5959
-v /var/run/docker.sock:/var/run/docker.sock \
6060
-v $(pwd)/examples/quick-start:$(pwd)/examples/quick-start \
6161
-v <path-to-output-model-repo>:<path-to-output-model-repo> \
62-
--net=host nvcr.io/nvidia/tritonserver:23.05-py3-sdk
62+
--net=host nvcr.io/nvidia/tritonserver:23.06-py3-sdk
6363
```
6464

6565
**Replacing** `<path-to-output-model-repo>` with the

docs/quick_start.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ git pull origin main
4949
**1. Pull the SDK container:**
5050

5151
```
52-
docker pull nvcr.io/nvidia/tritonserver:23.05-py3-sdk
52+
docker pull nvcr.io/nvidia/tritonserver:23.06-py3-sdk
5353
```
5454

5555
**2. Run the SDK container**
@@ -59,7 +59,7 @@ docker run -it --gpus all \
5959
-v /var/run/docker.sock:/var/run/docker.sock \
6060
-v $(pwd)/examples/quick-start:$(pwd)/examples/quick-start \
6161
-v <path-to-output-model-repo>:<path-to-output-model-repo> \
62-
--net=host nvcr.io/nvidia/tritonserver:23.05-py3-sdk
62+
--net=host nvcr.io/nvidia/tritonserver:23.06-py3-sdk
6363
```
6464

6565
**Replacing** `<path-to-output-model-repo>` with the

helm-chart/values.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,4 +41,4 @@ images:
4141

4242
triton:
4343
image: nvcr.io/nvidia/tritonserver
44-
tag: 23.05-py3
44+
tag: 23.06-py3

model_analyzer/config/input/config_defaults.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@
5353
DEFAULT_RUN_CONFIG_PROFILE_MODELS_CONCURRENTLY_ENABLE = False
5454
DEFAULT_REQUEST_RATE_SEARCH_ENABLE = False
5555
DEFAULT_TRITON_LAUNCH_MODE = 'local'
56-
DEFAULT_TRITON_DOCKER_IMAGE = 'nvcr.io/nvidia/tritonserver:23.05-py3'
56+
DEFAULT_TRITON_DOCKER_IMAGE = 'nvcr.io/nvidia/tritonserver:23.06-py3'
5757
DEFAULT_TRITON_HTTP_ENDPOINT = 'localhost:8000'
5858
DEFAULT_TRITON_GRPC_ENDPOINT = 'localhost:8001'
5959
DEFAULT_TRITON_METRICS_URL = 'http://localhost:8002/metrics'

0 commit comments

Comments
 (0)