Skip to content

Commit 7cd311f

Browse files
author
Ashwin Ramesh
authored
Updated docs and setup.py for publishing wheel to PyPI (#214)
1 parent d8faa52 commit 7cd311f

File tree

3 files changed

+144
-134
lines changed

3 files changed

+144
-134
lines changed

docs/install.md

Lines changed: 138 additions & 132 deletions
Original file line numberDiff line numberDiff line change
@@ -16,138 +16,144 @@ limitations under the License.
1616

1717
# Installation
1818

19-
There are three ways to use Triton Model Analyzer:
20-
21-
1. The recommended way to use Model Analyzer is with the Triton SDK docker
22-
container available on the [NVIDIA GPU Cloud
23-
Catalog](https://ngc.nvidia.com/catalog/containers/nvidia:tritonserver). You
24-
can pull and run the SDK container with the following commands:
25-
26-
```
27-
$ docker pull nvcr.io/nvidia/tritonserver:21.08-py3-sdk
28-
```
29-
30-
If you are not planning to run Model Analyzer with
31-
`--triton-launch-mode=docker` you can run the container with the following
32-
command:
33-
34-
```
35-
$ docker run -it --gpus all --net=host nvcr.io/nvidia/tritonserver:21.08-py3-sdk
36-
```
37-
38-
If you intend to use `--triton-launch-mode=docker`, you will need to mount
39-
the following:
40-
* `-v /var/run/docker.sock:/var/run/docker.sock` allows running docker
41-
containers as sibling containers from inside the Triton SDK container.
42-
Model Analyzer will require this if run with
43-
`--triton-launch-mode=docker`.
44-
* `-v <path-to-output-model-repo>:<path-to-output-model-repo>` The
45-
***absolute*** path to the directory where the output model repository
46-
will be located (i.e. parent directory of the output model repository).
47-
This is so that the launched Triton container has access to the model
48-
config variants that Model Analyzer creates.
49-
50-
```
51-
$ docker run -it --gpus all \
52-
-v /var/run/docker.sock:/var/run/docker.sock \
53-
-v <path-to-output-model-repo>:<path-to-output-model-repo> \
54-
--net=host nvcr.io/nvidia/tritonserver:21.08-py3-sdk
55-
```
56-
57-
Model Analyzer uses `pdfkit` for report generation. If you are running Model
58-
Analyzer inside the Triton SDK container, then you will need to download
59-
`wkhtmltopdf`.
60-
61-
```
62-
$ sudo apt-get update && sudo apt-get install wkhtmltopdf
63-
```
64-
65-
Once you do this, Model Analyzer will able to use `pdfkit` to generate
66-
reports.
67-
68-
2. Building the Dockerfile:
69-
70-
You can also build the Model Analyzer's dockerfile yourself. First, clone the
71-
Model Analyzer's git repository, then build the docker image.
72-
73-
```
74-
$ git clone https://github.com/triton-inference-server/model_analyzer
75-
$ docker build --pull -t model-analyzer .
76-
```
77-
78-
The above command will pull all the containers that model analyzer needs to
79-
run. The Model Analyzer's Dockerfile bases the container on the latest
80-
`tritonserver` containers from NGC. Now you can run the container with:
81-
82-
```
83-
$ docker run -it --rm --gpus all \
84-
-v /var/run/docker.sock:/var/run/docker.sock \
85-
-v <path-to-triton-model-repository>:<path-to-triton-model-repository> \
86-
-v <path-to-output-model-repo>:<path-to-output-model-repo> \
87-
--net=host model-analyzer
88-
89-
root@hostname:/opt/triton-model-analyzer#
90-
```
91-
92-
3. Using `pip3`:
93-
94-
You can install pip using:
95-
```
96-
$ sudo apt-get update && sudo apt-get install python3-pip
97-
```
98-
99-
Model analyzer can be installed with:
100-
```
101-
$ pip3 install nvidia-pyindex
102-
$ pip3 install triton-model-analyzer
103-
```
104-
105-
If you encounter any errors installing dependencies like `numba`, make sure
106-
that you have the latest version of `pip` using:
107-
108-
```
109-
$ pip3 install --upgrade pip
110-
```
111-
112-
You can then try installing model analyzer again.
113-
114-
If you are using this approach you need to install DCGM on your machine.
115-
116-
For installing DCGM on Ubuntu 20.04 you can use the following commands:
117-
```
118-
$ export DCGM_VERSION=2.0.13
119-
$ wget -q https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/datacenter-gpu-manager_${DCGM_VERSION}_amd64.deb && \
120-
dpkg -i datacenter-gpu-manager_${DCGM_VERSION}_amd64.deb
121-
```
122-
123-
4. Building from source:
124-
125-
To build model analyzer form source, you'll need to install the same
126-
dependencies (tritonclient and DCGM) mentioned in the "Using pip section".
127-
After that, you can use the following commands:
128-
129-
```
130-
$ git clone https://github.com/triton-inference-server/model_analyzer
131-
$ cd model_analyzer
132-
$ ./build_wheel.sh <path to perf_analyzer> true
133-
```
134-
135-
In the final command above we are building the triton-model-analyzer wheel.
136-
You will need to provide the `build_wheel.sh` script with two arguments. The
137-
first is the path to the `perf_analyzer` binary that you would like Model
138-
Analyzer to use. The second is whether you want this wheel to be linux
139-
specific. Currently, this argument must be set to `true` as perf analyzer is
140-
supported only on linux. This will create a wheel file in the `wheels`
141-
directory named
142-
`triton-model-analyzer-<version>-py3-none-manylinux1_x86_64.whl`. We can now
143-
install this with:
144-
145-
```
146-
$ pip3 install wheels/triton-model-analyzer-*.whl
147-
```
148-
149-
After these steps, `model-analyzer` executable should be available in
150-
`$PATH`.
19+
There are 4 ways to use Triton Model Analyzer:
20+
21+
## Triton SDK Container
22+
23+
The recommended way to use Model Analyzer is with the Triton SDK docker
24+
container available on the [NVIDIA GPU Cloud
25+
Catalog](https://ngc.nvidia.com/catalog/containers/nvidia:tritonserver). You can
26+
pull and run the SDK container with the following commands:
27+
28+
```
29+
$ docker pull nvcr.io/nvidia/tritonserver:21.08-py3-sdk
30+
```
31+
32+
If you are not planning to run Model Analyzer with
33+
`--triton-launch-mode=docker`, You can run the SDK container with the following
34+
command:
35+
36+
```
37+
$ docker run -it --gpus all --net=host nvcr.io/nvidia/tritonserver:21.08-py3-sdk
38+
```
39+
40+
You will need to build and install the Triton server binary inside the SDK
41+
container. See the Triton [Installation
42+
docs](https://github.com/triton-inference-server/server/blob/main/docs/build.md)
43+
for more details.
44+
45+
If you intend to use `--triton-launch-mode=docker`, which is recommended with
46+
this method of using Model Analyzer, you will need to mount the
47+
following:
48+
* `-v /var/run/docker.sock:/var/run/docker.sock` allows running docker
49+
containers as sibling containers from inside the Triton SDK container.
50+
Model Analyzer will require this if run with
51+
`--triton-launch-mode=docker`.
52+
* `-v <path-to-output-model-repo>:<path-to-output-model-repo>` The
53+
***absolute*** path to the directory where the output model repository
54+
will be located (i.e. parent directory of the output model repository).
55+
This is so that the launched Triton container has access to the model
56+
config variants that Model Analyzer creates.
57+
58+
```
59+
$ docker run -it --gpus all \
60+
-v /var/run/docker.sock:/var/run/docker.sock \
61+
-v <path-to-output-model-repo>:<path-to-output-model-repo> \
62+
--net=host nvcr.io/nvidia/tritonserver:21.08-py3-sdk
63+
```
64+
65+
Model Analyzer uses `pdfkit` for report generation. If you are running Model
66+
Analyzer inside the Triton SDK container, then you will need to download
67+
`wkhtmltopdf`.
68+
69+
```
70+
$ sudo apt-get update && sudo apt-get install wkhtmltopdf
71+
```
72+
73+
Once you do this, Model Analyzer will able to use `pdfkit` to generate reports.
74+
75+
## Building the Dockerfile
76+
77+
You can also build the Model Analyzer's dockerfile yourself. First, clone the
78+
Model Analyzer's git repository, then build the docker image. This is the
79+
recommended installation method if you mainly intend to use
80+
`--triton-launch-mode=local`, as all the dependencies will be available.
81+
82+
```
83+
$ git clone https://github.com/triton-inference-server/model_analyzer
84+
$ docker build --pull -t model-analyzer .
85+
```
86+
87+
The above command will pull all the containers that Model Analyzer needs to run.
88+
The Model Analyzer's Dockerfile bases the container on the latest `tritonserver`
89+
containers from NGC. Now you can run the container with:
90+
91+
```
92+
$ docker run -it --rm --gpus all \
93+
-v /var/run/docker.sock:/var/run/docker.sock \
94+
-v <path-to-triton-model-repository>:<path-to-triton-model-repository> \
95+
-v <path-to-output-model-repo>:<path-to-output-model-repo> \
96+
--net=host model-analyzer
97+
98+
root@hostname:/opt/triton-model-analyzer#
99+
```
100+
101+
## Using `pip3`
102+
103+
You can install pip using:
104+
```
105+
$ sudo apt-get update && sudo apt-get install python3-pip
106+
```
107+
108+
Model analyzer can be installed with:
109+
```
110+
$ pip3 install triton-model-analyzer
111+
```
112+
113+
If you encounter any errors installing dependencies like `numba`, make sure that
114+
you have the latest version of `pip` using:
115+
116+
```
117+
$ pip3 install --upgrade pip
118+
```
119+
120+
You can then try installing model analyzer again.
121+
122+
If you are using this approach you need to install DCGM on your machine.
123+
124+
For installing DCGM on Ubuntu 20.04 you can use the following commands:
125+
```
126+
$ export DCGM_VERSION=2.0.13
127+
$ wget -q https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/datacenter-gpu-manager_${DCGM_VERSION}_amd64.deb && \
128+
dpkg -i datacenter-gpu-manager_${DCGM_VERSION}_amd64.deb
129+
```
130+
131+
## Building from source
132+
133+
To build model analyzer form source, you'll need to install the same
134+
dependencies (tritonclient and DCGM) mentioned in the "Using pip section". After
135+
that, you can use the following commands:
136+
137+
```
138+
$ git clone https://github.com/triton-inference-server/model_analyzer
139+
$ cd model_analyzer
140+
$ ./build_wheel.sh <path to perf_analyzer> true
141+
```
142+
143+
In the final command above we are building the triton-model-analyzer wheel. You
144+
will need to provide the `build_wheel.sh` script with two arguments. The first
145+
is the path to the `perf_analyzer` binary that you would like Model Analyzer to
146+
use. The second is whether you want this wheel to be linux specific. Currently,
147+
this argument must be set to `true` as perf analyzer is supported only on linux.
148+
This will create a wheel file in the `wheels` directory named
149+
`triton-model-analyzer-<version>-py3-none-manylinux1_x86_64.whl`. We can now
150+
install this with:
151+
152+
```
153+
$ pip3 install wheels/triton-model-analyzer-*.whl
154+
```
155+
156+
After these steps, `model-analyzer` executable should be available in `$PATH`.
151157

152158
**Notes:**
153159
* Triton Model Analyzer supports all the GPUs supported by the DCGM library. See

docs/metrics.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@ recorded and aggregated over fixed intervals during a perf analyzer run.
6666
* `cpu_used_ram`: The total amount of memory used by all CPUs
6767
* `cpu_available_ram`: The total amount of availble CPU memory.
6868

69-
**Warning**: Collecting CPU metrics might affect model inference metrics such as throughput and latency. By default, CPU metrics are not collected. To collect CPU metrics, set `collect_cpu_metrics` flag to `true`, see [Configuring Model Analyzer](docs/config.md) for details.
69+
**Warning**: Collecting CPU metrics might affect model inference metrics such as throughput and latency. By default, CPU metrics are not collected. To collect CPU metrics, set `collect_cpu_metrics` flag to `true`, see [Configuring Model Analyzer](./config.md) for details.
7070

7171
## Additional tags for output headers
7272

setup.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,11 @@ def get_tag(self):
7474
author='NVIDIA Inc.',
7575
author_email='[email protected]',
7676
description=
77-
"The Model Analyzer is a tool to analyze the runtime performance of one or more models on the Triton Inference Server",
77+
"Triton Model Analyzer is a tool to analyze the runtime performance of one or more models on the Triton Inference Server",
78+
long_description=
79+
"""See the Model Analyzer's [installation documentation](https://github.com/triton-inference-server/model_analyzer/blob/main/docs/install.md#using-pip3) """
80+
"""for package details. The [quick start](https://github.com/triton-inference-server/model_analyzer/blob/main/docs/quick_start.md) documentation """
81+
"""describes how to get started with profiling and analysis using Triton Model Analyzer."""
7882
license='BSD',
7983
url='https://developer.nvidia.com/nvidia-triton-inference-server',
8084
keywords=[

0 commit comments

Comments
 (0)