@@ -16,138 +16,144 @@ limitations under the License.
1616
1717# Installation
1818
19- There are three ways to use Triton Model Analyzer:
20-
21- 1 . The recommended way to use Model Analyzer is with the Triton SDK docker
22- container available on the [ NVIDIA GPU Cloud
23- Catalog] ( https://ngc.nvidia.com/catalog/containers/nvidia:tritonserver ) . You
24- can pull and run the SDK container with the following commands:
25-
26- ```
27- $ docker pull nvcr.io/nvidia/tritonserver:21.08-py3-sdk
28- ```
29-
30- If you are not planning to run Model Analyzer with
31- ` --triton-launch-mode=docker ` you can run the container with the following
32- command:
33-
34- ```
35- $ docker run -it --gpus all --net=host nvcr.io/nvidia/tritonserver:21.08-py3-sdk
36- ```
37-
38- If you intend to use ` --triton-launch-mode=docker ` , you will need to mount
39- the following:
40- * ` -v /var/run/docker.sock:/var/run/docker.sock ` allows running docker
41- containers as sibling containers from inside the Triton SDK container.
42- Model Analyzer will require this if run with
43- ` --triton-launch-mode=docker ` .
44- * ` -v <path-to-output-model-repo>:<path-to-output-model-repo> ` The
45- *** absolute*** path to the directory where the output model repository
46- will be located (i.e. parent directory of the output model repository).
47- This is so that the launched Triton container has access to the model
48- config variants that Model Analyzer creates.
49-
50- ```
51- $ docker run -it --gpus all \
52- -v /var/run/docker.sock:/var/run/docker.sock \
53- -v <path-to-output-model-repo>:<path-to-output-model-repo> \
54- --net=host nvcr.io/nvidia/tritonserver:21.08-py3-sdk
55- ```
56-
57- Model Analyzer uses ` pdfkit ` for report generation. If you are running Model
58- Analyzer inside the Triton SDK container, then you will need to download
59- ` wkhtmltopdf ` .
60-
61- ```
62- $ sudo apt-get update && sudo apt-get install wkhtmltopdf
63- ```
64-
65- Once you do this, Model Analyzer will able to use ` pdfkit ` to generate
66- reports.
67-
68- 2 . Building the Dockerfile:
69-
70- You can also build the Model Analyzer's dockerfile yourself. First, clone the
71- Model Analyzer's git repository, then build the docker image.
72-
73- ```
74- $ git clone https://github.com/triton-inference-server/model_analyzer
75- $ docker build --pull -t model-analyzer .
76- ```
77-
78- The above command will pull all the containers that model analyzer needs to
79- run. The Model Analyzer's Dockerfile bases the container on the latest
80- ` tritonserver ` containers from NGC. Now you can run the container with:
81-
82- ```
83- $ docker run -it --rm --gpus all \
84- -v /var/run/docker.sock:/var/run/docker.sock \
85- -v <path-to-triton-model-repository>:<path-to-triton-model-repository> \
86- -v <path-to-output-model-repo>:<path-to-output-model-repo> \
87- --net=host model-analyzer
88-
89- root@hostname:/opt/triton-model-analyzer#
90- ```
91-
92- 3 . Using ` pip3 ` :
93-
94- You can install pip using:
95- ```
96- $ sudo apt-get update && sudo apt-get install python3-pip
97- ```
98-
99- Model analyzer can be installed with:
100- ```
101- $ pip3 install nvidia-pyindex
102- $ pip3 install triton-model-analyzer
103- ```
104-
105- If you encounter any errors installing dependencies like ` numba ` , make sure
106- that you have the latest version of ` pip ` using:
107-
108- ```
109- $ pip3 install --upgrade pip
110- ```
111-
112- You can then try installing model analyzer again.
113-
114- If you are using this approach you need to install DCGM on your machine.
115-
116- For installing DCGM on Ubuntu 20.04 you can use the following commands:
117- ```
118- $ export DCGM_VERSION=2.0.13
119- $ wget -q https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/datacenter-gpu-manager_${DCGM_VERSION}_amd64.deb && \
120- dpkg -i datacenter-gpu-manager_${DCGM_VERSION}_amd64.deb
121- ```
122-
123- 4 . Building from source:
124-
125- To build model analyzer form source, you'll need to install the same
126- dependencies (tritonclient and DCGM) mentioned in the "Using pip section".
127- After that, you can use the following commands:
128-
129- ```
130- $ git clone https://github.com/triton-inference-server/model_analyzer
131- $ cd model_analyzer
132- $ ./build_wheel.sh <path to perf_analyzer> true
133- ```
134-
135- In the final command above we are building the triton-model-analyzer wheel.
136- You will need to provide the ` build_wheel.sh ` script with two arguments. The
137- first is the path to the ` perf_analyzer ` binary that you would like Model
138- Analyzer to use. The second is whether you want this wheel to be linux
139- specific. Currently, this argument must be set to ` true ` as perf analyzer is
140- supported only on linux. This will create a wheel file in the ` wheels `
141- directory named
142- ` triton-model-analyzer-<version>-py3-none-manylinux1_x86_64.whl ` . We can now
143- install this with:
144-
145- ```
146- $ pip3 install wheels/triton-model-analyzer-*.whl
147- ```
148-
149- After these steps, ` model-analyzer ` executable should be available in
150- ` $PATH ` .
19+ There are 4 ways to use Triton Model Analyzer:
20+
21+ ## Triton SDK Container
22+
23+ The recommended way to use Model Analyzer is with the Triton SDK docker
24+ container available on the [ NVIDIA GPU Cloud
25+ Catalog] ( https://ngc.nvidia.com/catalog/containers/nvidia:tritonserver ) . You can
26+ pull and run the SDK container with the following commands:
27+
28+ ```
29+ $ docker pull nvcr.io/nvidia/tritonserver:21.08-py3-sdk
30+ ```
31+
32+ If you are not planning to run Model Analyzer with
33+ ` --triton-launch-mode=docker ` , You can run the SDK container with the following
34+ command:
35+
36+ ```
37+ $ docker run -it --gpus all --net=host nvcr.io/nvidia/tritonserver:21.08-py3-sdk
38+ ```
39+
40+ You will need to build and install the Triton server binary inside the SDK
41+ container. See the Triton [ Installation
42+ docs] ( https://github.com/triton-inference-server/server/blob/main/docs/build.md )
43+ for more details.
44+
45+ If you intend to use ` --triton-launch-mode=docker ` , which is recommended with
46+ this method of using Model Analyzer, you will need to mount the
47+ following:
48+ * ` -v /var/run/docker.sock:/var/run/docker.sock ` allows running docker
49+ containers as sibling containers from inside the Triton SDK container.
50+ Model Analyzer will require this if run with
51+ ` --triton-launch-mode=docker ` .
52+ * ` -v <path-to-output-model-repo>:<path-to-output-model-repo> ` The
53+ *** absolute*** path to the directory where the output model repository
54+ will be located (i.e. parent directory of the output model repository).
55+ This is so that the launched Triton container has access to the model
56+ config variants that Model Analyzer creates.
57+
58+ ```
59+ $ docker run -it --gpus all \
60+ -v /var/run/docker.sock:/var/run/docker.sock \
61+ -v <path-to-output-model-repo>:<path-to-output-model-repo> \
62+ --net=host nvcr.io/nvidia/tritonserver:21.08-py3-sdk
63+ ```
64+
65+ Model Analyzer uses ` pdfkit ` for report generation. If you are running Model
66+ Analyzer inside the Triton SDK container, then you will need to download
67+ ` wkhtmltopdf ` .
68+
69+ ```
70+ $ sudo apt-get update && sudo apt-get install wkhtmltopdf
71+ ```
72+
73+ Once you do this, Model Analyzer will able to use ` pdfkit ` to generate reports.
74+
75+ ## Building the Dockerfile
76+
77+ You can also build the Model Analyzer's dockerfile yourself. First, clone the
78+ Model Analyzer's git repository, then build the docker image. This is the
79+ recommended installation method if you mainly intend to use
80+ ` --triton-launch-mode=local ` , as all the dependencies will be available.
81+
82+ ```
83+ $ git clone https://github.com/triton-inference-server/model_analyzer
84+ $ docker build --pull -t model-analyzer .
85+ ```
86+
87+ The above command will pull all the containers that Model Analyzer needs to run.
88+ The Model Analyzer's Dockerfile bases the container on the latest ` tritonserver `
89+ containers from NGC. Now you can run the container with:
90+
91+ ```
92+ $ docker run -it --rm --gpus all \
93+ -v /var/run/docker.sock:/var/run/docker.sock \
94+ -v <path-to-triton-model-repository>:<path-to-triton-model-repository> \
95+ -v <path-to-output-model-repo>:<path-to-output-model-repo> \
96+ --net=host model-analyzer
97+
98+ root@hostname:/opt/triton-model-analyzer#
99+ ```
100+
101+ ## Using ` pip3 `
102+
103+ You can install pip using:
104+ ```
105+ $ sudo apt-get update && sudo apt-get install python3-pip
106+ ```
107+
108+ Model analyzer can be installed with:
109+ ```
110+ $ pip3 install triton-model-analyzer
111+ ```
112+
113+ If you encounter any errors installing dependencies like ` numba ` , make sure that
114+ you have the latest version of ` pip ` using:
115+
116+ ```
117+ $ pip3 install --upgrade pip
118+ ```
119+
120+ You can then try installing model analyzer again.
121+
122+ If you are using this approach you need to install DCGM on your machine.
123+
124+ For installing DCGM on Ubuntu 20.04 you can use the following commands:
125+ ```
126+ $ export DCGM_VERSION=2.0.13
127+ $ wget -q https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/datacenter-gpu-manager_${DCGM_VERSION}_amd64.deb && \
128+ dpkg -i datacenter-gpu-manager_${DCGM_VERSION}_amd64.deb
129+ ```
130+
131+ ## Building from source
132+
133+ To build model analyzer form source, you'll need to install the same
134+ dependencies (tritonclient and DCGM) mentioned in the "Using pip section". After
135+ that, you can use the following commands:
136+
137+ ```
138+ $ git clone https://github.com/triton-inference-server/model_analyzer
139+ $ cd model_analyzer
140+ $ ./build_wheel.sh <path to perf_analyzer> true
141+ ```
142+
143+ In the final command above we are building the triton-model-analyzer wheel. You
144+ will need to provide the ` build_wheel.sh ` script with two arguments. The first
145+ is the path to the ` perf_analyzer ` binary that you would like Model Analyzer to
146+ use. The second is whether you want this wheel to be linux specific. Currently,
147+ this argument must be set to ` true ` as perf analyzer is supported only on linux.
148+ This will create a wheel file in the ` wheels ` directory named
149+ ` triton-model-analyzer-<version>-py3-none-manylinux1_x86_64.whl ` . We can now
150+ install this with:
151+
152+ ```
153+ $ pip3 install wheels/triton-model-analyzer-*.whl
154+ ```
155+
156+ After these steps, ` model-analyzer ` executable should be available in ` $PATH ` .
151157
152158** Notes:**
153159* Triton Model Analyzer supports all the GPUs supported by the DCGM library. See
0 commit comments