Skip to content

Commit 0659dc9

Browse files
committed
docs: Additional Improvements
Use yaml instead of proto for code blocks to improve readability. Add example instance_group configuration.
1 parent 1fd9392 commit 0659dc9

File tree

1 file changed

+22
-13
lines changed

1 file changed

+22
-13
lines changed

README.md

Lines changed: 22 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -135,10 +135,12 @@ model_repository/
135135

136136
The `model.pt` is the TorchScript model file.
137137

138-
### Parameters
138+
## Configuration
139139

140140
Triton exposes some flags to control the execution mode of the TorchScript models through the `Parameters` section of the model's `config.pbtxt` file.
141141

142+
### Parameters
143+
142144
* `DISABLE_OPTIMIZED_EXECUTION`:
143145
Boolean flag to disable the optimized execution of TorchScript models.
144146
By default, the optimized execution is always enabled.
@@ -154,7 +156,7 @@ Triton exposes some flags to control the execution mode of the TorchScript model
154156

155157
The section of model config file specifying this parameter will look like:
156158

157-
```proto
159+
```yaml
158160
parameters: {
159161
key: "DISABLE_OPTIMIZED_EXECUTION"
160162
value: { string_value: "true" }
@@ -173,7 +175,7 @@ Triton exposes some flags to control the execution mode of the TorchScript model
173175

174176
To enable inference mode, use the configuration example below:
175177

176-
```proto
178+
```yaml
177179
parameters: {
178180
key: "INFERENCE_MODE"
179181
value: { string_value: "true" }
@@ -193,7 +195,7 @@ Triton exposes some flags to control the execution mode of the TorchScript model
193195

194196
To disable cuDNN, use the configuration example below:
195197

196-
```proto
198+
```yaml
197199
parameters: {
198200
key: "DISABLE_CUDNN"
199201
value: { string_value: "true" }
@@ -208,7 +210,7 @@ Triton exposes some flags to control the execution mode of the TorchScript model
208210

209211
To enable weight sharing, use the configuration example below:
210212

211-
```proto
213+
```yaml
212214
parameters: {
213215
key: "ENABLE_WEIGHT_SHARING"
214216
value: { string_value: "true" }
@@ -226,7 +228,7 @@ Triton exposes some flags to control the execution mode of the TorchScript model
226228

227229
To enable cleaning of the CUDA cache after every execution, use the configuration example below:
228230

229-
```proto
231+
```yaml
230232
parameters: {
231233
key: "ENABLE_CACHE_CLEANING"
232234
value: { string_value: "true" }
@@ -249,7 +251,7 @@ Triton exposes some flags to control the execution mode of the TorchScript model
249251
250252
To set the inter-op thread count, use the configuration example below:
251253

252-
```proto
254+
```yaml
253255
parameters: {
254256
key: "INTER_OP_THREAD_COUNT"
255257
value: { string_value: "1" }
@@ -275,7 +277,7 @@ Triton exposes some flags to control the execution mode of the TorchScript model
275277
276278
To set the intra-op thread count, use the configuration example below:
277279

278-
```proto
280+
```yaml
279281
parameters: {
280282
key: "INTRA_OP_THREAD_COUNT"
281283
value: { string_value: "1" }
@@ -291,9 +293,7 @@ Triton exposes some flags to control the execution mode of the TorchScript model
291293

292294
`ENABLE_JIT_PROFILING`
293295

294-
### Support
295-
296-
#### Model Instance Group Kind
296+
### Model Instance Group Kind
297297

298298
The PyTorch backend supports the following kinds of
299299
[Model Instance Groups](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md#instance-groups)
@@ -319,6 +319,15 @@ where the input tensors are placed as follows:
319319
> [!IMPORTANT]
320320
> If a device is not specified in the model, the backend uses the first available GPU device.
321321
322+
To set the model instance group, use the configuration example below:
323+
324+
```yaml
325+
instance_group {
326+
count: 2
327+
kind: KIND_GPU
328+
}
329+
```
330+
322331
### Customization
323332

324333
The following PyTorch settings may be customized by setting parameters on the
@@ -347,7 +356,7 @@ The following PyTorch settings may be customized by setting parameters on the
347356

348357
For example:
349358

350-
```proto
359+
```yaml
351360
parameters: {
352361
key: "NUM_THREADS"
353362
value: { string_value: "4" }
@@ -358,7 +367,7 @@ parameters: {
358367
}
359368
```
360369

361-
### Important Notes
370+
## Important Notes
362371

363372
* The execution of PyTorch model on GPU is asynchronous in nature.
364373
See

0 commit comments

Comments
 (0)