Skip to content

Commit 8f27edf

Browse files
committed
docs: Additional Improvements
Use yaml instead of proto for code blocks to improve readability. Add example instance_group configuration.
1 parent 43000fe commit 8f27edf

File tree

1 file changed

+22
-13
lines changed

1 file changed

+22
-13
lines changed

README.md

Lines changed: 22 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -135,10 +135,12 @@ model_repository/
135135

136136
The `model.pt` is the TorchScript model file.
137137

138-
### Parameters
138+
## Configuration
139139

140140
Triton exposes some flags to control the execution mode of the TorchScript models through the `Parameters` section of the model's `config.pbtxt` file.
141141

142+
### Parameters
143+
142144
* `DISABLE_OPTIMIZED_EXECUTION`:
143145
Boolean flag to disable the optimized execution of TorchScript models.
144146
By default, the optimized execution is always enabled.
@@ -154,7 +156,7 @@ Triton exposes some flags to control the execution mode of the TorchScript model
154156

155157
The section of model config file specifying this parameter will look like:
156158

157-
```proto
159+
```yaml
158160
parameters: {
159161
key: "DISABLE_OPTIMIZED_EXECUTION"
160162
value: { string_value: "true" }
@@ -173,7 +175,7 @@ Triton exposes some flags to control the execution mode of the TorchScript model
173175

174176
To enable inference mode, use the configuration example below:
175177

176-
```proto
178+
```yaml
177179
parameters: {
178180
key: "INFERENCE_MODE"
179181
value: { string_value: "true" }
@@ -193,7 +195,7 @@ Triton exposes some flags to control the execution mode of the TorchScript model
193195

194196
To disable cuDNN, use the configuration example below:
195197

196-
```proto
198+
```yaml
197199
parameters: {
198200
key: "DISABLE_CUDNN"
199201
value: { string_value: "true" }
@@ -208,7 +210,7 @@ Triton exposes some flags to control the execution mode of the TorchScript model
208210

209211
To enable weight sharing, use the configuration example below:
210212

211-
```proto
213+
```yaml
212214
parameters: {
213215
key: "ENABLE_WEIGHT_SHARING"
214216
value: { string_value: "true" }
@@ -226,7 +228,7 @@ Triton exposes some flags to control the execution mode of the TorchScript model
226228

227229
To enable cleaning of the CUDA cache after every execution, use the configuration example below:
228230

229-
```proto
231+
```yaml
230232
parameters: {
231233
key: "ENABLE_CACHE_CLEANING"
232234
value: { string_value: "true" }
@@ -249,7 +251,7 @@ Triton exposes some flags to control the execution mode of the TorchScript model
249251
250252
To set the inter-op thread count, use the configuration example below:
251253

252-
```proto
254+
```yaml
253255
parameters: {
254256
key: "INTER_OP_THREAD_COUNT"
255257
value: { string_value: "1" }
@@ -270,7 +272,7 @@ Triton exposes some flags to control the execution mode of the TorchScript model
270272
271273
To set the intra-op thread count, use the configuration example below:
272274

273-
```proto
275+
```yaml
274276
parameters: {
275277
key: "INTRA_OP_THREAD_COUNT"
276278
value: { string_value: "1" }
@@ -286,9 +288,7 @@ Triton exposes some flags to control the execution mode of the TorchScript model
286288

287289
`ENABLE_JIT_PROFILING`
288290

289-
### Support
290-
291-
#### Model Instance Group Kind
291+
### Model Instance Group Kind
292292

293293
The PyTorch backend supports the following kinds of
294294
[Model Instance Groups](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md#instance-groups)
@@ -314,6 +314,15 @@ where the input tensors are placed as follows:
314314
> [!IMPORTANT]
315315
> If a device is not specified in the model, the backend uses the first available GPU device.
316316
317+
To set the model instance group, use the configuration example below:
318+
319+
```yaml
320+
instance_group {
321+
count: 2
322+
kind: KIND_GPU
323+
}
324+
```
325+
317326
### Customization
318327

319328
The following PyTorch settings may be customized by setting parameters on the
@@ -342,7 +351,7 @@ The following PyTorch settings may be customized by setting parameters on the
342351

343352
For example:
344353

345-
```proto
354+
```yaml
346355
parameters: {
347356
key: "NUM_THREADS"
348357
value: { string_value: "4" }
@@ -353,7 +362,7 @@ parameters: {
353362
}
354363
```
355364

356-
### Important Notes
365+
## Important Notes
357366

358367
* The execution of PyTorch model on GPU is asynchronous in nature.
359368
See

0 commit comments

Comments
 (0)