Skip to content

Commit 121d9ba

Browse files
committed
update docs
1 parent bc9db63 commit 121d9ba

File tree

6 files changed

+86
-0
lines changed

6 files changed

+86
-0
lines changed
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
# Model serving
2+
3+
MMDeploy provides model server deployment based on Triton Inference Server.
4+
5+
## Supported tasks
6+
7+
The following tasks are currently supported:
8+
9+
- [image-classification](../../../demo/triton/image-classification/README.md)
10+
- [instance-segmentation](../../../demo/triton/instance-segmentation)
11+
- [keypoint-detection](../../../demo/triton/keypoint-detection)
12+
- [object-detection](../../../demo/triton/object-detection)
13+
- [oriented-object-detection](../../../demo/triton/oriented-object-detection)
14+
- [semantic-segmentation](../../../demo/triton/semantic-segmentation)
15+
- [text-detection](../../../demo/triton/text-detection)
16+
- [text-recognition](../../../demo/triton/text-recognition)
17+
- [text-ocr](../../../demo/triton/text-ocr)
18+
19+
## Run Triton
20+
21+
In order to use Triton Inference Server, we need:
22+
23+
1. Compile MMDeploy Triton Backend
24+
2. Prepare the model repository (including model files, and configuration files)
25+
26+
### Compile MMDeploy Triton Backend
27+
28+
a) Using Docker images
29+
30+
For ease of use, we provide a Docker image to support the deployment of models converted by MMDeploy. The image supports Tensorrt and ONNX Runtime as backends. If you need other backends, you can choose build from source.
31+
32+
b) Build from source
33+
34+
You can refer [build from source](../01-how-to-build/build_from_source.md) to build MMDeploy. In order to build MMDeploy Triton Backend, you need to add `-DTRITON_MMDEPLOY_BACKEND=ON` to cmake configure command. By default, the latest version of Triton Backend is used. If you want to use an older version of Triton Backend, you can add `-DTRITON_TAG=r22.12` to the cmake configure command.
35+
36+
### Prepare the model repository
37+
38+
Triton Inference Server has its own model description rules. Therefore the models converted through `tools/deploy.py ... --dump-info` need to be formatted to make Triton load correctly. We have prepared templates for each task. You can use `demo/triton/to_triton_model.py` script for model formatting. For complete samples, please refer to the description of each demo.

docs/en/get_started.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -330,6 +330,10 @@ We'll talk about them more in our next release.
330330

331331
If you want to fuse preprocess for acceleration,please refer to this [doc](./02-how-to-run/fuse_transform.md)
332332

333+
## Model serving (triton)
334+
335+
For server-side deployment, please read [model serving](02-how-to-run/triton_server.md) for more details.
336+
333337
## Evaluate Model
334338

335339
You can test the performance of deployed model using `tool/test.py`. For example,

docs/en/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ You can switch between Chinese and English documents in the lower-left corner of
2727
02-how-to-run/profile_model.md
2828
02-how-to-run/quantize_model.md
2929
02-how-to-run/useful_tools.md
30+
02-how-to-run/triton_server.md
3031

3132
.. toctree::
3233
:maxdepth: 1
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
# 如何进行服务端部署
2+
3+
模型转换后,MMDeploy 提供基于 Triton Inference Server 的模型服务端部署。
4+
5+
## 支持的任务
6+
7+
目前支持以下任务:
8+
9+
- [image-classification](../../../demo/triton/image-classification/README.md)
10+
- [instance-segmentation](../../../demo/triton/instance-segmentation)
11+
- [keypoint-detection](../../../demo/triton/keypoint-detection)
12+
- [object-detection](../../../demo/triton/object-detection)
13+
- [oriented-object-detection](../../../demo/triton/oriented-object-detection)
14+
- [semantic-segmentation](../../../demo/triton/semantic-segmentation)
15+
- [text-detection](../../../demo/triton/text-detection)
16+
- [text-recognition](../../../demo/triton/text-recognition)
17+
- [text-ocr](../../../demo/triton/text-ocr)
18+
19+
## 如何部署 Triton 服务
20+
21+
为了使用 Triton Inference Server, 我们需要:
22+
23+
1. 编译 MMDeploy Triton Backend
24+
2. 准备模型库(包括模型文件,以及配置文件)
25+
26+
### 编译 MMDeploy Triton Backend
27+
28+
a) 使用 Docker 镜像
29+
30+
为了方便使用,我们提供了 Docker 镜像,支持对通过 MMDeploy 转换的模型进行部署。镜像支持 Tensorrt 以及 ONNX Runtime 作为后端。若需要其他后端,可选择从源码进行编译。
31+
32+
b) 从源码编译
33+
34+
从源码编译 MMDeploy 的方式可参考[源码手动安装](../01-how-to-build/build_from_source.md),要编译 MMDeploy Triton Backend,需要在编译命令中添加:`-DTRITON_MMDEPLOY_BACKEND=ON`。默认使用最新版本的 Triton Backend,若要使用旧版本的 Triton Backend,可在编译命令中添加`-DTRITON_TAG=r22.12`
35+
36+
### 准备模型库
37+
38+
Triton Inference Server 有一套自己的模型描述规则,通过 `tools/deploy.py ... --dump-info ` 转换的模型需要调整格式才能使 Triton 正确加载,我们为各任务准备了模版,可以运行 `demo/triton/to_triton_model.py` 转换脚本格式进行修改。完整的样例可参考各个 demo 的说明。

docs/zh_cn/get_started.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -331,6 +331,10 @@ target_link_libraries(${name} PRIVATE mmdeploy ${OpenCV_LIBS})
331331

332332
若要对预处理进行加速,请查阅[此处](./02-how-to-run/fuse_transform.md)
333333

334+
## 服务端部署 (triton)
335+
336+
若需要进行服务端部署,请阅读 [服务端部署](02-how-to-run/triton_server.md) 了解更多细节
337+
334338
## 模型精度评估
335339

336340
为了测试部署模型的精度,推理效率,我们提供了 `tools/test.py` 来帮助完成相关工作。以上文中的部署模型为例:

docs/zh_cn/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@
2727
02-how-to-run/profile_model.md
2828
02-how-to-run/quantize_model.md
2929
02-how-to-run/useful_tools.md
30+
02-how-to-run/triton_server.md
3031

3132
.. toctree::
3233
:maxdepth: 1

0 commit comments

Comments
 (0)