update docs

irexyc · irexyc · commit 121d9bac2b7d · 2023-05-18T15:39:52.000+08:00
diff --git a/docs/en/02-how-to-run/triton_server.md b/docs/en/02-how-to-run/triton_server.md
@@ -0,0 +1,38 @@
+# Model serving
+
+MMDeploy provides model server deployment based on Triton Inference Server.
+
+## Supported tasks
+
+The following tasks are currently supported:
+
+- [image-classification](../../../demo/triton/image-classification/README.md)
+- [instance-segmentation](../../../demo/triton/instance-segmentation)
+- [keypoint-detection](../../../demo/triton/keypoint-detection)
+- [object-detection](../../../demo/triton/object-detection)
+- [oriented-object-detection](../../../demo/triton/oriented-object-detection)
+- [semantic-segmentation](../../../demo/triton/semantic-segmentation)
+- [text-detection](../../../demo/triton/text-detection)
+- [text-recognition](../../../demo/triton/text-recognition)
+- [text-ocr](../../../demo/triton/text-ocr)
+
+## Run Triton
+
+In order to use Triton Inference Server, we need:
+
+1. Compile MMDeploy Triton Backend
+2. Prepare the model repository (including model files, and configuration files)
+
+### Compile MMDeploy Triton Backend
+
+a) Using Docker images
+
+For ease of use, we provide a Docker image to support the deployment of models converted by MMDeploy. The image supports Tensorrt and ONNX Runtime as backends. If you need other backends, you can choose build from source.
+
+b) Build from source
+
+You can refer [build from source](../01-how-to-build/build_from_source.md) to build MMDeploy. In order to build MMDeploy Triton Backend, you need to add `-DTRITON_MMDEPLOY_BACKEND=ON` to cmake configure command. By default, the latest version of Triton Backend is used. If you want to use an older version of Triton Backend, you can add `-DTRITON_TAG=r22.12` to the cmake configure command.
+
+### Prepare the model repository
+
+Triton Inference Server has its own model description rules. Therefore the models converted through `tools/deploy.py ... --dump-info` need to be formatted to make Triton load correctly. We have prepared templates for each task. You can use `demo/triton/to_triton_model.py` script for model formatting. For complete samples, please refer to the description of each demo.
diff --git a/docs/en/get_started.md b/docs/en/get_started.md
@@ -330,6 +330,10 @@ We'll talk about them more in our next release.
 
 If you want to fuse preprocess for acceleration，please refer to this [doc](./02-how-to-run/fuse_transform.md)
 
+## Model serving (triton)
+
+For server-side deployment, please read [model serving](02-how-to-run/triton_server.md) for more details.
+
 ## Evaluate Model
 
 You can test the performance of deployed model using `tool/test.py`. For example,
diff --git a/docs/en/index.rst b/docs/en/index.rst
@@ -27,6 +27,7 @@ You can switch between Chinese and English documents in the lower-left corner of
    02-how-to-run/profile_model.md
    02-how-to-run/quantize_model.md
    02-how-to-run/useful_tools.md
+   02-how-to-run/triton_server.md
 
 .. toctree::
    :maxdepth: 1
diff --git a/docs/zh_cn/02-how-to-run/triton_server.md b/docs/zh_cn/02-how-to-run/triton_server.md
@@ -0,0 +1,38 @@
+# 如何进行服务端部署
+
+模型转换后，MMDeploy 提供基于 Triton Inference Server 的模型服务端部署。
+
+## 支持的任务
+
+目前支持以下任务：
+
+- [image-classification](../../../demo/triton/image-classification/README.md)
+- [instance-segmentation](../../../demo/triton/instance-segmentation)
+- [keypoint-detection](../../../demo/triton/keypoint-detection)
+- [object-detection](../../../demo/triton/object-detection)
+- [oriented-object-detection](../../../demo/triton/oriented-object-detection)
+- [semantic-segmentation](../../../demo/triton/semantic-segmentation)
+- [text-detection](../../../demo/triton/text-detection)
+- [text-recognition](../../../demo/triton/text-recognition)
+- [text-ocr](../../../demo/triton/text-ocr)
+
+## 如何部署 Triton 服务
+
+为了使用 Triton Inference Server, 我们需要：
+
+1. 编译 MMDeploy Triton Backend
+2. 准备模型库(包括模型文件，以及配置文件)
+
+### 编译 MMDeploy Triton Backend
+
+a) 使用 Docker 镜像
+
+为了方便使用，我们提供了 Docker 镜像，支持对通过 MMDeploy 转换的模型进行部署。镜像支持 Tensorrt 以及 ONNX Runtime 作为后端。若需要其他后端，可选择从源码进行编译。
+
+b) 从源码编译
+
+从源码编译 MMDeploy 的方式可参考[源码手动安装](../01-how-to-build/build_from_source.md)，要编译 MMDeploy Triton Backend，需要在编译命令中添加：`-DTRITON_MMDEPLOY_BACKEND=ON`。默认使用最新版本的 Triton Backend，若要使用旧版本的 Triton Backend，可在编译命令中添加`-DTRITON_TAG=r22.12`
+
+### 准备模型库
+
+Triton Inference Server 有一套自己的模型描述规则，通过 `tools/deploy.py ... --dump-info ` 转换的模型需要调整格式才能使 Triton 正确加载，我们为各任务准备了模版，可以运行 `demo/triton/to_triton_model.py` 转换脚本格式进行修改。完整的样例可参考各个 demo 的说明。
diff --git a/docs/zh_cn/get_started.md b/docs/zh_cn/get_started.md
@@ -331,6 +331,10 @@ target_link_libraries(${name} PRIVATE mmdeploy ${OpenCV_LIBS})
 
 若要对预处理进行加速，请查阅[此处](./02-how-to-run/fuse_transform.md)
 
+## 服务端部署 (triton)
+
+若需要进行服务端部署，请阅读 [服务端部署](02-how-to-run/triton_server.md) 了解更多细节
+
 ## 模型精度评估
 
 为了测试部署模型的精度，推理效率，我们提供了 `tools/test.py` 来帮助完成相关工作。以上文中的部署模型为例：
diff --git a/docs/zh_cn/index.rst b/docs/zh_cn/index.rst
@@ -27,6 +27,7 @@
    02-how-to-run/profile_model.md
    02-how-to-run/quantize_model.md
    02-how-to-run/useful_tools.md
+   02-how-to-run/triton_server.md
 
 .. toctree::
    :maxdepth: 1