Skip to content

Commit 60ebbea

Browse files
update multidevices (#1100)
1 parent 14382c6 commit 60ebbea

File tree

3 files changed

+199
-15
lines changed

3 files changed

+199
-15
lines changed

deploy/python_infer/base.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ class Predictor:
3838
Args:
3939
pdmodel_path (Optional[str]): Path to the PaddlePaddle model file. Defaults to None.
4040
pdiparams_path (Optional[str]): Path to the PaddlePaddle model parameters file. Defaults to None.
41-
device (Literal["gpu", "cpu", "npu", "xpu", "sdaa"], optional): Device to use for inference. Defaults to "cpu".
41+
device (Literal["cpu", "gpu", "npu", "xpu", "sdaa"], optional): Device to use for inference. Defaults to "cpu".
4242
engine (Literal["native", "tensorrt", "onnx", "mkldnn"], optional): Inference engine to use. Defaults to "native".
4343
precision (Literal["fp32", "fp16", "int8"], optional): Precision to use for inference. Defaults to "fp32".
4444
onnx_path (Optional[str], optional): Path to the ONNX model file. Defaults to None.
@@ -54,7 +54,7 @@ def __init__(
5454
pdmodel_path: Optional[str] = None,
5555
pdiparams_path: Optional[str] = None,
5656
*,
57-
device: Literal["gpu", "cpu", "npu", "xpu", "sdaa"] = "cpu",
57+
device: Literal["cpu", "gpu", "npu", "xpu", "sdaa"] = "cpu",
5858
engine: Literal["native", "tensorrt", "onnx", "mkldnn"] = "native",
5959
precision: Literal["fp32", "fp16", "int8"] = "fp32",
6060
onnx_path: Optional[str] = None,
@@ -214,7 +214,7 @@ def _create_onnx_predictor(
214214
return predictor, config
215215

216216
def _check_device(self, device: str):
217-
if device not in ["gpu", "cpu", "npu", "xpu"]:
217+
if device not in ["cpu", "gpu", "npu", "xpu"]:
218218
raise ValueError(
219219
"Inference only supports 'gpu', 'cpu', 'npu' and 'xpu' devices, "
220220
f"but got {device}."

docs/zh/multi_device.md

Lines changed: 195 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -88,27 +88,211 @@
8888
| 地震波形反演 | [VelocityGAN 地震波形反演](./examples/velocity_gan.md) || | | |
8989
| 交通预测 | [TGCN 交通流量预测](./examples/tgcn.md) || | | |
9090

91-
## 2. 贡献指南
91+
## 2. 运行指南
9292

93-
我们在公开的案例文档开头提供了基于 GPU 训练的参考精度和对应的预训练模型权重,如果需要在指定的硬件上运行,可以参考如下步骤:
93+
针对 PaddleScience 已支持的硬件,我们为每个硬件提供了一个运行示例,以[1D 欧拉梁变形](./examples/euler_beam.md)为例。
9494

95-
1. 在案例开头位置添加一行代码,将飞桨运行设备设置为当前硬件设备
95+
!!! note
9696

97-
``` py hl_lines="3"
98-
import paddle
97+
请确保你已经在你的环境中正确安装了计算硬件对应的 PaddlePaddle,否则请参考 [PaddleCustomDevice](https://github.com/PaddlePaddle/PaddleCustomDevice),将你的硬件代码接入到飞桨中。
9998

100-
paddle.set_device("your_device_name")
99+
=== "NVIDIA"
101100

102-
# 原案例代码
101+
``` sh
102+
# 安装 PaddleScience
103+
git clone -b develop https://github.com/PaddlePaddle/PaddleScience.git
104+
# 若 github clone 速度比较慢,可以使用 gitee clone
105+
# git clone -b develop https://gitee.com/paddlepaddle/PaddleScience.git
106+
107+
cd PaddleScience
108+
109+
# install paddlesci with editable mode
110+
python -m pip install -e . -i https://pypi.tuna.tsinghua.edu.cn/simple
111+
cd examples/euler_beam
112+
```
113+
114+
=== "模型训练命令"
115+
116+
``` sh
117+
python euler_beam.py
118+
```
119+
120+
=== "模型评估命令"
121+
122+
``` sh
123+
python euler_beam.py mode=eval EVAL.pretrained_model_path=https://paddle-org.bj.bcebos.com/paddlescience/models/euler_beam/euler_beam_pretrained.pdparams
124+
```
125+
126+
=== "模型导出命令"
127+
128+
``` sh
129+
python euler_beam.py mode=export
130+
```
131+
132+
=== "模型推理命令"
133+
134+
``` sh
135+
python euler_beam.py mode=infer
136+
```
137+
138+
=== "海光"
139+
140+
``` sh
141+
# 安装 PaddleScience
142+
git clone -b develop https://github.com/PaddlePaddle/PaddleScience.git
143+
# 若 github clone 速度比较慢,可以使用 gitee clone
144+
# git clone -b develop https://gitee.com/paddlepaddle/PaddleScience.git
145+
146+
cd PaddleScience
147+
148+
# install paddlesci with editable mode
149+
python -m pip install -e . -i https://pypi.tuna.tsinghua.edu.cn/simple
150+
cd examples/euler_beam
151+
```
152+
153+
=== "模型训练命令"
154+
155+
``` sh
156+
python euler_beam.py
157+
```
158+
159+
=== "模型评估命令"
160+
161+
``` sh
162+
# 测试自己训练的模型
163+
python euler_beam.py mode=eval EVAL.pretrained_model_path=$YOUR_MODEL_PATH
164+
# 测试官方提供的预训练模型
165+
python euler_beam.py mode=eval EVAL.pretrained_model_path=https://paddle-org.bj.bcebos.com/paddlescience/models/euler_beam/euler_beam_pretrained.pdparams
166+
```
167+
168+
=== "模型导出命令"
169+
170+
``` sh
171+
python euler_beam.py mode=export
172+
```
173+
174+
=== "模型推理命令"
175+
176+
``` sh
177+
python euler_beam.py mode=infer
178+
```
179+
180+
=== "太初"
181+
182+
``` sh
183+
# 安装 PaddleScience
184+
git clone -b develop https://github.com/PaddlePaddle/PaddleScience.git
185+
# 若 github clone 速度比较慢,可以使用 gitee clone
186+
# git clone -b develop https://gitee.com/paddlepaddle/PaddleScience.git
187+
188+
cd PaddleScience
189+
190+
# install paddlesci with editable mode
191+
python -m pip install -e . -i https://pypi.tuna.tsinghua.edu.cn/simple
192+
cd examples/euler_beam
103193
```
104194

105-
2. 按照案例文档步骤,准备好数据集,在指定硬件上进行全量训练,保存训练日志,记录最佳模型精度以及最佳模型权重,这些内容一般会在训练过程中,自动保存在案例文件夹下
195+
=== "模型训练命令"
196+
197+
``` sh
198+
python euler_beam.py device=sdaa
199+
```
200+
201+
=== "模型评估命令"
202+
203+
``` sh
204+
# 测试自己训练的模型
205+
python euler_beam.py device=sdaa mode=eval EVAL.pretrained_model_path=$YOUR_MODEL_PATH
206+
# 测试官方提供的预训练模型
207+
python euler_beam.py device=sdaa mode=eval EVAL.pretrained_model_path=https://paddle-org.bj.bcebos.com/paddlescience/models/euler_beam/euler_beam_pretrained.pdparams
208+
```
209+
210+
=== "模型导出命令"
211+
212+
``` sh
213+
python euler_beam.py mode=export
214+
```
215+
216+
=== "模型推理命令"
217+
218+
``` sh
219+
python euler_beam.py mode=infer INFER.device=sdaa
220+
```
221+
222+
=== "沐曦"
223+
224+
``` sh
225+
# 安装 PaddleScience
226+
git clone -b develop https://github.com/PaddlePaddle/PaddleScience.git
227+
# 若 github clone 速度比较慢,可以使用 gitee clone
228+
# git clone -b develop https://gitee.com/paddlepaddle/PaddleScience.git
229+
230+
cd PaddleScience
231+
232+
# install paddlesci with editable mode
233+
python -m pip install -e . -i https://pypi.tuna.tsinghua.edu.cn/simple
234+
cd examples/euler_beam
235+
```
236+
237+
=== "模型训练命令"
238+
239+
``` sh
240+
TODO
241+
```
242+
243+
=== "模型评估命令"
244+
245+
``` sh
246+
# 测试自己训练的模型
247+
TODO
248+
# 测试官方提供的预训练模型
249+
TODO
250+
```
251+
252+
=== "模型导出命令"
253+
254+
``` sh
255+
TODO
256+
```
257+
258+
=== "模型推理命令"
259+
260+
``` sh
261+
TODO
262+
```
263+
264+
## 3. 贡献指南
265+
266+
我们在公开的案例文档开头提供了基于 NVIDIA CUDA 训练的参考精度和对应的预训练模型权重,如果需要在指定的硬件上运行,可以参考如下步骤:
267+
268+
1. 如果你的硬件类型尚未接入 PaddlePaddle,则可以参考 [PaddleCustomDevice](https://github.com/PaddlePaddle/PaddleCustomDevice) 官方文档,接入飞桨框架。如果你的硬件类型已接入 PaddlePaddle,但尚未添加到 PaddleScience 的硬件支持列表中,请在 [ppsci/utils/config.py](https://github.com/PaddlePaddle/PaddleScience/blob/develop/ppsci/utils/config.py#L215)[deploy/python_infer/base.py](https://github.com/PaddlePaddle/PaddleScience/blob/develop/deploy/python_infer/base.py#L217) 中添加你的硬件类型。
269+
270+
2. 按照案例文档给出的步骤,准备好必要的数据集。
271+
272+
3. 如果模型文档中提供了模型训练命令,则需要在你的硬件上进行全量训练,保存训练日志,记录最佳模型精度以及最佳模型权重,这些内容一般会在训练过程中,自动保存在案例文件夹下。
273+
274+
4. 如果模型文档中提供了模型评估命令,则需要在你的硬件上对第三步所保存的最佳模型进行精度评估,保存评估日志,记录评估精度,这些内容一般会在评估过程中,自动保存在案例文件夹下。
275+
276+
!!! note
277+
278+
对于模型全量训练精度,默认要求最佳精度与 NVIDIA CUDA 精度对齐。具体地,如果案例精度指标为相对误差(如 L2 相对误差),则指标不能超过参考值 ± 0.5%,如果案例精度指标为 MSE/MAE 一类的误差,则与参考值应保持在同一量级。
106279

107-
3. 如果模型文档中提供了模型导出和推理命令,请按照模型导出和推理命令,验证在新硬件上模型导出和推理是否能够正常执行并对齐 GPU 的推理结果
280+
5. 如果模型文档中提供了模型导出和推理命令,请按照模型导出和推理命令,验证在新硬件上模型导出和推理是否能够正常执行并对齐 CUDA 的推理结果
108281

109-
4. 上述步骤完成后,可以在本文档(`docs/zh/multi_device.md`)的表格中,给对应模型在指定硬件上添加支持信息(✅),然后提交 PR 到 PaddleScience
282+
6. 上述步骤完成后,可以在 [1. 硬件支持列表](#1) 的表格中,给对应模型添加你的硬件支持信息(✅),然后提交 PR 到 PaddleScience。你的 PR 应该至少包括以下内容:
283+
*[2. 运行指南](#2) 中添加基于你的硬件环境使用模型的运行说明文档
284+
* 训练保存的最佳模型权重文件(`.pdparams` 文件)
285+
* 训练/评估等运行日志(`.log` 文件)
286+
* 验证模型精度所用到的软件版本,包括但不限于:
287+
* PaddlePaddle 版本
288+
* PaddleCustomDevice 版本(如果有)
289+
* 验证模型精度所用到的机器环境,包括但不限于:
290+
* 芯片型号
291+
* 系统版本
292+
* 硬件驱动版本
293+
* 算子库版本等
110294

111-
## 3. 更多文档
295+
## 4. 更多文档
112296

113297
更多关于飞桨多硬件适配和使用的相关文档,可以参考:
114298

ppsci/utils/config.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -212,7 +212,7 @@ class InferConfig(BaseModel):
212212
pdmodel_path: Optional[str] = None
213213
pdiparams_path: Optional[str] = None
214214
onnx_path: Optional[str] = None
215-
device: Literal["gpu", "cpu", "npu", "xpu", "sdaa"] = "cpu"
215+
device: Literal["cpu", "gpu", "npu", "xpu", "sdaa"] = "cpu"
216216
engine: Literal["native", "tensorrt", "onnx", "mkldnn"] = "native"
217217
precision: Literal["fp32", "fp16", "int8"] = "fp32"
218218
ir_optim: bool = True

0 commit comments

Comments
 (0)