Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
环境配置
基础环境配置
镜像启动
建议使用镜像安装,当然你也可以在裸机上安装。
首先根据自己的系统架构拉取镜像:
启动镜像:
安装高版本 CANN
镜像内的 CANN 套件较老,需要重新安装 CANN Toolkit、CANN Kernels 和 NNAL,版本>=8.1.RC1,请注意,三个软件的版本需配套,推荐使用 8.2.RC1 版本。请正确选择 CPU 架构,CANN kernels 是分硬件的,请注意选择。下载好后按下面顺序安装:
配置环境变量
运行前请配置下列环境变量:
另外默认显存分配机制为
naive_best_fit
可选择配置 Paddle 显存分配机制为auto_growth
以随着真实数据需要再占用内存/显存,但内存/显存可能会产生碎片,详见。目前由于未知原因,不将显存分配机制设为
auto_growth
会爆显存,因此也请设置下面的环境变量:export FLAGS_allocator_strategy=auto_growth
Python
环境配置安装
Paddle
可使用如下命令安装(更高版本的
paddlepaddle
和paddleformers
有冲突,因此这里建议安装 3.1 版本):详见昇腾 NPU 安装说明。
安装三方库
编译 PaddleCustomDevice 之前,需要安装三方库 spdlog 和 json:
安装 PaddleCustomDevice
git clone https://github.com/PaddlePaddle/PaddleCustomDevice.git cd PaddleCustomDevice/backends/npu bash tools/compile.sh
完成编译后执行下面的命令安装:
pip install build/dist/paddle_custom_npu-*.whl --force-reinstall
手动安装这个 PR:
如后续报错
please make sure you registered your op first and try again
,请在手动安装后回去再覆盖安装一下主线版本PaddlePaddle/PaddleCustomDevice
中生成的whl
。安装 PaddleNLP
从源码克隆:
到
csrc/npu
目录下按照README.md
安装:python setup.py build bdist_wheel pip install dist/paddlenlp_ops*.whl
编译 FastDeploy
运行时可能会报错:
可以修改
/usr/local/lib/python3.10/dist-packages/paddleformers/utils/pdc_sdk.py
22 行的from distutils.dir_util import copy_tree
为:运行前需把对应的 FastDeploy 目录添加到
PYTHONPATH
:如果遇到
libgomp cannot allocate memory in static TLS block
错误,可以按如下方法解决:如果遇到循环导入问题,且不运行多模态模型,可以临时卸载
opencv
。另外请注意,目前对numpy
2.0 支持不佳,因此在最后请强制安装numpy
1.26.4 版本:如果遇到:
先查询:
然后在
/etc/hosts
加上上面查询到的 hostname: