@@ -20,16 +20,24 @@ For detailed environment setup, please refer to the [Ascend PyTorch installation
2020## Environment Preparation
2121
2222Experiment Environment: 8 * Ascend 910B3 64G
23-
23+ ### Environment Installation
2424``` shell
2525# Create a new conda virtual environment (optional)
2626conda create -n swift-npu python=3.10 -y
2727conda activate swift-npu
2828
29+ # Note: Before proceeding with subsequent operations, you need to source and activate CANN environment first
30+ source /usr/local/Ascend/ascend-toolkit/set_env.sh
31+
2932# Set pip global mirror (optional, to speed up downloads)
3033pip config set global.index-url https://mirrors.aliyun.com/pypi/simple/
3134pip install ms-swift -U
3235
36+ # Install from source
37+ git clone https://github.com/modelscope/ms-swift.git
38+ cd ms-swift
39+ pip install -e .
40+
3341# Install torch-npu
3442pip install torch-npu decorator
3543# If you want to use deepspeed (to control memory usage, training speed might decrease)
@@ -41,8 +49,20 @@ pip install evalscope[opencompass]
4149# If you need to use vllm-ascend for inference, please install the following packages
4250pip install vllm==0.11.0
4351pip install vllm-ascend==0.11.0rc3
52+ ```
53+
54+ Check if the test environment is installed correctly and whether the NPU can be loaded properly.
55+ ``` python
56+ from transformers.utils import is_torch_npu_available
57+ import torch
4458
45- # If you need to use MindSpeed (Megatron-LM), please install the following packages
59+ print (is_torch_npu_available()) # True
60+ print (torch.npu.device_count()) # 8
61+ print (torch.randn(10 , device = ' npu:0' ))
62+ ```
63+
64+ ** If you need to use MindSpeed (Megatron-LM), please follow the guide below to install the necessary dependencies**
65+ ``` shell
4666# 1. Obtain and switch Megatron-LM to core_v0.12.1
4767git clone https://github.com/NVIDIA/Megatron-LM.git
4868cd Megatron-LM
@@ -60,17 +80,11 @@ cd ..
6080export PYTHONPATH=$PYTHONPATH :< your_local_megatron_lm_path>
6181export MEGATRON_LM_PATH=< your_local_megatron_lm_path>
6282```
63-
64- Check if the test environment is installed correctly and whether the NPU can be loaded properly.
65- ``` python
66- from transformers.utils import is_torch_npu_available
67- import torch
68-
69- print (is_torch_npu_available()) # True
70- print (torch.npu.device_count()) # 8
71- print (torch.randn(10 , device = ' npu:0' ))
83+ Run the following command to verify if MindSpeed (Megatron-LM) is configured successfully:
84+ ``` shell
85+ python -c " import mindspeed.megatron_adaptor; from swift.megatron.init import init_megatron_env; init_megatron_env(); print('✓ NPU environment Megatron-SWIFT configuration verified successfully!')"
7286```
73-
87+ ### Environment Viewing
7488Check the P2P connections of the NPU, where we can see that each NPU is interconnected through 7 HCCS links with other NPUs.
7589``` shell
7690(valle) root@valle:~ /src# npu-smi info -t topo
@@ -95,7 +109,7 @@ Legend:
95109 NA = Unknown relationship.
96110```
97111
98- Check the status of the NPU. Detailed information about the ` npu-smi ` command can be found in the [ official documentation] ( https://support.huawei.com/enterprise/zh /doc/EDOC1100079287/10dcd668 ) .
112+ Check the status of the NPU. For detailed information about the ` npu-smi ` command, please refer to the [ official documentation] ( https://support.huawei.com/enterprise/en /doc/EDOC1100079287/10dcd668 ) .
99113``` shell
100114(valle) root@valle:~ /src# npu-smi info
101115+------------------------------------------------------------------------------------------------+
@@ -345,6 +359,6 @@ ASCEND_RT_VISIBLE_DEVICES=0 swift deploy \
345359| Using sglang as inference engine |
346360
347361
348- ## NPU Wechat Group
362+ ## NPU WeChat Group
349363
350364<img src =" https://raw.githubusercontent.com/modelscope/ms-swift/main/docs/resources/wechat/npu.png " width =" 250 " >
0 commit comments