|
| 1 | +# Ascend Quickstart |
| 2 | + |
| 3 | +Last updated: 2025-11-28 |
| 4 | + |
| 5 | +We have added support for Huawei Ascend devices in VeOmni. |
| 6 | + |
| 7 | +### Environment Requirements |
| 8 | + |
| 9 | +| software | version | |
| 10 | +| --------- | -------------- | |
| 11 | +| Python | >= 3.10, <3.12 | |
| 12 | +| CANN | == 8.3.RC1 | |
| 13 | +| torch | == 2.7.1 | |
| 14 | +| torch_npu | == 2.7.1 | |
| 15 | + |
| 16 | +Please refer to this [document](https://gitcode.com/Ascend/pytorch) for basic environment setup. |
| 17 | + |
| 18 | +### Installing Dependencies with uv |
| 19 | + |
| 20 | +#### 1. Enter the VeOmni root directory |
| 21 | + |
| 22 | + git clone https://github.com/ByteDance-Seed/VeOmni.git |
| 23 | + cd VeOmni |
| 24 | + |
| 25 | +#### 2. Pin the Python version |
| 26 | + |
| 27 | + uv python pin 3.11 |
| 28 | + |
| 29 | +#### 3. (Optional) Set timeout |
| 30 | + |
| 31 | +If the network is unstable, you can increase the timeout to avoid download failures by setting the UV_HTTP_TIMEOUT environment variable: |
| 32 | + |
| 33 | + export UV_HTTP_TIMEOUT=60 |
| 34 | + |
| 35 | +#### 4. Install the environment using uv |
| 36 | + |
| 37 | + uv sync --extra npu --allow-insecure-host github.com --allow-insecure-host pythonhosted.org |
| 38 | + |
| 39 | +#### 5. Using the environment |
| 40 | + |
| 41 | +After installation, a .venv folder will appear in the VeOmni project root. This is the environment created by uv. |
| 42 | +Activate it with: |
| 43 | + |
| 44 | + source .venv/bin/activate |
| 45 | + |
| 46 | +Check installed dependencies: |
| 47 | + |
| 48 | + uv pip list |
| 49 | + |
| 50 | +### Quick Start |
| 51 | + |
| 52 | +1. Prepare the model and dataset. |
| 53 | + |
| 54 | +2. Set the NPROC_PER_NODE parameter in train.sh according to the number of available NPUs. |
| 55 | + |
| 56 | +3. Run the training script: |
| 57 | + |
| 58 | +```bash |
| 59 | +# Set environment variables |
| 60 | +export ASCEND_RT_VISIBLE_DEVICES=0,1,2,3 |
| 61 | +export PYTORCH_NPU_ALLOC_CONF=expandable_segments:True |
| 62 | +export MULTI_STREAM_MEMORY_REUSE=2 |
| 63 | + |
| 64 | +bash train.sh tasks/train_torch.py configs/sft/qwen3_sft.yaml |
| 65 | +``` |
| 66 | + |
| 67 | +Parallelism Support |
| 68 | + |
| 69 | +| Feature | Supported | |
| 70 | +| ---------------- | ----------- | |
| 71 | +| fsdp | ✅ | |
| 72 | +| fsdp2 | ✅ | |
| 73 | +| ulysses parallel | ✅ | |
| 74 | +| expert_parallel | In progress | |
0 commit comments