Skip to content

Commit 97b78bf

Browse files
authored
Update moe_tune_script.sh (ROCm#507)
add RAY_EXPERIMENTAL_NOSET_ROCR_VISIBLE_DEVICES=1
1 parent 8826599 commit 97b78bf

File tree

1 file changed

+8
-2
lines changed

1 file changed

+8
-2
lines changed

benchmarks/kernels/moe_tune_script.sh

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
#!/bin/bash
22

33
export HIP_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
4-
4+
export RAY_EXPERIMENTAL_NOSET_ROCR_VISIBLE_DEVICES=1
55

66
## ---- Mixtral fp8 tuning example ---- ##
77
python benchmark_moe.py --model /data/models/mistral-ai-models/Mixtral-8x22B-Instruct-v0.1-FP8/ --tp-size 1 --tune --dtype fp8_w8a8
@@ -30,4 +30,10 @@ python benchmark_moe.py --model /data/models/mistral-ai-models/Mixtral-8x22B-v0.
3030

3131
## ---- Notes ---- ##
3232
# 1. The tuned file is specific for a TP size. This means a tuned file obtained for --tp-size 8 can only be used when running the model under TP=8 setting.
33-
# 2. The script uses Ray for multi-gpu tuning. Export HIP_VISIBLE_DEVICES accordingly to expose the required no. of GPUs and use multiple gpus for tuning.
33+
# 2. The script uses Ray for multi-gpu tuning. Export HIP_VISIBLE_DEVICES accordingly to expose the required no. of GPUs and use multiple gpus for tuning.
34+
# 3. RAY_EXPERIMENTAL_NOSET_ROCR_VISIBLE_DEVICES=1 resolves the following errors (depending on if HIP_VISIBLE_DEVICES is set or not):
35+
# - Error-1: RuntimeError: HIP error: invalid device ordinal
36+
# HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
37+
# For debugging consider passing AMD_SERIALIZE_KERNEL=3
38+
# - Error-2: RuntimeError: HIP_VISIBLE_DEVICES contains more devices than ROCR_VISIBLE_DEVICES
39+

0 commit comments

Comments
 (0)