Skip to content

Commit e2ea8c4

Browse files
chore: Update training script for LLaVA-NeXT video models
1 parent acef4af commit e2ea8c4

File tree

3 files changed

+5
-3
lines changed

3 files changed

+5
-3
lines changed

docs/LLaVA_Video_1003.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -84,7 +84,7 @@ print(text_outputs)
8484

8585
## Training
8686

87-
[[Scripts]](/Users/zhangyuanhan/Desktop/LLaVA-NeXT/scripts/video/train): Start training models on your single-image/multi-image/video data.
87+
[[Scripts]](https://github.com/LLaVA-VL/LLaVA-NeXT/blob/yhzhang/video_dev/scripts/video/train/SO400M_Qwen2_72B_ov_to_video_am9_aug6.sh): Start training models on your single-image/multi-image/video data.
8888

8989

9090
## Evaluation Guidance

scripts/video/train/SO400M_Qwen2_72B_ov_to_video_am9.sh

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,8 @@ echo "PREV_STAGE_CHECKPOINT: ${PREV_STAGE_CHECKPOINT}"
3131
echo "MID_RUN_NAME: ${MID_RUN_NAME}"
3232

3333

34-
ACCELERATE_CPU_AFFINITY=1 torchrun --nproc_per_node="${ARNOLD_WORKER_GPU}" --nnodes="${ARNOLD_WORKER_NUM}" --node_rank="${ARNOLD_ID}" --master_addr="${METIS_WORKER_0_HOST}" --master_port="${port_in_cmd}" \
34+
# ACCELERATE_CPU_AFFINITY=1 torchrun --nproc_per_node="${ARNOLD_WORKER_GPU}" --nnodes="${ARNOLD_WORKER_NUM}" --node_rank="${ARNOLD_ID}" --master_addr="${METIS_WORKER_0_HOST}" --master_port="${port_in_cmd}" \
35+
deepspeed --master_port 30000 \
3536
llava/train/train_mem.py \
3637
--deepspeed scripts/zero3.json \
3738
--model_name_or_path $PREV_STAGE_CHECKPOINT \

scripts/video/train/SO400M_Qwen2_7B_ov_to_video_am9.sh

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,8 @@ echo "PREV_STAGE_CHECKPOINT: ${PREV_STAGE_CHECKPOINT}"
3131
echo "MID_RUN_NAME: ${MID_RUN_NAME}"
3232

3333

34-
ACCELERATE_CPU_AFFINITY=1 torchrun --nproc_per_node="${ARNOLD_WORKER_GPU}" --nnodes="${ARNOLD_WORKER_NUM}" --node_rank="${ARNOLD_ID}" --master_addr="${METIS_WORKER_0_HOST}" --master_port="${port_in_cmd}" \
34+
# ACCELERATE_CPU_AFFINITY=1 torchrun --nproc_per_node="${ARNOLD_WORKER_GPU}" --nnodes="${ARNOLD_WORKER_NUM}" --node_rank="${ARNOLD_ID}" --master_addr="${METIS_WORKER_0_HOST}" --master_port="${port_in_cmd}" \
35+
deepspeed --master_port 30000 \
3536
llava/train/train_mem.py \
3637
--deepspeed scripts/zero3.json \
3738
--model_name_or_path $PREV_STAGE_CHECKPOINT \

0 commit comments

Comments
 (0)