File tree Expand file tree Collapse file tree 4 files changed +303
-90
lines changed
examples/models/qwen3_omni
tests/test_align/test_template Expand file tree Collapse file tree 4 files changed +303
-90
lines changed Original file line number Diff line number Diff line change 1+ # 2 * 60GiB
2+ # mcore shell: https://github.com/modelscope/ms-swift/blob/main/examples/megatron/multimodal/omni/moe.sh
3+ MAX_PIXELS=1003520 \
4+ NPROC_PER_NODE=2 \
5+ VIDEO_MAX_PIXELS=50176 \
6+ FPS_MAX_FRAMES=12 \
7+ CUDA_VISIBLE_DEVICES=0,1 \
8+ swift sft \
9+ --model Qwen/Qwen3-Omni-30B-A3B-Instruct \
10+ --dataset ' AI-ModelScope/alpaca-gpt4-data-zh#10000' \
11+ ' AI-ModelScope/LaTeX_OCR:human_handwrite#5000' \
12+ ' swift/VideoChatGPT:Generic#2000' \
13+ ' speech_asr/speech_asr_aishell1_trainsets:validation#5000' \
14+ --split_dataset_ratio 0.01 \
15+ --load_from_cache_file true \
16+ --train_type lora \
17+ --torch_dtype bfloat16 \
18+ --num_train_epochs 1 \
19+ --per_device_train_batch_size 8 \
20+ --per_device_eval_batch_size 8 \
21+ --attn_impl flash_attn \
22+ --learning_rate 1e-4 \
23+ --lora_rank 8 \
24+ --lora_alpha 32 \
25+ --target_modules all-linear \
26+ --freeze_vit true \
27+ --freeze_aligner true \
28+ --padding_free true \
29+ --gradient_accumulation_steps 1 \
30+ --gradient_checkpointing true \
31+ --eval_steps 50 \
32+ --save_steps 50 \
33+ --save_total_limit 2 \
34+ --logging_steps 5 \
35+ --max_length 4096 \
36+ --output_dir output \
37+ --warmup_ratio 0.05 \
38+ --dataset_num_proc 4 \
39+ --deepspeed zero3 \
40+ --dataloader_num_workers 4
You can’t perform that action at this time.
0 commit comments