Qwen3-VL 8B Training Running excessively Slow #9646

Kathan-Vakharia · 2025-12-22T11:32:54Z

Kathan-Vakharia
Dec 22, 2025

Hello Everyone,

I am trying to perform distributed LoRa training of Qwen3-VL 8B model on my custom data.
GPU Config : 2 x A100 SXM 80GB
GPU Provider: Runpod
Total Dataset Size: ~70K samples across all the datasets

Here is my training config,

model_name_or_path: Qwen/Qwen3-VL-8B-Instruct
image_max_pixels: 4194304 # 2048 x 2048
print_param_status: true
trust_remote_code: true
cache_dir: /workspace/models/


### method
stage: sft
do_train: true
finetuning_type: lora
lora_rank: 128             
lora_target: all
lora_dropout: 0.05 
lora_alpha: 256 
# deepspeed: examples/deepspeed/ds_z3_config.json  # choices: [ds_z0_config.json, ds_z2_config.json, ds_z3_config.json]

### dataset
dataset: custom_data_1, custom_data_2
template: qwen3_vl_nothink
cutoff_len: 20000
max_samples: 100000
overwrite_cache: true
preprocessing_num_workers: 16
dataloader_num_workers: 4

### output
output_dir: /workspace/exp_wts
logging_steps: 1
save_steps: 250
plot_loss: true
overwrite_output_dir: true
save_only_model: false

### train
per_device_train_batch_size: 2
gradient_accumulation_steps: 4
learning_rate: 1.0e-5
num_train_epochs: 5
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
ddp_timeout: 180000000
resume_from_checkpoint: null


### eval
val_size: 0.1
per_device_eval_batch_size: 1
eval_strategy: steps
eval_steps: 250

I had a similar training config while performing qwen2.5 vl 7b experiments and it ran perfectly fine. However in qwen3-vl, the training is happening exorbitatnly slow. Moreover, I have to add FORCE_TORCH_RUN=1 parameter inorder for the training to begin correctly.
That is,
FORCE_TORCHRUN=1 llamafactory-cli train <path/to/yaml>

The issue I am observing is VRAM utilisation is more than 85% but GPU utilisation is less than 40% for both the GPUs.

What I have tried

turning of deepspeed doesn't seem to resolve the issue.

Happy to share more details if required :)

Answered by Originhhh

Jan 5, 2026

看看你的torch版本，不要大于2.9.0及以上

View full answer

Originhhh · 2026-01-05T01:43:16Z

Originhhh
Jan 5, 2026

看看你的torch版本，不要大于2.9.0及以上

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qwen3-VL 8B Training Running excessively Slow #9646

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Qwen3-VL 8B Training Running excessively Slow #9646

Uh oh!

Kathan-Vakharia Dec 22, 2025

Replies: 1 comment

Uh oh!

Originhhh Jan 5, 2026

Kathan-Vakharia
Dec 22, 2025

Originhhh
Jan 5, 2026