Skip to content
Discussion options

You must be logged in to vote

Hey @sadhiin - you can likely use large-v3 instead of large-v2, since it's a stronger pre-trained model:

-    --model_name_or_path="openai/whisper-large-v2" \
+    --model_name_or_path="openai/whisper-large-v3" \

Assuming you've set-up deepspeed correctly (which it looks like you have based on your config), I would recommend reducing your per-device batch-size:

    --per_device_train_batch_size="16" \
    --gradient_accumulation_steps="1" \

Note that your overall batch size will be:

per_device_train_batch_size * gradient_accumulation_steps * num_gpus

Plugging in the numbers for your specific config:

16 * 1 * 4 = 64

Which should be sufficient for fine-tuning. If you still hit OOM, reduce…

Replies: 3 comments 7 replies

Comment options

You must be logged in to vote
1 reply
@sadhiin
Comment options

Comment options

You must be logged in to vote
2 replies
@sadhiin
Comment options

@sadhiin
Comment options

Answer selected by sadhiin
Comment options

You must be logged in to vote
4 replies
@sadhiin
Comment options

@Navanit-nebula
Comment options

@sadhiin
Comment options

@Navanit-nebula
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
5 participants