Replies: 1 comment
-
|
@mingewang |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
HI
I noticed a difference in model_config.yaml between Parakeet v2 and Parakeet v3 models:
Parakeet v2
train_ds:
max_duration: 40
validation_ds:
max_duration: 40
Parakeet v3
train_ds:
max_duration: 10
validation_ds:
max_duration: 30
Does this mean that Parakeet v3 was trained with training samples truncated to 10 seconds, while Parakeet v2 was trained with up to 40-second samples?
If so, what was the motivation for reducing the training max_duration in v3?
When I try to increase train_ds.max_duration for Parakeet v3 to 30 seconds, I easily run into CUDA OOM errors on an A100 (80GB).
Are there recommended settings (batch size, gradient accumulation, bucketing strategy, etc.) to safely train v3 with longer utterances?
Any guidance on best practices for handling longer audio with Parakeet v3 ( finetune) would be appreciated.
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions