Replies: 1 comment
-
It appears that the model is producing the token 0 (which gets decoded into A simple workaround would be to stop sampling at the |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi, After finetuning whisper followed the blog (https://huggingface.co/blog/fine-tune-whisper#load-whisperfeatureextractor), I find eval_wer, eval_cer score are bad.
Because of that, I checked result of pred_str, about some pred_str setneces are just all exclamation marks. I checked also audio encoding part, but audio files are okay.
Is there any reason the result of finetuning returns only exclamation marks?
And these are my training arguments:
"training_args = Seq2SeqTrainingArguments(
output_dir="./whisper-small_output2", # change to a repo name of your choice
per_device_train_batch_size=16,
gradient_accumulation_steps=8, # increase by 2x for every 2x decrease in batch size
learning_rate=1e-5,
warmup_steps=50,
max_steps=4000,
gradient_checkpointing=False,
fp16=True,
tf32=True,
dataloader_num_workers=4,
evaluation_strategy="steps",
per_device_eval_batch_size=8,
predict_with_generate=True,
generation_max_length=225,
save_steps=50,
eval_steps=50,
logging_steps=1,
report_to=["tensorboard","wandb"], #, "wandb"
load_best_model_at_end=True,
metric_for_best_model="wer",
greater_is_better=False,
#push_to_hub=True,
)"
Beta Was this translation helpful? Give feedback.
All reactions