Finetune ErrorInfo: TypeError: decode() got an unexpected keyword argument 'skip_special_tokens' #2276
Unanswered
jxz2021114
asked this question in
Q&A
Replies: 1 comment 1 reply
-
It looks like you are using a whisper tokenizer here (ie. as defined in this repository)
But
Should wtokenizer instead be a HuggingFace - whisper |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
When I used the official finetuning code sample to finetune the whisper model, I met this error information:
Traceback (most recent call last):
File "main.py", line 102, in
text = wtokenizer.decode(token, skip_special_tokens=False)
File "/ldata/crj/anaconda3/envs/whisper/lib/python3.8/site-packages/whisper/tokenizer.py", line 166, in decode
return self.encoding.decode(token_ids, **kwargs)
TypeError: decode() got an unexpected keyword argument 'skip_special_tokens'
Here is the code context.
86 # Data loader
87 woptions = whisper.DecodingOptions(language="ja", without_timestamps=True)
88 wmodel = whisper.load_model("base")
89 wtokenizer = whisper.tokenizer.get_tokenizer(True, language="ja", task=woptions.task)
90
91 # Confirm Dataloading
92 dataset = JvsSpeechDataset(eval_audio_transcript_pair_list, wtokenizer, SAMPLE_RATE)
93 loader = torch.utils.data.DataLoader(dataset, batch_size=2, collate_fn=WhisperDataCollatorWhithPadding())
94
95 for b in loader:
96 print(b["labels"].shape)
97 print(b["input_ids"].shape)
98 print(b["dec_input_ids"].shape)
99
100 for token, dec in zip(b["labels"], b["dec_input_ids"]):
101 token[token == -100] = wtokenizer.eot
102 text = wtokenizer.decode(token, skip_special_tokens=False)
103 print(text)
104
105 dec[dec == -100] = wtokenizer.eot
106 text = wtokenizer.decode(dec, skip_special_tokens=False)
107 print(text)
108
109
110 break
Have anyone seen this error before? How can I solve this?
Beta Was this translation helpful? Give feedback.
All reactions