Need set model.config.forced_decoder_ids in training mode ? #1853
Unanswered
avishai111
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi everyone, I have a whisper model that I want to train. The model needs to work in a specific language. I have data for training in this language.
It seems that during training the
model.forced_decoder_ids
field at the model.config is always set to None. At inference time, the following fields are set:model. config.forced_decoder_ids = processor.get_decoder_prompt_ids(language="language", task="transcribe",no_timestamps=True)
I want to verify that my training is proceeding as expected, so at the validation part, I look at the preds. This is during training, so model.
config.forced_decoder_ids = None
. The preds seem wrong (a random text from different languages).Setting
model. config.forced_decoder_ids = processor.get_decoder_prompt_ids(language="language", task="transcribe",no_timestamps=True)
during training yields logical results (the preds from the same language).Still, on the other hand, I saw in the examples on the internet that it is not adjusted during the training but only in the inference.
I did not find when to give a value to this part during the training/validation/test. I only found information that this variable forces the model. to generate in a specific language that I choose.
I would appreciate it if someone could share some information on the subject, when to give a value in this field, and when to leave this value to
None
.Thank you all,
Avishai
Beta Was this translation helpful? Give feedback.
All reactions