-
Dear all, You have some results for Albanian language. However, in the notebook example Multilingual_ASR.ipynb the langauage code is missing. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 7 replies
-
Those language codes are the ones used by the Fleurs dataset, which did not include Albanian. Those are slightly different from what's used in our tokenizer.py. In Whisper, you can specify |
Beta Was this translation helpful? Give feedback.
Those language codes are the ones used by the Fleurs dataset, which did not include Albanian. Those are slightly different from what's used in our tokenizer.py.
In Whisper, you can specify
--language Albanian
or--language sq
to tell the model to transcribe Albanian speech, but we unfortunately don't have quantitative metrics on Whisper's performance in Albanian. I'd be curious to know if you can share your experience.