-
I am a beginner to ASR and saw 'Amharic' language in the Whisper's list of languages. I tried it with a 39 seconds audio and detected the language correctly but returned the transcript in Hebrew. Both languages are Semitic languages. 20 hrs of Amharic Speech Dataset: https://github.com/getalp/ALFFA_PUBLIC/tree/master/ASR/AMHARIC. [In case it can be considered in future versions] |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 2 replies
-
Unfortunately Whisper is really bad at Amharic, and has seen little to no Amharic script. In Appendix E of the paper it shows that it has only seen a small amount of Amharic audio -> English caption pairs, but the speech translation evals in D.3.1 shows that it's far from being useful. |
Beta Was this translation helpful? Give feedback.
-
Common voice also has some data https://commonvoice.mozilla.org/en/datasets. I am interested to have Amharic ASR. I am native speaker. If there are things I can help with, I am interested to know. |
Beta Was this translation helpful? Give feedback.
-
Unfortunately, Amharic ASR is pretty much non-usable even with Whisper v3. As @bemnet4u pointed out, Common Voice has some Amharic data—the latest version (16.1, dated 4 January 2024) has 1,223 Amharic samples, which is not bad. This post explains how to fine tune Whisper with Common Voice data (and even has a Google Colab notebook for doing so), but it's quite involved and not easy to follow for me. If anyone else has any luck, please let me know or post back here! Then we might add to this list with a working Amharic ASR model to save future Amharic transcribers the trouble of retracing our steps. |
Beta Was this translation helpful? Give feedback.
Unfortunately Whisper is really bad at Amharic, and has seen little to no Amharic script. In Appendix E of the paper it shows that it has only seen a small amount of Amharic audio -> English caption pairs, but the speech translation evals in D.3.1 shows that it's far from being useful.