Replies: 2 comments 6 replies
-
From this link, you can see that whisper has a relatively high error rate for Afrikaans. This is because Whisper has not been trained on enough Afrikaans audio yet. People are working on fine tuning Whisper for different languages, but Afrikaans has not been targeted yet. See for example In the short term, since you are not a developer, you can:
|
Beta Was this translation helpful? Give feedback.
-
Hi @AntiDotZA , regarding the "No Text found" error, is that error coming from SubtitleEdit/WhisperCPP, or from the model itself? I used to get "No Text found" error when trying to transcribe from Japanese using SubtitleEdit. At the time I concluded that the error comes from SubtitleEdit implementation/integration of WhisperCPP, but I might be wrong of course. The community in both SE and WhsiperCPP are quite helpful and might know some workaround. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
As part of my job (both day and night) I often need to create English subtitles from Afrikaans TV shows. For this I use the opensource Subtitle Edit application. Very handy.

However, the subtitling process is tedious and doing this manually takes a ridiculous amount of time. (For me, somewhere along the lines of 30minutes per content minute.)
After some research online regarding transcription software, I found out that Subtitle Edit has integrated Whisper for an "Audio to Text" feature, and unlike other software options, Whisper has language support for Afrikaans.
So I spent the last day getting this feature to work (it involves running a lot of coded command lines outside of my comfort zone), and FINALLY it's working. With much anticipation I ran the transcription feature on an Afrikaans video to see check the accuracy....
It was a complete dud. The results resembled Dutch more than Afrikaans, and is not usable at all. :(
I want to cry! Afrikaans is my home language, and even when it does get some representation in the software world, it is still confused with something that is not recognisable as Afrikaans. (Please see attached a few lines from the automatic transcription vs a manual transcription.)
So my question:
How can I help to improve the transcription results on Afrikaans? Because there are others that do the same work as me, and we will all benefit from a more accurate interpretation of spoken Afrikaans.
Many thanks!!!
Beta Was this translation helpful? Give feedback.
All reactions