How do i fine tune whisper to transcribe the numbers properly #1982
Unanswered
nagatarunkumar
asked this question in
Q&A
Replies: 1 comment 1 reply
-
Rather than fine tuning Whisper for such a task, it would be easier to post-process a portion of the transcript using regular expressions. Here is how to handle "double" as an example >>> import re
>>> transcript = "seven double nine eight double five"
>>> re.sub(r"double (\w+)", r"\1 \1", transcript)
'seven nine nine eight five five'
>>> To then convert words into a number you could use something like this package. https://github.com/ShailChoksi/text2digits >>> from text2digits import text2digits
>>> t2d = text2digits.Text2Digits()
>>> t2d.convert("seven nine nine eight five five")
'799855' And for variations where the caller groups digits, >>> t2d.convert("eighteen eighty-five six zero seven")
'1885607' |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I have my own dataset consisting of insurance claim recordings, where users tell various numbers such as policy number, phone number etc and i am using whisper to transcribe these call recordings, Now the issue i am facing is when user speaks a number such as "7998551771" as seven double nine eight double five... instead of converting it to the number as we may expect its converting it into words, So how do i fine tune whisper to basically convert all such instances where user uses words like double, triple while dicating the number to a number instead of words.
Beta Was this translation helpful? Give feedback.
All reactions