Removing automatic grammar correction in Whisper #1631

Shikhs1989 · 2023-08-29T02:37:41Z

Shikhs1989
Aug 29, 2023

Whisper is correcting grammar in the transcription. How can we achieve raw output in transctiption. Also, due to this it is not transcribing one word transcriptions correctly. E.g It is converting 'chemist' to 'i missed.'.

whicks1 · 2023-08-29T03:48:50Z

whicks1
Aug 29, 2023

There is a built in normalizer that you can’t do much about without probably forking the project. It is described in the paper and is fairly readable in the code but in short it removes “ums”, standardizes contractions and various numbers, etc. Additionally, I would suspect that since it’s fairly standard practice for human transcribers to edit outputs and vendors charge extra for verbatim outputs, a significant portion of the training data is “cleaned” and so whisper will probably follow those patterns as well. As with all other problems someone will suggest you attempt to use the initial prompt argument but your milage will vary. You might also look at setting the temp. Finally as with any asr, the outputs aren’t perfect and spelling and interpretation errors are going to happen, use the large model of you can. Otherwise, review the outputs and post process ftw.

3 replies

Shikhs1989 Aug 29, 2023
Author

Hi Whicks1, Thanks for replying.
Is there any parameter that we can disable to stop the corrections and it should just give raw data as provided in the audio?

whicks1 Aug 29, 2023

I think the simple answer is no.

this-duck Dec 17, 2023

Could you perhaps explain how adjusting the temprature would assist?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Removing automatic grammar correction in Whisper #1631

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Removing automatic grammar correction in Whisper #1631

Uh oh!

Shikhs1989 Aug 29, 2023

Replies: 1 comment · 3 replies

Uh oh!

whicks1 Aug 29, 2023

Uh oh!

Uh oh!

Shikhs1989 Aug 29, 2023 Author

Uh oh!

whicks1 Aug 29, 2023

Uh oh!

this-duck Dec 17, 2023

Shikhs1989
Aug 29, 2023

Replies: 1 comment 3 replies

whicks1
Aug 29, 2023

Shikhs1989 Aug 29, 2023
Author