Replies: 1 comment
-
I’ve created a code-switched language dataset for fine-tuning Whisper, including audio data along with CSV and Parquet files, which I’ve stored on Hugging Face. After preparing the dataset, I fine-tuned the model for translation. You can explore the entire end-to-end project in my repo. Here’s the link to check it out: https://github.com/pr0mila/MediBeng-Whisper-Tiny For time-stamp, you can use faster-whisper interference: https://github.com/SYSTRAN/faster-whisper |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
I have used whisper to transcribe audio from a few movies. I'm planning to correct the subtitles and then use it to fine-tune the whisper model.
Since whisper is not using timestamps I was wondering what is the risk for whisper to have a gap between the audio and the subtitiles?
I know that behind the scenes the python code breaks down the bigger audio into 30seconds chunks, however how would the python code how to cut the subtitles after 30 seconds since the subtitles are not timed, since whisper cannot be trained using timed subtitles.
Thank you
P.S. I've seen that this project does not work, it's a pitty... https://github.com/jumon/whisper-finetuning
Beta Was this translation helpful? Give feedback.
All reactions