30s Segmentation #490
catalwaysright
started this conversation in
General
Replies: 1 comment
-
You can read the transcribe function in transcribe.py. because whisper will output the timestamps, model will output the last full sentence at end timestamps, and then cut the audio from here to do next decoding, such as 25s-5s, not 30s -60s |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
The whisper model will segment the audio into 30s chunks and then do the transcription. I am curious that how the model do this segmentation without cutting a word in middle.
Beta Was this translation helpful? Give feedback.
All reactions