[SRT] How can I get Whisper to start at the sentence start, and end at the sentence end? #2191

Francoyy · 2024-05-27T12:48:41Z

Francoyy
May 27, 2024

I've been using whisper for more than a year, and using a lot of various options to generate SRT file.
But I have never managed to get a stable output where Whisper respects the natural sentence flow.
I don't get it and come here for some help... if anyone can help.

The command I mostly use is as such:

whisper interview_lukas_english.mp4 --language English --model large-v2 --output_format srt --word_timestamps True --highlight_words True --initial_prompt "Hello. My name is Tom. Welcome to my YouTube channel."

I'm using --word_timestamps True and --highlight_words True to have some extra post-processing after Whisper (mainly to fix the issue I'm describing here).

The output I'm getting is the following:

90
00:08:54,320 --> 00:08:59,780
the relationship. It would literally not have been possible without the amazing execution from

91
00:08:59,780 --> 00:09:03,920
Peter over here. That was nothing. You were the mastermind behind it. We were talking about it

92
00:09:03,920 --> 00:09:09,080
with the coded messages. I was just doing the execution. I was very stressed. Honestly,

93
00:09:09,160 --> 00:09:14,800
the whole experience was like super stressful. So at the time when she like, when Monica said yes,

I feel like Whisper is trying hard that every sentence is exactly the same length, but I don't need that. What I would prefer is that every stence stops at the final dot, or if not possible, at a comma, and if there's none of them and the sentence is too long, cut where it makes sense to cut.

What I would expect:

It would literally not have been possible without the amazing execution from Peter over here.
That was nothing. You were the mastermind behind it. 
We were talking about it with the coded messages.
I was just doing the execution. I was very stressed. 
Honestly, the whole experience was like super stressful.
So at the time when she like, when Monica said yes,

Note that I've also tried the more recent option --max_words_per_line which doesn't work well for me, because it chops sentences at non-natural places.

Any advice, anyone?

Francoyy · 2024-05-27T13:19:09Z

Francoyy
May 27, 2024
Author

Another case I get often, is that sometimes it starts well, and at some point it just starts completely messing up the natural order of the sentence and creates very long blocks without stopping at the end of sentences:

[11:04.960 --> 11:09.300]  And I got like 21 emails talking about like, you know,
[11:09.340 --> 11:11.140]  Confirmation for all the confetti.
[11:11.180 --> 11:12.700]  Your order has been confirmed.
[11:12.840 --> 11:13.900]  Your order is on the way.
[11:14.060 --> 11:15.100]  Your order has been received.
[11:15.320 --> 11:16.240]  Here is like the receipt.
[11:16.500 --> 11:17.920]  Please rate the experience.
[11:18.280 --> 11:20.340]  Here's a reminder to please rate the experience.
[11:20.800 --> 11:21.240]  Yeah.
[11:21.400 --> 11:24.700]  And sometimes you would even go into like my Shopee account,
[11:24.700 --> 11:31.040]  to order things and then you can see like previous orders and you can see like marry me sign and like
[11:31.040 --> 11:37.140]  I wonder what's he planning to do you know. Right now we are actually in the process of getting
[11:37.140 --> 11:44.180]  married on paper which also is like way more complicated than I ever thought. Right you told
[11:44.180 --> 11:50.840]  me a little bit. Basically I need to get a proof from the Swedish government saying that I am single
[11:50.840 --> 11:57.440]  that I'm not married already but in order for that to be approved by Taiwan it needs to be
[11:57.440 --> 12:02.820]  translated and then confirmed by like three stamps and this process is like you need to like
[12:02.820 --> 12:08.000]  mail it physically to like three different offices. So it's just a lot of back and forth.
[12:08.180 --> 12:14.760]  It's just a lot of back and forth yeah but hopefully a struggle worth struggling so
[12:14.760 --> 12:20.240]  we will hopefully get married one day. Wow do you realize like you know the

0 replies

glangford · 2024-05-27T13:37:17Z

glangford
May 27, 2024

I posted sample code for segmenting whisper output into sentences here:

#314 (reply in thread)

2 replies

Francoyy May 27, 2024
Author

Thanks, I thought Whisper was supposed to do just that, but your solution on top of Whisper seems to work very well for my needs!

Francoyy May 27, 2024
Author

Just a recap of how I installed the different tools in order to execute your script (may vary depending on everyone's set up though)

# different projects may need different pythons,
# so i usually create a virtual environment.
# In this case, using python 3.11
brew install [email protected] 
python3.11 -m venv venv 
source venv/bin/activate
pip install --upgrade pip setuptools wheel

# installing dependencies
pip install cython
pip install spacy more-itertools whisper openai-whisper

# downloading the spacy model
python -m spacy download en_core_web_lg

# executes the script provided in the gist with 42 characters max and 1 line - in my own use case
python whisper_post_process.py input.json -m en_core_web_lg -w 42 -l 1

chenjianjx · 2025-04-13T07:36:48Z

chenjianjx
Apr 13, 2025

Hope this can be done someday. It will be very useful in occasions where a full sentence is prefered.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SRT] How can I get Whisper to start at the sentence start, and end at the sentence end? #2191

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[SRT] How can I get Whisper to start at the sentence start, and end at the sentence end? #2191

Uh oh!

Francoyy May 27, 2024

Replies: 3 comments · 2 replies

Uh oh!

Francoyy May 27, 2024 Author

Uh oh!

glangford May 27, 2024

Uh oh!

Francoyy May 27, 2024 Author

Uh oh!

Francoyy May 27, 2024 Author

Uh oh!

chenjianjx Apr 13, 2025

Francoyy
May 27, 2024

Replies: 3 comments 2 replies

Francoyy
May 27, 2024
Author

glangford
May 27, 2024

Francoyy May 27, 2024
Author

Francoyy May 27, 2024
Author

chenjianjx
Apr 13, 2025