Segment lengths are continuous/gapless #1036

p4-k4 · 2023-03-06T21:50:46Z

p4-k4
Mar 6, 2023

Currently, we get continuous/gapless segments where segments are hard "butted up" against each other (indicative by the end-time of the previous and start-time of the proceeding segment (1 ends at 00:00:09,600 while 2 starts at 00:00:09,600 but in fact, there is a gap between them).

1
00:00:00,000 --> 00:00:09,600
New Zealand's first ever event

2
00:00:09,600 --> 00:00:15,720
Wingfoil World Tour event.

3
00:00:15,720 --> 00:00:20,520
Global competitors were welcomed at Whareroa Marae with a special opening ceremony to mark

Not sure if it's work in progress but would love to see segments start/end times matching with speech segment lengths that are true to the respective speech segments.

On the other hand, being able to specify a duration that; if the time between each segment is less than xspecified time, butt them up against each other. WRT subtitles, this would prevent "flashing" subtitles where the gap between them is so short that we should probably just extend it out to the next one.

p4-k4 · 2023-03-16T09:55:17Z

p4-k4
Mar 16, 2023
Author

Ending up putting something together that converts Whisper word-level generated SRT's to segment-level SRT's by condensing sentences down to one segment and reassigning start and end timestamps in accordance to first and last words.

When running Whisper with the --word_timestamps flag set to True, we get this:

1
00:00:00,000 --> 00:00:00,500
<u>If</u> you are driving around the Hastings or Napier, you will see plenty of low-vacancy signs.

------------ 2-17 ------------

18
00:00:06,560 --> 00:00:07,160
If you are driving around the Hastings or Napier, you will see plenty of low-vacancy signs.

subseg input.srt output.srt gets us this (Note the start and end timestamps):

1
00:00:00,000 --> 00:00:07,160
If you are driving around the Hastings or Napier, you will see plenty of low-vacancy signs.

Install dart, or run compiled in /bin.
Clone: git clone https://github.com/p4-k4/subseg.git
Repo: https://github.com/p4-k4/subseg

0 replies

ClaireCJS · 2023-07-06T08:58:57Z

ClaireCJS
Jul 6, 2023

Yeah, I kind of thought that if whisper was outputting an end time, it would be the actual end time of the speech, not... the beginning of the next speech.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Segment lengths are continuous/gapless #1036

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Segment lengths are continuous/gapless #1036

Uh oh!

p4-k4 Mar 6, 2023

Replies: 2 comments

Uh oh!

p4-k4 Mar 16, 2023 Author

Uh oh!

ClaireCJS Jul 6, 2023

p4-k4
Mar 6, 2023

p4-k4
Mar 16, 2023
Author

ClaireCJS
Jul 6, 2023