How to combine small audio segments to make 30 seconds chunks in the dataset? #2028

omarabb315 · 2024-02-17T12:30:57Z

omarabb315
Feb 17, 2024

I am trying to prepare a dataset for whisper fine-tuning , and I have a lot of small segment clip , most of them less than 6 seconds, I read the paper, but didn’t understand this paragraph:

“ When a final tran- script segment is only partially included in the current 30- second audio chunk, we predict only its start time token for the segment when in timestamp mode, to indicate that the subsequent decoding should be performed on an au- dio window aligned with that time, otherwise we truncate the audio to not include the segment”

Anyone could help?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to combine small audio segments to make 30 seconds chunks in the dataset? #2028

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

How to combine small audio segments to make 30 seconds chunks in the dataset? #2028

Uh oh!

omarabb315 Feb 17, 2024

Replies: 0 comments

omarabb315
Feb 17, 2024