How to combine small audio segments to make 30 seconds chunks in the dataset? #2028
Unanswered
omarabb315
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I am trying to prepare a dataset for whisper fine-tuning , and I have a lot of small segment clip , most of them less than 6 seconds, I read the paper, but didn’t understand this paragraph:
“ When a final tran- script segment is only partially included in the current 30- second audio chunk, we predict only its start time token for the segment when in timestamp mode, to indicate that the subsequent decoding should be performed on an au- dio window aligned with that time, otherwise we truncate the audio to not include the segment”
Anyone could help?
Beta Was this translation helpful? Give feedback.
All reactions