Trys - Embedded Interjections #1315
jh-modjeski
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I've just implemented a new feature for my Trys project that takes advantage of Whisper's word_timestamps to tag and embed interjections in the text of a primary speaker's transcript. If the interjection is shorter than the pause_len used for non-silence detection, then it will be embedded into the primary speaker's line. if it's longer than pause_len, it will be considered crosstalk and placed on a new line.
Example Output (Lex Fridman podcast with Andrej Karpathy)
Trys is using nonsilence detection to determine sections of audio where whisper should transcribe. Each line represents a continuous audible section when the speaker was never quiet for longer than the pause_len. Interjections do not disrupt the primary speaker for longer than the pause_len, and therefore, we do not disrupt the transcribed line by making a new line for the interjection. I believe this is far more readable, because we follow the primary speaker's chain of thought until the end.
Beta Was this translation helpful? Give feedback.
All reactions