Real time live ASR? #925

SinanAkkoyun · 2023-02-04T01:11:24Z

SinanAkkoyun
Feb 4, 2023

Hello, I am curious on how to implement ASR chunking for realtime transcription (with interim results so to speak). I am aware of VAD implementations, but I wanted to ask if chunking with overlaps might somehow be possible.

Thank you so much for helping me out!

Beenyaa · 2023-02-07T20:37:45Z

Beenyaa
Feb 7, 2023

Hey, I've tried implementing ASR chunking with overlaps myself to get those interim results in real-time-ish transcription. The good news is, it's definitely possible! But, it isn't always the greatest.

I created a sliding window to make this work. Essentially, I concatenated the current recording with a portion of the previous recording, like this: concatenated_recording = np.concatenate([previous_recording[-ovelap_length:], current_recording]). Then, I transcribed the concatenated numpy array and used the results to prompt the next chunk.

Unfortunately, this method often leads to premature endings of sentences, which in my case can completely change the meaning of the speech being transcribed. If you manage to make this work better, I'd love to hear about it.

1 reply

Beenyaa Feb 7, 2023

Another thought, you might want to checkout https://github.com/ggerganov/whisper.cpp/tree/master/examples/stream#sliding-window-mode-with-vad if openai/whisper doesn't work for you as you expected it to. This is where I got the sliding window idea from and this is the project I am looking to use going forward until someone makes a break through with an openai/whisper implementaion of real-time transcription.

SinanAkkoyun · 2023-02-07T22:50:29Z

SinanAkkoyun
Feb 7, 2023
Author

Hey, thank you for your reply! That is a nice approach, thank you for sharing!

1 reply

makaveli10 Jan 18, 2024

@SinanAkkoyun WhisperLive is real time and uses faster-whisper backend.

Gldkslfmsd · 2024-01-24T12:58:23Z

Gldkslfmsd
Jan 24, 2024

#1978 -- Whisper-Streaming has a real-time implementation and supports faster-whisper backend.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Real time live ASR? #925

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 3 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Real time live ASR? #925

Uh oh!

Uh oh!

SinanAkkoyun Feb 4, 2023

Replies: 3 comments · 2 replies

Uh oh!

Uh oh!

Beenyaa Feb 7, 2023

Uh oh!

Beenyaa Feb 7, 2023

Uh oh!

Uh oh!

SinanAkkoyun Feb 7, 2023 Author

Uh oh!

makaveli10 Jan 18, 2024

Uh oh!

Gldkslfmsd Jan 24, 2024

SinanAkkoyun
Feb 4, 2023

Replies: 3 comments 2 replies

Beenyaa
Feb 7, 2023

SinanAkkoyun
Feb 7, 2023
Author

Gldkslfmsd
Jan 24, 2024