Adding realtime diarization to collabora/WhisperLive

I'm trying to add diarization to this repo https://github.com/collabora/WhisperLive, which has transcription and also runs a VAD model before passing audio data to the transcriber. I had it working with pyannote-audio, however, the VAD model and the diarization model both run on the CPU so they slow down each other. I was also passing the whole audio file every time to the model so this is obviously not optimal. I was wondering how I can use diart instead of pyannote. Most of the examples I see are directly from microphone. Can anyone please share an example of how I can use it diart with the data being a float 32 numpy array of mono audio instead of a stream from the microphone? Any help is appreciated.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adding realtime diarization to collabora/WhisperLive #177

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Adding realtime diarization to collabora/WhisperLive #177

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions