Speaker Diarization/Speaker Segmentation for call conversation #2493
Replies: 2 comments
-
Hi, We’ve just released DiCoW v2, an upgraded version of our Diarization-Conditioned Whisper model! 🚀 This version takes diarization as input and transcribes multi-talker speech, even when speakers are speaking different languages. Give it a try here: https://pccnect.fit.vutbr.cz/gradio-demo/ 🔗 Papers: Codebase: Let us know your feedback! 🚀 |
Beta Was this translation helpful? Give feedback.
-
Hey @Navanit-nebula we also ran into some discrepancies between Whisper and Pyannote, and found that in general, Pyannote was more accurate than what we were seeing with Whisper itself. What has been your experience so far? If you want to compare, here's the app: WhisperScript |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I have to create simple speaker segmentation for a simple phone call conversion. I have to transcribe the call and segement between the user and agent.
For the time being I have downloaded the youtube video https://www.youtube.com/watch?v=xbyEs7DJshw&t converted it in wav format in mono audio and 16khz .
So I have used 2 different techniques
But the timestamp between both of them is not same.
If anyone wants to recreate I am attaching the code.
Now how should I get the best in this
Beta Was this translation helpful? Give feedback.
All reactions