Replies: 2 comments 8 replies
-
First thing first. Are you looking for speaker diarization or speaker tracking/identification?
Your question makes me think that you are actually interested in speaker tracking and not speaker diarization.
That is expected as actual labels do not really matter in speaker diarization (see definition above).
Yes, it makes sense to fine tune models/pipelines on a specific domain and is even recommended. |
Beta Was this translation helpful? Give feedback.
-
Hello everyone! Thank you very much!!! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Firstly, thanks a lot for providing such a concise tool to play around with audio! Big ups! :)
I was exploring the pyannote project and got confused with so many repositories that provide different functionalities. I wanted to know if my understanding of the way to use this package is correct.
Let's consider the following scenarios:-
If I have an audio clip (say from a college debate competition) involving 3 speakers.
Does it make sense to use pretrained models from pyannote.audio.hub to identify who speaks when?
Or should I retrain/fine-tune all of the pipelines (speech activity detection, speaker change detection, overlapped speech detection, speaker embedding) on audio data in which these 3 people speak?
I ask this because when I used the pretrained diarization pipeline from pyannote.audio.hub, I received random speaker labels like A, 101, 99.
Does it make sense to train pipelines on data from specific domains and expect them to perform well on other data from the same domain?
For example, if I train the above-mentioned pipelines on rap music by Eminem, Lil Wayne, and Kanye West, am I correct to use these pipelines to try and perform speaker diarization on a rap song in which Pitbull and J Cole rap?
Or will the trained pipelines be applicable to only Eminem, Li Wayne, and Kanye West's rap?
Please excuse me if these questions sound silly/stupid. I just want to understand the best way to use this fantastic tool that you have built.
Beta Was this translation helpful? Give feedback.
All reactions