Howto / Allow Inject initial embeddings inside pipeline cluster #1764

heralight · 2024-09-13T21:44:22Z

heralight
Sep 13, 2024

Help Needed: Reinjection of Previous Embeddings in `pyannote/speaker-diarization-3.1` Pipeline

Issue Type:
🆘 Support Request

Description

Yet another question on how identify speaker on multiple source...

Hello PyAnnote Team,

I am using the pyannote/speaker-diarization-3.1 pipeline for speaker diarization in my project. I aim to improve speaker consistency across multiple audio chunks by reinjecting embeddings from a previous chunk into the pipeline when processing the next chunk.

Attempts Made:

Modifying Clustering:
- Subclassed the SpeakerDiarization pipeline to store and utilize previous embeddings and cluster centroids.
- Overrode the compute_embeddings and cluster methods to concatenate previous embeddings and initialize KMeans with prior centroids.
Using Hooks:
- Tried to implement hooks to pass embeddings from one pipeline call to the next.
- Attempted to inject initial_embeddings during the second call to maintain speaker identity.

Example Use Case:

Specifically, I ideally want to perform the following:

# First call: process initial chunk and obtain embeddings
segments1, embeddings1 = pipeline(chunk_file1, return_embeddings=True)

# Second call: process next chunk using embeddings from the first call
segments2, embeddings2 = pipeline(chunk_file2, return_embeddings=True, initial_embeddings=embeddings1)

Guidance on Reinjection Mechanism:

How can I effectively pass initial_embeddings from one pipeline call to the next to maintain speaker consistency?

Pipeline Customization:

Are there existing hooks or recommended methods within pyannote/speaker-diarization-3.1 to facilitate the reinjection of previous embeddings?

Example Implementation:

Could you provide a simple example or reference on how to modify the clustering process and use hooks for embedding reinjection?
if not implemented, perhaps have you some ideas on the possibility to how to do it, I'll be happy to find out and help modify the library.

Environment

pyannote.audio version: 3.1
Python version: 3.11
Torch version: 2.4.1+cu121
GPU: NVIDIA CUDA-enabled device

Thank you for your assistance!

Best regards,

Alexandre

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Howto / Allow Inject initial embeddings inside pipeline cluster #1764

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Uh oh!

Howto / Allow Inject initial embeddings inside pipeline cluster #1764

Uh oh!

heralight Sep 13, 2024

Help Needed: Reinjection of Previous Embeddings in pyannote/speaker-diarization-3.1 Pipeline

Description

Attempts Made:

Example Use Case:

Guidance on Reinjection Mechanism:

How can I effectively pass initial_embeddings from one pipeline call to the next to maintain speaker consistency?

Example Implementation:

Environment

Thank you for your assistance!

Replies: 0 comments

heralight
Sep 13, 2024

Help Needed: Reinjection of Previous Embeddings in `pyannote/speaker-diarization-3.1` Pipeline