How to interpret the output of the segmentation model ? #1315

amitli1 · 2023-04-04T06:03:40Z

amitli1
Apr 4, 2023

pyannote for speaker diarization based on the following segmentation model:
End-to-end speaker segmentation for overlap-aware resegmentation

In the above paper they wrote, under the Implementation details:

model input: sequences of 80000 samples
[i.e: 5s audio chunks with a sampling rate of 16kHz]
model output:
K max -dimensional speaker activations between 0 and 1 every 16ms.

Does it means that the output shape is (K, 5000/16) ?
The output values are between 0 and 1. how to interpret it ?
How to conclude if we have a new segment or number of segments in each output ? number of speaker in output ? (example will be very helpful)

hbredin · 2023-04-04T06:52:47Z

hbredin
Apr 4, 2023
Maintainer

Did you read this? This should answer most of your questions about this model.

3 replies

amitli1 Apr 4, 2023
Author

Thanks,
Still don't understand the shape: (11, 293, 3) for 5s sliding window.
Didn't understand the value of 293.
5000 / 16ms = 312.5
How they got 293 ?

amitli1 Apr 7, 2023
Author

@hbredin Can you please explain how we get the shape of: (11, 293, 3)
after the code:

inference = Inference(model, duration=5.0, step=2.5)
output = inference(SAMPLE_WAV)

I don't understand why 293 ?

benbot Oct 14, 2024

@amitli1 did you ever figure it out?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

How to interpret the output of the segmentation model ? #1315

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

How to interpret the output of the segmentation model ? #1315

Uh oh!

amitli1 Apr 4, 2023

Replies: 1 comment · 3 replies

Uh oh!

hbredin Apr 4, 2023 Maintainer

Uh oh!

amitli1 Apr 4, 2023 Author

Uh oh!

amitli1 Apr 7, 2023 Author

Uh oh!

benbot Oct 14, 2024

amitli1
Apr 4, 2023

Replies: 1 comment 3 replies

hbredin
Apr 4, 2023
Maintainer

amitli1 Apr 4, 2023
Author

amitli1 Apr 7, 2023
Author