MK-SGC-SC: Multiple Kernel Guided Sparse Graph Construction in Spectral Clustering for Unsupervised Speaker Diarization
This repository contains the implementation for the following work:
- Nikhil Raghav, Avisek Gupta, Swagatam Das, and Md Sahidullah, MK-SGC-SC: Multiple Kernel Guided Sparse Graph Construction in Spectral Clustering for Unsupervised Speaker Diarization. The full paper is available on arXiv
Install SpeechBrain version 0.5.14.
Follow the installation guidelines provided in the SpeechBrain repository.
This repository provides scripts for the proposed MK-SGC-SC technique.
To run experiments on the DIHARD-III, AMI, or VoxConverse datasets, overwrite the following files in the SpeechBrain toolkit.
For example, for the AMI meeting corpus, replace the files as shown below:
| Original file (SpeechBrain) | Replace with (from this repo) |
|---|---|
speechbrain/recipes/AMI/Diarization/experiment.py |
experiment_ami.py |
speechbrain/recipes/AMI/Diarization/hparams/ecapa_tdnn.yaml |
ecapa_tdnn_ami.yaml |
speechbrain/speechbrain/processing/diarization.py |
diarization_ami.py |
Similarly, use the corresponding files for DIHARD-III and VoxConverse.
Our implementation is based on a modified version of the AMI recipe in SpeechBrain.
To launch an experiment (example: AMI meeting corpus), run the following command from the directory where your experiment_ami.py file is located:
python experiment_ami.py hparams/ecapa_tdnn_ami.yamlThis project is licensed under the MIT License. The full terms of the MIT License can be found in the LICENSE.md file at the root of this project.