You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Feb 12, 2022. It is now read-only.
Copy file name to clipboardExpand all lines: README.md
+40-17Lines changed: 40 additions & 17 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,24 +2,29 @@
2
2
3
3
<imgwidth="400"align="right"alt="screen shot 2017-11-21 at 12 35 28"src="https://user-images.githubusercontent.com/72940/33071669-be6c35b2-cebc-11e7-8822-9b998ad1ea09.png">
4
4
5
-
Estimating the number of concurrent speakers from single channel mixtures is a very challenging task that is a mandatory first step to address any realistic “cocktail-party” scenario. It has various audio-based applications such as blind source separation, speaker diarisation, and audio surveillance. Building upon powerful machine learning methodology and the possibility to generate large amounts of learning data, Deep Neural Network (DNN) architectures are well suited to directly estimate speaker counts.
5
+
_CountNet_ is a deep learning model to estimate the number of concurrent speakers from single channel mixtures is a very challenging task that is a mandatory first step to address any realistic “cocktail-party” scenario. It has various audio-based applications such as blind source separation, speaker diarisation, and audio surveillance.
6
6
7
-
## Publication
7
+
This repo provides pre-trained models.
8
8
9
-
#### Accepted for ICASSP 2018
9
+
##Publications
10
10
11
-
*__Title__: Classification vs. Regression in Supervised Learning for Single Channel
11
+
### 2019: IEEE/ACM Transactions on Audio, Speech, and Language Processing
12
+
13
+
*__Title__: CountNet: Estimating the Number of Concurrent Speakers Using Supervised Learning
<imgwidth="360"align="right"alt="screen shot 2017-11-21 at 12 35 28"src="https://user-images.githubusercontent.com/72940/33072095-60d1929c-cebe-11e7-91de-1dff3fc50bde.png">
20
-
21
-
In this work a recurrent neural network was trained to generate speaker count estimates for 0 to 10 speakers. The model uses three Bi-LSTM layers inspired by a model for singing voice separation by [Leglaive15](https://hal.archives-ouvertes.fr/hal-01110035).
20
+
### 2018: ICASSP
22
21
22
+
*__Title__: Classification vs. Regression in Supervised Learning for Single Channel
@@ -34,22 +39,22 @@ This repository provides the [keras](https://keras.io/) model to be used from Py
34
39
[Docker](https://www.docker.com/) makes it easy to reproduce the results and install all requirements. If you have docker installed, run the following steps to predict a count from the provided test sample.
35
40
36
41
* Build the docker image: `docker build -t countnet .`
37
-
* Predict from example: `docker run -i countnet python predict_audio.py examples/5_speakers.wav`
42
+
* Predict from example: `docker run -i countnet python predict.py --model CRNN examples/5_speakers.wav`
38
43
39
44
### Manual Installation
40
45
41
-
Make sure you have Python 3.6, `libsndfile` and `libhdf5` installed on your system (e.g. through Anaconda). To install the requirements run
46
+
To install the requirements using Anaconda Python, run
42
47
43
-
`pip install -r requirements.txt`
48
+
`conda env create -f env.yml`
44
49
45
-
You can now run the command line script and process wav files
50
+
You can now run the command line script and process wav files using the pre-trained model `CRNN` (best peformance).
#### Is it possible to convert the model to run on a modern version of keras with tensorflow backend?
109
+
110
+
Yes, its possible. But I was unable to get identical results when converting model. I tried this [guide](https://github.com/keras-team/keras/wiki/Converting-convolution-kernels-from-Theano-to-TensorFlow-and-vice-versa) but it still didn't help to get to the same performance compared to keras 1.2.2 and theano.
0 commit comments