A classification and quantification approach to generate features in soundscape ecology using neural networks
This repo contains a code implementation of the loss function proposed by the paper published on Neural Computing and Applications. It created a custom loss function that combines cross-entropy with quantification to train a simple CNN and the ResNet-50 to classify an audio set compounded by natural sounds.
To run the code, you will need to install:
obs.: It ran on Cuda 9.0 and cuDNN 7.0.5.15
We created two python files, and the main.py has the main functions we used: the architectures tested (my_CNN2D and my_ResNet50) and the custom loss (count_CC_loss). We do not know if that is the better way to implement the loss function, but it works.
The utils.py has some auxiliary functions used to train/apply the models. We coded a simple generator function, but you can use your generators.
The main code has the following parameters:
-a Action to be executed, to train a model (train) or to appply some pretrained model (apply)
-s Source directory to load spectrogram images.
-t Target directory to save/load model. Default = current directory
-m Model index (6) CNN (18) ResNet
-l Quantity of labels that the model needs to classify
-e Quantity of epochs. Default = 100
-b Batch size used for training, validation and test. Default = 80
-quant [value] Use the custom loss function with quantification. (no value after the parameter or 1)
first weighting case, (2) second weighting case, (3) third weighting case
-eval Generate model evaluation. Default = False.To train with your dataset, the generator looks for two directories inside the source path: train and validation. Inside these directories, put your spectrogram images and a CSV file with two columns: file and label related to your spectrograms. Data directory has examples of this files/structure.
To apply our pre-trained models to your data, set target -t parameter with a directory that contains a model (inside models directory exists a link to Google Drive with our models) and set source -s parameter with your spectrograms directory. Code will generate a CSV with the predicted labels in the current path, considering the case 12-class (described in the paper).
If you want other class scenarios (bird-class, anuran-class, or 2-class), just point to the specific model (see our model directory) and change the function utils.decode_labels, removing the commentary of your desired case.
Besides, you can use a ground truth to evaluate the models. Put a CSV file with two columns (file and expected labels) inside the source path and pass -eval parametter.
python main.py -a apply -l 12 -m 6 -b 80 -quant -eval -s /home/user/Desktop/data/test/ -t /home/user/Desktop/model python main.py -a train -l 12 -m 18 -e 50 -b 30 -quant -s /home/user/Desktop/data/ -t /home/user/Desktop/modelThe complete database has been collected by the LEEC lab and subsets were used in other papers, such as [1], [2], and [3]. Our subset is labeled with animal species and will be available on the lab website as soon as possible.
-
Fábio Felix Dias - e-mail: f_diasfabio@usp.br
-
Moacir Antonelli Ponti - e-mail: moacir@icmc.usp.br
-
Rosane Minghim - e-mail: rosane.minghim@ucc.ie