This project focuses on the ESC50 Challenge. The ESC-50 dataset is a labeled collection of 2000 environmental audio recordings suitable for benchmarking methods of environmental sound classification. The dataset consists of 5-second-long recordings organized into 50 semantic classes (with 40 examples per class), loosely arranged into 5 major categories.
- Data Generation: The
dataset_ESC50.pyfile is used to generate the data. - Training Pipeline: The
Train_crossval.pyfile provides the training pipeline, in which a 5-fold cross-validation training is conducted. - Model Storage: The trained models are stored in the
resultfolder. - Testing: The
test_crossval.pyfile tests all 5 folds and calculates a mean accuracy. - Results: The average accuracy of the model is 83.2%.
- Torch
- scikit-learn (sklearn)
- librosa