Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data/train		data/train
logs		logs
models		models
src		src
Readme.md		Readme.md

Repository files navigation

ID R&D Voice Antispoofing Challenge

2nd place solution for ID R&D Voice Antispoofing Challenge.

Task

Binary classification of sound recordings. The task was to distinguish the human voice from the spoofed speech.

Description

All data files (human and spoof) are expected to be in the same folder data/train.

Main files:

src/prepare_metadata.py - prepare csv with metadata
src/train.py - train all models
src/predict.py - prediction on test set (not reproducible, should run on competition platform)

Trained models go to models/, log file goes to logs/.

Approach

Blending (simple average) of 5 different models with different preprocessing.
Mel-spectrogram and constant-Q transform with different parameters for preprocessing.
Small and simple 2D and 1D CNN models - several convolutional blocks, global max pooling and dense layer in the end, batch normalization.
There was 100 mb solution size limit, so using custom lightweight models helps to blend more models (and allows to make cross-validation and blend trained models from several folds).
Prepare all spectrograms in advance (not on-the-fly), then train models, this is especially useful for constant-Q transform, which is quite slow.

All parameters of trained models and data preprocessing are in src/config.yaml.

About

2nd place solution for ID R&D Voice Antispoofing Challenge

Report repository

Releases

No releases published

Packages

No packages published

Languages

Python 100.0%