By ASVspoof2021 challenge organizers
With the release of the full set of keys and meta-labels (see ASVspoof.org), we provide this updated evaluation package to compute min t-DCF and EER.
Compared with the previous evaluation package (archived-package-stage-1), this evaluation package
- downloads and uses the full set of keys and meta-labels,
- computes not only pooled but also decomposed min t-DCFs and EERs on specified conditions,
- allows the users to provide their own ASV scores.
Users are encouraged to use this evaluation package rather than package-stage-1.
| Link | MD5 | |
|---|---|---|
| LA | https://www.asvspoof.org/asvspoof2021/LA-keys-full.tar.gz | |
| PA | https://www.asvspoof.org/asvspoof2021/PA-keys-full.tar.gz | a639ea472cf4fb564a62fbc7383c24cf |
| DF | https://www.asvspoof.org/asvspoof2021/DF-keys-full.tar.gz | dabbc5628de4fcef53036c99ac7ab93a |
(LA package is updated to remove an unnecessary file called trial_list.txt, 2023/04/13)
You can manually download them.
On Linux system, you may also use download.sh to download them.
Key & Meta-label file is in text format, each line contains the key and meta-labels for one trial. Details of the meta-labels are explained in the ASVspoof 2021 long summary paper.
We also provide a short explanation at the end of this page.
You can use the Python scripts to compute EERs and min t-DCFs.
pip install numpy
pip install pandas
pip install matplotlibEither use
bash download.shor manually download and untar them.
A directory called ./keys will be available. It contains:
keys
|- LA # Files for LA track
| |- CM
| | |- trial_metadata.txt # CM protocol with keys and meta-labels
| | |- LFCC-GMM
| | | |- score.txt # Score file from a baseline LFCC-GMM
| | |- ...
| |
| |- ASV
| |- trial_metadata.txt # ASV protocol with keys and meta-labels
| |- ASVtorch_kaldi
| |- score.txt # Score file from the ASV system
|- DF ...
|- PA ...A help message can be found by
python main.py --help
Here are some example use cases. Let's assume we have a CM score file score.txt for LA, and we want to get the results on eval subset (i.e., evaluation subset, which is disjoint from the progress and hidden subsets).
Compute results using pre-computed t-DCF C012 coefficients provided by the organizers
python main.py --cm-score-file score.txt --track LA --subset evalRecompute C012 using official ASV scores, save it to ./LA-c012.npy, and use the C012 coefficients to compute min tDCFs
python main.py --cm-score-file score.txt --track LA --subset eval --recompute-c012 --c012-path ./LA-c012.npyRecompute C012 using my own ASV scores, save it to ./LA-c012.npy and use the new C012 to compute min tDCFs
python main.py --cm-score-file score.txt --track LA --subset eval --recompute-c012 --c012-path ./LA-c012.npy --asv-score-file ./asv-score.txtCompute min tDCF using my own pre-computed C012 coeffs ./LA-c012.npy
python main.py --cm-score-file score.txt --track LA --subset eval --c012-path ./LA-c012.npyYou may play with the code using baseline CM score files.
They are available in the downloaded key and meta-label file packages
ls keys/*/CM/*/score.txt
Based on the Python scripts, this interactive notebook shows the details of min t-DCF and EER computation.
It also includes an API, which allows the user to upload score file and get the min t-DCF and EER tables.
You can directly open it through Google Colab. Just click the badge
Here we briefly explain the meanings of meta-labels, using the first line in LA/CM/trial_metadata.txt, PF/CM/trial_metadata.txt, and DF/CM/trial_metadata.txt.
LA_0009 LA_E_9332881 alaw ita_tx A07 spoof notrim evalLA_0009: speaker IDLA_E_9332881: trial IDalaw: name of codec. It can be:none: LA-C1alaw: LA-C2pstn: LA-C3g722: LA-C4ulaw: LA-C5gsm: LA-C6opus: LA-C7
ita_tx: name of transmission condition. It can beita_tx: FR-IT, transmission between France and Italysin_tx: FR-SG, transmission between France and Singaporeloc_tx: local transmissionmad_tx: Transmission through PSTN to Spain
A07: name of spoofing attack. It can beA07-A19are defined in ASVspoof 2019 LA database
spoof: key. It can bebonafide: bona fidespoof: spoof
notrim: whether non-speech frames are trimmed. It can benotrim: not trimmedtrim: trimmed
eval: name of subset. It can beeval: evaluation subsetprogress: progress subsethidden: hidden subset (there non-speech frames are trimmed)
PA_0010 PA_E_1000001 R3 M3 d4 r1 m1 s4 c4 spoof notrim eval
PA_0010: speaker IDPA_E_1000001: trial ID- Environment factors:
R1 - R9: ASV Room IDsM1 - M3: ASV microphone IDsD1 - D6: Talker-to-ASV Distance distances
- Attacker factors:
r1 - r9: Attacker Room IDsm1 - m3: Attacker microphone IDsc2 - c4: Attacker to talker distancess2 - s4: Attacker replay device IDsd1 - d6: Attacker-replay-device-to-ASV distances
spoof: key. It can bebonafide: bona fidespoof: spoof
notrim: whether non-speech frames are trimmed. It can benotrim: not trimmedtrim: trimmed
eval: name of subset. It can beeval: evaluation subsetprogress: progress subsethidden: hidden subsets
Note that hidden subsets contain:
notrim hidden: hidden subset 1 that contains simulated trials without trimmingtrim hidden: hidden subset 2 that contains real-replayed but trimmed trials
Note that, compared with key file released previously, these PA meta-labels are slightly updated:
- Old notation:
d2 - d4: Attacker to talker distances
D1 - D6: Attacker-replay-device-to-ASV distances
- New notation in the full set of key meta-labels
c2 - c4: Attacker to talker distances
d1 - d6: Attacker-replay-device-to-ASV distances
LA_0023 DF_E_2000011 nocodec asvspoof A14 spoof notrim progress traditional_vocoder - - - -
LA_0009: speaker IDDF_E_2000011: trial IDnocodec: name of codec for compression. It can be:nocodec: DF-C1low_mp3: DF-C2high_mp3: DF-C3low_m4a: DF-C4high_m4a: DF-C5low_ogg: DF-C6high_ogg: DF-C7mp3m4a: DF-C8oggm4a: DF-C9
asvspoof: source of data. It can be:asvspoof: from ASVspoof 2019vcc2018: from VCC 2018vcc2020: from VCC 2020
A14: name of spoofing attackA07-A19are defined in ASVspoof 2019 LA database
spoof: keybonafide: bona fidespoof: spoof
notrim: whether non-speech frames are trimmednotrim: not trimmedtrim: trimmed
progress: name of subset, which can beeval: evaluation subsetprogress: progress subsethidden: hidden subset (there non-speech frames are trimmed)
traditional_vocoder: type of vocoderbonafide: this is a bona fide trialneural_vocoder_autoregressive: spoofed trial using neural AR vocoderneural_vocoder_nonautoregressive: spoofed trial using neural non-AR vocodertraditional_vocoder: spoofed trial using traditional DSP-based vocoderunknown: spoofed trial with an unknown/unannotated vocoderwaveform_concatenation: spoofed trial by waveform concatenation
End