Skip to content

Repository for Finland Swedish Mispronunciation Detection Without L2 Pronunciation Dataset

License

Notifications You must be signed in to change notification settings

aalto-speech/FinSwedish

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Finland Swedish Mispronunciation Detection

Code for the paper
“Mispronunciation Detection Without L2 Pronunciation Dataset in Low-Resource Setting: A Case Study in Finland Swedish” (Interspeech 2025).


Contents

File Purpose
temperature_scaling.ipynb Main notebook – temperature scaling and top-k normalization
environment.yaml Exact Conda environment (optional)

Quick start

The algorihtm is very simple, so you can run without any specific requirement. However, you can also install the full environment from the yaml file

# optional: reproducible environment
conda env create -n FinSwe -f environment.yaml
conda activate FinSwe

Run the notebook temperature_scaling.ipynb. Of course you still need the Wav2vec 2.0 model and an audio file.


Word lists used in our experiments (Table 2 in the paper)

'fara', 'göra'
fara, göra
/rt/ and /u:/
bort, kors, kort, korta, borta  
telefon, telefonen, telefoner
Others
sju, sjuk, sjuka, stjärna  
sport  
köpa  
kyrka, kyrkan, kyrkas  
kina  
kök  
tjära, tjärn, tjugo, tjugofem  
kjol, kjolar  
tjena, tjejen, tjejet  
domare, domaren  
döma, dömas  
skjorta, skjortan, skjortor  
djur, djuren, djuret  
djup, djupa, djupt, djupare  
djärvare

Citation

Phan, N., Kuronen, M., Kautonen, M., Ullakonoja, R., von Zansen, A., Getman, Y., Voskoboinik, E., Grósz, T., Kurimo, M. (2025) Mispronunciation Detection Without L2 Pronunciation Dataset in Low-Resource Setting: A Case Study in Finland Swedish. Accepted in Interspeech 2025.

@inproceedings{phan25,
  title     = {Mispronunciation Detection Without L2 Pronunciation Dataset in Low-Resource Setting: A Case Study in Finland Swedish},
  author    = {Nhan Phan and Mikko Kuronen and Maria Kautonen and Riikka Ullakonoja and Anna {von Zansen} and Yaroslav Getman and Ekaterina Voskoboinik and Tamás Grosz and Mikko Kurimo},
  year      = {2025},
}

License

Our work is shared under Creative Commons Attribution 4.0 International (CC-BY-4.0)

About

Repository for Finland Swedish Mispronunciation Detection Without L2 Pronunciation Dataset

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published