This repository functions as a collection of resources for the development and testing of speech detection/recognition/analysis tools for the Dutch/Flemish language. These include:
| Title | Type | Description | Size | Link |
|---|---|---|---|---|
Wav2Vec2 |
Wav2Vec2 |
Wav2Vec2 |
Wav2Vec2 |
Wav2Vec2 |
| XLSR Wav2Vec2 Dutch by Jonatas Grosman | "Fine-tuned facebook/wav2vec2-large-xlsr-53 on Dutch using the train and validation splits of Common Voice 6.1 and CSS10." | - | HuggingFace | |
| Dutch XLSR Wav2Vec2 Large 53 by Wietse de Vries | "Fine-tuned facebook/wav2vec2-large-xlsr-53 on Dutch using the Common Voice dataset. When using this model, make sure that your speech input is sampled at 16kHz." | - | HuggingFace | |
| wav2vec2-large-xls-r-300m-nl | "This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the common_voice dataset." | - | HuggingFace | |
| Wav2Vec2-Large-XLSR-53-Dutch | "Fine-tuned facebook/wav2vec2-large-xlsr-53 on Dutch using the Common Voice. When using this model, make sure that your speech input is sampled at 16kHz." | - | HuggingFace | |
| wav2vec2-large-xlsr-53-Dutch by Mehdi Hosseini Moghadam | "Fine-tuned facebook/wav2vec2-large-xlsr-53 in Dutch using the Common Voice. When using this model, make sure that your speech input is sampled at 16kHz." | - | HuggingFace | |
| simonsr wav2vec2-large-xlsr-dutch | "Fine-tuned facebook/wav2vec2-large-xlsr-53 on Dutch using the Common Voice. When using this model, make sure that your speech input is sampled at 16kHz." | - | HuggingFace | |
| facebook wav2vec2 large xlsr-53-dutch model | "The model facebook wav2vec2 large xlsr-53-dutch is a Natural Language Processing (NLP) Model implemented in Transformer library, generally using the Python programming language." | - | HuggingFace | |
| GroNLP/wav2vec2-dutch-large-ft-cgn | "A Dutch Wav2Vec2 model. This model is created by further pre-training the original English facebook/wav2vec2-large model on Dutch speech from Het Corpus Gesproken Nederlands. Subsequently, the model is fine-tuned on the same Dutch speech using CTC." | - | HuggingFce | |
| Wav2Vec2-Large-XLSR-53-ft-CGN | "This model is created by fine-tuning the facebook/wav2vec2-large-xlsr-53 model on Dutch speech from Het Corpus Gesproken Nederlands using CTC." | - | HuggingFace | |
openai |
openai |
openai |
openai |
openai |
| openai/whisper-large | "The Whisper model was proposed in Robust Speech Recognition via Large-Scale Weak Supervision by Alec Radford, Jong Wook Kim, Tao Xu, Greg Brockman, Christine McLeavey, Ilya Sutskever." | - | HuggingFace | |
s2t |
s2t |
s2t |
s2t |
s2t |
| facebook/s2t-medium-mustc-multilingual-st | "s2t-medium-mustc-multilingual-st is a Speech to Text Transformer (S2T) model trained for end-to-end Multilingual Speech Translation (ST)." | - | HuggingFace | |
speechbrain |
speechbrain |
speechbrain |
speechbrain |
speechbrain |
| speechbrain/lang-id-commonlanguage_ecapa Copied | "This repository provides all the necessary tools to perform language identification from speech recordings with SpeechBrain. The system uses a model pretrained on the CommonLanguage dataset (45 languages)." | - | HuggingFace | |
| Coming soon... |
| Title | Type | Description | Size | Link |
|---|---|---|---|---|
| Common Voice NL | "The Common Voice dataset consists of a unique MP3 and corresponding text file. Many of the 9,283 recorded hours in the dataset also include demographic metadata like age, sex, and accent that can help train the accuracy of speech recognition engines. The dataset currently consists of 7,335 validated hours in 60 languages" | 63 hours | HuggingFace | |
| LibriVox | "Free public domain audiobooks" | 295 books | LibriVox | |
| MUST-C | "Created by Di Gangi et al. at 2019, the MuST-C Dataset is a speech translation corpus containing 385 hours from Ted talks for speech translation from English into several languages: Dutch, French, German, Italian, Portuguese, Romanian, Russian, & Spanish. Requires filling request form., in Multi-Lingual language." | 385 hours | Fondazione Bruno Kessler | |
| dutch-vl-tts | "This dataset contains 15.000 audio fragments of a male Dutch Flemish voice, the sentences read are extracted from the Mozilla Common Voice project." | 15.000 audio recordings | GitHub | |
| Corpus Gesproken Nederlands | "In de periode 1998-2004 is in het kader van het project Corpus Gesproken Nederlandse (CGN) gewerkt aan de aanleg van een databank voor het hedendaags Nederlands zoals dat door volwassen sprekers in Nederland en Vlaanderen wordt gesproken. De resultaten van dit project zijn in maart 2004 beschikbaar gekomen." | - | CGN | |
| IFA Spoken Language Corpus | "The IFA Spoken Language corpus is a free (GPL) database of hand-segmented Dutch speech. It was constructed with off-the-shelf software using speech from 8 speakers in a variety of speaking styles. For a total of 50,000 words (41 minutes/speaker), speech acquisition and preparation took around 3 person-weeks per speaker." | 4 hours | IFA | |
| CSS10 | "CSS10 is a collection of single speaker speech datasets for 10 languages. Each of them consists of audio files recorded by a single volunteer and their aligned text sourced from LibriVox." | - | kaggle | |
| Spoken Wikipedia Corpus (Dutch) | "The Spoken Wikipedia project unites volunteer readers of Wikipedia articles. Hundreds of spoken articles in multiple languages are available to users who are – for one reason or another – unable or unwilling to consume the written version of the article." | - | kaggle | |
| Corpus Gesproken Nederlands (CGN) | "Het Corpus Gesproken Nederlands (CGN) is een verzameling van 900 uur (bijna 9 miljoen woorden) hedendaagse Nederlandse spraak, afkomstig van Vlamingen en Nederlanders." | 900 hours | Instituut voor de Nederlandse taal | |
| Coming soon... |

.png?raw=true)
.png?raw=true)