language:
- nl
Dutch(Netherlands) Entities Scripted Monologue Smartphone speech dataset, covers several domains, including person, phone number, address, alphanumeric sequence, Email, product Model, product serial number, and money entities, mirrors real-world interactions. Transcribed with text content, and other attributes. Our dataset was collected from extensive and diversify speakers, geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.
For more details, please refer to the link: https://www.nexdata.ai/datasets/speechrecog?source=Github
16kHz,16bit,wav,mono channel
quiet indoor environment, normal environment(contains noise that does not affect recognition)
Speakers will read and record based on the given texts, with each text containing at least 1 type of specified entity word: person, phone number, address, alphanumeric sequence, Email, product Model, product serial number, and money.
Netherlands(NLD)
nl-NL
Dutch
WAR(Word Accuracy Rate) 98% (Punctuation, tags and non-speech annotations are subjective, thus they are excluded from the accuracy statistics.)
Android phone, iPhone
Commercial License