Phonetic Lexicon for German ASR

This repository contains scripts to create a phonetic lexicon for german words. Furthermore scripts for training a sequitur G2P model are available. If you want to use the lexicon or g2p model, please check licenses of the datasources in the table below.

Lexicon

For the lexicon data from two sources are used.

Name	Num. Words	URL
MaryTTS	26199	https://github.com/marytts/marytts-lexicon-de
Wiktionary	501179	https://dumps.wikimedia.org/dewiktionary

The final lexicon is based on both MaryTTS and Wiktionary. The phone-set (SAMPA) of MaryTTS is used. But since stress and glotal stops differ highly between both datasources, they were ignored. The final lexicon is available in the releases.

G2P

The G2P models are trained using sequitur. Training was done using the lexicon mentioned above for 8 iterations. Models are available in the releases.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
run_mary_wiki_nogs.sh		run_mary_wiki_nogs.sh
run_marytts.sh		run_marytts.sh
run_marytts_nogs.sh		run_marytts_nogs.sh
run_wiktionary.sh		run_wiktionary.sh
run_wiktionary_nogs.sh		run_wiktionary_nogs.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Phonetic Lexicon for German ASR

Lexicon

G2P

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Phonetic Lexicon for German ASR

Lexicon

G2P

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages