Skip to content

Latest commit

 

History

History
62 lines (50 loc) · 2.26 KB

File metadata and controls

62 lines (50 loc) · 2.26 KB

Summary

Simple IPA-based pronouncing and rhyming dictionary.

Dependencies

Required: pudzu-utils.

Documentation

Nouncer

A simple pronouncing dictionary that supports IPA input and output, as well as importing from CMU dict and pronunciation lists.

>> pdict = Nouncer()
>> pdict.import_cmudict("cmudict.0.7a")
>> pdict
<Nouncer: 123691 entries>
>> pdict["polish"]
{'pˈoʊlɪʃ', 'pˈɑlɪʃ'}
>> pdict["polish"] = 'ˈpɒlɪʃ'
>> pdict["polish"]
{'pˈoʊlɪʃ', 'pˈɑlɪʃ', 'pˈɒlɪʃ'}
>> pdict["polish"] = {'ˈpɒlɪʃ'}
>> pdict["polish"]
{'pˈɒlɪʃ'}
>> del pdict["polish"]
>> pdict.save("unpolished")
>> pdict = Nouncer("unpolished")

pronunciations: generator returning individual pronunciations. Supports function or regex filters.

>> next(pdict.pronunciations(word_filter="^k", pronunciation_filter="^[^k]"))
('kneller', 'nˈɛlɝ')
>> next(pdict.pronunciations(word_filter=lambda w: len(w) > 15))
("representative's", 'rˌɛprɪzˈɛnətɪvz')

syllables: number of syllables for a given word. Supports a syllable counter for missing words. A simple heuristic implementation for English is provided in english_syllables, though this is only around 90% accurate.

>> pdict.syllables("resume")
{'rizˈum': 2, 'rɪzˈum': 2, 'rˈɛzəmˌeɪ': 3}
>> pdict.syllables("wugging", english_syllables)
{'(wugging)': 2}

rhymes: words that rhyme with a given word. Options include identirhyme (allow the same consonant before the stress: e.g. head/behead), multirhyme (allow arbitrary internal consonants: e.g. beheading/depressing) and cutrhyme (return truncated rhymes: e.g. beheading/bread).

>> pdict.rhymes("inconceivable")
{'ˌɪnkənsˈivəbəl': ['unbelievable', 'believable', 'achievable']}
>> pdict.rhymes("inconceivable", identirhyme=True)
{'ˌɪnkənsˈivəbəl': ['inconceivable', 'receivable', 'unbelievable', 'believable', 'achievable', 'conceivable']}
>> pdict.rhymes("inconceivable", multirhyme=True)
{'ˌɪnkənsˈivəbəl': ['learonal', 'impeachable', 'amenable', 'unreasonable', ... ]}
>> pdict.rhymes("beheading", cutrhyme=True)
{'bɪhˈɛdɪŋ': ['dreading', 'treading', 'read', 'said', ... ]}