Skip to content

Format Specification: .translit

Alexei Baboulevitch edited this page Jan 22, 2014 · 1 revision

My core transliteration library loads transliteration mappings from a .translit file. (At the moment, RU.translit is loaded automatically, but my apps could be easily modified to accept arbitrary .translit files.) All a .translit file is is a JSON dictionary with input strings as keys and output strings as values. For example, here's an abridged Russian transliterator:

{
    "a"  :  "а",
    "A"  :  "А",
    "b"  :  "б",
    "B"  :  "Б",
    "ju" :  "ю",
    "Ju" :  "Ю",
    "JU" :  "Ю",
    "ja" :  "я",
    "Ja" :  "Я",
    "JA" :  "Я"
}

As you can see, capital letters aren't handled automatically. I wanted my format to be as "dumb" as possible, especially since there's some ambiguity in regards to the rules.

Clone this wiki locally