This python script translates text using an open-source GPT model with a custom prompt.
- No limits.
- Fully free.
- Supports all 7,151 languages.
pip3 install -r requirements.txtThe model requires the PyTorch backend. If it is not present in your environment, install it following the instructions on https://pytorch.org/.
On Apple Silicon machines (e.g., M1/M2 Macs), install the PyTorch build with MPS support to offload computation to the GPU. The translators in this project automatically use MPS when it is available, reducing CPU usage.
On Python 3.14, sentencepiece does not ship prebuilt wheels, so pip tries to
build it from source and can fail unless your system has build tools installed.
This project uses the fast tokenizer path by default, so sentencepiece is
optional on 3.14. If you want it anyway, install the build tools and then
install sentencepiece from source:
brew install cmake pkg-config sentencepiece coreutils
export PKG_CONFIG_PATH="$(brew --prefix sentencepiece)/lib/pkgconfig:$PKG_CONFIG_PATH"
python3 -m pip install sentencepiece --no-binary=:all:| Full Name | Language Code |
|---|---|
| Afar | aa |
| Abkhazian | ab |
| Avestan | ae |
| Afrikaans | af |
| Akan | ak |
| Amharic | am |
| Aragonese | an |
| Arabic | ar |
| Assamese | as |
| Avaric | av |
| Aymara | ay |
| Azerbaijani | az |
| Bashkir | ba |
| Belarusian | be |
| Bulgarian | bg |
| Bihari languages | bh |
| Bislama | bi |
| Bambara | bm |
| Bengali | bn |
| Tibetan | bo |
| Breton | br |
| Bosnian | bs |
| Catalan; Valencian | ca |
| Chechen | ce |
| Chamorro | ch |
| Corsican | co |
| Cree | cr |
| Czech | cs |
| Church Slavic; Slavonic; Old Bulgarian | cu |
| Chuvash | cv |
| Welsh | cy |
| Danish | da |
| German | de |
| Divehi; Dhivehi; Maldivian | dv |
| Dzongkha | dz |
| Ewe | ee |
| Greek, Modern (1453-) | el |
| English | en |
| Esperanto | eo |
| Spanish; Castilian | es |
| Estonian | et |
| Basque | eu |
| Persian | fa |
| Fulah | ff |
| Finnish | fi |
| Fijian | fj |
| Faroese | fo |
| French | fr |
| Western Frisian | fy |
| Irish | ga |
| Gaelic; Scottish Gaelic | gd |
| Galician | gl |
| Guarani | gn |
| Gujarati | gu |
| Manx | gv |
| Hausa | ha |
| Hebrew | he |
| Hindi | hi |
| Hiri Motu | ho |
| Croatian | hr |
| Haitian; Haitian Creole | ht |
| Hungarian | hu |
| Armenian | hy |
| Herero | hz |
| Interlingua | ia |
| Indonesian | id |
| Interlingue; Occidental | ie |
| Igbo | ig |
| Sichuan Yi; Nuosu | ii |
| Inupiaq | ik |
| Ido | io |
| Icelandic | is |
| Italian | it |
| Inuktitut | iu |
| Japanese | ja |
| Javanese | jv |
| Georgian | ka |
| Kongo | kg |
| Kikuyu; Gikuyu | ki |
| Kuanyama; Kwanyama | kj |
| Kazakh | kk |
| Kalaallisut; Greenlandic | kl |
| Central Khmer | km |
| Kannada | kn |
| Korean | ko |
| Kanuri | kr |
| Kashmiri | ks |
| Kurdish | ku |
| Komi | kv |
| Cornish | kw |
| Kirghiz; Kyrgyz | ky |
| Latin | la |
| Luxembourgish; Letzeburgesch | lb |
| Ganda | lg |
| Limburgan; Limburger; Limburgish | li |
| Lingala | ln |
| Lao | lo |
| Lithuanian | lt |
| Luba-Katanga | lu |
| Latvian | lv |
| Malagasy | mg |
| Marshallese | mh |
| Maori | mi |
| Macedonian | mk |
| Malayalam | ml |
| Mongolian | mn |
| Marathi | mr |
| Malay | ms |
| Maltese | mt |
| Burmese | my |
| Nauru | na |
| Norwegian Bokmål | nb |
| Ndebele, North; North Ndebele | nd |
| Nepali | ne |
| Ndonga | ng |
| Dutch; Flemish | nl |
| Norwegian Nynorsk | nn |
| Norwegian | no |
| Ndebele, South; South Ndebele | nr |
| Navajo; Navaho | nv |
| Chichewa; Chewa; Nyanja | ny |
| Occitan (post 1500) | oc |
| Ojibwa | oj |
| Oromo | om |
| Oriya | or |
| Ossetian; Ossetic | os |
| Panjabi; Punjabi | pa |
| Pali | pi |
| Polish | pl |
| Pushto; Pashto | ps |
| Portuguese | pt |
| Quechua | qu |
| Romansh | rm |
| Rundi | rn |
| Romanian; Moldavian; Moldovan | ro |
| Russian | ru |
| Kinyarwanda | rw |
| Sanskrit | sa |
| Sardinian | sc |
| Sindhi | sd |
| Northern Sami | se |
| Sango | sg |
| Sinhala; Sinhalese | si |
| Slovak | sk |
| Slovenian | sl |
| Samoan | sm |
| Shona | sn |
| Somali | so |
| Albanian | sq |
| Serbian | sr |
| Swati | ss |
| Sotho, Southern | st |
| Sundanese | su |
| Swedish | sv |
| Swahili | sw |
| Tamil | ta |
| Telugu | te |
| Tajik | tg |
| Thai | th |
| Tigrinya | ti |
| Turkmen | tk |
| Tagalog | tl |
| Tswana | tn |
| Tonga (Tonga Islands) | to |
| Turkish | tr |
| Tsonga | ts |
| Tatar | tt |
| Twi | tw |
| Tahitian | ty |
| Uighur; Uyghur | ug |
| Ukrainian | uk |
| Urdu | ur |
| Uzbek | uz |
| Venda | ve |
| Vietnamese | vi |
| Volapük | vo |
| Walloon | wa |
| Wolof | wo |
| Xhosa | xh |
| Yiddish | yi |
| Yoruba | yo |
| Zhuang; Chuang | za |
| Chinese | zh |
| Zulu | zu |
python <<< 'import nltk' && python <<< 'nltk.download("punkt")'import module
tr = module.PromptTranslator(text='Hello, World!', dest='fr')
translated_text = tr.translated_text
if __name__ == '__main__':
print(f'Result: {translated_text}');The result will be:
Result: Bonjour le monde!
You can run the provided example.py script from the command line:
python example.py "Hello, world!" --dest frYou can also use an open chat model that accepts custom prompts:
from module import PromptTranslator
tr = PromptTranslator(text="Hello, World!", dest="fr")
print(tr.translated_text)Loading the default chat model (openai-community/openai-gpt) is lightweight
and can run on modest hardware. The model has a 512-token context window, so
large prompts are automatically truncated.