English | 日本語 | 中文 | Français | Deutsch | Italiano | 한국어 | Русский | UK English
A small CLI translator that uses LLM tool calls and always reads from stdin.
- Overview
- Installation
- Quickstart
- Alias (
t) for CLI usage - Usage
- Model selection & cache
- Settings
- Language Packs
- Environment variables
- Options
- Server mode
- MCP mode
- FFI (C ABI)
- Notes
- Reads input from stdin and prints the translated text.
- Uses tool-calling JSON only (no free-form output).
- Providers: OpenAI, Gemini, Claude.
- Model list is fetched via each provider's Models API and cached for 24 hours.
Choose one of the following:
Release artifacts are available on the Releases page: GitHub Releases
Each asset is named llm-translator-rust-<os>-<arch> (e.g. llm-translator-rust-macos-aarch64).
cargo install --git https://github.com/m3m0r7/llm-translator-rust --lockedgit clone https://github.com/m3m0r7/llm-translator-rust
cd llm-translator-rust
make
sudo make installBinary will be available at:
./target/release/llm-translator-rust
Notes:
- macOS/Linux default:
$XDG_RUNTIME_DIRif set, otherwise/usr/local/bin(usesudo make installif needed) - Windows (MSYS/Git Bash):
%USERPROFILE%/.cargo/bin makewritesbuild/build_env.tomland embeds it into the binary (portable builds don’t need the file at runtime).- You can override paths via env vars passed to
make, e.g.DATA_DIRECTORY=$XDG_DATA_HOME/llm-translator-rust CONFIG_DIRECTORY=$XDG_CONFIG_HOME/llm-translator-rust RUNTIME_DIRECTORY=$XDG_RUNTIME_DIR BIN_DIRECTORY=target/release BUILD_ENV_PATH=build/build_env.toml make make installcopiessettings.tomlto the configured settings path (XDG config dir by default) if it does not exist.make installcopies headers fromext/into$XDG_DATA_HOME/llm-translator-rustif they do not exist.
export OPENAI_API_KEY="..."
./target/release/llm-translator-rust <<< "ねこ"alias t="/path/to/llm-translator-rust/target/release/llm-translator-rust"
echo ねこ | techo Cat | llm-translator-rust
echo Cat | llm-translator-rust -l en
echo Cat | llm-translator-rust --source-lang en -l ja
# Output examples
echo Cat | llm-translator-rust
# Cat
echo Cat | llm-translator-rust -l en
# Cat
echo Cat | llm-translator-rust -l kor
# 고양이
echo Cat | llm-translator-rust -l zho-hans
# 猫
echo Cat | llm-translator-rust -l zho-hant
# 貓
echo Cat | llm-translator-rust -l ja --formal academic
# 猫
echo Awesome | llm-translator-rust -l ja --slang
# ヤバい
# Dictionary (part of speech/inflections)
echo 猫 | llm-translator-rust --pos -l en
echo play | llm-translator-rust --pos noun,verb -l en
# File translation
cat foobar.txt | llm-translator-rust -l en
# File attachment translation (image/doc/docx/pptx/xlsx/pdf/txt/md/html/json/yaml/po/xml/js/ts/tsx/mermaid/audio)
llm-translator-rust --data ./slides.pptx --data-mime pptx -l en
llm-translator-rust --data ./scan.png -l ja
llm-translator-rust --data ./voice.mp3 -l en
# Attachment via stdin (auto-detect or with --data-mime)
cat ./scan.png | llm-translator-rust -l ja
cat ./report.pdf | llm-translator-rust --data-mime pdf -l en
# Image/PDF attachments are re-rendered with numbered overlays (path is returned).
# The image height is extended and a footer list is added:
# (N) original (reading): translated
# - reading is a Latin-script pronunciation for non-Latin text (e.g., romaji/pinyin).
# - identical translations share the same number.
# When using --data with a file path (and without --overwrite), a sibling file is written.
# The suffix comes from settings.toml [system].translated_suffix (default: _translated).
# When --data points to a directory, a sibling output directory is created with the same suffix.When --data points to a directory, the CLI walks it recursively and translates each supported file.
The relative directory structure is preserved in the output directory.
llm-translator-rust --data ./docs -l ja
# Output: ./docs_translated (default suffix; configurable via settings.toml)Notes:
--data-mimeapplies to every file in the directory; leave it asautofor mixed file types.- Files that cannot be read or whose mime cannot be detected are reported as failures; files that are detected but not supported by the translator are skipped.
- Use
--forceto treat unknown/low-confidence detections as text. - Directory translation runs concurrently (default 3 threads). Use
--directory-translation-threadsorsettings.tomlto change it. - Files can be excluded with
--ignore-translation-fileor an ignore file (default:.llm-translation-rust-ignore, configurable viasettings.toml). Patterns follow.gitignorerules (*,**,!, comments). - Ignore rules apply only when
--datapoints to a directory. - Use
--outto choose the output directory for directory translation. - When a directory translation fails, the original file is copied to the output directory.
--overwrite writes results in place for files or directories passed via --data.
Before writing, each file is backed up to $XDG_DATA_HOME/llm-translator-rust/backup.
Retention is controlled by settings.toml [system].backup_ttl_days (default: 30).
llm-translator-rust --data ./docs --overwrite -l ja
llm-translator-rust --data ./slide.pdf --overwrite -l en--out sets the output path for file or directory translations.
It cannot be used with --overwrite.
llm-translator-rust --data ./docs -l ja --out ./outdir
llm-translator-rust --data ./slide.pdf -l en --out ./translated.pdf--pos returns dictionary-style details for the input term. You can optionally filter by part of speech (comma-separated). English POS names are always accepted and mapped to the source language when possible.
Usage:
echo 猫 | llm-translator-rust --pos -l en
echo play | llm-translator-rust --pos noun,verb -l en
Example output (labels follow the source language):
訳語: cat
読み: キャット
品詞: 名詞
属性: 動物, ペット
別訳: kitty (キティ), tomcat (トムキャット), feline (フィーライン)
複数形: cats
三人称単数: cats
過去形: -
現在分詞: -
用法: 一般的な猫を指す最も基本的な言葉。ペットや動物全般として広く使われる。
用例:
- I have a cat. (私は猫を飼っています。)
- The black cat is sleeping. (黒い猫が眠っています。)
- Many people love cats. (多くの人が猫を愛しています。)
--correction proofreads the input and points out corrections in the source language.
Usage:
echo "This is pen" | llm-translator-rust --correction --source-lang en
Example output:
This is a pen
-
Correction reasons:
- English requires a/an before a countable noun
- Labels are localized to the source language.
Readingis the translation’s pronunciation rendered in the source language’s typical script (e.g., Japanese=katakana, Chinese=pinyin with tone marks, Korean=hangul).Alternativeslists other plausible translations with readings.Usageand example source sentences are in the source language.- Examples include the translation or one of the alternatives.
--report builds a translation report from histories (default limit: 512).
It aggregates language pairs, content types, models, daily volume, and LLM-based clusters/keywords.
Output defaults to ./report.<format> in the current directory.
Usage:
llm-translator-rust --report html
llm-translator-rust --report json --report-out ./report.json
--details generates a detailed translation report for all styles defined in settings.toml [formally].
Each style block includes the translation, reasoning bullets, and extra example sentences.
The --formal style (if present) is shown first.
Usage:
echo "This is a cat, however we do not eat the cat because the cat is so cute." | llm-translator-rust --details -l ja
Audio files are transcribed with whisper-rs, translated by the LLM, then re-synthesized.
- Supported audio: mp3, wav, m4a, flac, ogg
- Requires
ffmpeg - Requires a Whisper model (auto-downloaded on first run)
- TTS uses macOS
sayor Linuxespeak
Choose a model:
llm-translator-rust --show-whisper-models
llm-translator-rust --whisper-model small -d ./voice.mp3 -l en
You can also set LLM_TRANSLATOR_WHISPER_MODEL to a model name or file path.
settings.toml [whisper] model or --whisper-model overrides this.
macOS (Homebrew):
brew install tesseract ffmpeg
Ubuntu/Debian:
sudo apt-get install tesseract-ocr ffmpeg espeak
Windows (Chocolatey):
choco install tesseract ffmpeg
Original:
Translated:
- Default provider priority: OpenAI → Gemini → Claude (first API key found).
-m/--modelaccepts:- Provider only:
openai,gemini,claude(uses provider defaults below, if available) - Provider + model:
openai:MODEL_ID - When specifying a model, always include the provider prefix.
- Provider only:
- Defaults use provider defaults below; if unavailable, the first chat-capable model is used.
- Source/target languages use ISO 639-1 or ISO 639-2/3 codes (e.g.,
ja,en,jpn,eng). Source can beauto. - For Chinese variants, use
zho-hans(Simplified) orzho-hant(Traditional). - Language validation uses the ISO 639 list from Wikipedia: https://en.wikipedia.org/wiki/List_of_ISO_639_language_codes
- Model list is fetched from each provider’s Models API and cached for 24 hours.
- Cache path:
~/.llm-translator/.cache/meta.json(fallback:./.llm-translator/.cache/meta.json)
--show-models-listprints the cached list asprovider:modelper line.--show-whisper-modelsprints available whisper model names.--pos [noun,verb]returns dictionary-style details (translation + reading, POS, alternatives, inflections, usage/examples). Omit the list for all POS.--correctionreturns proofreading corrections and reasons in the source language.--whisper-modelselects the whisper model name or path for audio transcription.- When
--modelis omitted,lastUsingModelinmeta.jsonis preferred (falls back to default resolution if missing or invalid). - Histories are stored in
meta.json. Dest files are written to$XDG_DATA_HOME/llm-translator-rust/.cache/dest/<md5>. - Image/PDF attachments use OCR (tesseract), normalize OCR text with LLMs, and re-render a numbered overlay plus a footer list.
- Office files (docx/xlsx/pptx) are rewritten by translating text nodes in the XML.
- Output mime matches the input mime (e.g. png stays png, pdf stays pdf).
- OCR languages are inferred from
--source-langand--lang. - Use
tesseract --list-langsto see installed OCR language codes. - PDF OCR requires a PDF renderer (
mutoolorpdftoppmfrom poppler). - PDF output is rasterized (text is no longer selectable).
Provider defaults:
- OpenAI:
openai:gpt-5.2 - Gemini:
gemini:gemini-2.5-flash - Claude:
claude:claude-sonnet-4-5-20250929
Provider model APIs:
Settings files are loaded with the following precedence (highest first):
$XDG_CONFIG_HOME/llm-translator-rust/settings.local.toml(fallback:~/.config/llm-translator-rust/settings.local.toml)$XDG_CONFIG_HOME/llm-translator-rust/settings.toml(fallback:~/.config/llm-translator-rust/settings.toml)./settings.local.toml./settings.toml
You can also pass -r/--read-settings to load an additional local TOML file (highest priority).
settings.toml uses the following format:
system.languages should be ISO 639-3 codes.
[system]
languages = ["jpn", "eng", ...]
histories = 10
directory_translation_threads = 3
translation_ignore_file = ".llm-translation-rust-ignore"
[formally]
casual = "Use casual, natural everyday speech."
formal = "Use polite, formal register suitable for professional contexts."
...
[ocr]
text_color = "#c40000"
stroke_color = "#c40000"
fill_color = "#ffffff"
normalize = true
# font_size = 18
# font_family = "Hiragino Sans"
# font_path = "/System/Library/Fonts/Hiragino Sans W3.ttc"Language packs live in src/languages/<iso-639-3>.toml. The first entry in system.languages is used for label display in --show-enabled-languages.
Example (Japanese):
[translate.iso_country_lang.jpn]
jpn = "日本語"
eng = "英語"
- OpenAI:
OPENAI_API_KEY - Gemini:
GEMINI_API_KEYorGOOGLE_API_KEY - Claude:
ANTHROPIC_API_KEY
-k/--key overrides environment variables.
| Flag | Long | Description | Default |
|---|---|---|---|
-l |
--lang |
Target language | en |
-m |
--model |
Provider/model selector | (auto) |
-k |
--key |
API key override | (env) |
-f |
--formal |
Formality key (from settings.toml [formally]) |
formal |
-L |
--source-lang |
Source language (ISO 639-1/2/3 or auto) |
auto |
-s |
--slang |
Include slang keywords when appropriate | false |
-d |
--data |
File attachment (image/doc/docx/pptx/xlsx/pdf/txt/md/html/json/yaml/po/xml/js/ts/tsx/mermaid/audio) | |
-M |
--data-mime |
Mime type for --data (or stdin) (auto, image/*, pdf, doc, docx, docs, pptx, xlsx, txt, md, markdown, html, json, yaml, po, xml, js, ts, tsx, mermaid, mp3, wav, m4a, flac, ogg) |
auto |
--with-commentout |
Translate comment-out text (HTML/YAML/PO) | ||
--show-enabled-languages |
Show enabled translation languages | ||
--show-enabled-styles |
Show enabled style keys | ||
--show-models-list |
Show cached model list (provider:model) | ||
--show-whisper-models |
Show available whisper model names | ||
--pos [noun,verb] |
Dictionary output (part of speech/inflections) | ||
--correction |
Proofread input text and point out corrections | ||
--details |
Detailed translations across all formal styles | ||
--report |
Generate a translation report (html/xml/json) | ||
--report-out |
Report output path | ||
--show-histories |
Show translation histories | ||
--show-trend |
Show translation trend (categories/keywords) | ||
--with-using-tokens |
Append token usage to output | ||
--with-using-model |
Append model name to output | ||
--force |
Force translation when mime detection is uncertain (treat as text) | ||
--debug-ocr |
Output OCR debug overlays/JSON for attachments | ||
--whisper-model |
Whisper model name or path | ||
--overwrite |
Overwrite input files in place (backups stored in $XDG_DATA_HOME/llm-translator-rust/backup) |
||
--directory-translation-threads |
Directory translation concurrency | ||
--ignore-translation-file |
Ignore patterns for directory translation (gitignore-like) | ||
-o |
--out |
Output path for translated file or directory | |
--verbose |
Verbose logging | ||
-i |
--interactive |
Interactive mode | |
-r |
--read-settings |
Read extra settings TOML file | |
--server |
Start HTTP server (ADDR defaults to settings or 0.0.0.0:11223) |
||
--client |
Start web client (requires --server, ADDR defaults to settings or 0.0.0.0:11222) |
||
--mcp |
Start MCP server over stdio | ||
-h |
--help |
Show help |
Start the HTTP server:
llm-translator-rust --server
llm-translator-rust --server 0.0.0.0:11223Start the web client (requires --server):
llm-translator-rust --server --client
llm-translator-rust --server 0.0.0.0:11223 --client 0.0.0.0:11222Server settings are configurable in settings.toml under [server]:
[server]
host = "0.0.0.0"
port = 11223
tmp_dir = "/tmp/llm-translator-rust"Client settings are configurable in settings.toml under [client]:
[client]
host = "0.0.0.0"
port = 11222Additional endpoints (used by the web client):
GET /historiesGET /trendGET /settings
Requests are JSON POST /translate (either text or data path):
{
"text": "Hello",
"lang": "ja"
}{
"data": "/path/to/file-or-dir",
"data_mime": "auto",
"lang": "ja",
"force_translation": false
}Correction request:
{
"text": "This is pen",
"correction": true,
"source_lang": "en"
}Response (text):
{
"contents": [
{
"mime": "text/plain",
"format": "raw",
"original": "Hello",
"translated": "こんにちは"
}
]
}Correction response (text):
{
"contents": [
{
"mime": "text/plain",
"format": "raw",
"original": "This is pen",
"translated": "This is a pen",
"correction": {
"markers": " -",
"reasons": ["English requires a/an before a countable noun"],
"source_language": "en"
}
}
]
}Start the MCP server over stdio:
llm-translator-rust --mcpTools:
translatetranslate_detailscorrectionpos
Response (binary):
{
"contents": [
{
"mime": "image/png",
"format": "path",
"translated": "/tmp/llm-translator-rust/llm-translator-xxxx.png"
}
]
}When data is a directory, multiple entries are returned in contents.
- C header is at
ext/llm_translator_rust.h. - Functions return heap strings; free them with
llm_ext_free_string. - When a call fails, retrieve a message with
llm_ext_last_error_message.
- API errors (including insufficient quota) are surfaced with provider error messages.
- Use
-h/--helpto see the latest options.
casual: casual everyday toneformal: polite formal toneloose: relaxed, loose phrasingacademic: academic, paper-like phrasinggal: playful gyaru/gal toneyankee: rough delinquent styleotaku: otaku-friendly diction and nuanceelderly: gentle, elder registeraristocrat: refined aristocratic tonesamurai: archaic samurai-style wordingbraille: output in Unicode Braille patternsmorse: output in International Morse codeengineer: precise technical tone

