Skip to content

Small screen translation pipeline that allows you to translate UI elements while playing games in foreign languages

Notifications You must be signed in to change notification settings

v-stamenova/screen-whisper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 

Repository files navigation

ScreenWhisper - ScreenText Translator

A small pipeline that captures screen regions (like UI text in games such as The Sims), performs OCR via Tesseract, and translates the recognized text using LibreTranslate.

Inspired by the fact that I accidentally installed Sims in Dutch and did not want to reinstall it.


Architecture Overview

Unfortunately I couldn't make the entire app in a fully containerized network. This was because when I would run the screen capturing on the WSL, it didn't work (probably Docker just doesn't have access to the screen framebuffer).

Because of that the architecture is split in two parts.

Local

Within the local folder you can find a main.py which takes the screenshot. The dependencies that are used are mss, pyautogui, Pillow, dotenv, requests, keyboard, colorama. The Python version used in locally was 3.13.0.

After you run the script in a terminal you can take a screenshot using the Ctrl + Shift + T shortcut. You will see the translation in there as well

Running... Press Ctrl+Shift+T to capture screen region.
Hotkey pressed: Capturing and sending image...
Capturing at (1706, 752)...

==================================================
📝 OCR Text:

Hallo wereld!
--------------------------------------------------
🌍 Translated Text:

Hello, world!
==================================================

Docker containers

In the remote directory you can find the two containers that are used. One of the containers is using Tesseract to detect text while the other one is a LibreTranslate container. Once the local app sends the image (base64) to the Tesseract container, it detects the text and forwards it to the LibreTranslate container to be translated. After that the LibreTranslate responds to the Tesseract container and then finally the Tesseract returns it to the local app.


Requirements

Host (Windows)

  • Anaconda (or any Python 3.12 environment)
  • The following packages: mss, pyautogui, Pillow, dotenv, requests, keyboard, colorama
  • Docker + Docker Compose

Getting Started

1. Clone the repository

git clone https://github.com/v-stamenova/screen-whisper.git
cd translator

2. Run Docker Services

Build and run OCR API and LibreTranslate:

docker compose up

3. Configure & Run Local Client (Windows)

The client runs locally on Windows to capture your screen.

Install dependencies

pip install mss pyautogui pillow requests dotenv requests keyboard colorama

Run the capture client

cd local
python main.py

Tips & Troubleshooting

  • If LibreTranslate gives "nl is not supported": make sure LT_LOAD_ONLY includes nl.
  • Use localhost:5000 for OCR server and localhost:5010 for LibreTranslate in development.
  • Test OCR output with simple screenshots before using complex UI.

About

Small screen translation pipeline that allows you to translate UI elements while playing games in foreign languages

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors