A sophisticated universal translator bot built with FastAPI, aiogram, and transformers. The bot automatically detects input language, identifies target language from user prompts, and provides high-quality translations using Helsinki-NLP models.
- Automatic Language Detection using
langdetectwithlingua-language-detectorfallback - Smart Target Language Parsing from user messages
- Multi-language Support with 50+ language pairs
- Interactive Language Selection via inline keyboards
- Internationalization (i18n) with support for multiple languages
- Async Architecture built with FastAPI and aiogram for high performance
- Lightweight Translation Models using Helsinki-NLP/opus-mt
- CPU-Optimized PyTorch installation (no NVIDIA dependencies)
- Model Caching for improved response times
- Batch Processing for efficient translation
- Error Handling with fallback translation through intermediate languages
- Docker Support with optimized containerization
- Poetry for dependency management
- Comprehensive Testing with pytest
- Code Quality with black, isort, and pre-commit hooks
- Health Checks and monitoring endpoints
app/
βββ main.py # FastAPI entrypoint with retry logic
βββ bot.py # aiogram Bot & Dispatcher with custom session
βββ api/routes.py # Webhook endpoints
βββ core/
β βββ config.py # Environment configuration
β βββ webhook.py # Webhook registration
βββ handlers/
β βββ translate.py # Main translation handler
β βββ pair_select.py # Language pair selection
β βββ start.py # Bot start command
β βββ admin.py # Admin commands
βββ ml/
β βββ translation_service.py # Helsinki-NLP translation pipeline
β βββ language_service.py # Language detection service
β βββ language_parser.py # Target language extraction
β βββ nllb_codes.py # Language code mappings
β βββ language_sentence_split.py # Text preprocessing
βββ i18n/
β βββ messages.py # Internationalized messages
β βββ resources.py # Runtime i18n strings
βββ keyboards/
β βββ pairs.py # Inline keyboard layouts
βββ services/
βββ message_filter.py # Message filtering utilities
- Python 3.11+
- Telegram Bot Token from @BotFather
- Docker (optional, for containerized deployment)
# Clone the repository
git clone https://github.com/yourusername/telegramTranslator.git
cd telegramTranslator
# Install dependencies
poetry install
# Configure environment
cp .env.example .env
# Edit .env with your BOT_TOKEN
# Run the bot
poetry run uvicorn app.main:app --reload# Build and run with Docker Compose
docker-compose up --build
# Or build manually
docker build -t telegram-translator .
docker run -p 8000:8000 telegram-translatorThe bot supports webhook mode for production deployment:
# Set webhook URL in .env
WEBHOOK_URL=https://your-domain.com/webhook
# Webhook endpoint
POST /webhookFor local testing, use ngrok to expose your local server.
The bot supports 50+ language pairs including:
- European: English, Russian, German, French, Spanish, Italian, Portuguese, Dutch, Polish
- Asian: Chinese, Japanese, Korean, Hindi, Thai, Vietnamese
- Middle Eastern: Arabic, Turkish, Hebrew, Persian
- African: Swahili, Amharic, Afrikaans
- And many more...
# Run all tests
poetry run pytest
# Run with coverage
poetry run pytest --cov=app
# Run specific test files
poetry run pytest tests/test_translation_service.py# Format code
poetry run black app/ tests/
# Sort imports
poetry run isort app/ tests/
# Run pre-commit hooks
poetry run pre-commit run --all-files- Update
app/i18n/languages.pywith new language mappings - Add translations to
app/i18n/messages.py - Test with
poetry run pytest tests/test_language_service.py
- FastAPI (>=0.115.12) - Web framework
- aiogram (>=3.20.0) - Telegram Bot API
- transformers (>=4.52.4) - HuggingFace models
- torch (CPU-only) - PyTorch for inference
- langdetect (>=1.0.9) - Language detection
- lingua-language-detector (>=2.1.1) - Advanced language detection
- spacy (>=3.8.7) - NLP processing
- pytest - Testing framework
- black - Code formatting
- isort - Import sorting
- pre-commit - Git hooks
The project includes optimized Docker configuration:
- Multi-stage builds for smaller images
- CPU-only PyTorch to avoid NVIDIA dependencies
- Non-root user for security
- Health checks for monitoring
- Volume mounts for model caching
- Model Caching: Translation models are cached in memory
- Batch Processing: Multiple sentences processed together
- Async Architecture: Non-blocking I/O operations
- CPU Optimization: Lightweight models for faster inference
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the GNU Affero General Public License v3.0 - see the LICENSE file for details.
- Copyleft Protection: Forces derivative works to be open source
- Network Clause: Protects against commercial exploitation
- Community Driven: Ensures improvements benefit everyone
- Commercial Friendly: Allows you to monetize your own instance
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Documentation: Check the
/docsfolder for detailed guides
- Helsinki-NLP for the lightweight translation models
- HuggingFace for the transformers library
- aiogram team for the excellent Telegram Bot framework
- FastAPI team for the modern web framework
Made with β€οΈ for the open-source community