Multilingual AI Dubbing System

🎧 Choose your mode, dub in any language, and enjoy crystal-clear vocals.

Bluez-Dubbing is a modular, production-ready pipeline for automatic video dubbing and subtitle generation. It integrates state-of-the-art models for ASR (Automatic Speech Recognition), translation, and TTS (Text-to-Speech), supporting features like:

audio source separation
VAD-based duration alignment
sophisticated dubbing strategies
customizable subtitle styles

🚀 Features

End-to-End Dubbing: From video/audio input to fully dubbed output with burned-in subtitles.
Multiple Modes: Video dubbing (with or without subtitles), audio translation, or subtitling only.
REST API & CLI: FastAPI endpoints and command-line tools for automation.
Independent Web UI: A dedicated app offering an intuitive experience and live progress tracking. See Web UI for details.
Modular Services: Easily plug, swap, or extend ASR, translation, and TTS models.
Flexible Translation: Segment-wise or full-text translation with smart synchronization.
Advanced Audio Synchronization: Multiple algorithms for seamless and natural voice replacement.
Subtitle Generation: Netflix-style, bold-desktop, or mobile-optimized SRT/VTT/ASS output.

🗂️ Project Structure

bluez-dubbing/
├── apps/
│   ├── backend/
│   │   ├── cache/              # Cached audio/background/intermediate data
│   │   ├── libs/
│   │   │   └── common-schemas/ # Shared Pydantic models & utilities
│   │   ├── models_cache/       # Downloaded model weights/configs
│   │   ├── outs/               # Output workspaces per job
│   │   ├── services/
│   │   │   ├── asr/            # ASR (WhisperX, etc.)
│   │   │   ├── orchestrator/   # Main API & pipeline logic
│   │   │   ├── translation/    # Translation service
│   │   │   └── tts/            # TTS service
│   │   └── uploads/            # Uploaded media from the UI
│   └── frontend/
│       ├── assets/             # UI icons and branding
│       ├── scripts/            # JS modules for the Web UI
│       ├── styles/             # Stylesheets
│       └── index.html          # Web application entry
├── Makefile
└── README.md

📽️ Demo

Original Video (chinese)

Dubbed (English) W/O Subtitles

Dubbed (French) With Subtitles

⚡ Quickstart

1. Clone the Repository

git clone https://github.com/your-org/bluez-dubbing.git
cd bluez-dubbing

2. Install Dependencies (via `uv`)

Ensure ffmpeg and uv are installed. Linux example:

sudo apt update && sudo apt install ffmpeg -y
sudo apt install uv

Note: Some tokenizers (e.g. mecab-python3 for Japanese) require a JVM to be installed.

To install dependencies for any service:

cd apps/backend/services/<serviceName>
uv sync

Or for all at once:

make install-dep

This sets up .venv environments for each service (ASR, translation, TTS, orchestrator).

Dependency notes:

If onnx and ml_dtypes conflict, run:

uv lock --upgrade-package ml_dtypes==0.5.3 && uv sync

Chatterbox pins torch==2.6.0 / torchaudio==2.6.0. If your hardware needs newer versions (e.g., RTX 5080 GPUs require ≥ 2.8.0):
```
uv pip uninstall torch torchaudio
uv pip install torch==2.8.0 torchaudio==2.8.0
```
For CUDA wheels (Windows or manual install):
```
uv pip install torch==2.8.0 torchaudio==2.8.0 \
  --index-url https://download.pytorch.org/whl/cu12x
```
⚠️ Don’t re-run uv sync afterwards, as it will downgrade again.

3. Configure Environment

Copy .env.example → .env
Set required variables (HF_TOKEN, ORCHESTRATOR_ALLOWED_ORIGINS, etc.)
Place model weights in models_cache/

4. Run the Backend Stack

make start-api      # Launch orchestrator only
make stack-up       # Launch ASR, translation, TTS, orchestrator
make stop           # Stop all services
make restart        # Restart everything

5. Serve the Frontend UI

make start-ui

Default URL: http://localhost:5173 The UI connects to the backend at http://localhost:8000/api. To change it:

localStorage.setItem("bluez-backend-base", "https://your-host/api");

Restart or stop with:

make restart-ui
make stop

🛠️ Usage

See CONTRIBUTING for a full explanation of parameters and tuning guidance. Defaults work for most cases, and the models automatically adjust when needed.

Web UI

After serving the frontend:

Upload a file or paste a video link (YouTube, Instagram, TikTok…)
Adjust model and dubbing parameters or use auto-selection and hit the run dubbing pipeline that's it!
Watch live logs (ASR → Translation → TTS → Merge)
Preview or download results
Choose Lazy Mode (fully automatic) or Involve Mode (manual fine-tuning)
Toggle “Keep Intermediate Artefacts” to retain separated tracks or transcripts

API Example

curl -X POST -G 'http://localhost:8000/v1/dub' \
  --data-urlencode 'video_url=/path/to/video.mp4' \
  --data-urlencode 'target_work=dub' \
  --data-urlencode 'target_langs=fr' \
  --data-urlencode 'asr_model=whisperx' \
  --data-urlencode 'tr_model=deep_translator' \
  --data-urlencode 'tts_model=edge_tts' \
  --data-urlencode 'perform_vad_trimming=true' \
  --data-urlencode 'dubbing_strategy=full_replacement' \
  --data-urlencode 'sophisticated_dub_timing=true' \
  --data-urlencode 'subtitle_style=netflix_mobile' \
  --data-urlencode 'persist_intermediate=false'

Outputs are saved to apps/backend/outs/<workspace_id>/.

💻 CLI Tools

Each microservice has its own CLI for debugging or running isolated stages:

# ASR
uv run python -m services.asr.cli /path/to/audio.wav --output-json asr.json

# Translation
uv run python -m services.translation.cli asr.json --target-lang fr --output-json translation.json

# TTS
uv run python -m services.tts.cli translation.json --workspace ./tts_out --output-json tts.json

Run --help on any CLI for available flags.

🧪 Tests

Run tests via:

make test

Includes:

unit tests for service CLIs
registry validation (ensures all registered models run properly)
end-to-end integration test for the orchestrator pipeline

⚙️ Continuous Integration

GitHub Actions workflow (.github/workflows/ci.yml) automatically:

sets up Python 3.11 + uv
runs make test
validates model registries and pipeline integration

Ensure your PRs keep all tests green.

🧩 Supported Models

ASR: WhisperX
Translation: deep-translator, M2M100, etc.
TTS: Edge TTS, Chatterbox, and more

ASR: WhisperX out of the box; extend via services/asr/app/registry.py.
Translation: deep_translator, M2M100, and pluggable custom translators.
TTS: Edge TTS, Chatterbox, plus any custom registry entry.

See libs/common-schemas/config/ for model configs and supported languages.

🧠 Extending

Add new models via each service’s registry.py and model folder see CONTRIBUTING.md for more details

🤝 Contributing

Contributions are welcome! Please read CONTRIBUTING.md before submitting PRs or issues.

📄 License

Licensed under the Apache License 2.0.

🙏 Acknowledgements

Thanks to these open-source projects:

Contact: 📧 contactglobluez@gmail.com

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multilingual AI Dubbing System

🚀 Features

🗂️ Project Structure

📽️ Demo

Original Video (chinese)

Dubbed (English) W/O Subtitles

Dubbed (French) With Subtitles

⚡ Quickstart

1. Clone the Repository

2. Install Dependencies (via `uv`)

3. Configure Environment

4. Run the Backend Stack

5. Serve the Frontend UI

🛠️ Usage

Web UI

API Example

💻 CLI Tools

🧪 Tests

⚙️ Continuous Integration

🧩 Supported Models

🧠 Extending

🤝 Contributing

📄 License

🙏 Acknowledgements

About

Uh oh!

Releases 1

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
.github/workflows		.github/workflows
apps		apps
.env.example		.env.example
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md

License

Globluez/bluez-dubbing

Folders and files

Latest commit

History

Repository files navigation

Multilingual AI Dubbing System

🚀 Features

🗂️ Project Structure

📽️ Demo

Original Video (chinese)

Dubbed (English) W/O Subtitles

Dubbed (French) With Subtitles

⚡ Quickstart

1. Clone the Repository

2. Install Dependencies (via uv)

3. Configure Environment

4. Run the Backend Stack

5. Serve the Frontend UI

🛠️ Usage

Web UI

API Example

💻 CLI Tools

🧪 Tests

⚙️ Continuous Integration

🧩 Supported Models

🧠 Extending

🤝 Contributing

📄 License

🙏 Acknowledgements

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Contributors

Uh oh!

Languages

2. Install Dependencies (via `uv`)