Kokoro TTS Container

A Docker container for running Kokoro Text-to-Speech engine v.1, providing high-quality speech synthesis with 54 voices and 9 languages options.

Container Features

High-quality text-to-speech synthesis
Multiple voice and languages options
Voice blending capabilities
Adjustable speech speed
Support for .mp3 and .wav output files

Quick Start Using Docker Hub Image

You can directly pull and run the pre-built container from Docker Hub without building locally:

# Pull the latest image
docker pull usrbinbrain/kokoro-tts-container:latest

# Run a basic example
docker run --rm -v $(pwd):/app/shared usrbinbrain/kokoro-tts-container \
    "Hello world!" \
    output.mp3 \
    --voice "af_sarah" \
    --speed 1.0 \
    --lang "en-us"

This way you can use Kokoro-TTS instantly without worrying about setup or build steps.

Local Setup && Build

Building your kokoro-tts Docker image:

# Install requirements for setup
pip3 install -r requirements.txt

# Run setup to donwload model and gerenate voices bin file
python3 setup.py

# Build your kokoro-tts image
docker build -t kokoro-tts-container .

Usage

Basic Usage Examples

Run the container with a single voice.

The command below generates an output.mp3 file, where af_sarah voice says "Hello my friend!" in English (US) with speed 1.2

docker run --rm -v $(pwd):/app/shared kokoro-tts-container \
    "Hello my friend!" \
    output.mp3 \
    --voice "af_sarah" \
    --speed 1.2 \
    --lang "en-us"

Voice Blending

Kokoro-TTS supports voice blending, allowing you to mix multiple voices with different weights.

The command below generates an output.wav file with combined voices, where af_sarah contributes 40% and am_adam contributes 60% to the final voice saying "Hasta la vista!" in Spanish with speed 0.8

docker run --rm -v $(pwd):/app/shared kokoro-tts-container \
    "Hasta la vista!" \
    output.wav \
    --voice "af_sarah:40,am_adam:60" \
    --speed 0.8 \
    --lang "es"

Container Parameters

Parameter	Description	Default
`input_text`	The text to synthesize	Required
`output_file`	Output audio filename (`.wav` or `.mp3`)	Required
`--voice`	Voice ID or blend (format: `voice1:weight,voice2:weight`)	`af_sarah`
`--speed`	Speech rate multiplier, allows `0.5` to `2.0`	`1.0`
`--lang`	Language code	`en-us`

Supported Languages and Codes

en-us: English (US)
en-gb: English (British)
fr-fr: French
ja: Japanese
hi: Hindi
cmn: Mandarin Chinese
es: Spanish
pt-br: Brazilian Portuguese
it: Italian

Available Voices

The container includes multiple voices for different languages, for a complete list of voices or another help, run:

docker run --rm kokoro-tts-container --help

Thanks

Built with ❤️ on top of Kokoro ONNX - A special thanks to thewh1teagle and hexgrad for providing this amazing fast TTS engine that made this container project possible.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py
synthesize.py		synthesize.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kokoro TTS Container

Container Features

Quick Start Using Docker Hub Image

Local Setup && Build

Usage

Basic Usage Examples

Voice Blending

Container Parameters

Supported Languages and Codes

Available Voices

Thanks

About

Uh oh!

Releases

Packages

Languages

License

usrbinbrain/kokoro-tts-container

Folders and files

Latest commit

History

Repository files navigation

Kokoro TTS Container

Container Features

Quick Start Using Docker Hub Image

Local Setup && Build

Usage

Basic Usage Examples

Voice Blending

Container Parameters

Supported Languages and Codes

Available Voices

Thanks

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages