Skip to content

slabstech/parler-tts-server

 
 

Parler-TTS-Server

This repository provides a server with an OpenAI compatible API interface for Parler-TTS.

Table of Contents

Quick Start

Prerequisites

  • Docker
  • cURL

Docker

Run the server with the default model:

docker run --detach --volume ~/.cache/huggingface:/root/.cache/huggingface --publish 8000:8000 slabstech/parler-tts-server

Run the server with a fine-tuned model:

docker run --detach --volume ~/.cache/huggingface:/root/.cache/huggingface --publish 8000:8000 --env MODEL="ai4bharat/indic-parler-tts" slabstech/parler-tts-server

Docker Compose

Download the compose.yaml file and start the server:

curl -sO https://raw.githubusercontent.com/sachinsshetty/parler-tts-server/refs/heads/master/compose.yaml
docker compose up --detach parler-tts-server

Usage

Examples

Kannada

curl -s -H "content-type: application/json" localhost:8000/v1/audio/speech -d '{"input": "ಉದ್ಯಾನದಲ್ಲಿ ಮಕ್ಕಳ ಆಟವಾಡುತ್ತಿದ್ದಾರೆ ಮತ್ತು ಪಕ್ಷಿಗಳು ಚಿಲಿಪಿಲಿ ಮಾಡುತ್ತಿವೆ."}' -o audio_kannada.mp3
Output

Hindi

curl -s -H "content-type: application/json" localhost:8000/v1/audio/speech -d '{"input": "अरे, तुम आज कैसे हो?"}' -o audio_hindi.mp3

Saving to File

curl -s -H "content-type: application/json" localhost:8000/v1/audio/speech -d '{"input": "Hey, how are you?"}' -o audio.mp3

Specifying a Different Format

curl -s -H "content-type: application/json" localhost:8000/v1/audio/speech -d '{"input": "Hey, how are you?", "response_type": "wav"}' -o audio.wav

Playing Back the Audio

curl -s -H "content-type: application/json" localhost:8000/v1/audio/speech -d '{"input": "Hey, how are you?"}' | ffplay -hide_banner -autoexit -nodisp -loglevel quiet -

Describing the Voice

curl -s -H "content-type: application/json" localhost:8000/v1/audio/speech -d '{"input": "Hey, how are you?", "voice": "Feminine, speedy, and cheerful"}' | ffplay -hide_banner -autoexit -nodisp -loglevel quiet -

OpenAI SDK Usage Example

An example of using the OpenAI SDK can be found here.

Citations

@misc{lacombe-etal-2024-parler-tts,
  author = {Yoach Lacombe and Vaibhav Srivastav and Sanchit Gandhi},
  title = {Parler-TTS},
  year = {2024},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/huggingface/parler-tts}}
}
@misc{lyth2024natural,
  title = {Natural language guidance of high-fidelity text-to-speech with synthetic annotations},
  author = {Dan Lyth and Simon King},
  year = {2024},
  eprint = {2402.01912},
  archivePrefix = {arXiv},
  primaryClass = {cs.SD}
}

License

This project is licensed under the MIT License. See the LICENSE file for details.

About

No description, website, or topics provided.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 92.3%
  • Dockerfile 3.9%
  • Nix 3.8%