-
-
Notifications
You must be signed in to change notification settings - Fork 5
Description
as coqui says in their docs
Due to its open-source nature, relatively high quality voices and fast synthetization speed Mary-TTS was a popular choice in the past and many tools implemented API support over the years like screen-readers (NVDA + SpeechHub), smart-home HUBs (openHAB, Home Assistant) or voice assistants (Rhasspy, Mycroft, SEPIA). A compatibility layer for Coqui-TTS will ensure that these tools can use Coqui as a drop-in replacement and get even better voices right away.
Mary-TTS can run as HTTP server to allow access to the API via HTTP GET and POST calls. The best documentations of this API are probably the web-page, available via your self-hosted Mary-TTS server and the Java docs page.
Mary-TTS offers a larger number of endpoints to load styles, audio effects, examples etc., but compatible tools often only require 3 of them to work:
/locales(GET) - Returns a list of supported locales in the format[locale]\n..., for example "en_US" or "de_DE" or simply "en" etc./voices(GET) - Returns a list of supported voices in the format[name] [locale] [gender]\n..., 'name' can be anything without spaces(!) and 'gender' is traditionallyform/process?INPUT_TEXT=[my text]&INPUT_TYPE=TEXT&LOCALE=[locale]&VOICE=[name]&OUTPUT_TYPE=AUDIO&AUDIO=WAVE_FILE(GET/POST) - Processes the input text and returns a wav file. INPUT_TYPE, OUTPUT_TYPE and AUDIO support additional values, but are usually static in compatible tools.