AhaTTS is an open-source, production-ready text-to-speech service with an OpenAI-compatible API and a built-in web demo.
- OpenAI-compatible API for drop-in integration
- Built-in
api/webdemo UI - CPU / CUDA / MPS support
- Multi-language and multi-voice (Kokoro)
- Streaming output with high-quality audio
./scripts/install.sh --device cpu # or gpu / mac
./scripts/dev.shVisit:
Request URL: http(s)://<server-address>:<port>/v1/audio/speech (POST).
Parameters:
model(string, required): one oftts-1ortts-1-hd.input(string, required): text to generate audio from. Max length 4096 characters.voice(string, required): one ofalloy,ash,coral,echo,fable,onyx,nova,sage,shimmer.response_format(string, optional): audio format, defaultmp3. Supported:mp3,opus,aac,flac.speed(number, optional): audio speed, default1.0. Range0.5to2.0.
Special thanks to the following projects for inspiration and reference implementations:
- Kokoro-FastAPI: https://github.com/remsky/Kokoro-FastAPI
- kokoro: https://github.com/hexgrad/kokoro
