|
| 1 | +--- |
| 2 | +slug: echokit-30-days-day-18-groq-playai-tts |
| 3 | +title: "Day 18: Switching EchoKit to Groq PlayAI TTS | The First 30 Days with EchoKit" |
| 4 | +tags: [echokit30days, tts] |
| 5 | +--- |
| 6 | + |
| 7 | + |
| 8 | +Over the past two weeks, we’ve built almost every core component of a voice AI agent on EchoKit: |
| 9 | + |
| 10 | +ASR to turn speech into text. |
| 11 | +LLMs to reason, chat, and call tools. |
| 12 | +[System prompts to shape personality](https://echokit.dev/docs/dev/echokit-30-days-day-14-personality). |
| 13 | +[MCP servers to let the agent take real actions](https://echokit.dev/docs/dev/echokit-30-days-day-15-mcp-web-search). |
| 14 | +[TTS to give EchoKit a voice](https://echokit.dev/docs/dev/echokit-30-days-day-17-elevenlabs). |
| 15 | + |
| 16 | +Today, we close the loop again — but this time, with a **new voice engine**. |
| 17 | + |
| 18 | +We’re switching EchoKit’s TTS backend to **[Groq’s PlayAI TTS](https://console.groq.com/docs/model/playai-tts)**. |
| 19 | + |
| 20 | +### Why change TTS? |
| 21 | + |
| 22 | +Text-to-speech is often treated as the “last step” in a voice pipeline, but in practice, it’s the part users feel the most. |
| 23 | + |
| 24 | +Latency, voice stability, and natural prosody directly affect whether a voice agent feels responsive or awkward. Since Groq already powers our ASR and LLM experiments with very low latency, it made sense to test their TTS offering as well. |
| 25 | + |
| 26 | +PlayAI TTS fits EchoKit’s design goals nicely: |
| 27 | +It’s fast, simple to integrate, and exposed through an OpenAI-compatible API. |
| 28 | + |
| 29 | +That means **no special SDK**, and no changes to EchoKit’s core architecture. |
| 30 | + |
| 31 | +### Switching EchoKit to Groq PlayAI TTS |
| 32 | + |
| 33 | +On EchoKit, swapping TTS providers is mostly a configuration change. |
| 34 | + |
| 35 | +To use Groq PlayAI TTS, we update the `tts` section in `config.toml` like this: |
| 36 | + |
| 37 | +```toml |
| 38 | +[tts] |
| 39 | +platform = "openai" |
| 40 | +url = "https://api.groq.com/openai/v1/audio/speech" |
| 41 | +model = "Playai-tts" |
| 42 | +api_key = "gsk_xxx" |
| 43 | +voice = "Fritz-PlayAI" |
| 44 | +``` |
| 45 | + |
| 46 | +A few things worth calling out: |
| 47 | + |
| 48 | +The `platform` stays as `openai` because Groq exposes an OpenAI-compatible endpoint. |
| 49 | +We point the `url` directly to Groq’s audio speech API. |
| 50 | +The model is set to `Playai-tts`. |
| 51 | +Voices are selected via the `voice` field — here we’re using `Fritz-PlayAI`. |
| 52 | + |
| 53 | +Once this is in place, no other code changes are required. |
| 54 | + |
| 55 | +Restart the EchoKit server, reconnect the EchoKit device and the new server, and the agent speaks with a new voice. |
| 56 | + |
| 57 | +### The bigger picture |
| 58 | + |
| 59 | +Most importantly, switching different tts providers reinforces one of EchoKit’s core ideas: |
| 60 | +**every part of the voice pipeline should be swappable.** |
| 61 | + |
| 62 | + |
| 63 | +It’s about treating voice as a first-class system component — something you can experiment with, replace, and optimize just like models or prompts. |
| 64 | + |
| 65 | +EchoKit doesn’t lock you into one vendor or one voice. |
| 66 | +If tomorrow you want to try a different TTS engine, or even run one locally, the architecture already supports that. |
| 67 | + |
| 68 | +--- |
| 69 | + |
| 70 | +Want to get your own EchoKit device and make it unique? |
| 71 | + |
| 72 | +* [EchoKit Box](https://echokit.dev/echokit_box.html) |
| 73 | +* [EchoKit DIY](https://echokit.dev/echokit_diy.html) |
| 74 | + |
| 75 | +Join the [EchoKit Discord](https://discord.gg/Fwe3zsT5g3) to share your welcome voices and see how others are personalizing their voice AI agents! |
| 76 | + |
0 commit comments