Skip to content

Commit bbee42d

Browse files
authored
Merge pull request #118 from second-state/alabulei1-patch-8
Add documentation for switching to Groq PlayAI TTS
2 parents c8941f2 + 36010e0 commit bbee42d

File tree

1 file changed

+76
-0
lines changed

1 file changed

+76
-0
lines changed
Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
---
2+
slug: echokit-30-days-day-18-groq-playai-tts
3+
title: "Day 18: Switching EchoKit to Groq PlayAI TTS | The First 30 Days with EchoKit"
4+
tags: [echokit30days, tts]
5+
---
6+
7+
8+
Over the past two weeks, we’ve built almost every core component of a voice AI agent on EchoKit:
9+
10+
ASR to turn speech into text.
11+
LLMs to reason, chat, and call tools.
12+
[System prompts to shape personality](https://echokit.dev/docs/dev/echokit-30-days-day-14-personality).
13+
[MCP servers to let the agent take real actions](https://echokit.dev/docs/dev/echokit-30-days-day-15-mcp-web-search).
14+
[TTS to give EchoKit a voice](https://echokit.dev/docs/dev/echokit-30-days-day-17-elevenlabs).
15+
16+
Today, we close the loop again — but this time, with a **new voice engine**.
17+
18+
We’re switching EchoKit’s TTS backend to **[Groq’s PlayAI TTS](https://console.groq.com/docs/model/playai-tts)**.
19+
20+
### Why change TTS?
21+
22+
Text-to-speech is often treated as the “last step” in a voice pipeline, but in practice, it’s the part users feel the most.
23+
24+
Latency, voice stability, and natural prosody directly affect whether a voice agent feels responsive or awkward. Since Groq already powers our ASR and LLM experiments with very low latency, it made sense to test their TTS offering as well.
25+
26+
PlayAI TTS fits EchoKit’s design goals nicely:
27+
It’s fast, simple to integrate, and exposed through an OpenAI-compatible API.
28+
29+
That means **no special SDK**, and no changes to EchoKit’s core architecture.
30+
31+
### Switching EchoKit to Groq PlayAI TTS
32+
33+
On EchoKit, swapping TTS providers is mostly a configuration change.
34+
35+
To use Groq PlayAI TTS, we update the `tts` section in `config.toml` like this:
36+
37+
```toml
38+
[tts]
39+
platform = "openai"
40+
url = "https://api.groq.com/openai/v1/audio/speech"
41+
model = "Playai-tts"
42+
api_key = "gsk_xxx"
43+
voice = "Fritz-PlayAI"
44+
```
45+
46+
A few things worth calling out:
47+
48+
The `platform` stays as `openai` because Groq exposes an OpenAI-compatible endpoint.
49+
We point the `url` directly to Groq’s audio speech API.
50+
The model is set to `Playai-tts`.
51+
Voices are selected via the `voice` field — here we’re using `Fritz-PlayAI`.
52+
53+
Once this is in place, no other code changes are required.
54+
55+
Restart the EchoKit server, reconnect the EchoKit device and the new server, and the agent speaks with a new voice.
56+
57+
### The bigger picture
58+
59+
Most importantly, switching different tts providers reinforces one of EchoKit’s core ideas:
60+
**every part of the voice pipeline should be swappable.**
61+
62+
63+
It’s about treating voice as a first-class system component — something you can experiment with, replace, and optimize just like models or prompts.
64+
65+
EchoKit doesn’t lock you into one vendor or one voice.
66+
If tomorrow you want to try a different TTS engine, or even run one locally, the architecture already supports that.
67+
68+
---
69+
70+
Want to get your own EchoKit device and make it unique?
71+
72+
* [EchoKit Box](https://echokit.dev/echokit_box.html)
73+
* [EchoKit DIY](https://echokit.dev/echokit_diy.html)
74+
75+
Join the [EchoKit Discord](https://discord.gg/Fwe3zsT5g3) to share your welcome voices and see how others are personalizing their voice AI agents!
76+

0 commit comments

Comments
 (0)