|
| 1 | +--- |
| 2 | +slug: echokit-30-days-day-19-fish-audio-tts |
| 3 | +title: "Day 19: Switching EchoKit’s TTS Provider to Fish.audio| The First 30 Days with EchoKit" |
| 4 | +tags: [echokit30days, tts] |
| 5 | +--- |
| 6 | + |
| 7 | + |
| 8 | +Over the past few days, we’ve been iterating on different parts of EchoKit’s voice pipeline — ASR, LLMs, system prompts, and TTS (including ElevenLabs and Groq). |
| 9 | + |
| 10 | +On Day 19, we switch EchoKit’s **Text-to-Speech provider to Fish.audio**, purely through a configuration change. |
| 11 | + |
| 12 | +No code changes are required. |
| 13 | + |
| 14 | +## What Is Fish.audio |
| 15 | + |
| 16 | +Fish.audio is a modern text-to-speech platform focused on **high-quality, expressive voices** and **fast iteration for developers**. |
| 17 | + |
| 18 | +One notable aspect of Fish.audio is the breadth of available voices. It offers a wide range of voice styles, including voices inspired by public figures, pop culture, and anime culture references, which makes it easy to experiment with playful or character-driven agents. |
| 19 | + |
| 20 | +In addition to preset voices, Fish.audio also supports voice cloning, allowing developers to generate speech in a customized voice when needed. |
| 21 | + |
| 22 | +These features make it particularly interesting for conversational and personality-driven voice AI systems. |
| 23 | + |
| 24 | +EchoKit is designed to be provider-agnostic. As long as a TTS service matches the expected interface, it can be plugged into the system without affecting the rest of the pipeline. |
| 25 | + |
| 26 | +## The Exact Change in `config.toml` |
| 27 | + |
| 28 | +Switching to Fish.audio in EchoKit only requires updating the TTS section in the `config.toml` file: |
| 29 | + |
| 30 | +```toml |
| 31 | +[tts] |
| 32 | +platform = "fish" |
| 33 | +speaker = "03397b4c4be74759b72533b663fbd001" |
| 34 | +api_key = "YOUR_FISH_AUDIO_API_KEY" |
| 35 | +``` |
| 36 | + |
| 37 | +A brief explanation of each field: |
| 38 | + |
| 39 | +* `platform` set to `"fish"` tells EchoKit to use Fish.audio as the TTS provider. |
| 40 | +* `speaker` specifies the TTS model ID, which can be obtained from the Fish.audio model detail page. |
| 41 | +* `api_key` is the API key used to authenticate with the Fish.audio service. |
| 42 | + |
| 43 | +After restarting the EchoKit server and reconnecting the device, all voice output is generated by Fish.audio. |
| 44 | + |
| 45 | +Everything else remains unchanged: |
| 46 | + |
| 47 | +* ASR stays the same |
| 48 | +* The LLM and system prompts stay the same |
| 49 | +* Conversation flow and tool calls stay the same |
| 50 | + |
| 51 | +With Fish.audio added to the list of supported TTS providers, EchoKit’s voice layer becomes even more flexible — making it easier to experiment with different voices without reworking the system. |
| 52 | + |
| 53 | +--- |
| 54 | + |
| 55 | +Want to get your own EchoKit device and make it unique? |
| 56 | + |
| 57 | +* [EchoKit Box](https://echokit.dev/echokit_box.html) |
| 58 | +* [EchoKit DIY](https://echokit.dev/echokit_diy.html) |
| 59 | + |
| 60 | +Join the [EchoKit Discord](https://discord.gg/Fwe3zsT5g3) to share your welcome voices and see how others are personalizing their voice AI agents! |
0 commit comments