-
Notifications
You must be signed in to change notification settings - Fork 981
Description
Happy already has excellent voice input — you can speak commands and they execute. But the loop isn’t fully closed because you still have to look at the screen to read Claude’s responses.
Proposed feature: Add optional text-to-speech playback of Claude Code responses in the mobile app, so you can both speak commands and hear responses — truly hands-free.
Why this makes sense for Happy specifically:
∙ Happy already integrates ElevenLabs for the voice agent (see RealtimeVoiceSession.tsx). The TTS infrastructure is partially there.
∙ The mobile app already receives and decrypts the full response text — it just needs to optionally pipe it through TTS before/alongside rendering.
∙ The core value prop of Happy is using Claude Code away from your desk. TTS would extend that to situations where you can’t look at a screen at all — walking, driving, cooking, doing laundry.
∙ Other projects in the ecosystem (agent-tts, AgentVibes, claude-code-tts, VoiceMode MCP) prove there’s demand, but they all run on the desktop side. Happy is uniquely positioned to do TTS on the mobile side where it’s most useful.
Possible implementation:
∙ A toggle in settings to enable TTS for responses
∙ Use the existing ElevenLabs integration or device-native TTS (iOS AVSpeechSynthesizer / Android TextToSpeech) for a zero-cost option
∙ Optionally summarize long responses before speaking (similar to the TTS_SUMMARY approach from claude-code-tts) so it doesn’t read 200 lines of code aloud
∙ Interrupt playback when the user starts speaking a new command