@@ -80,49 +80,36 @@ then go to chrome://extensions/ and load unpacked the extensions/chrome/ dir
8080
8181## Demo Video
8282
83- [ ![ TalkiTo Demo] ( https://img.youtube.com/vi/FJdYTYZK_0U /0.jpg )] ( https://youtu.be/FJdYTYZK_0U )
83+ [ ![ TalkiTo Demo] ( https://img.youtube.com/vi/pf8jFt0smqs /0.jpg )] ( https://youtu.be/pf8jFt0smqs )
8484
8585## AI Assistant Compatibility
8686
87- | AI Assistant | Method | Status |
88- | ------------------------------| ---------------| ---------------------|
89- | ** Claude Code** | Terminal | ** Fully Supported** |
90- | ** Codex Cli** | Terminal | ** Fully Supported** |
91- | bolt.new | Web Extension | Output Only |
92- | v0.dev | Web Extension | Output Only |
93- | replit.com | Web Extension | Output Only |
94- | Gemini CLI | Terminal | In Progress |
95- | Aider | Terminal | In Progress |
96- | Cursor | Terminal | In Progress |
97- | Continue | Terminal | In Progress |
87+ | AI Assistant | Method | Status |
88+ | -----------------| ---------------| ---------------------|
89+ | ** Claude Code** | Terminal | ** Fully Supported** |
90+ | ** Codex Cli** | Terminal | ** Fully Supported** |
91+ | bolt.new | Web Extension | Output Only |
92+ | v0.dev | Web Extension | Output Only |
93+ | replit.com | Web Extension | Output Only |
94+ | Other agents | Terminal | In Progress |
9895
9996
10097
101- ### Voice Mode
98+ ### Run with Claude Code
10299
103- When you run ` talkito claude ` , voice mode is enabled by default:
100+ run ` talkito claude `
104101
105- 1 . ** Automatic voice interaction** : Claude will:
106- - Speak all responses using TTS
107- - Listen for your voice input after speaking
108- - Process your speech as the next user message
109- - Continue this loop automatically
102+ ### Run with Codex Cli
110103
111- 2 . ** Control voice mode** :
112- - Voice mode starts ON by default
113- - Say or type "turn off talkito" to disable voice interaction
114- - Say or type "turn on talkito" to re-enable if turned off
104+ run ` talkito codex `
115105
116- 3 . ** Unified input handling** : All inputs are processed as user messages:
117- - Voice dictation: Your spoken words
118- - Slack messages: From configured channels
119- - WhatsApp messages: From configured numbers
106+ ### Run as an MCP server
120107
121- 4 . ** Communication modes** :
122- - Say "start slack mode #channel-name" to auto-send responses to Slack
123- - Say "start whatsapp mode +1234567890" to auto-send responses to WhatsApp
124- - Say "stop slack/whatsapp mode" to disable
108+ run ` talkito --mcp-server `
125109
110+ ### Run the TalkiTo configuration menu
111+
112+ run ` talkito `
126113
127114#### Advanced Options
128115
@@ -131,13 +118,16 @@ When you run `talkito claude`, voice mode is enabled by default:
131118talkito --dont-auto-skip-tts claude
132119
133120# Use different TTS providers
134- talkito --tts-provider openai --tts-voice nova echo " Hello with OpenAI"
135121talkito --tts-provider polly --tts-voice Matthew --tts-region us-west-2 echo " Hello with AWS"
136122talkito --tts-provider azure --tts-voice en-US-JennyNeural echo " Hello with Azure"
137123talkito --tts-provider gcloud --tts-voice en-US-Journey-F echo " Hello with Google"
124+ talkito --tts-provider kittentts --tts-voice expr-voice-3-f echo " Hello with KittenTTS"
125+ talkito --tts-provider kokoro --tts-voice af_heart echo " Hello with Kokoro (local)"
138126
139127# Use different ASR providers
140128talkito --asr-provider gcloud --asr-language en-US claude
129+ AZURE_SPEECH_KEY=... AZURE_SPEECH_REGION=eastus talkito --asr-provider azure claude
130+ WHISPER_MODEL=small WHISPER_COMPUTE_TYPE=int8 talkito --asr-provider local_whisper claude
141131talkito --asr-language es-ES echo " Hola mundo" # Spanish recognition
142132
143133# Enable remote communication (configure via environment variables)
@@ -191,26 +181,6 @@ except KeyboardInterrupt:
191181 asr.stop_dictation()
192182```
193183
194- ### MCP Server Usage
195-
196- Talkito includes an MCP (Model Context Protocol) server that allows AI applications to use TTS and ASR capabilities:
197-
198- ``` bash
199- # Install TalkiTo (includes MCP support)
200- pip install talkito
201-
202- # Run as MCP server
203- talkito --mcp-server
204- ```
205-
206- The MCP server provides tools for:
207- - ** Core** : ` turn_on ` /` turn_off ` (enable voice mode), ` get_talkito_status `
208- - ** TTS** : ` enable_tts ` /` disable_tts ` , ` speak_text ` , ` skip_current_speech ` , ` configure_tts `
209- - ** ASR** : ` enable_asr ` /` disable_asr ` , ` start_voice_input ` /` stop_voice_input ` , ` get_dictated_text `
210- - ** Communication** : ` start_whatsapp_mode ` /` stop_whatsapp_mode ` , ` start_slack_mode ` /` stop_slack_mode ` , ` send_whatsapp ` , ` send_slack ` , ` get_messages `
211-
212- Configure your AI application to connect to the talkito MCP server for voice capabilities.
213-
214184## Provider Configuration
215185
216186### Text-to-Speech (TTS) Providers
@@ -255,6 +225,18 @@ Configure your AI application to connect to the talkito MCP server for voice cap
255225- ** Voices** : aura-asteria-en, aura-luna-en, aura-stella-en, and more
256226- ** Usage** : ` --tts-provider deepgram --tts-voice aura-asteria-en `
257227
228+ #### KittenTTS (Local / Offline)
229+ - ** Install** : ` pip install https://github.com/KittenML/KittenTTS/releases/download/0.1/kittentts-0.1.0-py3-none-any.whl soundfile phonemizer `
230+ - ** Setup** : No API key required. First run prompts to download the selected model (default ` kitten-tts-nano-0.2 ` ) into the Hugging Face cache. Configure ` KITTENTTS_MODEL ` and ` KITTENTTS_VOICE ` to pick different quality/voice options.
231+ - ** Best for** : Ultra-lightweight CPU-only voices that stay on-device.
232+ - ** Usage** : ` KITTENTTS_MODEL=kitten-tts-nano-0.2 talkito --tts-provider kittentts --tts-voice expr-voice-3-f `
233+
234+ #### Kokoro (Local / Offline)
235+ - ** Install** : ` pip install 'kokoro>=0.9.4' soundfile phonemizer `
236+ - ** Setup** : No API key required. TalkiTo will download Kokoro weights the first time you run it (set ` KOKORO_LANGUAGE ` , ` KOKORO_VOICE ` , ` KOKORO_SPEED ` to control defaults).
237+ - ** Best for** : High-quality multilingual voices without sending audio to a cloud provider.
238+ - ** Usage** : ` talkito --tts-provider kokoro --tts-voice af_heart --tts-language en-US `
239+
258240### Automatic Speech Recognition (ASR) Providers
259241
260242#### Google Speech Recognition (Default)
@@ -292,6 +274,17 @@ Configure your AI application to connect to the talkito MCP server for voice cap
292274- ** Features** : Streaming transcription
293275- ** Usage** : ` --asr-provider aws --aws-region us-west-2 `
294276
277+ #### Azure Speech Services
278+ - ** Get API Key** : https://azure.microsoft.com/en-us/services/cognitive-services/speech-services/
279+ - ** Setup** : Set ` AZURE_SPEECH_KEY ` and ` AZURE_SPEECH_REGION ` , then ` pip install azure-cognitiveservices-speech `
280+ - ** Features** : Low-latency streaming dictation with automatic punctuation
281+ - ** Usage** : ` AZURE_SPEECH_KEY=... AZURE_SPEECH_REGION=eastus talkito --asr-provider azure `
282+
283+ #### Local Whisper (On-Device)
284+ - ** Install** : ` pip install faster-whisper ` (default) or ` WHISPER_COREML=1 pip install pywhispercpp ` for Apple Silicon/CoreML acceleration
285+ - ** Setup** : No API key required. Configure ` WHISPER_MODEL ` (e.g., ` small ` , ` medium ` ), ` WHISPER_DEVICE ` (` cpu ` , ` cuda ` , or ` mps ` ), and ` WHISPER_COMPUTE_TYPE ` (` int8 ` , ` int8_float16 ` , etc.). Models are cached locally and TalkiTo will prompt before downloading unless ` TALKITO_AUTO_APPROVE_DOWNLOADS=1 ` .
286+ - ** Usage** : ` WHISPER_MODEL=small WHISPER_COMPUTE_TYPE=int8 talkito --asr-provider local_whisper `
287+
295288### Communication Providers (Remote Interaction)
296289
297290#### Twilio SMS
0 commit comments