Skip to content

zferentz/voice2tmux

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Voice to Claude Code / tmux

TL;DR

A small proof-of-concept toy that demonstrates how to send voice commands (and translated voice commands) from a webpage to a terminal program ... I used it to voice-communicate with Claude Code :) .

Background

Sometimes, I need to run a program - locally or remotely - that requires typing a lot of prompts or commands. Recently, that program was Claude Code, which I run remotely over SSH. I wanted a quick way to issue commands using my voice from my Windows machine, so I could “talk” to Claude Code instead of typing - indeed I'm lazy. Of course, there are other solutions out there, but I wanted to see how I could build something myself.

I tried a few approaches (including Claude Code Hooks and MCP notifications) but none worked out. Eventually, I fell back on the good old tmux trick. For the uninitiated, tmux (Terminal Multiplexer) is a Linux command-line tool for managing multiple terminal sessions in a single window. More importantly for me, it can inject keystrokes into a session with a simple command—perfect for faking typed input.

The core idea:

  1. Start a terminal program inside tmux. For example, to run Claude Code: # tmux new -A -s claude-session claude
  2. Open a webpage that captures voice input, sends it to a simple backend, and the backend forwards it to the program via tmux send-keys command.

I began with a simple prompt (initial-prompt.txt) and built a basic system that accepts text or voice input in the browser, sends it over HTTP to a server, and uses tmux send-keys to feed it into Claude Code. Later, I extended it:

  • Users can speak in their native language, the system uses Google Translate to convert it to English, and then sends the translated text to Claude Code.
  • Users can specify the tmux session-name (which allowed me to use it with multiple Claude Code and Bash sessions)

And yes, because I’m lazy, I used Claude Code to help write the very system that lets me talk to Claude Code. Which is… amusingly recursive. image

Some implementation notes:

  • The default port is _PORT = 8099 (can easily make it command line parameter if needed...)
  • Backend using simple FastAPI endpoint /listen
  • This approach is not safe outside your localhost/local-network as we accept input from the web and run subprocess.run in the backend.
  • Frontend is pretty minimal (and was fully created by Claude so not sure about its quality ;) )
  • To record/listen, we're using the webkitSpeechRecognition, which is supported by the Chrome browser. Not sure about other browsers.
  • To translate, we're using the free Google Translate API.

About

A simple POC shows how to send voice commands (or translated voice commands) from a webpage to a terminal program like Claude Code (running in tmux)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors