Releases: aperepel/claude-mlx-tts
Releases · aperepel/claude-mlx-tts
v1.3.0
New Features
Streaming TTS (79% faster time-to-first-audio)
- Audio playback now starts almost immediately instead of waiting for full generation
Dynamic Audio Compression
- Added professional-grade compressor/limiter for consistent volume levels
- Default "notification punch" preset for clear, punchy TTS output
- Prevents audio clipping and sudden volume spikes
TTFT Metrics
- Time-to-first-token measurements now logged for performance monitoring
Breaking Changes
Dependency change: pyloudnorm → pedalboard
- If upgrading, run `uv sync --extra mlx` to install the new dependency
v1.2.0
- Voice embeddings caching — Voice cloning now runs once at server startup.
Subsequent requests load cached embeddings from disk, reducing per-request
overhead by ~99% (1.5s → <10ms)
- Permission prompt notifications — Get an audio alert when Claude needs
tool permission approval, so you don't miss prompts while away from terminal
- Separate logging for generation vs playback time for clearer performance metrics
v1.1.1
v1.1.0
v1.0.0
v0.1.0
Working implementation using subprocess calls for TTS:
- macOS 'say' command as default TTS backend
- MLX voice cloning via 'python -m mlx_audio.tts.generate' subprocess
- Claude CLI subprocess for summarization
- Threshold-based triggering (duration, tool calls, thinking keywords)
Architecture: scripts/tts-notify.py (single script, ~290 lines)
- Hook fires on Claude stop event
- Checks thresholds against transcript
- Summarizes via claude -p subprocess
- Speaks via say or mlx_audio subprocess
Known limitation: Each MLX TTS call loads ~4GB model from scratch (5-10s latency)
Next: Direct Python API integration with background daemon for sub-second response