Successfully integrated Elevenlabs API text-to-speech (TTS) functionality into the hyper-robco plugin following the "agent vibes" philosophy of short, relevant announcements.
-
Non-blocking TTS Integration
- Uses
setImmediate()for async execution - TTS runs in background without delaying terminal output
- No blocking calls to Elevenlabs API
- Uses
-
Smart Summarization
- Extracts short, relevant summaries (10-50 characters)
- Follows "agent vibes" philosophy - concise, not verbose
- Pattern-based extraction for different event types:
- CRITICAL: Error messages (shortened)
- COMPLETION: Success messages (e.g., "Build succeeded", "42 tests passed")
- APPROVAL: Action needed messages (e.g., "Ready for review")
-
Pattern Detection
- Monitors streaming text for completion patterns
- 5-second cooldown between announcements
- 500-character rolling buffer for pattern matching
- Three categories: CRITICAL, COMPLETION, APPROVAL
-
Platform-Aware Audio Playback
- macOS: Uses
afplay(built-in) - Windows: Uses Windows Media Player COM object
- Linux: Checks for
mpg123,sox, orffplay - Browser: Uses Web Audio API
- Graceful degradation with helpful error messages
- macOS: Uses
-
Security Hardened
- Uses
execFileinstead ofexec(prevents shell injection) - Arguments passed as array, not string interpolation
- Proper escaping in PowerShell commands
- API key via environment variable or config
- Timeout protection (10 seconds)
- Uses
Added TTS section to .claude-plugin/config.json:
{
"tts": {
"enabled": false, // Disabled by default
"apiKey": "", // Or use ELEVENLABS_API_KEY env var
"voiceId": "21m00Tcm4TlvDq8ikWAM", // Default: Rachel
"modelId": "eleven_monolingual_v1",
"stability": 0.5,
"similarityBoost": 0.75,
"volume": 0.3
}
}-
.claude-plugin/elevenlabs-tts.js(280 lines)- Core TTS module
- Elevenlabs API integration
- Smart summarization logic
- Platform-aware audio playback
- Error handling and cleanup
-
.claude-plugin/TTS_README.md(120 lines)- Comprehensive documentation
- Configuration guide
- Voice selection guide
- Troubleshooting section
- Privacy and security notes
- Cost considerations
-
test-tts.js(140 lines)- 9 summarization test cases
- Module export validation
- Hook integration tests
- Configuration validation
- All tests passing (100%)
-
.claude-plugin/hook.js- Added TTS module import
- Added pattern detection logic
- Added
onCompletionhook - Added
processStreamingTextfunction - Integrated TTS into streaming response handler
-
.claude-plugin/config.json- Added TTS configuration section
-
.claude-plugin/plugin.json- Added
onCompletionhook declaration
- Added
-
package.json- Updated test script to include TTS tests
- Added TTS-related keywords
-
README.md- Added TTS feature section
- Updated configuration example
- Added reference to TTS_README.md
Followed the established pattern of keeping TTS announcements short and relevant:
- Extract only essential information
- Avoid verbose descriptions
- Focus on actionable status updates
- Skip boilerplate and code
- Used
setImmediate()for background execution - TTS failures don't block or crash the plugin
- Warnings logged but not thrown
- Terminal output never delayed
- Disabled by default
- Requires explicit user configuration
- API key never committed to git
- Environment variable support
- Only summaries sent to API (no code/secrets)
- Proper command detection on each platform
- Graceful degradation if audio player not found
- Clear error messages guide users to install tools
- Browser fallback with Web Audio API
- Uses
execFileto prevent shell injection - Proper argument passing (array, not strings)
- Timeout protection
- Temp file cleanup
- No eval or dynamic code execution
- ✅ 9/9 summarization test cases passing
- ✅ Module exports validated
- ✅ Hook integration verified
- ✅ Configuration structure validated
- ✅ JavaScript syntax validation
- ✅ Code review: No issues found
- ✅ CodeQL security scan: 0 vulnerabilities
| Input | Category | Output |
|---|---|---|
| "Build succeeded with 247 tests passed" | COMPLETION | "Build succeeded" |
| "All 42 tests passed successfully!" | COMPLETION | "42 tests passed" |
| "Error: Cannot find module 'missing'" | CRITICAL | "Error: Cannot find module 'missing'" |
| "Code review completed successfully" | COMPLETION | "Code review completed" |
| "Ready for review - please approve" | APPROVAL | "Ready for review" |
- Get an Elevenlabs API key from https://elevenlabs.io
- Either:
- Set environment variable:
export ELEVENLABS_API_KEY="your-key" - Or add to config:
.claude-plugin/config.json
- Set environment variable:
- Set
"tts.enabled": truein config - Restart Claude Code
Browse voices at https://elevenlabs.io/voice-library and update voiceId in config.
Popular choices:
- Rachel (default): Calm, professional
- Domi: Strong, authoritative
- Bella: Soft, friendly
# Recommended
sudo apt install mpg123
# Or alternatives
sudo apt install sox libsox-fmt-mp3
sudo apt install ffmpeg- TTS Latency: 500-2000ms (Elevenlabs API call)
- Non-blocking: Does not delay terminal output
- Cooldown: 5 seconds between announcements
- Buffer Size: 500 characters
- Timeout: 10 seconds max per TTS request
- Average Summary Length: 10-50 characters
- API Cost: ~10-30 characters per announcement
✅ Shell Injection Risk: Fixed by using execFile instead of exec
✅ Command String Interpolation: Fixed by using argument arrays
✅ Path Escaping: Proper escaping in PowerShell commands
- CodeQL: 0 vulnerabilities
- Code Review: No security issues
- Manual Review: All patterns validated
- ✅ No shell string interpolation
- ✅ Proper argument passing
- ✅ Timeout protection
- ✅ Error boundary handling
- ✅ Temp file cleanup
- ✅ API key via environment variable
- ✅ No sensitive data sent to API
- ✅ Input sanitization
Potential improvements for future PRs:
- Caching of TTS audio for repeated messages
- Custom voice upload support
- Per-pattern volume control
- User-configurable patterns
- Offline TTS option (e.g., piper, espeak)
- Multi-language support
- Voice rotation for variety
Complete documentation provided:
- ✅ README.md updated
- ✅ TTS_README.md created
- ✅ Config examples
- ✅ Troubleshooting guide
- ✅ Security notes
- ✅ Cost considerations
- ✅ Code comments
Successfully implemented a production-ready, secure, non-blocking Elevenlabs TTS integration that follows the "agent vibes" philosophy of short, relevant announcements. All tests pass, security scans clear, and comprehensive documentation provided.
The feature is:
- ✅ Non-blocking
- ✅ Optional
- ✅ Smart (short summaries)
- ✅ Secure (no vulnerabilities)
- ✅ Platform-aware
- ✅ Well-tested
- ✅ Well-documented