Skip to content

Conversation

@Nsuccess
Copy link

Description

This PR adds a new TTS extension for Speechmatics TTS API, providing low-latency, high-quality speech synthesis optimized for voice agents.

Features

  • Low-latency streaming synthesis (sub-150ms)
  • HTTP REST API integration with aiohttp
  • 4 voice options: sarah, theo, megan, jack (UK and US English)
  • Support for WAV and MP3 output formats
  • Configurable sample rates
  • Comprehensive error handling and retry logic
  • TTFB metrics support

Testing

  • 20 test cases covering configuration and extension functionality
  • Follows TTS2 HTTP interface pattern used by other TTS extensions
  • Compatible with existing TEN Framework infrastructure

Documentation

  • Complete README with setup instructions and examples
  • Configuration reference with all options
  • Voice options table
  • API details and architecture explanation

Closes #1965

Nsuccess added 2 commits January 14, 2026 16:46
- Implements text-to-speech using NVIDIA Riva Speech Skills
- Supports streaming synthesis with gRPC
- Includes comprehensive tests and documentation
- Follows TTS2 interface pattern

Closes TEN-framework#1964
- Implements text-to-speech using Speechmatics TTS API
- Supports low-latency streaming synthesis (sub-150ms)
- Includes 4 voice options (UK and US English)
- Comprehensive tests and documentation
- Follows TTS2 HTTP interface pattern

Closes TEN-framework#1965
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[2026NewYearChallenge 🏅] Create a Speechmatics TTS Extension

1 participant