Skip to content

taresh18/livekit-kyutai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

LiveKit Kyutai TTS Plugin

A high-quality text-to-speech plugin for LiveKit agents integrating Kyutai TTS with streaming implementation for real-time voice synthesis.

Features

  • High-Quality Audio: Premium voice synthesis with natural-sounding speech
  • Streaming Implementation: Real-time audio generation with low-latency streaming
  • Excellent Performance: ~260ms time-to-first-byte (TTFB) on RTX 4090

Requirements

  • LiveKit Agents v1.2 or higher
  • Kyutai TTS server instance

Installation

  1. Clone or download this plugin into your LiveKit-based agents project root directory
  2. Set up the Kyutai TTS server using taresh18/delayed-streams-modeling
  3. Ensure your server is running and accessible

Server Setup

Use the delayed streams modeling server for optimal performance:

Repository: taresh18/delayed-streams-modeling

Follow the setup instructions in the repository to get your Kyutai TTS server running with streaming capabilities.

Usage

Initialize your agent session with the KyutaiTTS plugin:

from your_plugin_path import kyutTTS

session = AgentSession(
    # ... other configuration
    tts=kyutTTS(
        base_url="<kyutai_server_url>",  # e.g., "http://localhost:8000"
        voice="expresso/ex04-ex02_happy_001_channel2_140s.wav",  # voice file path
    )
)

Performance Metrics

  • Latency: ~260ms TTFB on RTX 4090 GPU
  • Quality: High-quality voice synthesis with natural prosody
  • Streaming: Real-time audio generation and playback
  • Efficiency: Optimized for production use with streaming implementation

About

LiveKit TTS plugin with Kyutai streaming implementation

Topics

Resources

Stars

Watchers

Forks

Languages