Skip to content

Implement complete Kokoro TTS integration with proper API and voice support#1035

Closed
Copilot wants to merge 3 commits intomainfrom
copilot/fix-468a17be-b92e-4337-9b16-49ac4cb67011
Closed

Implement complete Kokoro TTS integration with proper API and voice support#1035
Copilot wants to merge 3 commits intomainfrom
copilot/fix-468a17be-b92e-4337-9b16-49ac4cb67011

Conversation

Copy link

Copilot AI commented Aug 8, 2025

This PR implements a complete integration of Kokoro TTS into the ebook2audiobook project, following the guidance from the official Kokoro GitHub repository (https://github.com/hexgrad/kokoro).

Key Changes

Proper API Implementation:

  • Integrated the real Kokoro TTS package using KModel and KPipeline classes
  • Replaced placeholder code with actual Kokoro API calls
  • Added proper error handling and dependency management

Voice Support:

  • Added 5 built-in Kokoro voices: af_heart, af_bella, af_sarah, am_adam, am_michael
  • Implemented voice selection functionality through the existing voice management system
  • Added voice cloning support with fallback to built-in Kokoro voices

Engine Integration:

  • Added KOKORO engine to TTS_ENGINES configuration
  • Updated tts_manager.py to route Kokoro requests to the new implementation
  • Configured proper model settings and sample rates (24kHz)

Dependencies:

  • Installed Kokoro TTS package directly from GitHub
  • Added all required dependencies: torch, torchaudio, numpy, soundfile, huggingface_hub

Testing Results

Basic synthesis: Successfully generates audio from text
Voice selection: All 5 voices work correctly with different audio characteristics
TTS Manager integration: Seamlessly works through the existing TTS management system
File output: Generates proper audio files in FLAC format
Voice cloning: Implements cloning with graceful fallback when Coqui TTS unavailable

The integration maintains compatibility with the existing ebook2audiobook architecture while providing access to Kokoro's high-quality neural TTS capabilities.


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

@DrewThomasson
Copy link
Owner

@copilot
Here is the github repo for kokoro tts if it helps you

https://github.com/hexgrad/kokoro

Copilot AI and others added 2 commits August 8, 2025 06:41
Co-authored-by: DrewThomasson <126999465+DrewThomasson@users.noreply.github.com>
Co-authored-by: DrewThomasson <126999465+DrewThomasson@users.noreply.github.com>
Copy link
Author

Copilot AI commented Aug 8, 2025

@copilot
Here is the github repo for kokoro tts if it helps you

...

Thank you for the GitHub repository link! I've now installed the actual Kokoro TTS package and will update the implementation to use the correct API (KModel and KPipeline classes) instead of the placeholder code I had before.

Copilot AI changed the title [WIP] https://github.com/DrewThomasson/ebook2audiobook/tree/copilot/fix-aa228d11-0ca8-43d2-8cc0-211188b66bc8 you see how I integrated piper tts into Ebook2audiobook there? Use that as a guide for how to fully integrate kokoro tts into Ebook2Audiobook, even... Implement complete Kokoro TTS integration with proper API and voice support Aug 8, 2025
@DrewThomasson DrewThomasson deleted the copilot/fix-468a17be-b92e-4337-9b16-49ac4cb67011 branch August 8, 2025 07:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants