Skip to content

Releases: Sinapsis-AI/sinapsis-speech

sinapsis-speech v0.4.0

05 Jun 21:26

Choose a tag to compare

Sinapsis-speech V0.4.0

The sinapsis-speech package continues to evolve, bringing broader compatibility and more powerful speech capabilities. This release introduces new integrations and support for cutting-edge TTS and STT models, along with extended support for ElevenLabs 2.0+.

🚀 New Integrations
Sinapsis Orpheus-CPP
Enables text-to-speech (TTS) using the Orpheus-TTS engine, providing high-quality neural voice synthesis.

OrpheusTTS

Converts text to speech using Orpheus.

Accepts text packets from an input container and returns synthesized audio.

Includes memory-safe error handling for GPU-intensive workloads.

📄 See the full setup in the README.

Sinapsis Parakeet-TDT
Brings speech-to-text (STT) capabilities using NVIDIA’s Parakeet TDT 0.6B model.

ParakeetTDTInference

Transcribes audio input from containers or files.

Supports timestamp prediction.

Adds the resulting text packets back into the container.

📄 See the full setup in the README.

ElevenLabs 2.0+ Support
We now offer seamless compatibility with ElevenLabs v2.0 and above, unlocking improved voice fidelity and additional model options.

ElevenLabsTTS

Text-to-speech using ElevenLabs voice models.

ElevenLabsVoiceGeneration

Generate synthetic voices based on descriptions.

📄 Setup instructions available in the package README.

🔧 Full Package Overview
Sinapsis ElevenLabs – TTS + voice generation via ElevenLabs

Sinapsis F5-TTS – TTS with voice cloning

Sinapsis Kokoro – TTS with Kokoro 82M

Sinapsis Zonos – TTS and voice cloning using Zonos

Sinapsis Orpheus-CPP – NEW: TTS via Orpheus

Sinapsis Parakeet-TDT – NEW: STT via Parakeet TDT

sinapsis-speech v0.3.0

25 Apr 22:30

Choose a tag to compare

Sinapsis-speech v0.3.0

We are excited to introduce the sinapsis-kokoro package into the sinapsis-speech monorepo. sinapsis-kokoro is a powerful tool for integrating high-quality text-to-speech (TTS) capabilities into your applications. Built on the Kokoro model, this package offers a lightweight yet efficient solution for generating synthetic speech, making it ideal for a wide range of applications.

Note:

This release includes an upgrade in the webapp to follow the sinapsis design

Key Features
Text-to-Speech Synthesis

High-Quality Speech Generation: Kokoro delivers speech output comparable to larger models, ensuring clear and natural voice synthesis.
Versatile Use Cases: Perfect for applications like audiobooks, voice assistants, and accessibility tools.

Voice Customization

Multiple Voice Options: Choose from a variety of voices to suit different content types and user preferences.
Customizable Output: Adjust pitch, speed, and tone to create the desired auditory experience.

Deployment Flexibility

Apache-Licensed Weights: Easily deploy Kokoro in any environment, from production servers to personal projects.
Cross-Platform Compatibility: Works seamlessly across various platforms and devices.

Performance and Efficiency

Lightning-Fast Processing: Kokoro's lightweight architecture ensures quick synthesis, reducing latency in real-time applications.
Cost-Effective: Lower computational requirements make it an economical choice for continuous use.

sinapsis-speech v0.2.2

24 Apr 20:58

Choose a tag to compare

This release includes a minor fix in the webapps for text-to-speech for the different packages

sinapsis-speech v0.2.0

14 Apr 15:55

Choose a tag to compare

We present sinapsis-speech V0.2.0, which now includes two new packages: sinapsis-zonos and sinapsis-f5tts, which expand the functionality provided by sinapsis-elevenlabs, showcasing how easy it is to integrate new functionality within the sinapsis framework

sinapsis-zonos Package

High-Quality Voice Synthesis
    Generate natural-sounding speech with customizable voice characteristics.
    Supports multiple languages and voice styles for diverse use cases.

Advanced Voice Customization
    Fine-tune voice attributes such as pitch, speed, and tone.
    Create unique synthetic voices tailored to specific applications.

Efficient Workflow Integration
    Seamlessly integrate with existing workflows for text-to-speech tasks.
    Compatible with the core template system for flexible pipeline construction.

sinapsis-f5tts Package

Real-Time Text-to-Speech Synthesis
    Generate speech on-the-fly with low latency.
    Ideal for applications requiring immediate audio output.

Emotional Modulation
    Add emotional tone to synthetic speech, such as happiness, sadness, or neutrality.
    Enhance the expressiveness of generated speech for more engaging interactions.

Multi-Language Support
    Synthesize speech in multiple languages and dialects.
    Handle long-form text inputs for extended speech generation.

Furthermore, we include two new webapps to test the functionality of these packages

sinapsis-speech v0.1.0

19 Mar 21:47
d675752

Choose a tag to compare

The sinapsis-speech monorepo provides a sinapsis-elevenlabs package with templates that provide a flexible and reusable framework for text-to-speech and voice generation workflows. This release introduces a core template system that allows developers to:

  1. Define Custom Templates: Create reusable blueprints for text-to-speech and voice synthesis tasks.
  2. Compose Complex Workflows: Combine multiple templates to build sophisticated data processing pipelines.
  3. Modify Attributes Dynamically: Update template attributes through a dedicated method while preserving metadata.
  4. Integrate with Data Containers: Process and transform data within a unified data container system.

The sinapsis-elevenlabs include templates to perform:

  • Text-to-speech: Template for converting text into speech using ElevenLabs' voice models.
  • Voice generation: Template for generating custom synthetic voices based on user-provided descriptions.