TTS and STS Models to port to MLX-Audio (Roadmap)

## Overview
This issue outlines our roadmap for integrating additional text-to-speech (TTS) and speech-to-speech (STS) models into the MLX-Audio library to expand our offerings beyond the current Kokoro model.

## Text-to-Speech (TTS) Models

### Planned TTS Models
- [x] **Nari Labs Dia 1.6B**
- [x] **OuteTTS v1**
- [x] **Orpheus**
- [x] **BARK**
- [x] **SparkTTS 0.5B**
- [X] **Sesame CSM-1B**
- [x] **IndexTTS**
- [x] **ChatterBox**
- [x] **VibeVoice** 
- [x] **VyoTTS**
- [ ] **MegaTTS**
- [ ] **Zonos**
- [ ] **CosyVoice2**
- [ ] **StyleTTS2**
- [ ] **Parler TTS**
- [ ] **ibm-granite/granite-speech-3.2-8b**
- [ ] **LLMVoX**
- [ ] **MeloTTS**
- [ ] **bosonai/higgs-audio-v2**

## Speech-to-Speech (STS) Models

### Planned STS Models
- [ ] **Kyutai-Labs Moshi**
- [ ] **Kyutai-Labs Moshi-vis**
 
## Speech-to-text (STT) 
- [x] Whisper 
- [x] Parakeet
- [x] Wav2vec
- [x] Voxtral
- [ ] Canary

## Technical Considerations
- All models will need MLX-specific optimizations
- Quantization support should be implemented for each model
- Documentation and examples will be created for each new model
- Performance benchmarks will be established

Instructions:
1. Select the model and comment below with your selection
2. Create a Draft PR titled: "Add support for X"
3. Read [Contribution guide](https://github.com/Blaizzy/mlx-audio/blob/main/CONTRIBUTING.md)
4. Check existing [models](https://github.com/Blaizzy/mlx-audio/tree/main/mlx_audio) 
5. Tag @Blaizzy for code reviews and questions.

## Community Input
We welcome community feedback on prioritization and additional model suggestions. Please comment on this issue with your thoughts.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

TTS and STS Models to port to MLX-Audio (Roadmap) #1

Overview

Text-to-Speech (TTS) Models

Planned TTS Models

Speech-to-Speech (STS) Models

Planned STS Models

Speech-to-text (STT)

Technical Considerations

Community Input

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

TTS and STS Models to port to MLX-Audio (Roadmap) #1

Description

Overview

Text-to-Speech (TTS) Models

Planned TTS Models

Speech-to-Speech (STS) Models

Planned STS Models

Speech-to-text (STT)

Technical Considerations

Community Input

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions