speech-to-speech

Here are 107 public repositories matching this topic...

ictnlp / LLaMA-Omni

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

speech-to-text speech-to-speech large-language-models multimodal-large-language-models speech-language-model speech-interaction

Updated May 19, 2025
Python

IAHispano / Applio

Star

A simple, high-quality voice conversion tool focused on ease of use and performance.

text-to-speech ai voice speech pytorch tts rvc voice-conversion vc voice-cloning speech-to-speech vits voice-clone applio

Updated Feb 18, 2026
Python

akdeb / ElatoAI

Star

Realtime AI Voice Agents with SoTA Multimodal AI models on Arduino ESP32 with >15 minutes uninterrupted conversations globally for AI toys, AI companions, AI devices and more

arduino ai hardware websocket esp32 realtime gemini openai agents grok deno realtime-api speech-to-speech supabase elevenlabs humeai

Updated Feb 17, 2026
TypeScript

aws-samples / swift-chat

Star

A lightning-fast, cross-platform AI Assistant App built with React Native.

android chat mac aws ios react-native mobile-app virtual-try-on swift-chat speech-to-speech swiftchat ollama amazon-bedrock deepseek amazon-nova claude-4-sonnet

Updated Jan 23, 2026
TypeScript

FlashLabs-AI-Corp / FlashLabs-Chroma

Star

Worlds first open-source real-time end-to-end spoken dialogue model with personalized voice cloning.

end-to-end streaming-audio real-time-audio voice-cloning voice-ai speech-to-speech zero-shot-voice audio-language-model

Updated Jan 28, 2026
Jupyter Notebook

opendilab / CleanS2S

Star

High-quality and streaming Speech-to-Speech interactive agent in a single file. 只用一个文件实现的流式全双工语音交互原型智能体！

python machine-learning streaming ai speech-synthesis speech-recognition speech-to-speech gpt-4o

Updated Dec 15, 2025
Python

SamirPaulb / real-time-voice-translator

Star

A desktop application that uses AI to translate voice between languages in real time, while preserving the speaker's tone and emotion.

Updated Jan 22, 2024
Tcl

VITA-MLLM / Freeze-Omni

Star

✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM

speech speech-synthesis speech-recognition speech-to-speech large-language-models multimodal-large-language-models

Updated May 27, 2025
Python

Speech-to-speech AI assistant with natural conversation flow, mid-speech interruption, vision capabilities and AI-initiated follow-ups. Features low-latency audio streaming, dynamic visual feedback, and works with local LLM/TTS services via OpenAI-compatible endpoints.

artificial-intelligence visionprocessing conversational-ai speech-to-speech

Updated Apr 14, 2025
TypeScript

OpenBMB / UltraEval-Audio

Star

Your faithful, impartial partner for audio evaluation — know yourself, know your rivals. 真实评测，知己知彼。

evaluation speech-recognition speech-to-text speech-to-speech

Updated Feb 3, 2026
Python

amanvirparhar / weebo

Star

A real-time speech-to-speech chatbot powered by Whisper Small, Llama 3.2, and Kokoro-82M.

llama whisper kokoro speech-to-speech

Updated Jan 20, 2025
Python

asiff00 / On-Device-Speech-to-Speech-Conversational-AI

Star

This is an on-CPU real-time conversational system for two-way speech communication with AI models, utilizing a continuous streaming architecture for fluid conversations with immediate responses and natural interruption handling.

tts vad audio-processing asr voice-assistant conversational-ai speech-to-speech ollama kokoro-tts

Updated Nov 24, 2025
Python

Blaizzy / mlx-audio-swift

Sponsor

Star

A modular Swift SDK for audio processing with MLX on Apple Silicon

text-to-speech tts speech-to-text stt mlx speech-to-speech mlx-audio mlx-swift-audio mlx-audio-swift

Updated Feb 21, 2026
Swift

MooreThreads / MooER

Star

MooER: Moore-threads Open Omni model for speech-to-speech intERaction. MooER-omni includes a series of end-to-end speech interaction models along with training and inference code, covering but not limited to end-to-end speech interaction, end-to-end speech translation and speech recognition.

speech-recognition speech-to-text speech-translation speech-to-speech large-language-models chatgpt gpt-4o speech-interaction

Updated Jan 8, 2025
Python

mangodxd / youtube-auto-dub

Star

AI-powered YouTube video dubbing pipeline. Automatically transcribes (Whisper), translates (Google), and generates neural dubbing (Edge-TTS) with smart audio-video synchronization and background music preservation.

machine-translation video-translation speech-to-speech youtube-automation ai-dubbing edge-tts automatic-subtitles whisper-asr multilingual-video automated-dubbing

Updated Jan 5, 2026
Python

amigniter / mod_audio_stream

Sponsor

Star

FreeSWITCH module to stream audio to websocket and receive response

cpp websockets freeswitch speech-to-text realtime-audio freeswitch-plugin speech-to-speech

Updated Jan 28, 2026
C++

jesuscopado / samantha-os1-openai-realtime

Star

Samantha OS1 is a conversational AI assistant powered by the Realtime API from OpenAI

agent openai realtime-api speech-to-speech ai-agent

Updated Dec 27, 2024
Python

dqqcasia / awesome-speech-translation

Star

natural-language-processing machine-translation speech speech-synthesis speech-recognition speech-processing text-translation disfluency-detection speech-translation multimodal-machine-learning multimodal-machine-translation punctuation-restoration speech-to-speech simultaneous-translation cascaded-speech-translation non-autoregressive-translation speech-to-subtitles

Updated Nov 10, 2021

xcc-zach / xtalk

Star

X-Talk is an open-source full-duplex cascaded spoken dialogue system framework enabling low-latency, interruptible, and human-like speech interaction with a lightweight, pure-Python, production-ready architecture.

python real-time ai websocket voice voice-assistant conversational-ai speech-to-speech

Updated Feb 11, 2026
Python

tarun7r / Vocal-Agent

Star

Cascading voice assistant combining real-time speech recognition, AI reasoning, and neural text-to-speech capabilities.

text-to-speech llama agents whisper kokoro speech-to-speech

Updated Sep 7, 2025
Python

Improve this page

Add a description, image, and links to the speech-to-speech topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the speech-to-speech topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

speech-to-speech

Here are 107 public repositories matching this topic...

ictnlp / LLaMA-Omni

IAHispano / Applio

akdeb / ElatoAI

aws-samples / swift-chat

FlashLabs-AI-Corp / FlashLabs-Chroma

opendilab / CleanS2S

SamirPaulb / real-time-voice-translator

VITA-MLLM / Freeze-Omni

Lex-au / Vocalis

OpenBMB / UltraEval-Audio

amanvirparhar / weebo

asiff00 / On-Device-Speech-to-Speech-Conversational-AI

Blaizzy / mlx-audio-swift

MooreThreads / MooER

mangodxd / youtube-auto-dub

amigniter / mod_audio_stream

jesuscopado / samantha-os1-openai-realtime

dqqcasia / awesome-speech-translation

xcc-zach / xtalk

tarun7r / Vocal-Agent

Improve this page

Add this topic to your repo