This repository contains the implementation of a Speech-to-Speech (S2S) Dialogue System.
The pre-trained and fine-tuned checkpoints for this dialogue system are available on Hugging Face:
Download Link: tranquangchung/qwen2-audio-dialogue
You can clone the model using:
git lfs install
git clone https://huggingface.co/tranquangchung/qwen2-audio-dialogue- Prerequisites Ensure you have Python 3.10+ and the necessary audio processing libraries installed:
pip install torch torchaudio transformers accelerate librosa- To test the dialogue system with real-world audio samples ("in-the-wild"), run the provided inference script:
python test_dialogue_inthewild.py