Skip to content

Commit 5a00aef

Browse files
committed
update
1 parent 35abb1f commit 5a00aef

File tree

2 files changed

+41
-0
lines changed

2 files changed

+41
-0
lines changed

README.md

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,9 @@
11
[![SVG Banners](https://svg-banners.vercel.app/api?type=origin&text1=CosyVoice🤠&text2=Text-to-Speech%20💖%20Large%20Language%20Model&width=800&height=210)](https://github.com/Akshay090/svg-banners)
22

33
## 👉🏻 CosyVoice 👈🏻
4+
5+
**CosyVoice 3.0**: [Demos](https://funaudiollm.github.io/cosyvoice3/); [Paper](https://arxiv.org/abs/2505.17589); [CV3-Eval](https://github.com/FunAudioLLM/CV3-Eval)
6+
47
**CosyVoice 2.0**: [Demos](https://funaudiollm.github.io/cosyvoice2/); [Paper](https://arxiv.org/abs/2412.10117); [Modelscope](https://www.modelscope.cn/studios/iic/CosyVoice2-0.5B); [HuggingFace](https://huggingface.co/spaces/FunAudioLLM/CosyVoice2-0.5B)
58

69
**CosyVoice 1.0**: [Demos](https://fun-audio-llm.github.io); [Paper](https://funaudiollm.github.io/pdf/CosyVoice_v1.pdf); [Modelscope](https://www.modelscope.cn/studios/iic/CosyVoice-300M)
@@ -26,6 +29,10 @@
2629

2730
## Roadmap
2831

32+
- [x] 2025/07
33+
34+
- [x] release cosyvoice 3.0 eval set
35+
2936
- [x] 2025/05
3037

3138
- [x] add cosyvoice 2.0 vllm support
@@ -251,5 +258,39 @@ You can also scan the QR code to join our official Dingding chat group.
251258
4. We borrowed a lot of code from [AcademiCodec](https://github.com/yangdongchao/AcademiCodec).
252259
5. We borrowed a lot of code from [WeNet](https://github.com/wenet-e2e/wenet).
253260
261+
## Citations
262+
263+
``` bibtex
264+
@article{du2024cosyvoice,
265+
title={Cosyvoice: A scalable multilingual zero-shot text-to-speech synthesizer based on supervised semantic tokens},
266+
author={Du, Zhihao and Chen, Qian and Zhang, Shiliang and Hu, Kai and Lu, Heng and Yang, Yexin and Hu, Hangrui and Zheng, Siqi and Gu, Yue and Ma, Ziyang and others},
267+
journal={arXiv preprint arXiv:2407.05407},
268+
year={2024}
269+
}
270+
271+
@article{du2024cosyvoice,
272+
title={Cosyvoice 2: Scalable streaming speech synthesis with large language models},
273+
author={Du, Zhihao and Wang, Yuxuan and Chen, Qian and Shi, Xian and Lv, Xiang and Zhao, Tianyu and Gao, Zhifu and Yang, Yexin and Gao, Changfeng and Wang, Hui and others},
274+
journal={arXiv preprint arXiv:2412.10117},
275+
year={2024}
276+
}
277+
278+
@article{du2025cosyvoice,
279+
title={CosyVoice 3: Towards In-the-wild Speech Generation via Scaling-up and Post-training},
280+
author={Du, Zhihao and Gao, Changfeng and Wang, Yuxuan and Yu, Fan and Zhao, Tianyu and Wang, Hao and Lv, Xiang and Wang, Hui and Shi, Xian and An, Keyu and others},
281+
journal={arXiv preprint arXiv:2505.17589},
282+
year={2025}
283+
}
284+
285+
@inproceedings{lyu2025build,
286+
title={Build LLM-Based Zero-Shot Streaming TTS System with Cosyvoice},
287+
author={Lyu, Xiang and Wang, Yuxuan and Zhao, Tianyu and Wang, Hao and Liu, Huadai and Du, Zhihao},
288+
booktitle={ICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
289+
pages={1--2},
290+
year={2025},
291+
organization={IEEE}
292+
}
293+
```
294+
254295
## Disclaimer
255296
The content provided above is for academic purposes only and is intended to demonstrate technical capabilities. Some examples are sourced from the internet. If any content infringes on your rights, please contact us to request its removal.

asset/dingding.png

46 Bytes
Loading

0 commit comments

Comments
 (0)