Skip to content

Commit 1b6c974

Browse files
authored
Update README.md
1 parent a877a42 commit 1b6c974

File tree

1 file changed

+3
-2
lines changed

1 file changed

+3
-2
lines changed

README.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
</p>
77
<a href="https://arxiv.org/abs/2509.14946"><img src="https://img.shields.io/badge/arXiv-2509.14946-b31b1b.svg?logo=arxiv&logoColor=white" alt="arXiv:2509.14946"></a>
88
<a href="https://shawnpi233.github.io/SynParaSpeech"><img src="https://img.shields.io/badge/Demos-🌐-blue" alt="Demos"></a>
9-
<a href="https://huggingface.co/datasets/shawnpi/SynParaSpeech"><img src="https://img.shields.io/badge/%F0%9F%A4%97%20Dataset%20-Demo%20Access-orange" alt="Dataset Access(Coming Soon)"></a>
9+
<a href="https://huggingface.co/datasets/shawnpi/SynParaSpeech"><img src="https://img.shields.io/badge/%F0%9F%A4%97%20Dataset%20-%20Access-orange" alt="Dataset Access"></a>
1010
<!-- <a href="README_zh.md"><img src="https://img.shields.io/badge/语言-简体中文-green" alt="简体中文"></a> -->
1111
<a href="https://creativecommons.org/licenses/by-nc-nd/4.0/"><img src="https://img.shields.io/badge/License-CC%20BY--NC--ND%204.0-blue.svg" alt="License: CC BY-NC-ND 4.0"></a>
1212
</div>
@@ -16,11 +16,12 @@
1616
- **[2025-09-18]** 🎉 Initial release of arxiv paper.
1717
- **[2025-09-20]** 🎉 Initial release of demo page.
1818
- **[2025-09-22]** 🎉 Initial release of HuggingFace dataset demo.
19+
- **[2025-11-02]** 🎉 Initial release of HuggingFace full dataset (compliance-vetted core data).
1920

2021
### 📅 Release Plan
2122
- [x] Demo page
2223
- [x] SynParaSpeech demo dataset
23-
- [ ] SynParaSpeech full dataset
24+
- [x] SynParaSpeech full dataset
2425
- [ ] Fine-tuned TTS model checkpoints and inference codes
2526

2627
SynParaSpeech is the **first automated syntheis framework** designed for constructing large-scale paralinguistic datasets, enabling more realistic speech synthesis and speech understanding. It addresses critical issues in existing resources by generating high-quality data with paralinguistic sounds (e.g., laughter, sigh, throat clearing) that are fully aligned with speech, text, and precise timestamps.

0 commit comments

Comments
 (0)