Skip to content

Commit 232b2cb

Browse files
Merge pull request #6007 from goergenj/jagoerge-speech-hotfixes
Adding note on new VAD model rolled out in July 2025 as hotfix to Wha…
2 parents feed5d6 + 21a713f commit 232b2cb

File tree

1 file changed

+9
-3
lines changed

1 file changed

+9
-3
lines changed

articles/ai-services/speech-service/includes/release-notes/release-notes-stt.md

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,21 +7,27 @@ ms.author: eur
77
ms.custom: references_regions
88
---
99

10+
### July 2025 release
11+
12+
#### Improved speech to text models
13+
14+
The English models (all `en-*` models except for `en-IN`) were updated to incorporate a new VAD (voice activity detector) which helps reduce the latency by 100 ms or more. It can affect the accuracy and silence segmentation both positively and negatively, with the aim of reducing latency. Further language expansion is coming in the next few months.
15+
1016
### June 2025 release
1117

1218
#### Improved pronunciation assessment model
1319

14-
We've rolled out significant upgrades to the pronunciation assessment models for `ta-IN` and `ms-MY`. You'll see a noticeable jump in Pearson Correlation Coefficients (PCC), which means more precise and dependable evaluations.
20+
We rolled out significant upgrades to the pronunciation assessment models for `ta-IN` and `ms-MY`. You're seeing a noticeable jump in Pearson Correlation Coefficients (PCC), which means more precise and dependable evaluations.
1521

1622
These updated models are ready to use through the API and the Azure AI Foundry playground, just like before.
1723

1824
#### Improved speech to text models
19-
Accuracy of speech to text models in [fast transcription](../../fast-transcription-create.md) for `de-DE`, `en-US`, `en-GB`, `es-ES`, `es-MX`, `fr-FR`, `it-IT`, `ja-JP`, `ko-KR`, `pt-BR`, and `zh-CN` locales are improved by 10%-25% percent respectively, particularly with improved readaibility and recognition on entities.
25+
Accuracy of speech to text models in [fast transcription](../../fast-transcription-create.md) for `de-DE`, `en-US`, `en-GB`, `es-ES`, `es-MX`, `fr-FR`, `it-IT`, `ja-JP`, `ko-KR`, `pt-BR`, and `zh-CN` locales improving by 10%-25% percent respectively, particularly with improved readability and recognition on entities.
2026

2127
### May 2025 release
2228

2329
#### Improved speech to text models
24-
Accuracy of speech to text models for `ta-IN`, `te-IN`, `en-IN`, and `hu-HU` locales are improved by 5-10 percent respectively. We also approximate a 20x reduction in ghost words for the `ta-IN` and `te-IN` models.
30+
Accuracy of speech to text models for `ta-IN`, `te-IN`, `en-IN`, and `hu-HU` locales improving by 5-10 percent respectively. We also approximate a 20x reduction in ghost words for the `ta-IN` and `te-IN` models.
2531

2632
#### Fast transcription API - Multi-lingual speech transcription
2733

0 commit comments

Comments
 (0)