You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/speech-service/includes/release-notes/release-notes-stt.md
+9-3Lines changed: 9 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,21 +7,27 @@ ms.author: eur
7
7
ms.custom: references_regions
8
8
---
9
9
10
+
### July 2025 release
11
+
12
+
#### Improved speech to text models
13
+
14
+
The English models (all `en-*` models except for `en-IN`) were updated to incorporate a new VAD (voice activity detector) which helps reduce the latency by 100 ms or more. It can affect the accuracy and silence segmentation both positively and negatively, with the aim of reducing latency. Further language expansion is coming in the next few months.
15
+
10
16
### June 2025 release
11
17
12
18
#### Improved pronunciation assessment model
13
19
14
-
We've rolled out significant upgrades to the pronunciation assessment models for `ta-IN` and `ms-MY`. You'll see a noticeable jump in Pearson Correlation Coefficients (PCC), which means more precise and dependable evaluations.
20
+
We rolled out significant upgrades to the pronunciation assessment models for `ta-IN` and `ms-MY`. You're seeing a noticeable jump in Pearson Correlation Coefficients (PCC), which means more precise and dependable evaluations.
15
21
16
22
These updated models are ready to use through the API and the Azure AI Foundry playground, just like before.
17
23
18
24
#### Improved speech to text models
19
-
Accuracy of speech to text models in [fast transcription](../../fast-transcription-create.md) for `de-DE`, `en-US`, `en-GB`, `es-ES`, `es-MX`, `fr-FR`, `it-IT`, `ja-JP`, `ko-KR`, `pt-BR`, and `zh-CN` locales are improved by 10%-25% percent respectively, particularly with improved readaibility and recognition on entities.
25
+
Accuracy of speech to text models in [fast transcription](../../fast-transcription-create.md) for `de-DE`, `en-US`, `en-GB`, `es-ES`, `es-MX`, `fr-FR`, `it-IT`, `ja-JP`, `ko-KR`, `pt-BR`, and `zh-CN` locales improving by 10%-25% percent respectively, particularly with improved readability and recognition on entities.
20
26
21
27
### May 2025 release
22
28
23
29
#### Improved speech to text models
24
-
Accuracy of speech to text models for `ta-IN`, `te-IN`, `en-IN`, and `hu-HU` locales are improved by 5-10 percent respectively. We also approximate a 20x reduction in ghost words for the `ta-IN` and `te-IN` models.
30
+
Accuracy of speech to text models for `ta-IN`, `te-IN`, `en-IN`, and `hu-HU` locales improving by 5-10 percent respectively. We also approximate a 20x reduction in ghost words for the `ta-IN` and `te-IN` models.
25
31
26
32
#### Fast transcription API - Multi-lingual speech transcription
0 commit comments