Skip to content

Commit 031a435

Browse files
Update how-to-pronunciation-assessment.md
1 parent a1c66f5 commit 031a435

File tree

1 file changed

+22
-2
lines changed

1 file changed

+22
-2
lines changed

articles/ai-services/speech-service/how-to-pronunciation-assessment.md

Lines changed: 22 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -680,7 +680,7 @@ You can get pronunciation assessment scores for:
680680
- Syllable groups
681681
- Phonemes in [SAPI](/previous-versions/windows/desktop/ee431828(v=vs.85)#american-english-phoneme-table) or [IPA](https://en.wikipedia.org/wiki/IPA) format
682682

683-
### Supported features per locale
683+
## Supported features per locale
684684

685685
The following table summarizes which features that locales support. For more specifies, see the following sections. If the locales you require aren't listed in the following table for the supported feature, fill out this [intake form](https://aka.ms/speechpa/intake) for further assistance.
686686

@@ -783,7 +783,7 @@ pronunciationAssessmentConfig?.phonemeAlphabet = "IPA"
783783

784784
::: zone-end
785785

786-
## Assess spoken phonemes
786+
### Assess spoken phonemes
787787

788788
With spoken phonemes, you can get confidence scores that indicate how likely the spoken phonemes matched the expected phonemes.
789789

@@ -1029,6 +1029,26 @@ pronunciationAssessmentConfig?.nbestPhonemeCount = 5
10291029

10301030
::: zone-end
10311031

1032+
## Other tips on configuration and SDK usage
1033+
1034+
- Pronunciation assessment uses a fixed version of the speech to text model, different from the standard speech to text model.
1035+
- Pronunciation scores are calculated by weighting accuracy, prosody, fluency, and completeness scores based on specific formulas for reading and speaking scenarios.
1036+
1037+
When sorting the scores of accuracy/prosody/fluency/completeness from low to high (if each score is available) and representing the lowest score to the highest score as s0 to s3, the pronunciation score is calculated as follows:
1038+
1039+
For reading scenario:
1040+
- With prosody score: PronScore = 0.4 * s0 + 0.2 * s1 + 0.2 * s2 + 0.2 * s3
1041+
- Without prosody score: PronScore = 0.6 * s0 + 0.2 * s1 + 0.2 * s2
1042+
1043+
For speaking scenario (completeness score is useless):
1044+
- With prosody score: PronScore = 0.6 * s0 + 0.2 * s1 + 0.2 * s2
1045+
- Without prosody score: PronScore = 0.6 * s0 + 0.4 * s1
1046+
1047+
This formula provides a weighted calculation based on the importance of each score, ensuring a comprehensive evaluation of pronunciation.
1048+
1049+
- Currently, only `en-US` is supported for topics in pronunciation assessment. There is no length limit for the topic parameter.
1050+
- If your audio file exceeds 30 seconds, use continuous mode for processing. For how to use streaming mode, refer to [Use pronunciation assessment in streaming mode](#use-pronunciation-assessment-in-streaming-mode).
1051+
10321052
## Related content
10331053

10341054
- Learn about quality [benchmark](https://aka.ms/pronunciationassessment/techblog).

0 commit comments

Comments
 (0)