Merge pull request #227722 from sally-baolian/patch-100

v-dirichards · web-flow · commit 9c2b041b3aa4 · 2023-02-23T09:58:49.000-06:00
Update pronunciation-assessment-tool.md
diff --git a/articles/cognitive-services/Speech-Service/how-to-pronunciation-assessment.md b/articles/cognitive-services/Speech-Service/how-to-pronunciation-assessment.md
@@ -678,7 +678,53 @@ Pronunciation assessment results for the spoken word "hello" are shown as a JSON
 }
 ```
 
+## Pronunciation assessment in streaming mode
+
+Pronunciation assessment supports uninterrupted streaming mode. The recording time can be unlimited through the Speech SDK. As long as you don't stop recording, the evaluation process doesn't finish and you can pause and resume evaluation conveniently. In streaming mode, the `AccuracyScore`, `FluencyScore` , and `CompletenessScore`  will vary over time throughout the recording and evaluation process.
+
+::: zone pivot="programming-language-csharp"
+
+For how to use Pronunciation Assessment in streaming mode in your own application, see [sample code](https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/csharp/sharedcontent/console/speech_recognition_samples.cs#:~:text=PronunciationAssessmentWithStream).
+
+::: zone-end
+
+::: zone pivot="programming-language-cpp"
+
+For how to use Pronunciation Assessment in streaming mode in your own application, see [sample code](https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/cpp/windows/console/samples/speech_recognition_samples.cpp#:~:text=PronunciationAssessmentWithStream).
+
+::: zone-end
+
+::: zone pivot="programming-language-java"
+
+For how to use Pronunciation Assessment in streaming mode in your own application, see [sample code](https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/java/android/sdkdemo/app/src/main/java/com/microsoft/cognitiveservices/speech/samples/sdkdemo/MainActivity.java#L548).
+
+::: zone-end
+
+::: zone pivot="programming-language-python"
+
+
+::: zone-end
+
+::: zone pivot="programming-language-javascript"
+
+For how to use Pronunciation Assessment in streaming mode in your own application, see [sample code](https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/js/node/pronunciationAssessment.js).
+
+::: zone-end
+
+::: zone pivot="programming-language-objectivec"
+
+::: zone-end
+
+::: zone pivot="programming-language-swift"
+
+::: zone-end
+
+::: zone pivot="programming-language-go"
+
+::: zone-end
+
 ## Next steps
 
+- Learn our quality [benchmark](https://techcommunity.microsoft.com/t5/ai-cognitive-services-blog/speech-service-update-hierarchical-transformer-for-pronunciation/ba-p/3740866)
 - Try out [pronunciation assessment in Speech Studio](pronunciation-assessment-tool.md)
-- Try out the [pronunciation assessment demo](https://github.com/Azure-Samples/Cognitive-Speech-TTS/tree/master/PronunciationAssessment/BrowserJS) and watch the [video tutorial](https://www.youtube.com/watch?v=zFlwm7N4Awc) of pronunciation assessment.
+- Check out easy-to-deploy Pronunciation Assessment [demo](https://github.com/Azure-Samples/Cognitive-Speech-TTS/tree/master/PronunciationAssessment/BrowserJS) and watch the [video tutorial](https://www.youtube.com/watch?v=zFlwm7N4Awc) of pronunciation assessment.
diff --git a/articles/cognitive-services/Speech-Service/media/pronunciation-assessment/initial-recording.png b/articles/cognitive-services/Speech-Service/media/pronunciation-assessment/initial-recording.png
diff --git a/articles/cognitive-services/Speech-Service/pronunciation-assessment-tool.md b/articles/cognitive-services/Speech-Service/pronunciation-assessment-tool.md
@@ -19,7 +19,7 @@ Pronunciation assessment uses the Speech-to-Text capability to provide subjectiv
 Pronunciation assessment provides various assessment results in different granularities, from individual phonemes to the entire text input. 
 - At the full-text level, pronunciation assessment offers additional Fluency and Completeness scores: Fluency indicates how closely the speech matches a native speaker's use of silent breaks between words, and Completeness indicates how many words are pronounced in the speech to the reference text input. An overall score aggregated from Accuracy, Fluency and Completeness is then given to indicate the overall pronunciation quality of the given speech.  
 - At the word-level, pronunciation assessment can automatically detect miscues and provide accuracy score simultaneously, which provides more detailed information on omission, repetition, insertions, and mispronunciation in the given speech.
-- Syllable-level accuracy scores are currently only available via the [JSON file](?tabs=json#scores-within-words) or [Speech SDK](how-to-pronunciation-assessment.md).
+- Syllable-level accuracy scores are currently available via the [JSON file](?tabs=json#pronunciation-assessment-results) or [Speech SDK](how-to-pronunciation-assessment.md). 
 - At the phoneme level, pronunciation assessment provides accuracy scores of each phoneme, helping learners to better understand the pronunciation details of their speech.
 
 This article describes how to use the pronunciation assessment tool through the [Speech Studio](https://speech.microsoft.com). You can get immediate feedback on the accuracy and fluency of your speech without writing any code. For information about how to integrate pronunciation assessment in your speech applications, see [How to use pronunciation assessment](how-to-pronunciation-assessment.md).
@@ -56,27 +56,12 @@ Follow these steps to assess your pronunciation of the reference text:
 
    :::image type="content" source="media/pronunciation-assessment/pa-upload.png" alt-text="Screenshot of uploading recorded audio to be assessed.":::
 
-
 ## Pronunciation assessment results
 
 Once you've recorded the reference text or uploaded the recorded audio, the **Assessment result** will be output. The result includes your spoken audio and the feedback on the accuracy and fluency of spoken audio, by comparing a machine generated transcript of the input audio with the reference text. You can listen to your spoken audio, and download it if necessary.
 
 You can also check the pronunciation assessment result in JSON. The word-level, syllable-level, and phoneme-level accuracy scores are included in the JSON file. 
 
-### Overall scores 
-
-Pronunciation Assessment evaluates three aspects of pronunciation: accuracy, fluency, and completeness. At the bottom of **Assessment result**, you can see **Pronunciation score**, **Accuracy score**, **Fluency score**, and **Completeness score**. The **Accuracy score** and the **Fluency score** will vary over time throughout the recording process. The **Completeness score** is only calculated at the end of the evaluation. The **Pronunciation score** is overall score indicating the pronunciation quality of the given speech. During recording, the **Pronunciation score** is aggregated from **Accuracy score** and **Fluency score** with weight. Once completing recording, this overall score is aggregated from **Accuracy score**, **Fluency score**, and **Completeness score** with weight.
-
-**During recording**
-
-:::image type="content" source="media/pronunciation-assessment/pa-recording-display-score.png" alt-text="Screenshot of overall assessment scores when recording." lightbox="media/pronunciation-assessment/pa-recording-display-score.png":::
-
-**Completing recording**
-
-:::image type="content" source="media/pronunciation-assessment/pa-after-recording-display-score.png" alt-text="Screenshot of overall assessment scores after recording." lightbox="media/pronunciation-assessment/pa-after-recording-display-score.png":::
-
-### Scores within words
-
 ### [Display](#tab/display)
 
 The complete transcription is shown in the **Display** window. If a word is omitted, inserted, or mispronounced compared to the reference text, the word will be highlighted according to the error type. While hovering over each word, you can see accuracy scores for the whole word or specific phonemes. 
@@ -233,7 +218,31 @@ The complete transcription is shown in the `text` attribute. You can see accurac
 
 ---
 
+### Assessment scores in streaming mode
+
+Pronunciation Assessment supports uninterrupted streaming mode. The demo on the Speech Studio supports up to 60 minutes of recording in streaming mode for evaluation. The Speech Studio demo allows for up to 60 minutes of recording in streaming mode for evaluation. As long as you do not press the stop recording button, the evaluation process does not finish and you can pause and resume evaluation conveniently.
+
+Pronunciation Assessment evaluates three aspects of pronunciation: accuracy, fluency, and completeness. At the bottom of **Assessment result**, you can see **Pronunciation score** as aggregated overall score which includes 3 sub aspects: **Accuracy score**, **Fluency score**, and **Completeness score**. In streaming mode, since the **Accuracy score**, **Fluency score and Completeness score** will vary over time throughout the recording process, we demonstrate an approach on Speech Studio to display approximate overall score incrementally before the end of the evaluation, which weighted only with Accuracy score and Fluency score. The **Completeness score** is only calculated at the end of the evaluation after you press the stop button, so the final overall score is aggregated from **Accuracy score**, **Fluency score**, and **Completeness score** with weight.
 
+Refer to the demo examples below for the whole process of evaluating pronunciation in streaming mode.
+
+**Start recording**
+
+As you start recording, the scores at the bottom begin to alter from 0.
+
+:::image type="content" source="media/pronunciation-assessment/initial-recording.png" alt-text="Screenshot of overall assessment scores when starting to record." lightbox="media/pronunciation-assessment/initial-recording.png":::
+
+**During recording**
+
+During recording a long paragraph, you can pause recording at any time. You can continue to evaluate your recording as long as you don't press the stop button. 
+
+:::image type="content" source="media/pronunciation-assessment/pa-recording-display-score.png" alt-text="Screenshot of overall assessment scores when recording." lightbox="media/pronunciation-assessment/pa-recording-display-score.png":::
+
+**Finish recording**
+
+After you press the stop button, you can see **Pronunciation score**, **Accuracy score**, **Fluency score**, and **Completeness score** at the bottom.
+
+:::image type="content" source="media/pronunciation-assessment/pa-after-recording-display-score.png" alt-text="Screenshot of overall assessment scores after recording." lightbox="media/pronunciation-assessment/pa-after-recording-display-score.png":::
 
 ## Next steps