Merge pull request #264568 from TimShererWithAquent/us200722d

v-ccolin · web-flow · commit 05241996bb53 · 2024-02-02T16:12:02.000Z
Freshness update: Azure AI Speech service
diff --git a/articles/ai-services/speech-service/get-started-stt-diarization.md b/articles/ai-services/speech-service/get-started-stt-diarization.md
@@ -1,19 +1,20 @@
 ---
 title: "Real-time diarization quickstart - Speech service"
 titleSuffix: Azure AI services
-description: In this quickstart, you convert speech to text continuously from a file. The service transcribes the speech and identifies one or more speakers.
+description: In this quickstart, you convert speech to text continuously from a file. The Speech service transcribes the speech and identifies one or more speakers.
 author: eric-urban
 manager: nitinme
 ms.service: azure-ai-speech
 ms.custom: devx-track-extended-java, devx-track-go, devx-track-js, devx-track-python
 ms.topic: quickstart
-ms.date: 7/27/2023
+ms.date: 01/30/2024
 ms.author: eur
 zone_pivot_groups: programming-languages-speech-services
 keywords: speech to text, speech to text software
+#customer intent: As a developer, I want to create speech to text applications that use diarization to improve readability of multiple person conversations.
 ---
 
-# Quickstart: Real-time diarization (Preview)
+# Quickstart: Create real-time diarization (Preview)
 
 ::: zone pivot="programming-language-csharp"
 [!INCLUDE [C# include](includes/quickstarts/stt-diarization/csharp.md)]
@@ -55,8 +56,7 @@ keywords: speech to text, speech to text software
 [!INCLUDE [CLI include](includes/quickstarts/stt-diarization/cli.md)]
 ::: zone-end
 
-
-## Next steps
+## Next step
 
 > [!div class="nextstepaction"]
 > [Learn more about speech recognition](how-to-recognize-speech.md)
diff --git a/articles/ai-services/speech-service/includes/quickstarts/stt-diarization/cpp.md b/articles/ai-services/speech-service/includes/quickstarts/stt-diarization/cpp.md
@@ -2,7 +2,7 @@
 author: eric-urban
 ms.service: azure-ai-speech
 ms.topic: include
-ms.date: 7/27/2023
+ms.date: 01/30/2024
 ms.author: eur
 ---
 
@@ -15,23 +15,27 @@ ms.author: eur
 [!INCLUDE [Prerequisites](../../common/azure-prerequisites.md)]
 
 ## Set up the environment
+
 The Speech SDK is available as a [NuGet package](https://www.nuget.org/packages/Microsoft.CognitiveServices.Speech) and implements .NET Standard 2.0. You install the Speech SDK later in this guide, but first check the [SDK installation guide](../../../quickstarts/setup-platform.md?pivots=programming-language-cpp) for any more requirements.
 
 ### Set environment variables
 
 [!INCLUDE [Environment variables](../../common/environment-variables.md)]
 
-## Diarization from file with conversation transcription
+## Implement diarization from file with conversation transcription
+
+Follow these steps to create a console application and install the Speech SDK.
 
-Follow these steps to create a new console application and install the Speech SDK.
+1. Create a new C++ console project in [Visual Studio Community 2022](https://visualstudio.microsoft.com/downloads/) named `ConversationTranscription`.
 
-1. Create a new C++ console project in Visual Studio Community 2022 named `ConversationTranscription`.
-1. Install the Speech SDK in your new project with the NuGet package manager.
-    ```powershell
+1. Select **Tools** > **Nuget Package Manager** > **Package Manager Console**. In the **Package Manager Console**, run this command:
+
+    ```console
     Install-Package Microsoft.CognitiveServices.Speech
     ```
-1. Replace the contents of `ConversationTranscription.cpp` with the following code:
-    
+
+1. Replace the contents of `ConversationTranscription.cpp` with the following code.
+
     ```cpp
     #include <iostream> 
     #include <stdlib.h>
@@ -134,20 +138,23 @@ Follow these steps to create a new console application and install the Speech SD
     }
     ```
 
-1. Replace `katiesteve.wav` with the filepath and filename of your `.wav` file. The intent of this quickstart is to recognize speech from multiple participants in the conversation. Your audio file should contain multiple speakers. For example, you can use the [sample audio file](https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/sampledata/audiofiles/katiesteve.wav) provided in the Speech SDK samples repository on GitHub.
-    > [!NOTE]
-    > The service performs best with at least 7 seconds of continuous audio from a single speaker. This allows the system to differentiate the speakers properly. Otherwise the Speaker ID is returned as `Unknown`.
-1. To change the speech recognition language, replace `en-US` with another [supported language](~/articles/cognitive-services/speech-service/supported-languages.md). For example, `es-ES` for Spanish (Spain). The default language is `en-US` if you don't specify a language. For details about how to identify one of multiple languages that might be spoken, see [language identification](~/articles/cognitive-services/speech-service/language-identification.md). 
+1. Get the [sample audio file](https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/sampledata/audiofiles/katiesteve.wav) or use your own `.wav` file. Replace `katiesteve.wav` with the path and name of your `.wav` file.
+
+   The application recognizes speech from multiple participants in the conversation. Your audio file should contain multiple speakers.
+
+   > [!NOTE]
+   > The service performs best with at least 7 seconds of continuous audio from a single speaker. This allows the system to differentiate the speakers properly. Otherwise the Speaker ID is returned as `Unknown`.
 
+1. To change the speech recognition language, replace `en-US` with another [supported language](~/articles/cognitive-services/speech-service/supported-languages.md). For example, `es-ES` for Spanish (Spain). The default language is `en-US` if you don't specify a language. For details about how to identify one of multiple languages that might be spoken, see [language identification](~/articles/cognitive-services/speech-service/language-identification.md).
 
-[Build and run](/cpp/build/vscpp-step-2-build) your application to start conversation transcription:
+1. [Build and run](/cpp/build/vscpp-step-2-build) your application to start conversation transcription:
 
-> [!IMPORTANT]
-> Make sure that you set the `SPEECH_KEY` and `SPEECH_REGION` environment variables as described [above](#set-environment-variables). If you don't set these variables, the sample will fail with an error message.
+   > [!IMPORTANT]
+   > Make sure that you set the `SPEECH_KEY` and `SPEECH_REGION` [environment variables](#set-environment-variables). If you don't set these variables, the sample fails with an error message.
 
-The transcribed conversation should be output as text: 
+The transcribed conversation should be output as text:
 
-```console
+```output
 TRANSCRIBED: Text=Good morning, Steve. Speaker ID=Unknown
 TRANSCRIBED: Text=Good morning. Katie. Speaker ID=Unknown
 TRANSCRIBED: Text=Have you tried the latest real time diarization in Microsoft Speech Service which can tell you who said what in real time? Speaker ID=Guest-1
diff --git a/articles/ai-services/speech-service/includes/quickstarts/stt-diarization/csharp.md b/articles/ai-services/speech-service/includes/quickstarts/stt-diarization/csharp.md
@@ -2,7 +2,7 @@
 author: eric-urban
 ms.service: azure-ai-speech
 ms.topic: include
-ms.date: 7/27/2023
+ms.date: 01/30/2024
 ms.author: eur
 ---
 
@@ -15,25 +15,32 @@ ms.author: eur
 [!INCLUDE [Prerequisites](../../common/azure-prerequisites.md)]
 
 ## Set up the environment
+
 The Speech SDK is available as a [NuGet package](https://www.nuget.org/packages/Microsoft.CognitiveServices.Speech) and implements .NET Standard 2.0. You install the Speech SDK later in this guide, but first check the [SDK installation guide](../../../quickstarts/setup-platform.md?pivots=programming-language-csharp) for any more requirements.
 
 ### Set environment variables
 
 [!INCLUDE [Environment variables](../../common/environment-variables.md)]
 
-## Diarization from file with conversation transcription
+## Implement diarization from file with conversation transcription
+
+Follow these steps to create a console application and install the Speech SDK.
 
-Follow these steps to create a new console application and install the Speech SDK.
+1. Open a command prompt window in the folder where you want the new project. Run this command to create a console application with the .NET CLI.
 
-1. Open a command prompt where you want the new project, and create a console application with the .NET CLI. The `Program.cs` file should be created in the project directory.
     ```dotnetcli
     dotnet new console
     ```
+
+   This command creates the *Program.cs* file in your project directory.
+
 1. Install the Speech SDK in your new project with the .NET CLI.
+
     ```dotnetcli
     dotnet add package Microsoft.CognitiveServices.Speech
     ```
-1. Replace the contents of `Program.cs` with the following code. 
+
+1. Replace the contents of `Program.cs` with the following code.
 
     ```csharp
     using Microsoft.CognitiveServices.Speech;
@@ -110,23 +117,27 @@ Follow these steps to create a new console application and install the Speech SD
     }
     ```
 
-1. Replace `katiesteve.wav` with the filepath and filename of your `.wav` file. The intent of this quickstart is to recognize speech from multiple participants in the conversation. Your audio file should contain multiple speakers. For example, you can use the [sample audio file](https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/sampledata/audiofiles/katiesteve.wav) provided in the Speech SDK samples repository on GitHub.
-    > [!NOTE]
-    > The service performs best with at least 7 seconds of continuous audio from a single speaker. This allows the system to differentiate the speakers properly. Otherwise the Speaker ID is returned as `Unknown`.
-1. To change the speech recognition language, replace `en-US` with another [supported language](~/articles/cognitive-services/speech-service/supported-languages.md). For example, `es-ES` for Spanish (Spain). The default language is `en-US` if you don't specify a language. For details about how to identify one of multiple languages that might be spoken, see [language identification](~/articles/cognitive-services/speech-service/language-identification.md). 
+1. Get the [sample audio file](https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/sampledata/audiofiles/katiesteve.wav) or use your own `.wav` file. Replace `katiesteve.wav` with the path and name of your `.wav` file.
 
-Run your new console application to start conversation transcription:
+   The application recognizes speech from multiple participants in the conversation. Your audio file should contain multiple speakers.
 
-```console
-dotnet run
-```
+   > [!NOTE]
+   > The service performs best with at least 7 seconds of continuous audio from a single speaker. This allows the system to differentiate the speakers properly. Otherwise the Speaker ID is returned as `Unknown`.
+
+1. To change the speech recognition language, replace `en-US` with another [supported language](~/articles/cognitive-services/speech-service/supported-languages.md). For example, `es-ES` for Spanish (Spain). The default language is `en-US` if you don't specify a language. For details about how to identify one of multiple languages that might be spoken, see [language identification](~/articles/cognitive-services/speech-service/language-identification.md).
+
+1. Run your console application to start conversation transcription:
+
+   ```dotnetcli
+   dotnet run
+   ```
 
 > [!IMPORTANT]
-> Make sure that you set the `SPEECH_KEY` and `SPEECH_REGION` environment variables as described [above](#set-environment-variables). If you don't set these variables, the sample will fail with an error message.
+> Make sure that you set the `SPEECH_KEY` and `SPEECH_REGION` [environment variables](#set-environment-variables). If you don't set these variables, the sample fails with an error message.
 
-The transcribed conversation should be output as text: 
+The transcribed conversation should be output as text:
 
-```console
+```output
 TRANSCRIBED: Text=Good morning, Steve. Speaker ID=Unknown
 TRANSCRIBED: Text=Good morning. Katie. Speaker ID=Unknown
 TRANSCRIBED: Text=Have you tried the latest real time diarization in Microsoft Speech Service which can tell you who said what in real time? Speaker ID=Guest-1
@@ -142,4 +153,3 @@ Speakers are identified as Guest-1, Guest-2, and so on, depending on the number
 ## Clean up resources
 
 [!INCLUDE [Delete resource](../../common/delete-resource.md)]
-
diff --git a/articles/ai-services/speech-service/includes/quickstarts/stt-diarization/intro.md b/articles/ai-services/speech-service/includes/quickstarts/stt-diarization/intro.md
@@ -2,16 +2,16 @@
 author: eric-urban
 ms.service: azure-ai-speech
 ms.topic: include
-ms.date: 05/08/2023
+ms.date: 01/30/2024
 ms.author: eur
 ---
 
-In this quickstart, you run an application for speech to text transcription with real-time diarization. Here, diarization is distinguishing between the different speakers participating in the conversation. The Speech service provides information about which speaker was speaking a particular part of transcribed speech. 
+In this quickstart, you run an application for speech to text transcription with real-time diarization. Diarization distinguishes between the different speakers who participate in the conversation. The Speech service provides information about which speaker was speaking a particular part of transcribed speech. 
 
 > [!NOTE]
-> Real-time diarization is currently in public preview. 
+> Real-time diarization is currently in public preview.
 
-The speaker information is included in the result in the speaker ID field. The speaker ID is a generic identifier assigned to each conversation participant by the service during the recognition as different speakers are being identified from the provided audio content. 
+The speaker information is included in the result in the speaker ID field. The speaker ID is a generic identifier assigned to each conversation participant by the service during the recognition as different speakers are being identified from the provided audio content.
 
 > [!TIP]
-> You can try real-time speech to text in [Speech Studio](https://aka.ms/speechstudio/speechtotexttool) without signing up or writing any code. However, the Speech Studio doesn't yet support diarization.
+> You can try real-time speech to text in [Speech Studio](https://aka.ms/speechstudio/speechtotexttool) without signing up or writing any code. However, the Speech Studio doesn't yet support diarization.
diff --git a/articles/ai-services/speech-service/includes/quickstarts/stt-diarization/java.md b/articles/ai-services/speech-service/includes/quickstarts/stt-diarization/java.md
@@ -2,7 +2,7 @@
 author: eric-urban
 ms.service: azure-ai-speech
 ms.topic: include
-ms.date: 7/27/2023
+ms.date: 01/30/2024
 ms.author: eur
 ---
 
@@ -16,10 +16,12 @@ ms.author: eur
 
 ## Set up the environment
 
-Before you can do anything, you need to install the Speech SDK. The sample in this quickstart works with the [Java Runtime](~/articles/cognitive-services/speech-service/quickstarts/setup-platform.md?pivots=programming-language-java&tabs=jre).
+To set up your environment, [install the Speech SDK](~/articles/ai-services/speech-service/quickstarts/setup-platform.md?pivots=programming-language-java&tabs=jre). The sample in this quickstart works with the [Java Runtime](~/articles/cognitive-services/speech-service/quickstarts/setup-platform.md?pivots=programming-language-java&tabs=jre).
 
 1. Install [Apache Maven](https://maven.apache.org/install.html). Then run `mvn -v` to confirm successful installation.
+
 1. Create a new `pom.xml` file in the root of your project, and copy the following into it:
+
     ```xml
     <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
         <modelVersion>4.0.0</modelVersion>
@@ -48,7 +50,9 @@ Before you can do anything, you need to install the Speech SDK. The sample in th
         </dependencies>
     </project>
     ```
+
 1. Install the Speech SDK and dependencies.
+
     ```console
     mvn clean dependency:copy-dependencies
     ```
@@ -57,11 +61,12 @@ Before you can do anything, you need to install the Speech SDK. The sample in th
 
 [!INCLUDE [Environment variables](../../common/environment-variables.md)]
 
-## Diarization from file with conversation transcription
+## Implement diarization from file with conversation transcription
 
-Follow these steps to create a new console application for conversation transcription.
+Follow these steps to create a console application for conversation transcription.
 
 1. Create a new file named `ConversationTranscription.java` in the same project root directory.
+
 1. Copy the following code into `ConversationTranscription.java`:
 
     ```java
@@ -139,24 +144,28 @@ Follow these steps to create a new console application for conversation transcri
     }
     ```
 
-1. Replace `katiesteve.wav` with the filepath and filename of your `.wav` file. The intent of this quickstart is to recognize speech from multiple participants in the conversation. Your audio file should contain multiple speakers. For example, you can use the [sample audio file](https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/sampledata/audiofiles/katiesteve.wav) provided in the Speech SDK samples repository on GitHub.
-    > [!NOTE]
-    > The service performs best with at least 7 seconds of continuous audio from a single speaker. This allows the system to differentiate the speakers properly. Otherwise the Speaker ID is returned as `Unknown`.
-1. To change the speech recognition language, replace `en-US` with another [supported language](~/articles/cognitive-services/speech-service/supported-languages.md). For example, `es-ES` for Spanish (Spain). The default language is `en-US` if you don't specify a language. For details about how to identify one of multiple languages that might be spoken, see [language identification](~/articles/cognitive-services/speech-service/language-identification.md). 
+1. Get the [sample audio file](https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/sampledata/audiofiles/katiesteve.wav) or use your own `.wav` file. Replace `katiesteve.wav` with the path and name of your `.wav` file.
 
-Run your new console application to start conversation transcription:
+   The application recognizes speech from multiple participants in the conversation. Your audio file should contain multiple speakers.
 
-```console
-javac ConversationTranscription.java -cp ".;target\dependency\*"
-java -cp ".;target\dependency\*" ConversationTranscription
-```
+   > [!NOTE]
+   > The service performs best with at least 7 seconds of continuous audio from a single speaker. This allows the system to differentiate the speakers properly. Otherwise the Speaker ID is returned as `Unknown`.
+
+1. To change the speech recognition language, replace `en-US` with another [supported language](~/articles/cognitive-services/speech-service/supported-languages.md). For example, `es-ES` for Spanish (Spain). The default language is `en-US` if you don't specify a language. For details about how to identify one of multiple languages that might be spoken, see [language identification](~/articles/cognitive-services/speech-service/language-identification.md).
+
+1. Run your new console application to start conversation transcription:
+
+   ```console
+   javac ConversationTranscription.java -cp ".;target\dependency\*"
+   java -cp ".;target\dependency\*" ConversationTranscription
+   ```
 
 > [!IMPORTANT]
-> Make sure that you set the `SPEECH_KEY` and `SPEECH_REGION` environment variables as described [above](#set-environment-variables). If you don't set these variables, the sample will fail with an error message.
+> Make sure that you set the `SPEECH_KEY` and `SPEECH_REGION` [environment variables](#set-environment-variables). If you don't set these variables, the sample fails with an error message.
 
-The transcribed conversation should be output as text: 
+The transcribed conversation should be output as text:
 
-```console 
+```output
 TRANSCRIBED: Text=Good morning, Steve. Speaker ID=Unknown
 TRANSCRIBED: Text=Good morning. Katie. Speaker ID=Unknown
 TRANSCRIBED: Text=Have you tried the latest real time diarization in Microsoft Speech Service which can tell you who said what in real time? Speaker ID=Guest-1
diff --git a/articles/ai-services/speech-service/includes/quickstarts/stt-diarization/javascript.md b/articles/ai-services/speech-service/includes/quickstarts/stt-diarization/javascript.md
diff --git a/articles/ai-services/speech-service/includes/quickstarts/stt-diarization/python.md b/articles/ai-services/speech-service/includes/quickstarts/stt-diarization/python.md