You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In this article, asynchronous meeting transcription is demonstrated using the **RemoteMeetingTranscriptionClient** API. If you have configured meeting transcription to do asynchronous transcription and have a `meetingId`, you can obtain the transcription associated with that `meetingId` using the **RemoteMeetingTranscriptionClient** API.
16
+
> [!NOTE]
17
+
> This feature is currently in public preview. This preview is provided without a service-level agreement, and is not recommended for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see [Supplemental Terms of Use for Microsoft Azure Previews](https://azure.microsoft.com/support/legal/preview-supplemental-terms/).
18
+
19
+
In this article, asynchronous conversation transcription is demonstrated using the **RemoteMeetingTranscriptionClient** API. If you have configured conversation transcription to do asynchronous transcription and have a `meetingId`, you can obtain the transcription associated with that `meetingId` using the **RemoteMeetingTranscriptionClient** API.
20
+
21
+
> [!IMPORTANT]
22
+
> Conversation transcription multichannel diarization (preview) is retiring on March 28, 2025. For more information about migrating to other speech to text features, see [Migrate away from conversation transcription multichannel diarization](#migrate-away-from-conversation-transcription-multichannel-diarization).
17
23
18
24
## Asynchronous vs. real-time + asynchronous
19
25
@@ -32,7 +38,7 @@ Two steps are required to accomplish asynchronous transcription. The first step
32
38
::: zone-end
33
39
34
40
35
-
## Next steps
41
+
## Related content
36
42
37
-
> [!div class="nextstepaction"]
38
-
> [Explore our samples on GitHub](https://aka.ms/csspeech/samples)
43
+
-[Try the real-time diarization quickstart](get-started-stt-diarization.md)
44
+
-[Try batch transcription with diarization](batch-transcription.md)
title: Real-time meeting transcription quickstart - Speech service
2
+
title: Real-time conversation transcription quickstart - Speech service
3
3
titleSuffix: Azure AI services
4
4
description: In this quickstart, learn how to transcribe meetings. You can add, remove, and identify multiple participants by streaming audio to the Speech service.
You can transcribe meetings with the ability to add, remove, and identify multiple participants by streaming audio to the Speech service. You first create voice signatures for each participant using the REST API, and then use the voice signatures with the Speech SDK to transcribe meetings. See the meeting transcription [overview](meeting-transcription.md) for more information.
17
+
> [!NOTE]
18
+
> This feature is currently in public preview. This preview is provided without a service-level agreement, and is not recommended for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see [Supplemental Terms of Use for Microsoft Azure Previews](https://azure.microsoft.com/support/legal/preview-supplemental-terms/).
19
+
20
+
You can transcribe meetings with the ability to add, remove, and identify multiple participants by streaming audio to the Speech service. You first create voice signatures for each participant using the REST API, and then use the voice signatures with the Speech SDK to transcribe meetings. See the conversation transcription [overview](meeting-transcription.md) for more information.
21
+
22
+
> [!IMPORTANT]
23
+
> Conversation transcription multichannel diarization (preview) is retiring on March 28, 2025. For more information about migrating to other speech to text features, see [Migrate away from conversation transcription multichannel diarization](#migrate-away-from-conversation-transcription-multichannel-diarization).
18
24
19
25
## Limitations
20
26
21
27
* Only available in the following subscription regions: `centralus`, `eastasia`, `eastus`, `westeurope`
22
28
* Requires a 7-mic circular multi-microphone array. The microphone array should meet [our specification](./speech-sdk-microphone.md).
23
29
24
-
> [!NOTE]
25
-
> The Speech SDK for C++, Java, Objective-C, and Swift support meeting transcription, but we haven't yet included a guide here.
30
+
> [!IMPORTANT]
31
+
> For the conversation transcription multichannel diarization feature, use `MeetingTranscriber` instead of `ConversationTranscriber`, and use `CreateMeetingAsync` instead of `CreateConversationAsync`. A new "conversation transcription" feature is released without the use of user profiles and voice signatures. For more information, see the [release notes](releasenotes.md?tabs=speech-sdk).
description: You use the meeting transcription feature for meetings. It combines recognition, speaker ID, and diarization to provide transcription of any meeting.
4
+
description: You use the conversation transcription feature for meetings. It combines recognition, speaker ID, and diarization to provide transcription of any meeting.
# What is conversation transcription multichannel diarization? (preview)
15
15
16
-
Meeting transcription is a [speech to text](speech-to-text.md) solution that provides real-time or asynchronous transcription of any meeting. This feature, which is currently in preview, combines speech recognition, speaker identification, and sentence attribution to determine who said what, and when, in a meeting.
16
+
> [!NOTE]
17
+
> This feature is currently in public preview. This preview is provided without a service-level agreement, and is not recommended for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see [Supplemental Terms of Use for Microsoft Azure Previews](https://azure.microsoft.com/support/legal/preview-supplemental-terms/).
18
+
19
+
Conversation transcription multichannel diarization is a [speech to text](speech-to-text.md) solution that provides real-time or asynchronous transcription of any meeting. This feature combines speech recognition, speaker identification, and sentence attribution to determine who said what, and when, in a meeting.
17
20
18
21
> [!IMPORTANT]
19
-
> The former "conversation transcription" scenario is renamed to "meeting transcription." For example, use `MeetingTranscriber` instead of `ConversationTranscriber`, and use `CreateMeetingAsync` instead of `CreateConversationAsync`. A new "conversation transcription" feature is released without the use of user profiles and voice signatures. For more information, see the [release notes](releasenotes.md?tabs=speech-sdk).
22
+
> Conversation transcription multichannel diarization (preview) is retiring on March 28, 2025. For more information about migrating to other speech to text features, see [Migrate away from conversation transcription multichannel diarization](#migrate-away-from-conversation-transcription-multichannel-diarization).
23
+
24
+
## Migrate away from conversation transcription multichannel diarization
25
+
26
+
Conversation transcription multichannel diarization (preview) is retiring on March 28, 2025.
27
+
28
+
To continue using speech to text with diarization, use the following features instead:
29
+
30
+
-[Real-time speech to text with diarization](get-started-stt-diarization.md)
31
+
-[Batch transcription with diarization](batch-transcription.md)
32
+
33
+
These speech to text features only support diarization for single-channel audio. Multichannel audio that you used with conversation transcription multichannel diarization isn't supported.
20
34
21
35
## Key features
22
36
23
-
You might find the following features of meeting transcription useful:
37
+
You might find the following features of conversation transcription useful:
24
38
25
39
-**Timestamps:** Each speaker utterance has a timestamp, so that you can easily find when a phrase was said.
26
40
-**Readable transcripts:** Transcripts have formatting and punctuation added automatically to ensure the text closely matches what was being said.
@@ -31,41 +45,34 @@ You might find the following features of meeting transcription useful:
31
45
-**Asynchronous transcription:** Provide transcripts with higher accuracy by using a multichannel audio stream.
32
46
33
47
> [!NOTE]
34
-
> Although meeting transcription doesn't put a limit on the number of speakers in the room, it's optimized for 2-10 speakers per session.
35
-
36
-
## Get started
37
-
38
-
See the real-time meeting transcription [quickstart](how-to-use-meeting-transcription.md) to get started.
48
+
> Although conversation transcription doesn't put a limit on the number of speakers in the room, it's optimized for 2-10 speakers per session.
39
49
40
50
## Use cases
41
51
42
-
To make meetings inclusive for everyone, such as participants who are deaf and hard of hearing, it's important to have transcription in real-time. Meeting transcription in real-time mode takes meeting audio and determines who is saying what, allowing all meeting participants to follow the transcript and participate in the meeting, without a delay.
52
+
To make meetings inclusive for everyone, such as participants who are deaf and hard of hearing, it's important to have transcription in real-time. Conversation transcription in real-time mode takes meeting audio and determines who is saying what, allowing all meeting participants to follow the transcript and participate in the meeting, without a delay.
43
53
44
-
Meeting participants can focus on the meeting and leave note-taking to meeting transcription. Participants can actively engage in the meeting and quickly follow up on next steps, using the transcript instead of taking notes and potentially missing something during the meeting.
54
+
Meeting participants can focus on the meeting and leave note-taking to conversation transcription. Participants can actively engage in the meeting and quickly follow up on next steps, using the transcript instead of taking notes and potentially missing something during the meeting.
45
55
46
56
## How it works
47
57
48
58
The following diagram shows a high-level overview of how the feature works.
49
59
50
-

60
+

51
61
52
62
## Expected inputs
53
63
54
-
Meeting transcription uses two types of inputs:
64
+
Conversation transcription uses two types of inputs:
55
65
56
66
-**Multi-channel audio stream:** For specification and design details, see [Microphone array recommendations](./speech-sdk-microphone.md).
57
-
-**User voice samples:** Meeting transcription needs user profiles in advance of the conversation for speaker identification. Collect audio recordings from each user, and then send the recordings to the [signature generation service](https://aka.ms/cts/signaturegenservice) to validate the audio and generate user profiles.
58
-
59
-
> [!NOTE]
60
-
> Single channel audio configuration for meeting transcription is currently only available in private preview.
67
+
-**User voice samples:** Conversation transcription needs user profiles in advance of the conversation for speaker identification. Collect audio recordings from each user, and then send the recordings to the [signature generation service](https://aka.ms/cts/signaturegenservice) to validate the audio and generate user profiles.
61
68
62
69
User voice samples for voice signatures are required for speaker identification. Speakers who don't have voice samples are recognized as *unidentified*. Unidentified speakers can still be differentiated when the `DifferentiateGuestSpeakers` property is enabled (see the following example). The transcription output then shows speakers as, for example, *Guest_0* and *Guest_1*, instead of recognizing them as pre-enrolled specific speaker names.
The following sections provide more detail about transcription modes you can choose.
71
78
@@ -81,11 +88,11 @@ Audio data is batch processed to return the speaker identifier and transcript. S
81
88
82
89
Audio data is processed live to return the speaker identifier and transcript, and, in addition, requests a high-accuracy transcript through asynchronous processing. Select this mode if your application has a need for real-time transcription, and also requires a higher accuracy transcript for use after the meeting occurred.
83
90
84
-
## Language support
91
+
## Language and region support
85
92
86
-
Currently, meeting transcription supports [all speech to text languages](language-support.md?tabs=stt) in the following regions: `centralus`, `eastasia`, `eastus`, `westeurope`.
93
+
Currently, conversation transcription supports [all speech to text languages](language-support.md?tabs=stt) in the following regions: `centralus`, `eastasia`, `eastus`, `westeurope`.
0 commit comments