Skip to content

Commit 362ce99

Browse files
committed
Team aligned on real-time hyphen
1 parent 93bfa53 commit 362ce99

File tree

30 files changed

+38
-38
lines changed

30 files changed

+38
-38
lines changed

articles/cognitive-services/Speech-Service/audio-processing-overview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ The Microsoft Audio Stack also powers a wide range of Microsoft products:
3636
## Speech SDK integration
3737

3838
The Speech SDK integrates Microsoft Audio Stack (MAS), allowing any application or product to use its audio processing capabilities on input audio. Some of the key Microsoft Audio Stack features available via the Speech SDK include:
39-
* **Realtime microphone input & file input** - Microsoft Audio Stack processing can be applied to real-time microphone input, streams, and file-based input.
39+
* **Real-time microphone input & file input** - Microsoft Audio Stack processing can be applied to real-time microphone input, streams, and file-based input.
4040
* **Selection of enhancements** - To allow for full control of your scenario, the SDK allows you to disable individual enhancements like dereverberation, noise suppression, automatic gain control, and acoustic echo cancellation. For example, if your scenario does not include rendering output audio that needs to be suppressed from the input audio, you have the option to disable acoustic echo cancellation.
4141
* **Custom microphone geometries** - The SDK allows you to provide your own custom microphone geometry information, in addition to supporting preset geometries like linear two-mic, linear four-mic, and circular 7-mic arrays (see more information on supported preset geometries at [Microphone array recommendations](speech-sdk-microphone.md#microphone-geometry)).
4242
* **Beamforming angles** - Specific beamforming angles can be provided to optimize audio input originating from a predetermined location, relative to the microphones.

articles/cognitive-services/Speech-Service/batch-synthesis.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ The Batch synthesis API (Preview) can synthesize a large volume of text input (l
1919
> [!IMPORTANT]
2020
> The Batch synthesis API is currently in public preview. Once it's generally available, the Long Audio API will be deprecated. For more information, see [Migrate to batch synthesis API](migrate-to-batch-synthesis.md).
2121
22-
The batch synthesis API is asynchronous and doesn't return synthesized audio in real time. You submit text files to be synthesized, poll for the status, and download the audio output when the status indicates success. The text inputs must be plain text or [Speech Synthesis Markup Language (SSML)](speech-synthesis-markup.md) text.
22+
The batch synthesis API is asynchronous and doesn't return synthesized audio in real-time. You submit text files to be synthesized, poll for the status, and download the audio output when the status indicates success. The text inputs must be plain text or [Speech Synthesis Markup Language (SSML)](speech-synthesis-markup.md) text.
2323

2424
This diagram provides a high-level overview of the workflow.
2525

articles/cognitive-services/Speech-Service/captioning-concepts.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ The following are aspects to consider when using captioning:
3636
>
3737
> Try the [Azure Video Indexer](../../azure-video-indexer/video-indexer-overview.md) as a demonstration of how you can get captions for videos that you upload.
3838
39-
Captioning can accompany real time or pre-recorded speech. Whether you're showing captions in real time or with a recording, you can use the [Speech SDK](speech-sdk.md) or [Speech CLI](spx-overview.md) to recognize speech and get transcriptions. You can also use the [Batch transcription API](batch-transcription.md) for pre-recorded video.
39+
Captioning can accompany real-time or pre-recorded speech. Whether you're showing captions in real-time or with a recording, you can use the [Speech SDK](speech-sdk.md) or [Speech CLI](spx-overview.md) to recognize speech and get transcriptions. You can also use the [Batch transcription API](batch-transcription.md) for pre-recorded video.
4040

4141
## Caption output format
4242

@@ -68,13 +68,13 @@ Welcome to applied Mathematics course 201.
6868

6969
## Input audio to the Speech service
7070

71-
For real time captioning, use a microphone or audio input stream instead of file input. For examples of how to recognize speech from a microphone, see the [Speech to text quickstart](get-started-speech-to-text.md) and [How to recognize speech](how-to-recognize-speech.md) documentation. For more information about streaming, see [How to use the audio input stream](how-to-use-audio-input-streams.md).
71+
For real-time captioning, use a microphone or audio input stream instead of file input. For examples of how to recognize speech from a microphone, see the [Speech to text quickstart](get-started-speech-to-text.md) and [How to recognize speech](how-to-recognize-speech.md) documentation. For more information about streaming, see [How to use the audio input stream](how-to-use-audio-input-streams.md).
7272

7373
For captioning of a prerecording, send file input to the Speech service. For more information, see [How to use compressed input audio](how-to-use-codec-compressed-audio-input-streams.md).
7474

7575
## Caption and speech synchronization
7676

77-
You'll want to synchronize captions with the audio track, whether it's done in real time or with a prerecording.
77+
You'll want to synchronize captions with the audio track, whether it's done in real-time or with a prerecording.
7878

7979
The Speech service returns the offset and duration of the recognized speech.
8080

@@ -91,7 +91,7 @@ Consider when to start displaying captions, and how many words to show at a time
9191
9292
For captioning of prerecorded speech or wherever latency isn't a concern, you could wait for the complete transcription of each utterance before displaying any words. Given the final offset and duration of each word in an utterance, you know when to show subsequent words at pace with the soundtrack.
9393

94-
Real time captioning presents tradeoffs with respect to latency versus accuracy. You could show the text from each `Recognizing` event as soon as possible. However, if you can accept some latency, you can improve the accuracy of the caption by displaying the text from the `Recognized` event. There's also some middle ground, which is referred to as "stable partial results".
94+
Real-time captioning presents tradeoffs with respect to latency versus accuracy. You could show the text from each `Recognizing` event as soon as possible. However, if you can accept some latency, you can improve the accuracy of the caption by displaying the text from the `Recognized` event. There's also some middle ground, which is referred to as "stable partial results".
9595

9696
You can request that the Speech service return fewer `Recognizing` events that are more accurate. This is done by setting the `SpeechServiceResponse_StablePartialResultThreshold` property to a value between `0` and `2147483647`. The value that you set is the number of times a word has to be recognized before the Speech service returns a `Recognizing` event. For example, if you set the `SpeechServiceResponse_StablePartialResultThreshold` property value to `5`, the Speech service will affirm recognition of a word at least five times before returning the partial results to you with a `Recognizing` event.
9797

articles/cognitive-services/Speech-Service/conversation-transcription.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ See the real-time conversation transcription [quickstart](how-to-use-conversatio
4141

4242
## Use cases
4343

44-
To make meetings inclusive for everyone, such as participants who are deaf and hard of hearing, it's important to have transcription in real time. Conversation transcription in real-time mode takes meeting audio and determines who is saying what, allowing all meeting participants to follow the transcript and participate in the meeting, without a delay.
44+
To make meetings inclusive for everyone, such as participants who are deaf and hard of hearing, it's important to have transcription in real-time. Conversation transcription in real-time mode takes meeting audio and determines who is saying what, allowing all meeting participants to follow the transcript and participate in the meeting, without a delay.
4545

4646
Meeting participants can focus on the meeting and leave note-taking to conversation transcription. Participants can actively engage in the meeting and quickly follow up on next steps, using the transcript instead of taking notes and potentially missing something during the meeting.
4747

@@ -90,4 +90,4 @@ Currently, conversation transcription supports [all speech-to-text languages](la
9090
## Next steps
9191

9292
> [!div class="nextstepaction"]
93-
> [Transcribe conversations in real time](how-to-use-conversation-transcription.md)
93+
> [Transcribe conversations in real-time](how-to-use-conversation-transcription.md)

articles/cognitive-services/Speech-Service/custom-neural-voice.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ Here's an overview of the steps to create a custom neural voice in Speech Studio
4848
1. [Test your voice](how-to-custom-voice-create-voice.md#test-your-voice-model). Prepare test scripts for your voice model that cover the different use cases for your apps. It’s a good idea to use scripts within and outside the training dataset, so you can test the quality more broadly for different content.
4949
1. [Deploy and use your voice model](how-to-deploy-and-use-endpoint.md) in your apps.
5050

51-
You can tune, adjust, and use your custom voice, similarly as you would use a prebuilt neural voice. Convert text into speech in real time, or generate audio content offline with text input. You can do this by using the [REST API](./rest-text-to-speech.md), the [Speech SDK](./get-started-text-to-speech.md), or the [Speech Studio](https://speech.microsoft.com/audiocontentcreation).
51+
You can tune, adjust, and use your custom voice, similarly as you would use a prebuilt neural voice. Convert text into speech in real-time, or generate audio content offline with text input. You can do this by using the [REST API](./rest-text-to-speech.md), the [Speech SDK](./get-started-text-to-speech.md), or the [Speech Studio](https://speech.microsoft.com/audiocontentcreation).
5252

5353
The style and the characteristics of the trained voice model depend on the style and the quality of the recordings from the voice talent used for training. However, you can make several adjustments by using [SSML (Speech Synthesis Markup Language)](./speech-synthesis-markup.md?tabs=csharp) when you make the API calls to your voice model to generate synthetic speech. SSML is the markup language used to communicate with the text-to-speech service to convert text into audio. The adjustments you can make include change of pitch, rate, intonation, and pronunciation correction. If the voice model is built with multiple styles, you can also use SSML to switch the styles.
5454

articles/cognitive-services/Speech-Service/gaming-concepts.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ It's not unusual that players in the same game session natively speak different
5858
For an example, see the [Speech translation quickstart](get-started-speech-translation.md).
5959

6060
> [!NOTE]
61-
> Besides the Speech service, you can also use the [Translator service](../translator/translator-overview.md). To execute text translation between supported source and target languages in real time see [Text translation](../translator/text-translation-overview.md).
61+
> Besides the Speech service, you can also use the [Translator service](../translator/translator-overview.md). To execute text translation between supported source and target languages in real-time see [Text translation](../translator/text-translation-overview.md).
6262
6363
## Next steps
6464

articles/cognitive-services/Speech-Service/how-to-async-conversation-transcription.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,9 +19,9 @@ In this article, asynchronous Conversation Transcription is demonstrated using t
1919

2020
## Asynchronous vs. real-time + asynchronous
2121

22-
With asynchronous transcription, you stream the conversation audio, but don't need a transcription returned in real time. Instead, after the audio is sent, use the `conversationId` of `Conversation` to query for the status of the asynchronous transcription. When the asynchronous transcription is ready, you'll get a `RemoteConversationTranscriptionResult`.
22+
With asynchronous transcription, you stream the conversation audio, but don't need a transcription returned in real-time. Instead, after the audio is sent, use the `conversationId` of `Conversation` to query for the status of the asynchronous transcription. When the asynchronous transcription is ready, you'll get a `RemoteConversationTranscriptionResult`.
2323

24-
With real-time plus asynchronous, you get the transcription in real time, but also get the transcription by querying with the `conversationId` (similar to asynchronous scenario).
24+
With real-time plus asynchronous, you get the transcription in real-time, but also get the transcription by querying with the `conversationId` (similar to asynchronous scenario).
2525

2626
Two steps are required to accomplish asynchronous transcription. The first step is to upload the audio, choosing either asynchronous only or real-time plus asynchronous. The second step is to get the transcription results.
2727

articles/cognitive-services/Speech-Service/how-to-audio-content-creation.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ You can use the [Audio Content Creation](https://speech.microsoft.com/portal/aud
1818

1919
Build highly natural audio content for a variety of scenarios, such as audiobooks, news broadcasts, video narrations, and chat bots. With Audio Content Creation, you can efficiently fine-tune text-to-speech voices and design customized audio experiences.
2020

21-
The tool is based on [Speech Synthesis Markup Language (SSML)](speech-synthesis-markup.md). It allows you to adjust text-to-speech output attributes in real time or batch synthesis, such as voice characters, voice styles, speaking speed, pronunciation, and prosody.
21+
The tool is based on [Speech Synthesis Markup Language (SSML)](speech-synthesis-markup.md). It allows you to adjust text-to-speech output attributes in real-time or batch synthesis, such as voice characters, voice styles, speaking speed, pronunciation, and prosody.
2222

2323
- No-code approach: You can use the Audio Content Creation tool for text-to-speech synthesis without writing any code. The output audio might be the final deliverable that you want. For example, you can use the output audio for a podcast or a video narration.
2424
- Developer-friendly: You can listen to the output audio and adjust the SSML to improve speech synthesis. Then you can use the [Speech SDK](speech-sdk.md) or [Speech CLI](spx-basics.md) to integrate the SSML into your applications. For example, you can use the SSML for building a chat bot.

articles/cognitive-services/Speech-Service/includes/how-to/conversation-transcription/real-time-csharp.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -89,7 +89,7 @@ Running the function `GetVoiceSignatureString()` returns a voice signature strin
8989
9090
## Transcribe conversations
9191

92-
The following sample code demonstrates how to transcribe conversations in real time for two speakers. It assumes you've already created voice signature strings for each speaker as shown above. Substitute real information for `subscriptionKey`, `region`, and the path `filepath` for the audio you want to transcribe.
92+
The following sample code demonstrates how to transcribe conversations in real-time for two speakers. It assumes you've already created voice signature strings for each speaker as shown above. Substitute real information for `subscriptionKey`, `region`, and the path `filepath` for the audio you want to transcribe.
9393

9494
If you don't use pre-enrolled user profiles, it will take a few more seconds to complete the first recognition of unknown users as speaker1, speaker2, etc.
9595

articles/cognitive-services/Speech-Service/includes/how-to/conversation-transcription/real-time-javascript.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ Running this script returns a voice signature string in the variable `voiceSigna
5858
5959
## Transcribe conversations
6060

61-
The following sample code demonstrates how to transcribe conversations in real time for two speakers. It assumes you've already created voice signature strings for each speaker as shown above. Substitute real information for `subscriptionKey`, `region`, and the path `filepath` for the audio you want to transcribe.
61+
The following sample code demonstrates how to transcribe conversations in real-time for two speakers. It assumes you've already created voice signature strings for each speaker as shown above. Substitute real information for `subscriptionKey`, `region`, and the path `filepath` for the audio you want to transcribe.
6262

6363
If you don't use pre-enrolled user profiles, it will take a few more seconds to complete the first recognition of unknown users as speaker1, speaker2, etc.
6464

0 commit comments

Comments
 (0)