You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/speech-service/gaming-concepts.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,38 +6,38 @@ author: eric-urban
6
6
manager: nitinme
7
7
ms.service: azure-ai-speech
8
8
ms.topic: conceptual
9
-
ms.date: 1/18/2024
9
+
ms.date: 9/18/2024
10
10
ms.author: eur
11
11
---
12
12
13
13
# Game development with Azure AI Speech
14
14
15
-
Azure AI services for Speech can be used to improve various gaming scenarios, both in- and out-of-game.
15
+
Azure AI Speech can be used to improve various gaming scenarios, both in-game and out-of-game.
16
16
17
17
Here are a few Speech features to consider for flexible and interactive game experiences:
18
18
19
19
- Bring everyone into the conversation by synthesizing audio from text. Or by displaying text from audio.
20
20
- Make the game more accessible for players who are unable to read text in a particular language, including young players who don't read or write. Players can listen to storylines and instructions in their preferred language.
21
-
- Create game avatars and non-playable characters (NPC) that can initiate or participate in a conversation in-game.
21
+
- Create game avatars and nonplayable characters (NPC) that can initiate or participate in a conversation in-game.
22
22
- Prebuilt neural voice can provide highly natural out-of-box voices with leading voice variety in terms of a large portfolio of languages and voices.
23
23
- Custom neural voice for creating a voice that stays on-brand with consistent quality and speaking style. You can add emotions, accents, nuances, laughter, and other para linguistic sounds and expressions.
24
24
- Use game dialogue prototyping to shorten the amount of time and money spent in product to get the game to market sooner. You can rapidly swap lines of dialog and listen to variations in real-time to iterate the game content.
25
25
26
-
You can use the [Speech SDK](speech-sdk.md) or [Speech CLI](spx-overview.md) for real-time low latency speech to text, text to speech, language identification, and speech translation. You can also use the [Batch transcription API](batch-transcription.md) to transcribe pre-recorded speech to text. To synthesize a large volume of text input (long and short) to speech, use the [Batch synthesis API](batch-synthesis.md).
26
+
You can use the [Speech SDK](speech-sdk.md) or [Speech CLI](spx-overview.md) for real-time low latency speech to text, text to speech, language identification, and speech translation. You can also use the [Batch transcription API](batch-transcription.md) to transcribe prerecorded speech to text. To synthesize a large volume of text input (long and short) to speech, use the [Batch synthesis API](batch-synthesis.md).
27
27
28
28
For information about locale and regional availability, see [Language and voice support](language-support.md) and [Region support](regions.md).
29
29
30
30
## Text to speech
31
31
32
-
Help bring everyone into the conversation by converting text messages to audio using [Text to speech](text-to-speech.md) for scenarios, such as game dialogue prototyping, greater accessibility, or non-playable character (NPC) voices. Text to speech includes [prebuilt neural voice](language-support.md?tabs=tts#prebuilt-neural-voices) and [custom neural voice](language-support.md?tabs=tts#custom-neural-voice) features. Prebuilt neural voice can provide highly natural out-of-box voices with leading voice variety in terms of a large portfolio of languages and voices. Custom neural voice is an easy-to-use self-service for creating a highly natural custom voice.
32
+
Help bring everyone into the conversation by converting text messages to audio using [Text to speech](text-to-speech.md) for scenarios, such as game dialogue prototyping, greater accessibility, or nonplayable character (NPC) voices. Text to speech includes [prebuilt neural voice](language-support.md?tabs=tts#prebuilt-neural-voices) and [custom neural voice](language-support.md?tabs=tts#custom-neural-voice) features. Prebuilt neural voice can provide highly natural out-of-box voices with leading voice variety in terms of a large portfolio of languages and voices. Custom neural voice is an easy-to-use self-service for creating a highly natural custom voice.
33
33
34
34
When enabling this functionality in your game, keep in mind the following benefits:
35
35
36
36
- Voices and languages supported - A large portfolio of [locales and voices](language-support.md?tabs=tts#supported-languages) are supported. You can also [specify multiple languages](speech-synthesis-markup-voice.md#adjust-speaking-languages) for Text to speech output. For [custom neural voice](custom-neural-voice.md), you can [choose to create](professional-voice-train-voice.md?tabs=neural#choose-a-training-method) different languages from single language training data.
37
37
- Emotional styles supported - [Emotional tones](language-support.md?tabs=tts#voice-styles-and-roles), such as cheerful, angry, sad, excited, hopeful, friendly, unfriendly, terrified, shouting, and whispering. You can [adjust the speaking style](speech-synthesis-markup-voice.md#use-speaking-styles-and-roles), style degree, and role at the sentence level.
38
38
- Visemes supported - You can use visemes during real-time synthesizing to control the movement of 2D and 3D avatar models, so that the mouth movements are perfectly matched to synthetic speech. For more information, see [Get facial position with viseme](how-to-speech-synthesis-viseme.md).
39
39
- Fine-tuning Text to speech output with Speech Synthesis Markup Language (SSML) - With SSML, you can customize Text to speech outputs, with richer voice tuning supports. For more information, see [Speech Synthesis Markup Language (SSML) overview](speech-synthesis-markup.md).
40
-
- Audio outputs - Each prebuilt neural voice model is available at 24 kHz and high-fidelity 48 kHz. If you select 48-kHz output format, the high-fidelity voice model with 48 kHz will be invoked accordingly. The sample rates other than 24 kHz and 48 kHz can be obtained through upsampling or downsampling when synthesizing. For example, 44.1 kHz is downsampled from 48 kHz. Each audio format incorporates a bitrate and encoding type. For more information, see the [supported audio formats](rest-text-to-speech.md?tabs=streaming#audio-outputs). For more information on 48-kHz high-quality voices, see [this introduction blog](https://techcommunity.microsoft.com/t5/ai-cognitive-services-blog/azure-neural-tts-voices-upgraded-to-48khz-with-hifinet2-vocoder/ba-p/3665252).
40
+
- Audio outputs - Each prebuilt neural voice model is available at 24 kHz and high-fidelity 48 kHz. If you select 48-kHz output format, the high-fidelity voice model with 48 kHz is invoked accordingly. The sample rates other than 24 kHz and 48 kHz can be obtained through upsampling or downsampling when synthesizing. For example, 44.1 kHz is downsampled from 48 kHz. Each audio format incorporates a bitrate and encoding type. For more information, see the [supported audio formats](rest-text-to-speech.md?tabs=streaming#audio-outputs). For more information on 48-kHz high-quality voices, see [this introduction blog](https://techcommunity.microsoft.com/t5/ai-cognitive-services-blog/azure-neural-tts-voices-upgraded-to-48khz-with-hifinet2-vocoder/ba-p/3665252).
41
41
42
42
For an example, see the [text to speech quickstart](get-started-text-to-speech.md).
0 commit comments