Skip to content

Commit 9c9ec8f

Browse files
committed
Fixing acrolynx score
1 parent b0275d7 commit 9c9ec8f

File tree

1 file changed

+14
-14
lines changed

1 file changed

+14
-14
lines changed

articles/ai-services/speech-service/voice-live-language-support.md

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -18,13 +18,13 @@ ms.custom: languages
1818

1919
## Introduction
2020

21-
The voice live API supports multiple languages and configuration options. In this document you will which languages are supported by the voice live API and how to configure them.
21+
The voice live API supports multiple languages and configuration options. In this document, you learn which languages the voice live API supports and how to configure them.
2222

2323
## [Speech input](#tab/speechinput)
2424

25-
Depending on which model is being used voice live speech input is processed either by one of the multimodal models (e.g. `gpt-4o-realtime-preview`, `gpt-4o-mini-realtime-preview`, and `phi4-mm-realtime`) or by `azure speech to text` models.
25+
Depending on which model is being used voice live speech input is processed either by one of the multimodal models (for example, `gpt-4o-realtime-preview`, `gpt-4o-mini-realtime-preview`, and `phi4-mm-realtime`) or by `azure speech to text` models.
2626

27-
### azure speech to text supported languages
27+
### Azure speech to text supported languages
2828

2929
Azure speech to text is used for all configuration where a non-multimodal model is being used and for speech input transcriptions with `phi4-mm-realtime`.
3030
It supports all languages documented on the [Language and voice support for the Speech service - Speech to text](./language-support.md?tabs=stt) tab.
@@ -51,7 +51,7 @@ The current multi-lingual model supports the following languages:
5151
- Spanish (Mexico) [es-MX]
5252
- Spanish (Spain) [es-ES]
5353

54-
To use **Automatic multilingual configuration using multilingual model** no additional configuration is required. If you do add the `language` string to the session`session.update` message, make sure to leave it empty.
54+
To use **Automatic multilingual configuration using multilingual model** no extra configuration is required. If you do add the `language` string to the session`session.update` message, make sure to leave it empty.
5555

5656
```json
5757
{
@@ -64,9 +64,9 @@ To use **Automatic multilingual configuration using multilingual model** no addi
6464
```
6565

6666
> [!NOTE]
67-
> The multilingual model will also generate results for unsupported languages, if no language is defined. In these cases transcription quality will be low. Ensure to configure defined languages, if you are setting up application with languages unsupported by the multilingual model.
67+
> The multilingual model generates results for unsupported languages, if no language is defined. In these cases transcription, quality is low. Ensure to configure defined languages, if you're setting up application with languages unsupported by the multilingual model.
6868

69-
To configure a single or multiple languages not supported by the multimodal model you must add them to the `language` string in the session`session.update` message. A maximum of 10 languages are supported.
69+
To configure a single or multiple languages not supported by the multimodal model, you must add them to the `language` string in the session`session.update` message. A maximum of 10 languages are supported.
7070

7171
```json
7272
{
@@ -80,7 +80,7 @@ To configure a single or multiple languages not supported by the multimodal mode
8080

8181
### gpt-4o-realtime-preview and gpt-4o-mini-realtime-preview supported languages
8282

83-
While the underlying model was trained on 98 languages, OpenAI only lists the languages that exceeded <50% word error rate (WER) which is an industry standard benchmark for speech to text model accuracy. The model will return results for languages not listed below but the quality will be low.
83+
While the underlying model was trained on 98 languages, OpenAI only lists the languages that exceeded <50% word error rate (WER) which is an industry standard benchmark for speech to text model accuracy. The model returns results for languages not listed but the quality will be low.
8484

8585
The following languages are supported by `gpt-4o-realtime-preview` and `gpt-4o-mini-realtime-preview`:
8686
- Afrikaans
@@ -141,7 +141,7 @@ The following languages are supported by `gpt-4o-realtime-preview` and `gpt-4o-m
141141
- Vietnamese
142142
- Welsh
143143

144-
Multimodal models do not require a language configuration for the general processing. If you configure input audio transcription you can provide the transcription models with a language hint to improve transcription quality. In this case you need to add the `language`string to the session`session.update` message.
144+
Multimodal models don't require a language configuration for the general processing. If you configure input audio transcription, you can provide the transcription models with a language hint to improve transcription quality. In this case you need to add the `language`string to the session`session.update` message.
145145

146146
```json
147147
{
@@ -154,7 +154,7 @@ Multimodal models do not require a language configuration for the general proces
154154
```
155155

156156
> [!NOTE]
157-
> Multimodal gpt models only support the following transcription models: `whisper-1`, `gpt-4o-transcribe` and `gpt-4o-mini-transcribe`.
157+
> Multimodal gpt models only support the following transcription models: `whisper-1`, `gpt-4o-transcribe`, and `gpt-4o-mini-transcribe`.
158158

159159
### phi4-mm-realtime supported languages
160160

@@ -168,7 +168,7 @@ The following languages are supported by `phi4-mm-realtime`:
168168
- Portuguese
169169
- Spanish
170170

171-
Multimodal models do not require a language configuration for the general processing. If you configure input audio transcription for `phi4-mm-realtime` you need to use the same configuration as for all non-mulitmodal model configuration where azure-speech is used for transcription as described above.
171+
Multimodal models don't require a language configuration for the general processing. If you configure input audio transcription for `phi4-mm-realtime` you need to use the same configuration as for all non-mulitmodal model configuration where `azure-speech` is used for transcription as described.
172172

173173
> [!NOTE]
174174
> Multimodal phi models only support the following transcription models: `azure-speech`.
@@ -177,7 +177,7 @@ Multimodal models do not require a language configuration for the general proces
177177

178178
Depending on which model is being used voice live speech output is processed either by one of the multimodal OpenAI voices integrated into `gpt-4o-realtime-preview` and `gpt-4o-mini-realtime-preview` or by `azure text to speech` voices.
179179

180-
### azure text to speech supported languages
180+
### Azure text to speech supported languages
181181

182182
Azure text to speech is used by default for all configuration where a non-multimodal OpenAI model is being used and can be configured in all configurations manually.
183183
It supports all voices documented on the [Language and voice support for the Speech service - Text to speech](./language-support.md?tabs=tts) tab.
@@ -187,7 +187,7 @@ The following types of voices are supported:
187187
1. Multilingual voices
188188
1. Custom voices
189189

190-
The supported language is tied to the voice used. To configure specific Azure text to speech voices you need to add the `voice` configuration to the session`session.update` message.
190+
The supported language is tied to the voice used. To configure specific Azure text to speech voices, you need to add the `voice` configuration to the session`session.update` message.
191191

192192
```json
193193
{
@@ -201,9 +201,9 @@ The supported language is tied to the voice used. To configure specific Azure te
201201
}
202202
```
203203

204-
For more details see how to configure [Audio output through Azure text to speech](./voice-live-how-to.md#audio-output-through-azure-text-to-speech).
204+
For more information, see how to configure [Audio output through Azure text to speech](./voice-live-how-to.md#audio-output-through-azure-text-to-speech).
205205

206-
In case of *Multilingual Voices* the language output can optionally be controlled by setting specific SSML tags. You can learn more about this in the [Customize voice and sound with SSML](./speech-synthesis-markup-voice.md#lang-examples) how to.
206+
If *Multilingual Voices* are used, the language output can optionally be controlled by setting specific SSML tags. You can learn more about SSML tags in the [Customize voice and sound with SSML](./speech-synthesis-markup-voice.md#lang-examples) how to.
207207

208208
## Related content
209209

0 commit comments

Comments
 (0)