You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-foundry/openai/how-to/content-filters.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -22,7 +22,7 @@ The default content filtering configuration is set to filter at the medium sever
22
22
Prompt shields and protected text and code models are optional and on by default. For prompt shields and protected material text and code models, the configurability feature allows all customers to turn the models on and off. The models are by default on and can be turned off per your scenario. Some models are required to be on for certain scenarios to retain coverage under the [Customer Copyright Commitment](/azure/ai-foundry/responsible-ai/openai/customer-copyright-commitment).
23
23
24
24
> [!NOTE]
25
-
> All customers have the ability to modify the content filters and configure the severity thresholds (low, medium, high). Approval is required for turning the content filters partially or fully off. Managed customers only may apply for full content filtering control via this form: [Azure OpenAI Limited Access Review: Modified Content Filters](https://ncv.microsoft.com/uEfCgnITdR). At this time, it is not possible to become a managed customer.
25
+
> All customers have the ability to modify the content filters and configure the severity thresholds (low, medium, high). Approval is required for turning the content filters partially or fully off. Managed customers only may apply for full content filtering control via this form: [Limited Access Review: Modified Content Filters](https://ncv.microsoft.com/uEfCgnITdR). At this time, it is not possible to become a managed customer.
26
26
27
27
> [!IMPORTANT]
28
28
> The GPT-image-1 model does not support content filtering configuration: only the default content filter is used.
Copy file name to clipboardExpand all lines: articles/ai-foundry/openai/how-to/dall-e.md
+38-2Lines changed: 38 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -71,8 +71,6 @@ The following is a sample request body. You specify a number of options, defined
71
71
}
72
72
```
73
73
74
-
75
-
76
74
#### [DALL-E 3](#tab/dalle-3)
77
75
78
76
Send a POST request to:
@@ -147,8 +145,46 @@ The response from a successful image generation API call looks like the followin
147
145
]
148
146
}
149
147
```
148
+
150
149
---
151
150
151
+
### Streaming
152
+
153
+
You can stream image generation requests to `gpt-image-1` by setting the `stream` parameter to `true`, and setting the `partial_images` parameter to a value between 0 and 3.
154
+
155
+
```python
156
+
from openai import OpenAI
157
+
from azure.identity import DefaultAzureCredential, get_bearer_token_provider
Copy file name to clipboardExpand all lines: articles/ai-foundry/openai/how-to/responses.md
+2-42Lines changed: 2 additions & 42 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1265,14 +1265,9 @@ Compared to the standalone Image API, the Responses API offers several advantage
1265
1265
***Flexible inputs**: Accept image File IDs as inputs, in addition to raw image bytes.
1266
1266
1267
1267
> [!NOTE]
1268
-
> The image generation tool in the Responses APIis only supported by the `gpt-image-1` model. You can however call this model from this list of supported models -`gpt-4o`, `gpt-4o-mini`, `gpt-4.1`, `gpt-4.1-mini`, `gpt-4.1-nano`, `o3`.
1268
+
> The image generation tool in the Responses APIis only supported by the `gpt-image-1` model. You can however call this model from this list of supported models -`gpt-4o`, `gpt-4o-mini`, `gpt-4.1`, `gpt-4.1-mini`, `gpt-4.1-nano`, `o3`, and`gpt-5` series models.<br><br>The Responses API image generation tool does not currently support streaming mode. To use streaming mode and generate partial images, call the [image generation API](./dall-e.md) directly outside of the Responses API.
Copy file name to clipboardExpand all lines: articles/ai-foundry/openai/includes/content-filter-configurability.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -23,7 +23,7 @@ All customers can also configure content filters and create custom content polic
23
23
| No filters | If approved<sup>1</sup>| If approved<sup>1</sup>| No content is filtered regardless of severity level detected. Requires approval<sup>1</sup>.|
24
24
|Annotate only | If approved<sup>1</sup>| If approved<sup>1</sup>| Disables the filter functionality, so content will not be blocked, but annotations are returned via API response. Requires approval<sup>1</sup>.|
25
25
26
-
<sup>1</sup> For Azure OpenAI models, only customers who have been approved for modified content filtering have full content filtering control and can turn off content filters. Apply for modified content filters via this form: [Azure OpenAI Limited Access Review: Modified Content Filters](https://ncv.microsoft.com/uEfCgnITdR). For Azure Government customers, apply for modified content filters via this form: [Azure Government - Request Modified Content Filtering for Azure OpenAI](https://aka.ms/AOAIGovModifyContentFilter).
26
+
<sup>1</sup> For Azure OpenAI models, only customers who have been approved for modified content filtering have full content filtering control and can turn off content filters. Apply for modified content filters via this form: [Limited Access Review: Modified Content Filters](https://ncv.microsoft.com/uEfCgnITdR). For Azure Government customers, apply for modified content filters via this form: [Azure Government - Request Modified Content Filtering](https://aka.ms/AOAIGovModifyContentFilter).
27
27
28
28
Configurable content filters for inputs (prompts) and outputs (completions) are available for all Azure OpenAI models.
Copy file name to clipboardExpand all lines: articles/ai-foundry/responsible-ai/speech-service/text-to-speech/transparency-note.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -99,13 +99,13 @@ In addition to the common terms from prebuilt neural voice, custom neural voice,
99
99
|---|---|
100
100
| Avatar talent | Custom text to speech avatar model building requires training on a video recording of a real human speaking. This person is the avatar talent. Customers must get sufficient consent under all relevant laws and regulations from the avatar talent to use their image/likeness to create a custom avatar. |
101
101
102
-
#### [Video translation (preview)](#tab/video)
102
+
#### [Video translation](#tab/video)
103
103
104
104
### Introduction
105
105
106
106
Video translation can efficiently localize your video content to cater to diverse audiences around the globe. This service empowers you to create immersive, localized content efficiently and effectively across various use cases such as vlogs, education, news, advertising, and more.
107
107
108
-
Video translation using prebuilt neural voices is available in preview for all users. Video translation with personal voice is a Limited Access feature in preview and is subject to use case and eligibility restrictions.
108
+
Video translation using prebuilt neural voices is available for all users.
109
109
110
110
### Key terms
111
111
@@ -169,7 +169,7 @@ Text to speech avatar adopts Coalition for Content Provenance and Authenticity (
169
169
In addition, avatar outputs are automatically watermarked. Watermarks allow approved users to identify whether a video is synthesized using the avatar feature of Azure AI Speech. To request watermark detection, please contact [avatarvoice[at]microsoft.com](mailto:[email protected]).
170
170
171
171
172
-
### Video translation (preview)
172
+
### Video translation
173
173
174
174
Video translation can efficiently localize your video content to cater to diverse audiences around the globe. Video translation will automatically extract dialogue audio, transcribe, translate and dub the content with prebuilt or personal voice to the target language, with accurate subtitles for better accessibility. Multi-speaker features will help identify the number of individuals speaking and recommend suitable voices. Content editing with human in the loop allows for precise alignment with customer preference. Enhanced translation quality ensures precise audio and video alignment with GPT integration. Video translation enables authentic and personalized dubbing experiences with personal voice.
175
175
@@ -209,7 +209,7 @@ All other uses of custom neural voice, including Custom Neural Voice Pro, Custom
209
209
210
210
Prebuilt neural voice may also be used for the custom neural voice use cases above, as well as additional use cases selected by customers and consistent with the Azure Acceptable Use Policy and the [Code of conduct for Azure AI Speech text to speech](/legal/ai-code-of-conduct?context=%2Fazure%2Fai-services%2Fspeech-service%2Fcontext%2Fcontext). No registration or pre-approval is required for additional use cases for prebuilt neural voice that meet all applicable terms and conditions.
211
211
212
-
### Intended use cases for video translation (preview)
212
+
### Intended use cases for video translation
213
213
214
214
Video translation could be used for films, TV, and other visual (including but not limited to video or animation) and audio applications, where customers maintain sole control over the creation of, access to, and use of the voice models and their output. Personal voice and lip syncing are subject to the Limited Access framework, and eligible customers may use these capabilities with Video translation. The following are the approved use cases for Video translation service:
215
215
-**Education & learning**: To translate audio in educational visuals, online courses, training modules, simulation-based learning, or guided museum tour visuals for multilingual learners.
@@ -301,7 +301,7 @@ Technical limitations to consider are the accuracy of lip sync alignment with th
301
301
-**Gestures**: Avatars may use hand gestures during speaking to deliver a natural speaking experience, but the gestures are not pre-programmed. Instead, they are learned from video clips in the training data and are included in synthetic video regardless of the input text. Also, avatars cannot make gestures that were not made by the avatar talent and captured in the training data. Avatars are not able to tailor gestures according to contextual information and emotions, so customers should be mindful of the avatar system’s inability to automatically play a gesture appropriate for the context.
302
302
-**Privacy and data protection**: When utilizing text to speech avatars, customers should adhere to all applicable privacy laws and regulations and ensure that sensitive or personal information is handled securely. It is important to be cautious when processing and storing data, and to follow best practices for data protection and consent management.
303
303
304
-
#### [Video translation (preview)](#tab/video)
304
+
#### [Video translation](#tab/video)
305
305
306
306
***Translation quality**: Translation quality will depend on the transcription accuracy and translation accuracy. If the input video is mixed with background music or noise, this will impact the quality of the translation. Translation results will be dependent on context.
307
307
***Dubbing voice similarity and intonation**: When you choose prebuilt neural voices for dubbing, the voice output characteristics may not be similar to the original voice characteristics. If you use the personal voice feature, the voice output will more closely resemble the original voice, but the speaking style may not closely resemble the user’s speaking style including tones and prosodies. It’s also possible the voice output will not sound equally natural across all supported languages.
@@ -395,7 +395,7 @@ The quality of the resulting avatar heavily depends on the recorded video used f
395
395
396
396
The appearance and performance of the avatar talent are also key factors impacting the system performance; please see our guidance [How to record video samples for custom text to speech avatar](/azure/ai-services/speech-service/text-to-speech-avatar/custom-avatar-record-video-samples).
Copy file name to clipboardExpand all lines: articles/ai-services/speech-service/how-to-custom-speech-transcription-editor.md
+6-3Lines changed: 6 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,17 +2,20 @@
2
2
title: How to use the online transcription editor for custom speech - Speech service
3
3
titleSuffix: Azure AI services
4
4
description: The online transcription editor allows you to create or edit audio + human-labeled transcriptions for custom speech.
5
-
author: PatrickFarley
5
+
author: goergenj
6
6
manager: nitinme
7
7
ms.service: azure-ai-speech
8
8
ms.topic: how-to
9
-
ms.date: 5/19/2025
10
-
ms.author: pafarley
9
+
ms.date: 9/11/2025
10
+
ms.author: jagoerge
11
11
#Customer intent: As a developer, I need to understand how to use the online transcription editor for custom speech so that I can create or edit audio + human-labeled transcriptions for custom speech.
The online transcription editor allows you to create or edit audio + human-labeled transcriptions for custom speech. The main use cases of the editor are as follows:
17
20
18
21
* You only have audio data, but want to build accurate audio + human-labeled datasets from scratch to use in model training.
0 commit comments