You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
| Low, medium, high | Yes | Yes | Strictest filtering configuration. Content detected at severity levels low, medium and high is filtered.|
60
-
| Medium, high | Yes | Yes | Default setting. Content detected at severity level low is not filtered, content at medium and high is filtered.|
60
+
| Medium, high | Yes | Yes | Default setting. Content detected at severity level low isn't filtered, content at medium and high is filtered.|
61
61
| High | Yes| Yes | Content detected at severity levels low and medium isn't filtered. Only content at severity level high is filtered.|
62
62
| No filters | If approved<sup>\*</sup>| If approved<sup>\*</sup>| No content is filtered regardless of severity level detected. Requires approval<sup>\*</sup>.|
63
63
64
-
<sup>\*</sup> Only customers who have been approved for modified content filtering have full content filtering control and can turn content filters partially or fully off. Content filtering control does not apply to content filters for DALL-E (preview) or GPT-4 Turbo with Vision (preview). Apply for modified content filters using this form: [Azure OpenAI Limited Access Review: Modified Content Filtering (microsoft.com)](https://customervoice.microsoft.com/Pages/ResponsePage.aspx?id=v4j5cvGGr0GRqy180BHbR7en2Ais5pxKtso_Pz4b1_xUMlBQNkZMR0lFRldORTdVQzQ0TEI5Q1ExOSQlQCN0PWcu).
64
+
<sup>\*</sup> Only customers who have been approved for modified content filtering have full content filtering control and can turn content filters partially or fully off. Content filtering control doesn't apply to content filters for DALL-E (preview) or GPT-4 Turbo with Vision (preview). Apply for modified content filters using this form: [Azure OpenAI Limited Access Review: Modified Content Filtering (microsoft.com)](https://customervoice.microsoft.com/Pages/ResponsePage.aspx?id=v4j5cvGGr0GRqy180BHbR7en2Ais5pxKtso_Pz4b1_xUMlBQNkZMR0lFRldORTdVQzQ0TEI5Q1ExOSQlQCN0PWcu).
65
65
66
66
Customers are responsible for ensuring that applications integrating Azure OpenAI comply with the [Code of Conduct](/legal/cognitive-services/openai/code-of-conduct?context=%2Fazure%2Fai-services%2Fopenai%2Fcontext%2Fcontext).
67
67
@@ -635,13 +635,13 @@ This section describes the Azure OpenAI content streaming experience and options
635
635
636
636
### Default
637
637
638
-
The content filtering system is integrated and enabled by default for all customers. In the default streaming scenario, completion content is buffered, the content filtering system runs on the buffered content, and – depending on the content filtering configuration – content is either returned to the user if it does not violate the content filtering policy (Microsoft's default or a custom user configuration), or it’s immediately blocked and returns a content filtering error, without returning the harmful completion content. Thisprocess is repeated until the end of the stream. Content is fully vetted according to the content filtering policy before it's returned to the user. Content is not returned token-by-token in this case, but in “content chunks” of the respective buffer size.
638
+
The content filtering system is integrated and enabled by default for all customers. In the default streaming scenario, completion content is buffered, the content filtering system runs on the buffered content, and – depending on the content filtering configuration – content is either returned to the user if it doesn't violate the content filtering policy (Microsoft's default or a custom user configuration), or it’s immediately blocked and returns a content filtering error, without returning the harmful completion content. This process is repeated until the end of the stream. Content is fully vetted according to the content filtering policy before it's returned to the user. Contentisn't returned token-by-token in this case, but in “content chunks” of the respective buffer size.
639
639
640
640
### Asynchronous modified filter
641
641
642
642
Customers who have been approved for modified content filters can choose the asynchronous modified filter as an additional option, providing a new streaming experience. In this case, content filters are run asynchronously, and completion content is returned immediately with a smooth token-by-token streaming experience. No content is buffered, which allows for zero latency.
643
643
644
-
Customers must be aware that while the feature improves latency, it is a trade-off against the safety and real-time vetting of smaller sections of model output. Because content filters are run asynchronously, content moderation messages and policy violation signals are delayed, which means some sections of harmful content that would otherwise have been filtered immediately could be displayed to the user.
644
+
Customers must be aware that while the feature improves latency, it's a trade-off against the safety and real-time vetting of smaller sections of model output. Because content filters are run asynchronously, content moderation messages and policy violation signals are delayed, which means some sections of harmful content that would otherwise have been filtered immediately could be displayed to the user.
645
645
646
646
**Annotations**: Annotations and content moderation messages are continuously returned during the stream. We strongly recommend you consume annotations in your app and implement additional AI content safety mechanisms, such as redacting content or returning additional safety information to the user.
647
647
@@ -713,7 +713,7 @@ The text field will always be an empty string, indicating no new tokens. Annotat
713
713
714
714
`"start_offset"` and `"end_offset"` are low-granularity offsets intext (with0 at beginning of prompt) to mark which text the annotation is relevant to.
715
715
716
-
`"check_offset"` represents how much text has been fully moderated. It is an exclusive lower bound on the `"end_offset"` values of future annotations. It is non-decreasing.
716
+
`"check_offset"` represents how much text has been fully moderated. It's an exclusive lower bound on the `"end_offset"` values of future annotations. It's non-decreasing.
717
717
718
718
```json
719
719
data: {
@@ -741,7 +741,7 @@ data: {
741
741
742
742
#### Sample response stream (passes filters)
743
743
744
-
Below is a real chat completion response using asynchronous modified filter. Note how the prompt annotations are not changed, completion tokens are sent without annotations, and new annotation messages are sent without tokens—they are instead associated with certain content filter offsets.
744
+
Below is a real chat completion response using asynchronous modified filter. Note how the prompt annotations aren't changed, completion tokens are sent without annotations, and new annotation messages are sent without tokens—they are instead associated with certain content filter offsets.
0 commit comments