Skip to content

Commit fdc884d

Browse files
authored
Merge pull request #264982 from PatrickFarley/openai-updates
Openai updates
2 parents b441bb3 + 59693ad commit fdc884d

File tree

1 file changed

+24
-25
lines changed

1 file changed

+24
-25
lines changed

articles/ai-services/openai/concepts/content-filter.md

Lines changed: 24 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -56,11 +56,11 @@ The default content filtering configuration is set to filter at the medium sever
5656
| Severity filtered | Configurable for prompts | Configurable for completions | Descriptions |
5757
|-------------------|--------------------------|------------------------------|--------------|
5858
| Low, medium, high | Yes | Yes | Strictest filtering configuration. Content detected at severity levels low, medium and high is filtered.|
59-
| Medium, high | Yes | Yes | Default setting. Content detected at severity level low is not filtered, content at medium and high is filtered.|
59+
| Medium, high | Yes | Yes | Default setting. Content detected at severity level low isn't filtered, content at medium and high is filtered.|
6060
| High | Yes| Yes | Content detected at severity levels low and medium isn't filtered. Only content at severity level high is filtered.|
6161
| No filters | If approved<sup>\*</sup>| If approved<sup>\*</sup>| No content is filtered regardless of severity level detected. Requires approval<sup>\*</sup>.|
6262

63-
<sup>\*</sup> Only customers who have been approved for modified content filtering have full content filtering control and can turn content filters partially or fully off. Content filtering control does not apply to content filters for DALL-E (preview) or GPT-4 Turbo with Vision (preview). Apply for modified content filters using this form: [Azure OpenAI Limited Access Review: Modified Content Filtering (microsoft.com)](https://customervoice.microsoft.com/Pages/ResponsePage.aspx?id=v4j5cvGGr0GRqy180BHbR7en2Ais5pxKtso_Pz4b1_xUMlBQNkZMR0lFRldORTdVQzQ0TEI5Q1ExOSQlQCN0PWcu).
63+
<sup>\*</sup> Only customers who have been approved for modified content filtering have full content filtering control and can turn content filters partially or fully off. Content filtering control doesn't apply to content filters for DALL-E (preview) or GPT-4 Turbo with Vision (preview). Apply for modified content filters using this form: [Azure OpenAI Limited Access Review: Modified Content Filtering (microsoft.com)](https://customervoice.microsoft.com/Pages/ResponsePage.aspx?id=v4j5cvGGr0GRqy180BHbR7en2Ais5pxKtso_Pz4b1_xUMlBQNkZMR0lFRldORTdVQzQ0TEI5Q1ExOSQlQCN0PWcu).
6464

6565
Customers are responsible for ensuring that applications integrating Azure OpenAI comply with the [Code of Conduct](/legal/cognitive-services/openai/code-of-conduct?context=%2Fazure%2Fai-services%2Fopenai%2Fcontext%2Fcontext).
6666

@@ -628,40 +628,39 @@ For details on the inference REST API endpoints for Azure OpenAI and how to crea
628628
}
629629
```
630630
631-
## Streaming
631+
## Content streaming
632632
633-
Azure OpenAI Service includes a content filtering system that works alongside core models. The following section describes the AOAI streaming experience and options in the context of content filters.
633+
This section describes the Azure OpenAI content streaming experience and options. With approval, you have the option to receive content from the API as it's generated, instead of waiting for chunks of content that have been verified to pass your content filters.
634634
635635
### Default
636636
637-
The content filtering system is integrated and enabled by default for all customers. In the default streaming scenario, completion content is buffered, the content filtering system runs on the buffered content, and – depending on content filtering configuration – content is either returned to the user if it does not violate the content filtering policy (Microsoft default or custom user configuration), or it’s immediately blocked which returns a content filtering error, without returning harmful completion content. This process is repeated until the end of the stream. Content was fully vetted according to the content filtering policy before returned to the user. Content is not returned token-by-token in this case, but in “content chunks” of the respective buffer size.
637+
The content filtering system is integrated and enabled by default for all customers. In the default streaming scenario, completion content is buffered, the content filtering system runs on the buffered content, and – depending on the content filtering configuration – content is either returned to the user if it doesn't violate the content filtering policy (Microsoft's default or a custom user configuration), or it’s immediately blocked and returns a content filtering error, without returning the harmful completion content. This process is repeated until the end of the stream. Content is fully vetted according to the content filtering policy before it's returned to the user. Content isn't returned token-by-token in this case, but in “content chunks” of the respective buffer size.
638638
639639
### Asynchronous modified filter
640640
641-
Customers who have been approved for modified content filters can choose Asynchronous Modified Filter as an additional option, providing a new streaming experience. In this case, content filters are run asynchronously, completion content is returned immediately with a smooth token-by-token streaming experience. No content is buffered, the content filters run asynchronously, which allows for zero latency in this context.
641+
Customers who have been approved for modified content filters can choose the asynchronous modified filter as an additional option, providing a new streaming experience. In this case, content filters are run asynchronously, and completion content is returned immediately with a smooth token-by-token streaming experience. No content is buffered, which allows for zero latency.
642642
643-
> [!NOTE]
644-
> Customers must be aware that while the feature improves latency, it can bring a trade-off in terms of the safety and real-time vetting of smaller sections of model output. Because content filters are run asynchronously, content moderation messages and the content filtering signal in case of a policy violation are delayed, which means some sections of harmful content that would otherwise have been filtered immediately could be displayed to the user.
643+
Customers must be aware that while the feature improves latency, it's a trade-off against the safety and real-time vetting of smaller sections of model output. Because content filters are run asynchronously, content moderation messages and policy violation signals are delayed, which means some sections of harmful content that would otherwise have been filtered immediately could be displayed to the user.
645644
646-
**Annotations**: Annotations and content moderation messages are continuously returned during the stream. We strongly recommend to consume annotations and implement additional AI content safety mechanisms, such as redacting content or returning additional safety information to the user.
645+
**Annotations**: Annotations and content moderation messages are continuously returned during the stream. We strongly recommend you consume annotations in your app and implement additional AI content safety mechanisms, such as redacting content or returning additional safety information to the user.
647646
648-
**Content filtering signal**: The content filtering error signal is delayed; in case of a policy violation, it’s returned as soon as it’s available, and the stream is stopped. The content filtering signal is guaranteed within ~1,000-character windows in case of a policy violation.
647+
**Content filtering signal**: The content filtering error signal is delayed. In case of a policy violation, it’s returned as soon as it’s available, and the stream is stopped. The content filtering signal is guaranteed within a ~1,000-character window of the policy-violating content.
649648
650-
Approval for Modified Content Filtering is required for access to Streaming – Asynchronous Modified Filter. The application can be found [here](https://customervoice.microsoft.com/Pages/ResponsePage.aspx?id=v4j5cvGGr0GRqy180BHbR7en2Ais5pxKtso_Pz4b1_xURE01NDY1OUhBRzQ3MkQxMUhZSE1ZUlJKTiQlQCN0PWcu). To enable it via Azure OpenAI Studio please follow the instructions [here](/azure/ai-services/openai/how-to/content-filters) to create a new content filtering configuration, and select Asynchronous Modified Filter in the Streaming section, as shown in the below screenshot.
649+
Approval for modified content filtering is required for access to the asynchronous modified filter. The application can be found [here](https://customervoice.microsoft.com/Pages/ResponsePage.aspx?id=v4j5cvGGr0GRqy180BHbR7en2Ais5pxKtso_Pz4b1_xURE01NDY1OUhBRzQ3MkQxMUhZSE1ZUlJKTiQlQCN0PWcu). To enable it in Azure OpenAI Studio, follow the [Content filter how-to guide](/azure/ai-services/openai/how-to/content-filters) to create a new content filtering configuration, and select **Asynchronous Modified Filter** in the Streaming section.
651650
652-
### Overview
651+
### Comparison of content filtering modes
653652
654-
| Category | Streaming - Default | Streaming - Asynchronous Modified Filter |
653+
| | Streaming - Default | Streaming - Asynchronous Modified Filter |
655654
|---|---|---|
656655
|Status |GA |Public Preview |
657-
| Access | Enabled by default, no action needed |Customers approved for Modified Content Filtering can configure directly via Azure OpenAI Studio (as part of a content filtering configuration; applied on deployment-level) |
658-
| Eligibility |All customers |Customers approved for Modified Content Filtering |
659-
|Modality and Availability |Text; all GPT-models |Text; all GPT-models except gpt-4-vision |
656+
| Eligibility |All customers |Customers approved for modified content filtering |
657+
| How to enable | Enabled by default, no action needed |Customers approved for modified content filtering can configure it directly in Azure OpenAI Studio (as part of a content filtering configuration, applied at the deployment level) |
658+
|Modality and availability |Text; all GPT models |Text; all GPT models except gpt-4-vision |
660659
|Streaming experience |Content is buffered and returned in chunks |Zero latency (no buffering, filters run asynchronously) |
661-
|Content filtering signal |Immediate filtering signal |Delayed filtering signal (in up to ~1,000 char increments) |
662-
|Content filtering configurations |Supports default and any customer-defined filter setting (including optional models) |Supports default and any customer-defined filter setting (including optional models) |
660+
|Content filtering signal |Immediate filtering signal |Delayed filtering signal (in up to ~1,000-character increments) |
661+
|Content filtering configurations |Supports default and any customer-defined filter setting (including optional models) |Supports default and any customer-defined filter setting (including optional models) |
663662
664-
### Annotations and sample response stream
663+
### Annotations and sample responses
665664
666665
#### Prompt annotation message
667666
@@ -709,11 +708,11 @@ data: {
709708
710709
#### Annotation message
711710
712-
The text field will always be an empty string, indicating no new tokens. Annotations will only be relevant to already-sent tokens. There may be multiple Annotation Messages referring to the same tokens.
711+
The text field will always be an empty string, indicating no new tokens. Annotations will only be relevant to already-sent tokens. There may be multiple annotation messages referring to the same tokens.
713712
714-
start_offset and end_offset are low-granularity offsets in text (with 0 at beginning of prompt) which the annotation is relevant to.
713+
`"start_offset"` and `"end_offset"` are low-granularity offsets in text (with 0 at beginning of prompt) to mark which text the annotation is relevant to.
715714
716-
check_offset represents how much text has been fully moderated. It is an exclusive lower bound on the end_offsets of future annotations. It is nondecreasing.
715+
`"check_offset"` represents how much text has been fully moderated. It's an exclusive lower bound on the `"end_offset"` values of future annotations. It's non-decreasing.
717716
718717
```json
719718
data: {
@@ -739,9 +738,9 @@ data: {
739738
```
740739
741740
742-
### Sample response stream
741+
#### Sample response stream (passes filters)
743742
744-
Below is a real chat completion response using Asynchronous Modified Filter. Note how prompt annotations are not changed; completion tokens are sent without annotations; and new annotation messages are sent without tokens, instead associated with certain content filter offsets.
743+
Below is a real chat completion response using asynchronous modified filter. Note how the prompt annotations aren't changed, completion tokens are sent without annotations, and new annotation messages are sent without tokens&mdash;they are instead associated with certain content filter offsets.
745744
746745
`{"temperature": 0, "frequency_penalty": 0, "presence_penalty": 1.0, "top_p": 1.0, "max_tokens": 800, "messages": [{"role": "user", "content": "What is color?"}], "stream": true}`
747746
@@ -769,7 +768,7 @@ data: {"id":"","object":"","created":0,"model":"","choices":[{"index":0,"finish_
769768
data: [DONE]
770769
```
771770
772-
### Sample response stream (blocking)
771+
#### Sample response stream (blocked by filters)
773772
774773
`{"temperature": 0, "frequency_penalty": 0, "presence_penalty": 1.0, "top_p": 1.0, "max_tokens": 800, "messages": [{"role": "user", "content": "Tell me the lyrics to \"Hey Jude\"."}], "stream": true}`
775774

0 commit comments

Comments
 (0)