Skip to content

Commit cfe12de

Browse files
committed
acrolinx
1 parent c6696cd commit cfe12de

File tree

1 file changed

+20
-20
lines changed

1 file changed

+20
-20
lines changed

articles/ai-services/openai/concepts/content-filter.md

Lines changed: 20 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -41,12 +41,12 @@ Text and image models support Drugs as an additional classification. This catego
4141
|--------|-----------|
4242
| Hate and Fairness | Hate and fairness-related harms refer to any content that attacks or uses discriminatory language with reference to a person or Identity group based on certain differentiating attributes of these groups. <br><br>This includes, but is not limited to:<ul><li>Race, ethnicity, nationality</li><li>Gender identity groups and expression</li><li>Sexual orientation</li><li>Religion</li><li>Personal appearance and body size</li><li>Disability status</li><li>Harassment and bullying</li></ul> |
4343
| Sexual | Sexual describes language related to anatomical organs and genitals, romantic relationships and sexual acts, acts portrayed in erotic or affectionate terms, including those portrayed as an assault or a forced sexual violent act against one’s will. <br><br> This includes but is not limited to:<ul><li>Vulgar content</li><li>Prostitution</li><li>Nudity and Pornography</li><li>Abuse</li><li>Child exploitation, child abuse, child grooming</li></ul> |
44-
| Violence | Violence describes language related to physical actions intended to hurt, injure, damage, or kill someone or something; describes weapons, guns and related entities. <br><br>This includes, but is not limited to: <ul><li>Weapons</li><li>Bullying and intimidation</li><li>Terrorist and violent extremism</li><li>Stalking</li></ul> |
45-
| Self-Harm | Self-harm describes language related to physical actions intended to purposely hurt, injure, damage one’s body or kill oneself. <br><br> This includes, but is not limited to: <ul><li>Eating Disorders</li><li>Bullying and intimidation</li></ul> |
44+
| Violence | Violence describes language related to physical actions intended to hurt, injure, damage, or kill someone or something; describes weapons, guns and related entities. <br><br>This includes, but isn't limited to: <ul><li>Weapons</li><li>Bullying and intimidation</li><li>Terrorist and violent extremism</li><li>Stalking</li></ul> |
45+
| Self-Harm | Self-harm describes language related to physical actions intended to purposely hurt, injure, damage one’s body or kill oneself. <br><br> This includes, but isn't limited to: <ul><li>Eating Disorders</li><li>Bullying and intimidation</li></ul> |
4646
| Protected Material for Text<sup>*</sup> | Protected material text describes known text content (for example, song lyrics, articles, recipes, and selected web content) that can be outputted by large language models.
4747
| Protected Material for Code | Protected material code describes source code that matches a set of source code from public repositories, which can be outputted by large language models without proper citation of source repositories.
4848

49-
<sup>*</sup> If you are an owner of text material and want to submit text content for protection, please [file a request](https://aka.ms/protectedmaterialsform).
49+
<sup>*</sup> If you're an owner of text material and want to submit text content for protection, [file a request](https://aka.ms/protectedmaterialsform).
5050

5151
## Prompt Shields
5252

@@ -91,16 +91,16 @@ The default content filtering configuration for the GPT model series is set to f
9191

9292
| Severity filtered | Configurable for prompts | Configurable for completions | Descriptions |
9393
|-------------------|--------------------------|------------------------------|--------------|
94-
| Low, medium, high | Yes | Yes | Strictest filtering configuration. Content detected at severity levels low, medium and high is filtered.|
94+
| Low, medium, high | Yes | Yes | Strictest filtering configuration. Content detected at severity levels low, medium, and high is filtered.|
9595
| Medium, high | Yes | Yes | Content detected at severity level low isn't filtered, content at medium and high is filtered.|
9696
| High | Yes| Yes | Content detected at severity levels low and medium isn't filtered. Only content at severity level high is filtered. Requires approval<sup>1</sup>.|
9797
| No filters | If approved<sup>1</sup>| If approved<sup>1</sup>| No content is filtered regardless of severity level detected. Requires approval<sup>1</sup>.|
9898

99-
<sup>1</sup> For Azure OpenAI models, only customers who have been approved for modified content filtering have full content filtering control and can turn content filters off. Apply for modified content filters via this form: [Azure OpenAI Limited Access Review: Modified Content Filters](https://ncv.microsoft.com/uEfCgnITdR)
99+
<sup>1</sup> For Azure OpenAI models, only customers who have been approved for modified content filtering have full content filtering control and can turn off content filters. Apply for modified content filters via this form: [Azure OpenAI Limited Access Review: Modified Content Filters](https://ncv.microsoft.com/uEfCgnITdR)
100100

101101
This preview feature is available for the following Azure OpenAI models:
102102
* GPT model series (text)
103-
* GPT-4 Turbo Vision 2024-04-09 (multi-modal text/image)
103+
* GPT-4 Turbo Vision `2024-04-09` (multi-modal text/image)
104104
* DALL-E 2 and 3 (image)
105105

106106
Content filtering configurations are created within a Resource in Azure AI Studio, and can be associated with Deployments. [Learn more about configurability here](../how-to/content-filters.md).
@@ -112,8 +112,8 @@ Customers are responsible for ensuring that applications integrating Azure OpenA
112112
When the content filtering system detects harmful content, you receive either an error on the API call if the prompt was deemed inappropriate, or the `finish_reason` on the response will be `content_filter` to signify that some of the completion was filtered. When building your application or system, you'll want to account for these scenarios where the content returned by the Completions API is filtered, which might result in content that is incomplete. How you act on this information will be application specific. The behavior can be summarized in the following points:
113113

114114
- Prompts that are classified at a filtered category and severity level will return an HTTP 400 error.
115-
- Non-streaming completions calls won't return any content when the content is filtered. The `finish_reason` value will be set to content_filter. In rare cases with longer responses, a partial result can be returned. In these cases, the `finish_reason` will be updated.
116-
- For streaming completions calls, segments will be returned back to the user as they're completed. The service will continue streaming until either reaching a stop token, length, or when content that is classified at a filtered category and severity level is detected.
115+
- Non-streaming completions calls won't return any content when the content is filtered. The `finish_reason` value is set to content_filter. In rare cases with longer responses, a partial result can be returned. In these cases, the `finish_reason` is updated.
116+
- For streaming completions calls, segments are returned back to the user as they're completed. The service continues streaming until either reaching a stop token, length, or when content that is classified at a filtered category and severity level is detected.
117117

118118
### Scenario: You send a non-streaming completions call asking for multiple outputs; no content is classified at a filtered category and severity level
119119

@@ -224,7 +224,7 @@ The table below outlines the various ways content filtering can appear:
224224

225225
|**HTTP Response Code** | **Response behavior**|
226226
|------------|------------------------|
227-
|200|In this case, the call will stream back with the full generation and `finish_reason` will be either 'length' or 'stop' for each generated response.|
227+
|200|In this case, the call streams back with the full generation and `finish_reason` will be either 'length' or 'stop' for each generated response.|
228228

229229
**Example request payload:**
230230

@@ -337,7 +337,7 @@ The table below outlines the various ways content filtering can appear:
337337

338338
When annotations are enabled as shown in the code snippet below, the following information is returned via the API for the categories hate and fairness, sexual, violence, and self-harm:
339339
- content filtering category (hate, sexual, violence, self_harm)
340-
- the severity level (safe, low, medium or high) within each content category
340+
- the severity level (safe, low, medium, or high) within each content category
341341
- filtering status (true or false).
342342

343343
### Optional models
@@ -732,7 +732,7 @@ violence : @{filtered=False; severity=safe}
732732
733733
---
734734
735-
For details on the inference REST API endpoints for Azure OpenAI and how to create Chat and Completions please follow [Azure OpenAI Service REST API reference guidance](../reference.md). Annotations are returned for all scenarios when using any preview API version starting from `2023-06-01-preview`, as well as the GA API version `2024-02-01`.
735+
For details on the inference REST API endpoints for Azure OpenAI and how to create Chat and Completions, follow [Azure OpenAI Service REST API reference guidance](../reference.md). Annotations are returned for all scenarios when using any preview API version starting from `2023-06-01-preview`, as well as the GA API version `2024-02-01`.
736736
737737
### Example scenario: An input prompt containing content that is classified at a filtered category and severity level is sent to the completions API
738738
@@ -781,7 +781,7 @@ For enhanced detection capabilities, prompts should be formatted according to th
781781
782782
The Chat Completion API is structured by definition. It consists of a list of messages, each with an assigned role.
783783
784-
The safety system will parse this structured format and apply the following behavior:
784+
The safety system parses this structured format and apply the following behavior:
785785
- On the latest “user” content, the following categories of RAI Risks will be detected:
786786
- Hate
787787
- Sexual
@@ -800,7 +800,7 @@ This is an example message array:
800800
801801
### Embedding documents in your prompt
802802
803-
In addition to detection on last user content, Azure OpenAI also supports the detection of specific risks inside context documents via Prompt Shields – Indirect Prompt Attack Detection. You should identify parts of the input that are a document (e.g. retrieved website, email, etc.) with the following document delimiter.
803+
In addition to detection on last user content, Azure OpenAI also supports the detection of specific risks inside context documents via Prompt Shields – Indirect Prompt Attack Detection. You should identify parts of the input that are a document (for example, retrieved website, email, etc.) with the following document delimiter.
804804
805805
```
806806
<documents>
@@ -812,7 +812,7 @@ When you do so, the following options are available for detection on tagged docu
812812
- On each tagged “document” content, detect the following categories:
813813
- Indirect attacks (optional)
814814
815-
Here is an example chat completion messages array:
815+
Here's an example chat completion messages array:
816816
817817
```json
818818
{"role": "system", "content": "Provide some context and/or instructions to the model, including document context. \"\"\" <documents>\n*insert your document content here*\n<\\documents> \"\"\""},
@@ -848,21 +848,21 @@ The escaped text in a chat completion context would read:
848848
849849
## Content streaming
850850
851-
This section describes the Azure OpenAI content streaming experience and options. Customers have the option to receive content from the API as it's generated, instead of waiting for chunks of content that have been verified to pass your content filters.
851+
This section describes the Azure OpenAI content streaming experience and options. Customers can receive content from the API as it's generated, instead of waiting for chunks of content that have been verified to pass your content filters.
852852
853853
### Default
854854
855855
The content filtering system is integrated and enabled by default for all customers. In the default streaming scenario, completion content is buffered, the content filtering system runs on the buffered content, and – depending on the content filtering configuration – content is either returned to the user if it doesn't violate the content filtering policy (Microsoft's default or a custom user configuration), or it’s immediately blocked and returns a content filtering error, without returning the harmful completion content. This process is repeated until the end of the stream. Content is fully vetted according to the content filtering policy before it's returned to the user. Content isn't returned token-by-token in this case, but in “content chunks” of the respective buffer size.
856856
857857
### Asynchronous Filter
858858
859-
Customers can choose the Asynchronous Filter as an additional option, providing a new streaming experience. In this case, content filters are run asynchronously, and completion content is returned immediately with a smooth token-by-token streaming experience. No content is buffered, which allows for a fast streaming experience with zero latency associated with content safety.
859+
Customers can choose the Asynchronous Filter as an extra option, providing a new streaming experience. In this case, content filters are run asynchronously, and completion content is returned immediately with a smooth token-by-token streaming experience. No content is buffered, which allows for a fast streaming experience with zero latency associated with content safety.
860860
861-
Customers must be aware that while the feature improves latency, it's a trade-off against the safety and real-time vetting of smaller sections of model output. Because content filters are run asynchronously, content moderation messages and policy violation signals are delayed, which means some sections of harmful content that would otherwise have been filtered immediately could be displayed to the user.
861+
Customers must understand that while the feature improves latency, it's a trade-off against the safety and real-time vetting of smaller sections of model output. Because content filters are run asynchronously, content moderation messages and policy violation signals are delayed, which means some sections of harmful content that would otherwise have been filtered immediately could be displayed to the user.
862862
863-
**Annotations**: Annotations and content moderation messages are continuously returned during the stream. We strongly recommend you consume annotations in your app and implement additional AI content safety mechanisms, such as redacting content or returning additional safety information to the user.
863+
**Annotations**: Annotations and content moderation messages are continuously returned during the stream. We strongly recommend you consume annotations in your app and implement other AI content safety mechanisms, such as redacting content or returning other safety information to the user.
864864
865-
**Content filtering signal**: The content filtering error signal is delayed. In case of a policy violation, it’s returned as soon as it’s available, and the stream is stopped. The content filtering signal is guaranteed within a ~1,000-character window of the policy-violating content.
865+
**Content filtering signal**: The content filtering error signal is delayed. If there is a policy violation, it’s returned as soon as it’s available, and the stream is stopped. The content filtering signal is guaranteed within a ~1,000-character window of the policy-violating content.
866866
867867
**Customer Copyright Commitment**: Content that is retroactively flagged as protected material may not be eligible for Customer Copyright Commitment coverage.
868868
@@ -960,7 +960,7 @@ data: {
960960
961961
#### Sample response stream (passes filters)
962962
963-
Below is a real chat completion response using Asynchronous Filter. Note how the prompt annotations aren't changed, completion tokens are sent without annotations, and new annotation messages are sent without tokens&mdash;they are instead associated with certain content filter offsets.
963+
Below is a real chat completion response using Asynchronous Filter. Note how the prompt annotations aren't changed, completion tokens are sent without annotations, and new annotation messages are sent without tokens&mdash;they're instead associated with certain content filter offsets.
964964
965965
`{"temperature": 0, "frequency_penalty": 0, "presence_penalty": 1.0, "top_p": 1.0, "max_tokens": 800, "messages": [{"role": "user", "content": "What is color?"}], "stream": true}`
966966

0 commit comments

Comments
 (0)