You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-studio/ai-services/content-safety-overview.md
+2-18Lines changed: 2 additions & 18 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -23,7 +23,7 @@ You can use Azure AI Content Safety for many scenarios:
23
23
**Text content**:
24
24
- Moderate text content: This feature scans and moderates text content, identifying and categorizing it based on different levels of severity to ensure appropriate responses.
25
25
- Groundedness detection: This filter determines if the AI's responses are based on trusted, user-provided sources, ensuring that the answers are "grounded" in the intended material. Groundedness detection is helpful for improving the reliability and factual accuracy of responses.
26
-
- Protected material detection for text: This feature identifies protected text material, such as known song lyrics, articles, or other content, ensuring that the AI doesn’t output this content without permission.
26
+
- Protected material detection for text: This feature identifies protected text material, such as known song lyrics, articles, or other content, ensuring that the AI doesn't output this content without permission.
27
27
- Protected material detection for code: Detects code segments in the model's output that match known code from public repositories, helping to prevent uncredited or unauthorized reproduction of source code.
28
28
- Prompt shields: This feature provides a unified API to address "Jailbreak" and "Indirect Attacks":
29
29
- Jailbreak Attacks: Attempts by users to manipulate the AI into bypassing its safety protocols or ethical guidelines. Examples include prompts designed to trick the AI into giving inappropriate responses or performing tasks it was programmed to avoid.
@@ -39,23 +39,7 @@ You can use Azure AI Content Safety for many scenarios:
39
39
40
40
## Understand harm categories
41
41
42
-
### Harm categories
43
-
44
-
| Category | Description |API term |
45
-
| --------- | ------------------- | --- |
46
-
| Hate and Fairness | Hate and fairness harms refer to any content that attacks or uses discriminatory language with reference to a person or identity group based on certain differentiating attributes of these groups. <br><br>This includes, but is not limited to:<ul><li>Race, ethnicity, nationality</li><li>Gender identity groups and expression</li><li>Sexual orientation</li><li>Religion</li><li>Personal appearance and body size</li><li>Disability status</li><li>Harassment and bullying</li></ul> |`Hate`|
47
-
| Sexual | Sexual describes language related to anatomical organs and genitals, romantic relationships and sexual acts, acts portrayed in erotic or affectionate terms, including those portrayed as an assault or a forced sexual violent act against one’s will. <br><br> This includes but is not limited to:<ul><li>Vulgar content</li><li>Prostitution</li><li>Nudity and Pornography</li><li>Abuse</li><li>Child exploitation, child abuse, child grooming</li></ul> |`Sexual`|
48
-
| Violence | Violence describes language related to physical actions intended to hurt, injure, damage, or kill someone or something; describes weapons, guns, and related entities. <br><br>This includes, but isn't limited to: <ul><li>Weapons</li><li>Bullying and intimidation</li><li>Terrorist and violent extremism</li><li>Stalking</li></ul> |`Violence`|
49
-
| Self-Harm | Self-harm describes language related to physical actions intended to purposely hurt, injure, damage one’s body or kill oneself. <br><br> This includes, but isn't limited to: <ul><li>Eating Disorders</li><li>Bullying and intimidation</li></ul> |`SelfHarm`|
50
-
51
-
### Severity levels
52
-
53
-
| Level | Description |
54
-
| --- | ---|
55
-
|Safe |Content might be related to violence, self-harm, sexual, or hate categories but the terms are used in general, journalistic, scientific, medical, and similar professional contexts, which are appropriate for most audiences. |
56
-
|Low |Content that expresses prejudiced, judgmental, or opinionated views, includes offensive use of language, stereotyping, use cases exploring a fictional world (for example, gaming, literature) and depictions at low intensity.|
57
-
|Medium |Content that uses offensive, insulting, mocking, intimidating, or demeaning language towards specific identity groups, includes depictions of seeking and executing harmful instructions, fantasies, glorification, promotion of harm at medium intensity. |
58
-
|High |Content that displays explicit and severe harmful instructions, actions, damage, or abuse; includes endorsement, glorification, or promotion of severe harmful acts, extreme or illegal forms of harm, radicalization, or nonconsensual power exchange or abuse. |
Copy file name to clipboardExpand all lines: articles/ai-studio/concepts/model-catalog-content-safety.md
+7-7Lines changed: 7 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@ titleSuffix: Azure AI Foundry
4
4
description: Learn about content safety for models deployed using serverless APIs, using Azure AI Foundry.
5
5
manager: scottpolly
6
6
ms.service: azure-ai-foundry
7
-
ms.topic: how-to
7
+
ms.topic: conceptual
8
8
ms.date: 01/21/2025
9
9
ms.author: mopeakande
10
10
author: msakande
@@ -22,10 +22,10 @@ In this article, learn about content safety capabilities for models from the mod
22
22
23
23
Azure AI uses a default configuration of [Azure AI Content Safety](/azure/ai-services/content-safety/overview) content filters that detect harmful content across four categories hate and fairness, self-harm, sexual, and violence for models deployed via serverless APIs. To learn more about content filtering (preview), see [Harm categories in Azure AI Content Safety](/azure/ai-services/content-safety/concepts/harm-categories).
24
24
25
-
The default content filtering configuration for text models is set to filter at the medium severity threshold, filtering any detected content at this level or higher. For image models, the default content filtering configuration is set at the low configuration threshold, filtering at this level or higher. Models deployed using the [Azure AI model inference service](/articles/ai-foundry/model-inference/how-to/configure-content-filters.md)can create configurable filters by clicking the **Content filters** tab within the **Safety + security** page.
25
+
The default content filtering configuration for text models is set to filter at the medium severity threshold, filtering any detected content at this level or higher. For image models, the default content filtering configuration is set at the low configuration threshold, filtering at this level or higher. Models deployed using the [Azure AI model inference service](../../ai-foundry/model-inference/how-to/configure-content-filters.md)can create configurable filters by clicking the **Content filters** tab within the **Safety + security** page.
26
26
27
27
> [!TIP]
28
-
> Content filtering (preview) is not available for certain model types that are deployed via serverless APIs. These model types include embedding models and time series models.
28
+
> Content filtering (preview) isn't available for certain model types that are deployed via serverless APIs. These model types include embedding models and time series models.
29
29
30
30
Content filtering (preview) occurs synchronously as the service processes prompts to generate content. You might be billed separately according to [Azure AI Content Safety pricing](https://azure.microsoft.com/pricing/details/cognitive-services/content-safety/) for such use. You can disable content filtering (preview) for individual serverless endpoints either:
31
31
@@ -34,14 +34,14 @@ Content filtering (preview) occurs synchronously as the service processes prompt
34
34
35
35
Suppose you decide to use an API other than the [Azure AI Model Inference API](/azure/ai-studio/reference/reference-model-inference-api) to work with a model that is deployed via a serverless API. In such a situation, content filtering (preview) isn't enabled unless you implement it separately by using Azure AI Content Safety. To get started with Azure AI Content Safety, see [Quickstart: Analyze text content](/azure/ai-services/content-safety/quickstart-text). You run a higher risk of exposing users to harmful content if you don't use content filtering (preview) when working with models that are deployed via serverless APIs.
Pricing details are viewable at [Azure AI Content Safety pricing](https://azure.microsoft.com/pricing/details/cognitive-services/content-safety/). Charges are incurred when the Azure AI Content Safety validates the prompt or completion. If Azure AI Content Safety blocks the prompt or completion, you're charged for both the evaluation of the content and the inference calls.
42
42
43
43
## Related content
44
44
45
-
-[How to configure content filters (preview) for models in Azure AI services](/articles/ai-foundry/model-inference/how-to/configure-content-filters.md)
46
-
-[Azure AI Content Safety Overview](/articles/ai-services/content-safety/overview.md)
47
-
-[Model catalog and collections in Azure AI Foundry portal](/articles/ai-studio/how-to/model-catalog-overview.md)
45
+
-[How to configure content filters (preview) for models in Azure AI services](../../ai-foundry/model-inference/how-to/configure-content-filters.md)
46
+
-[What is Azure AI Content Safety?](../../ai-services/content-safety/overview.md)
47
+
-[Model catalog and collections in Azure AI Foundry portal](../how-to/model-catalog-overview.md)
Copy file name to clipboardExpand all lines: articles/ai-studio/includes/content-safety-harm-categories.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,16 +14,16 @@ ms.custom: include
14
14
15
15
| Category | Description |API term |
16
16
| --------- | ------------------- | --- |
17
-
| Hate and Fairness | Hate and fairness harms refer to any content that attacks or uses discriminatory language with reference to a person or identity group based on certain differentiating attributes of these groups. <br><br>This includes, but is not limited to:<ul><li>Race, ethnicity, nationality</li><li>Gender identity groups and expression</li><li>Sexual orientation</li><li>Religion</li><li>Personal appearance and body size</li><li>Disability status</li><li>Harassment and bullying</li></ul> |`Hate`|
18
-
| Sexual | Sexual describes language related to anatomical organs and genitals, romantic relationships and sexual acts, acts portrayed in erotic or affectionate terms, including those portrayed as an assault or a forced sexual violent act against one’s will. <br><br> This includes but is not limited to:<ul><li>Vulgar content</li><li>Prostitution</li><li>Nudity and Pornography</li><li>Abuse</li><li>Child exploitation, child abuse, child grooming</li></ul> |`Sexual`|
17
+
| Hate and Fairness | Hate and fairness harms refer to any content that attacks or uses discriminatory language with reference to a person or identity group based on certain differentiating attributes of these groups. <br><br>This includes, but isn't limited to:<ul><li>Race, ethnicity, nationality</li><li>Gender identity groups and expression</li><li>Sexual orientation</li><li>Religion</li><li>Personal appearance and body size</li><li>Disability status</li><li>Harassment and bullying</li></ul> |`Hate`|
18
+
| Sexual | Sexual describes language related to anatomical organs and genitals, romantic relationships and sexual acts, acts portrayed in erotic or affectionate terms, including those portrayed as an assault or a forced sexual violent act against one's will. <br><br> This includes but isn't limited to:<ul><li>Vulgar content</li><li>Prostitution</li><li>Nudity and Pornography</li><li>Abuse</li><li>Child exploitation, child abuse, child grooming</li></ul> |`Sexual`|
19
19
| Violence | Violence describes language related to physical actions intended to hurt, injure, damage, or kill someone or something; describes weapons, guns, and related entities. <br><br>This includes, but isn't limited to: <ul><li>Weapons</li><li>Bullying and intimidation</li><li>Terrorist and violent extremism</li><li>Stalking</li></ul> |`Violence`|
20
-
| Self-Harm | Self-harm describes language related to physical actions intended to purposely hurt, injure, damage one’s body or kill oneself. <br><br> This includes, but isn't limited to: <ul><li>Eating Disorders</li><li>Bullying and intimidation</li></ul> |`SelfHarm`|
20
+
| Self-Harm | Self-harm describes language related to physical actions intended to purposely hurt, injure, damage one's body or kill oneself. <br><br> This includes, but isn't limited to: <ul><li>Eating Disorders</li><li>Bullying and intimidation</li></ul> |`SelfHarm`|
21
21
22
22
### Severity levels
23
23
24
24
| Level | Description |
25
25
| --- | ---|
26
-
|Safe |Content might be related to violence, self-harm, sexual, or hate categories but the terms are used in general, journalistic, scientific, medical, and similar professional contexts, which are appropriate for most audiences. |
27
-
|Low |Content that expresses prejudiced, judgmental, or opinionated views, includes offensive use of language, stereotyping, usecases exploring a fictional world (for example, gaming, literature) and depictions at low intensity.|
26
+
|Safe |Content might be related to violence, self-harm, sexual, or hate categories. However, the terms are used in general, journalistic, scientific, medical, and similar professional contexts, which are appropriate for most audiences. |
27
+
|Low |Content that expresses prejudiced, judgmental, or opinionated views, includes offensive use of language, stereotyping, use-cases exploring a fictional world (for example, gaming, literature) and depictions at low intensity.|
28
28
|Medium |Content that uses offensive, insulting, mocking, intimidating, or demeaning language towards specific identity groups, includes depictions of seeking and executing harmful instructions, fantasies, glorification, promotion of harm at medium intensity. |
29
29
|High |Content that displays explicit and severe harmful instructions, actions, damage, or abuse; includes endorsement, glorification, or promotion of severe harmful acts, extreme or illegal forms of harm, radicalization, or nonconsensual power exchange or abuse. |
Copy file name to clipboardExpand all lines: articles/ai-studio/includes/content-safety-serverless-models.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -13,7 +13,7 @@ ms.custom: include file
13
13
# Also used in Azure Machine Learning documentation
14
14
---
15
15
16
-
For language models deployed via serverless APIs, Azure AI implements a default configuration of [Azure AI Content Safety](/azure/ai-services/content-safety/overview) text moderation filters that detect harmful content such as hate, self-harm, sexual, and violent content. To learn more about content filtering (preview), see [Content Safety for serverless APIs](/articles/ai-studio/how-to/model-catalog-content-safety.md).
16
+
For language models deployed via serverless APIs, Azure AI implements a default configuration of [Azure AI Content Safety](../../ai-services/content-safety/overview.md) text moderation filters that detect harmful content such as hate, self-harm, sexual, and violent content. To learn more about content filtering (preview), see [Content Safety for serverless APIs](../concepts/model-catalog-content-safety.md).
17
17
18
18
> [!TIP]
19
19
> Content filtering (preview) is not available for certain model types that are deployed via serverless APIs. These model types include embedding models and time series models.
0 commit comments