|
| 1 | +--- |
| 2 | +title: Azure API Management policy reference - llm-content-safety |
| 3 | +description: Reference for the llm-content-safety policy available for use in Azure API Management. Provides policy usage, settings, and examples. |
| 4 | +services: api-management |
| 5 | +author: dlepow |
| 6 | + |
| 7 | +ms.service: azure-api-management |
| 8 | +ms.collection: ce-skilling-ai-copilot |
| 9 | +ms.custom: |
| 10 | +ms.topic: article |
| 11 | +ms.date: 03/04/2025 |
| 12 | +ms.author: danlep |
| 13 | +--- |
| 14 | + |
| 15 | +# Enforce content safety checks on LLM requests |
| 16 | + |
| 17 | +[!INCLUDE [api-management-availability-premium-dev-standard-basic-premiumv2-standardv2-basicv2](../../includes/api-management-availability-premium-dev-standard-basic-premiumv2-standardv2-basicv2.md)] |
| 18 | + |
| 19 | +The `llm-content-safety` policy enforces content safety checks on large language model (LLM) requests (prompts) by transmitting them to the [Azure AI Content Safety](/azure/ai-services/content-safety/overview) service before sending to the backend LLM API. When the policy is enabled and Azure AI Content Safety detects malicious content, API Management blocks the request and returns a `403` error code. |
| 20 | + |
| 21 | +Use the policy in scenarios such as the following: |
| 22 | + |
| 23 | +* Block requests that contain predefined categories of harmful content or hate speech |
| 24 | +* Apply custom blocklists to prevent specific content from being sent |
| 25 | +* Shield against prompts that match attack patterns |
| 26 | + |
| 27 | +[!INCLUDE [api-management-policy-generic-alert](../../includes/api-management-policy-generic-alert.md)] |
| 28 | + |
| 29 | +## Prerequisites |
| 30 | + |
| 31 | +* An [Azure AI Content Safety](/azure/ai-services/content-safety/) resource. |
| 32 | +* An API Management [backend](backends.md) configured to route content safety API calls and authenticate to the Azure AI Content Safety service, in the form `https://<content-safety-service-name>.cognitiveservices.azure.com`. Managed identity with Cognitive Services User role is recommended for authentication. |
| 33 | + |
| 34 | + |
| 35 | +## Policy statement |
| 36 | + |
| 37 | +```xml |
| 38 | +<llm-content-safety backend-id="name of backend entity" shield-prompt="true | false" > |
| 39 | + <categories output-type="FourSeverityLevels | EightSeverityLevels"> |
| 40 | + <category name="Hate | SelfHarm | Sexual | Violence" threshold="integer" /> |
| 41 | + <!-- If there are multiple categories, add more category elements --> |
| 42 | + [...] |
| 43 | + </categories> |
| 44 | + <blocklists> |
| 45 | + <id>blocklist-identifier</id> |
| 46 | + <!-- If there are multiple blocklists, add more id elements --> |
| 47 | + [...] |
| 48 | + </blocklists> |
| 49 | +</llm-content-safety> |
| 50 | +``` |
| 51 | + |
| 52 | +## Attributes |
| 53 | + |
| 54 | +| Attribute | Description | Required | Default | |
| 55 | +| -------------- | ----------------------------------------------------------------------------------------------------- | -------- | ------- | |
| 56 | +| backend-id | Identifier (name) of the Azure AI Content Safety backend to route content-safety API calls to. Policy expressions are allowed. | Yes | N/A | |
| 57 | +| shield-prompt | If set to `true`, content is checked for user attacks. Otherwise, skip this check. Policy expressions are allowed. | No | `false` | |
| 58 | + |
| 59 | + |
| 60 | +## Elements |
| 61 | + |
| 62 | +| Element | Description | Required | |
| 63 | +| -------------- | -----| -------- | |
| 64 | +| categories | A list of `category` elements that specify settings for blocking requests when the category is detected. | No | |
| 65 | +| blocklists | A list of [blocklist](/azure/ai-services/content-safety/how-to/use-blocklist) `id` elements from the Azure AI Content Safety instance for which detection causes the request to be blocked. Policy expressions are allowed. | No | |
| 66 | + |
| 67 | +### categories attributes |
| 68 | + |
| 69 | +| Attribute | Description | Required | Default | |
| 70 | +| -------------- | ----------------------------------------------------------------------------------------------------- | -------- | ------- | |
| 71 | +| output-type | Specifies how severity levels are returned by Azure AI Content Safety. The attribute must have one of the following values.<br /><br />- `FourSeverityLevels`: Output severities in four levels: 0,2,4,6.<br/>- `EightSeverityLevels`: Output severities in eight levels: 0,1,2,3,4,5,6,7.<br/><br/>Policy expressions are allowed. | No | `FourSeverityLevels` | |
| 72 | + |
| 73 | + |
| 74 | +### category attributes |
| 75 | + |
| 76 | +| Attribute | Description | Required | Default | |
| 77 | +| -------------- | ----------------------------------------------------------------------------------------------------- | -------- | ------- | |
| 78 | +| name | Specifies the name of this category. The attribute must have one of the following values: `Hate`, `SelfHarm`, `Sexual`, `Violence`. Policy expressions are allowed. | Yes | N/A | |
| 79 | +| threshold | Specifies the threshold value for this category at which request are blocked. Requests with content severities less than the threshold aren't blocked. The value must be between 0 and 7. Policy expressions are allowed. | Yes | N/A | |
| 80 | + |
| 81 | + |
| 82 | +## Usage |
| 83 | + |
| 84 | +- [**Policy sections:**](./api-management-howto-policies.md#sections) inbound |
| 85 | +- [**Policy scopes:**](./api-management-howto-policies.md#scopes) global, workspace, product, API |
| 86 | +- [**Gateways:**](api-management-gateways-overview.md) classic, v2, consumption, self-hosted, workspace |
| 87 | + |
| 88 | +### Usage notes |
| 89 | + |
| 90 | +* The policy runs on a concatenation of all text content in a completion or chat completion request. |
| 91 | +* If the request exceeds the character limit of Azure AI Content Safety, a `403` error is returned. |
| 92 | +* This policy can be used multiple times per policy definition. |
| 93 | + |
| 94 | +## Example |
| 95 | + |
| 96 | +The following example enforces content safety checks on LLM requests using the Azure AI Content Safety service. The policy blocks requests that contain speech in the `Hate` or `Violence` category with a severity level of 4 or higher. The `shield-prompt` attribute is set to `true` to check for adversarial attacks. |
| 97 | + |
| 98 | +```xml |
| 99 | +<policies> |
| 100 | + <inbound> |
| 101 | + <llm-content-safety backend-id="content-safety-backend" shield-prompt="true"> |
| 102 | + <categories output-type="EightSeverityLevels"> |
| 103 | + <category name="Hate" threshold="4" /> |
| 104 | + <category name="Violence" threshold="4" /> |
| 105 | + </categories> |
| 106 | + </llm-content-safety> |
| 107 | + </inbound> |
| 108 | +</policies> |
| 109 | + |
| 110 | +``` |
| 111 | + |
| 112 | +## Related policies |
| 113 | + |
| 114 | +* [Content validation](api-management-policies.md#content-validation) |
| 115 | +* [llm-token-limit](llm-token-limit-policy.md) policy |
| 116 | +* [llm-emit-token-metric](llm-emit-token-metric-policy.md) policy |
| 117 | + |
| 118 | +[!INCLUDE [api-management-policy-ref-next-steps](../../includes/api-management-policy-ref-next-steps.md)] |
0 commit comments