Skip to content

Commit fe70a60

Browse files
committed
[WAF] Update Firewall for AI
1 parent b6f9f3f commit fe70a60

File tree

2 files changed

+54
-9
lines changed

2 files changed

+54
-9
lines changed

src/content/docs/waf/detections/firewall-for-ai.mdx

Lines changed: 16 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -12,11 +12,14 @@ sidebar:
1212

1313
import { Tabs, TabItem, Details } from "~/components";
1414

15-
Firewall for AI is a detection that can help protect your services powered by large language models (LLMs) against abuse. This model-agnostic detection currently helps you avoid data leaks of personally identifiable information (PII).
15+
Firewall for AI is a detection that can help protect your services powered by large language models (LLMs) against abuse. This model-agnostic detection currently helps you do the following:
1616

17-
When enabled, the detection runs on incoming traffic, searching for any LLM prompts attempting to exploit the model in order to extract data.
17+
- Prevent data leaks of personally identifiable information (PII) — for example, phone numbers, email addresses, social security numbers, and credit card numbers.
18+
- Detect and moderate unsafe or harmful prompts – for example, prompts potentially related to violent crimes.
1819

19-
Cloudflare will populate the existing [Firewall for AI fields](#fields) based on the scan results. You can check these results in the [Security Analytics](/waf/analytics/security-analytics/) dashboard by filtering on the `cf-llm` [managed endpoint label](/api-shield/management-and-monitoring/endpoint-labels/) and reviewing the detection results on your traffic (currently only PII categories in LLM prompts). Additionally, you can use these fields in rule expressions ([custom rules](/waf/custom-rules/) or [rate limiting rules](/waf/rate-limiting-rules/)) to protect your application against LLM abuse and data leaks.
20+
When enabled, the detection runs on incoming traffic, searching for any LLM prompts attempting to exploit the model.
21+
22+
Cloudflare will populate the existing [Firewall for AI fields](#fields) based on the scan results. You can check these results in the [Security Analytics](/waf/analytics/security-analytics/) dashboard by filtering on the `cf-llm` [managed endpoint label](/api-shield/management-and-monitoring/endpoint-labels/) and reviewing the detection results on your traffic. Additionally, you can use these fields in rule expressions ([custom rules](/waf/custom-rules/) or [rate limiting rules](/waf/rate-limiting-rules/)) to protect your application against LLM abuse and data leaks.
2023

2124
## Availability
2225

@@ -61,7 +64,7 @@ curl "https://<YOUR_HOSTNAME>/api/v1/" \
6164

6265
The PII category for this request would be `EMAIL_ADDRESS`.
6366

64-
Then, use [Security Analytics](/waf/analytics/security-analytics/) in the new application security dashboard to validate that the WAF is correctly detecting prompts leaking PII data in incoming requests. Filter data by the `cf-llm` managed endpoint label and review the detection results on your traffic.
67+
Then, use [Security Analytics](/waf/analytics/security-analytics/) in the new application security dashboard to validate that the WAF is correctly detecting potentially harmful prompts in incoming requests. Filter data by the `cf-llm` managed endpoint label and review the detection results on your traffic.
6568

6669
Alternatively, create a custom rule like the one described in the next step using a _Log_ action. This rule will generate [security events](/waf/analytics/security-events/) that will allow you to validate your configuration.
6770

@@ -114,10 +117,14 @@ You can combine the previous expression with other [fields](/ruleset-engine/rule
114117

115118
When enabled, Firewall for AI populates the following fields:
116119

117-
| Field name in the dashboard | Field |
118-
| --------------------------- | --------------------------------------------------------------------------------------------------------------- |
119-
| LLM PII Detected | [`cf.llm.prompt.pii_detected`](/ruleset-engine/rules-language/fields/reference/cf.llm.prompt.pii_detected/) |
120-
| LLM PII Categories | [`cf.llm.prompt.pii_categories`](/ruleset-engine/rules-language/fields/reference/cf.llm.prompt.pii_categories/) |
121-
| LLM Content Detected | [`cf.llm.prompt.detected`](/ruleset-engine/rules-language/fields/reference/cf.llm.prompt.detected/) |
120+
| Field name in the dashboard | Data type | Field |
121+
| --------------------------- | --------------- | --------------------------------------------------------------------------------------------------------------------------------- |
122+
| LLM PII Detected | `Boolean` | [`cf.llm.prompt.pii_detected`](/ruleset-engine/rules-language/fields/reference/cf.llm.prompt.pii_detected/) |
123+
| LLM PII Categories | `Array<String>` | [`cf.llm.prompt.pii_categories`](/ruleset-engine/rules-language/fields/reference/cf.llm.prompt.pii_categories/) |
124+
| LLM Content Detected | `Boolean` | [`cf.llm.prompt.detected`](/ruleset-engine/rules-language/fields/reference/cf.llm.prompt.detected/) |
125+
| LLM Unsafe topic detected | `Boolean` | [`cf.llm.prompt.unsafe_topic_detected`](/ruleset-engine/rules-language/fields/reference/cf.llm.prompt.unsafe_topic_detected/) |
126+
| LLM Unsafe topic categories | `Array<String>` | [`cf.llm.prompt.unsafe_topic_categories`](/ruleset-engine/rules-language/fields/reference/cf.llm.prompt.unsafe_topic_categories/) |
122127

123128
For a list of PII categories, refer to the [`cf.llm.prompt.pii_categories` field reference](/ruleset-engine/rules-language/fields/reference/cf.llm.prompt.pii_categories/).
129+
130+
For a list of unsafe topic categories, refer to the [`cf.llm.prompt.unsafe_topic_categories` field reference](/ruleset-engine/rules-language/fields/reference/cf.llm.prompt.unsafe_topic_categories/).

src/content/fields/index.yaml

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1202,6 +1202,44 @@ entries:
12021202
# Matches requests where PII categorized as "EMAIL_ADDRESS" or "IBAN_CODE" was detected:
12031203
(cf.llm.prompt.pii_detected and any(cf.llm.prompt.pii_categories[*] in {"EMAIL_ADDRESS" "IBAN_CODE"}))
12041204
1205+
- name: cf.llm.prompt.unsafe_topic_detected
1206+
data_type: Boolean
1207+
categories: [Request]
1208+
keywords: [request, cloudflare, ai, client, visitor]
1209+
plan_info_label: Enterprise
1210+
summary: Indicates whether the incoming request includes any unsafe topic category in the LLM prompt.
1211+
description: |-
1212+
Equivalent to checking if the [`cf.llm.prompt.unsafe_topic_categories`](/ruleset-engine/rules-language/fields/reference/cf.llm.prompt.unsafe_topic_categories/) field is not empty.
1213+
1214+
Requires a Cloudflare Enterprise plan. You must also enable [Firewall for AI](/waf/detections/firewall-for-ai/).
1215+
1216+
- name: cf.llm.prompt.unsafe_topic_categories
1217+
data_type: Array<String>
1218+
categories: [Request]
1219+
keywords: [request, cloudflare, ai, client, visitor]
1220+
plan_info_label: Enterprise
1221+
summary: Array of string values with the type of unsafe topics detected in the LLM prompt.
1222+
description: |-
1223+
The possible values are the following:
1224+
1225+
| Value | Category name | Description |
1226+
| ----- | ------------------------- | ----------------------------------------------------------------------------------------------------------------- |
1227+
| `S1` | Violent Crimes | Violent crimes against people or animals. |
1228+
| `S2` | Non-Violent Crimes | Non-violent offenses such as fraud, theft, drug creation, or hacking. |
1229+
| `S3` | Sex-Related Crimes | Sex-related crimes, including trafficking, assault, and harassment. |
1230+
| `S4` | Child Sexual Exploitation | Sexual exploitation of children. |
1231+
| `S5` | Defamation | False statements that are likely to damage a living person's reputation. |
1232+
| `S6` | Specialized Advice | Specialized financial, medical, or legal advice, or misrepresent dangerous things as safe. |
1233+
| `S7` | Privacy | Sensitive, nonpublic personal information that could endanger an individual. |
1234+
| `S8` | Intellectual Property | Violate a third party's intellectual property rights. |
1235+
| `S9` | Indiscriminate Weapons | Creation of indiscriminate weapons like chemical, biological, or nuclear arms. |
1236+
| `S10` | Hate | Demean or dehumanize people based on their race, religion, sexual orientation, or other personal characteristics. |
1237+
| `S11` | Suicide & Self-Harm | Encourage or endorse suicide, self-injury, or disordered eating. |
1238+
| `S12` | Sexual Content | Erotic content. |
1239+
| `S13` | Elections | False information about the time, place, or manner of voting in elections. |
1240+
1241+
Requires a Cloudflare Enterprise plan. You must also enable [Firewall for AI](/waf/detections/firewall-for-ai/).
1242+
12051243
- name: cf.worker.upstream_zone
12061244
data_type: String
12071245
categories: [Request]

0 commit comments

Comments
 (0)