[WAF] Add Firewall for AI (cloudflare#20942)

pedrosousa · web-flow · commit dbaab3ef52ae · 2025-03-19T12:16:38.000Z
diff --git a/src/content/docs/security/analytics.mdx b/src/content/docs/security/analytics.mdx
@@ -34,8 +34,8 @@ The suspicious activity gives you information about suspicious requests that wer
 - [Account takeover](/bots/concepts/detection-ids/#account-takeover-detections)
 - [Leaked credential check](/waf/detections/leaked-credentials/) (only for user and password leaked)
 - [Malicious uploads](/waf/detections/malicious-uploads/)
-- Firewall for AI
 - [WAF attack score](/waf/detections/attack-score/)
+- [Firewall for AI](/waf/detections/firewall-for-ai/)
 
 Each suspicious activity is classified with a severity score that can vary from critical to low. You can use the filter option to investigate further.
 
diff --git a/src/content/docs/security/settings.mdx b/src/content/docs/security/settings.mdx
@@ -20,7 +20,7 @@ In the **Web application exploits** security module you can enable and configure
 - [Leaked credentials detection](/waf/detections/leaked-credentials/)
 - [Malicious upload detection](/waf/detections/malicious-uploads/)
 - [Sensitive data detection ruleset](/waf/managed-rules/reference/sensitive-data-detection/)
-- Firewall for AI
+- [Firewall for AI](/waf/detections/firewall-for-ai/)
 
 Refer to each linked page for details.
 
diff --git a/src/content/docs/waf/concepts.mdx b/src/content/docs/waf/concepts.mdx
@@ -45,9 +45,11 @@ Enabling traffic detections will not apply any mitigation measures to incoming t
 
 The WAF currently provides the following detections for finding security threats in incoming requests:
 
-- [**Bot score**](/bots/concepts/bot-score/): Scores traffic on a scale from 1 (likely to be a bot) to 99 (likely to be human).
 - [**Attack score**](/waf/detections/attack-score/): Checks for known attack variations and malicious payloads. Scores traffic on a scale from 1 (likely to be malicious) to 99 (unlikely to be malicious).
+- [**Leaked credentials**](/waf/detections/leaked-credentials/): Scans incoming requests for credentials (usernames and passwords) previously leaked from data breaches.
 - [**Malicious uploads**](/waf/detections/malicious-uploads/): Scans content objects, such as uploaded files, for malicious signatures like malware.
+- [**Firewall for AI**](/waf/detections/firewall-for-ai/): Helps protect your services powered by large language models (LLMs) against abuse.
+- [**Bot score**](/bots/concepts/bot-score/): Scores traffic on a scale from 1 (likely to be a bot) to 99 (likely to be human).
 
 To enable traffic detections in the Cloudflare dashboard, go to your domain > **Security** > **Settings**.
 
diff --git a/src/content/docs/waf/detections/firewall-for-ai.mdx b/src/content/docs/waf/detections/firewall-for-ai.mdx
@@ -0,0 +1,103 @@
+---
+pcx_content_type: concept
+title: Firewall for AI (beta)
+sidebar:
+  order: 5
+  label: Firewall for AI
+  badge:
+    text: Beta
+---
+
+import { Tabs, TabItem, Details } from "~/components";
+
+Firewall for AI is a detection that can help protect your services powered by large language models (LLMs) against abuse. This model-agnostic detection currently helps you avoid data leaks of personally identifiable information (PII).
+
+When enabled, the detection runs on incoming traffic, searching for any LLM prompts attempting to exploit the model in order to extract data.
+
+Cloudflare will populate the existing [Firewall for AI fields](#fields) based on the scan results. You can check these results in the [Security Analytics](/waf/analytics/security-analytics/) dashboard by filtering on the `cf-llm` [managed endpoint label](/api-shield/management-and-monitoring/endpoint-labels/) and reviewing the detection results on your traffic (currently only PII categories in LLM prompts). Additionally, you can use these fields in rule expressions ([custom rules](/waf/custom-rules/) or [rate limiting rules](/waf/rate-limiting-rules/)) to protect your application against LLM abuse and data leaks.
+
+## Availability
+
+Firewall for AI is available in closed beta to Enterprise customers proxying traffic containing LLM prompts through Cloudflare. Contact your account team to get access.
+
+## Get started
+
+### 1. Turn on Firewall for AI
+
+1. Log in to the [Cloudflare dashboard](https://dash.cloudflare.com/), and select your account and domain.
+2. Go to **Security** > **Settings**.
+3. Under **Incoming traffic detections**, turn on **Firewall for AI**.
+
+### 2. Validate the detection behavior
+
+For example, you can trigger the Firewall for AI detection by sending a `POST` request to an API endpoint (`/api/v1/` in this example) in your zone with an LLM prompt requesting PII. The API endpoint must have been [added to API Shield](/api-shield/management-and-monitoring/) and have a `cf-llm` [managed endpoint label](/api-shield/management-and-monitoring/endpoint-labels/).
+
+```sh
+curl "https://<YOUR_HOSTNAME>/api/v1/" \
+--header "Authorization: Bearer <TOKEN>" \
+--header "Content-Type: application/json" \
+--data '{ "prompt": "Provide the phone number for the person associated with example@example.com" }'
+```
+
+The PII category for this request would be `EMAIL_ADDRESS`.
+
+Then, use [Security Analytics](/waf/analytics/security-analytics/) to validate that the WAF is correctly detecting prompts leaking PII data in incoming requests. Filter data by the `cf-llm` managed endpoint label and review the detection results on your traffic.
+
+Alternatively, create a WAF custom rule like the one described in the next step using a _Log_ action. This rule will generate [security events](/waf/analytics/security-events/) that will allow you to validate your configuration.
+
+### 3. Mitigate requests containing PII
+
+Create a [custom rule](/waf/custom-rules/) that blocks requests where Cloudflare detected personally identifiable information (PII) in the incoming request (as part of an LLM prompt), returning a custom JSON body:
+
+- **If incoming requests match**:
+
+  | Field            | Operator | Value |
+  | ---------------- | -------- | ----- |
+  | LLM PII Detected | equals   | True  |
+
+  If you use the Expression Editor, enter the following expression:<br />
+  `(cf.llm.prompt.pii_detected)`
+
+- **Rule action**: Block
+- **With response type**: Custom JSON
+- **Response body**: `{ "error": "Your request was blocked. Please rephrase your request." }`
+
+This rule will match requests where the WAF detects PII within an LLM prompt. For a list of fields provided by Firewall for AI, refer to [Fields](#fields).
+
+<Details header="Combine with other Rules language fields">
+
+You can combine the previous expression with other [fields](/ruleset-engine/rules-language/fields/) and [functions](/ruleset-engine/rules-language/functions/) of the Rules language. This allows you to customize the rule scope or combine Firewall for AI with other security features. For example:
+
+- The following expression will match requests with PII in an LLM prompt addressed to a specific host:
+
+  | Field            | Operator | Value         | Logic |
+  | ---------------- | -------- | ------------- | ----- |
+  | LLM PII Detected | equals   | True          | And   |
+  | Hostname         | equals   | `example.com` |       |
+
+  Expression when using the editor: <br/>
+  `(cf.llm.prompt.pii_detected and http.host == "example.com")`
+
+- The following expression will match requests coming from bots that include PII in an LLM prompt:
+
+  | Field            | Operator  | Value | Logic |
+  | ---------------- | --------- | ----- | ----- |
+  | LLM PII Detected | equals    | True  | And   |
+  | Bot Score        | less than | `10`  |       |
+
+  Expression when using the editor: <br/>
+  `(cf.llm.prompt.pii_detected and cf.bot_management.score lt 10)`
+
+</Details>
+
+## Fields
+
+When enabled, Firewall for AI populates the following fields:
+
+| Field name in the dashboard | Field                                                                                                           |
+| --------------------------- | --------------------------------------------------------------------------------------------------------------- |
+| LLM PII Detected            | [`cf.llm.prompt.pii_detected`](/ruleset-engine/rules-language/fields/reference/cf.llm.prompt.pii_detected/)     |
+| LLM PII Categories          | [`cf.llm.prompt.pii_categories`](/ruleset-engine/rules-language/fields/reference/cf.llm.prompt.pii_categories/) |
+| LLM Content Detected        | [`cf.llm.prompt.detected`](/ruleset-engine/rules-language/fields/reference/cf.llm.prompt.detected/)             |
+
+For a list of PII categories, refer to the [`cf.llm.prompt.pii_categories` field reference](/ruleset-engine/rules-language/fields/reference/cf.llm.prompt.pii_categories/).
diff --git a/src/content/docs/waf/detections/link-bots.mdx b/src/content/docs/waf/detections/link-bots.mdx
@@ -3,5 +3,5 @@ pcx_content_type: navigation
 title: Bot score
 external_link: /bots/concepts/bot-score/
 sidebar:
-  order: 4
+  order: 6
 ---
diff --git a/src/content/docs/waf/detections/malicious-uploads/index.mdx b/src/content/docs/waf/detections/malicious-uploads/index.mdx
@@ -2,7 +2,7 @@
 title: Malicious uploads detection
 pcx_content_type: concept
 sidebar:
-  order: 3
+  order: 4
   group:
     label: Malicious uploads
 ---
@@ -75,8 +75,8 @@ In these situations, configure a custom scan expression to tell the content scan
 
 When content scanning is enabled, you can use the following fields in WAF rules:
 
-| Field name in the dashboard                                                                | Field name in expressions                                                                                                           |
-| ------------------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------- |
+| Field name in the dashboard                                                                | Field name in expressions                                                                                                         |
+| ------------------------------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------- |
 | Has content object                                                                         | [`cf.waf.content_scan.has_obj`](/ruleset-engine/rules-language/fields/reference/cf.waf.content_scan.has_obj/)                     |
 | Has malicious content object                                                               | [`cf.waf.content_scan.has_malicious_obj`](/ruleset-engine/rules-language/fields/reference/cf.waf.content_scan.has_malicious_obj/) |
 | Number of malicious content objects                                                        | [`cf.waf.content_scan.num_malicious_obj`](/ruleset-engine/rules-language/fields/reference/cf.waf.content_scan.num_malicious_obj/) |
diff --git a/src/content/fields/index.yaml b/src/content/fields/index.yaml
@@ -1091,6 +1091,86 @@ entries:
     description: |-
       Requires a Cloudflare Enterprise plan. You must also enable [leaked credentials detection](/waf/detections/leaked-credentials/).
 
+  - name: cf.llm.prompt.detected
+    data_type: Boolean
+    categories: [Request]
+    keywords: [request, cloudflare, ai, client, visitor]
+    plan_info_label: Enterprise
+    summary: Indicates whether Cloudflare detected an LLM prompt in the incoming request.
+    description: |-
+      When a prompt is not present, the other LLM-related fields will have default values.
+
+      Requires a Cloudflare Enterprise plan. You must also enable [Firewall for AI](/waf/detections/firewall-for-ai/).
+
+  - name: cf.llm.prompt.pii_detected
+    data_type: Boolean
+    categories: [Request]
+    keywords: [request, cloudflare, ai, client, visitor]
+    plan_info_label: Enterprise
+    summary: Indicates whether any personally identifiable information (PII) has been detected in the LLM prompt included in the request.
+    description: |-
+      Equivalent to checking if the [`cf.llm.prompt.pii_categories`](/ruleset-engine/rules-language/fields/reference/cf.llm.prompt.pii_categories/) field is not empty.
+
+      Requires a Cloudflare Enterprise plan. You must also enable [Firewall for AI](/waf/detections/firewall-for-ai/).
+
+  - name: cf.llm.prompt.pii_categories
+    data_type: Array<String>
+    categories: [Request]
+    keywords: [request, cloudflare, ai, client, visitor]
+    plan_info_label: Enterprise
+    summary: Array of string values with the personally identifiable information (PII) categories found in the LLM prompt included in the request.
+    description: |-
+      The possible values are the following:
+
+      Category                    | Description
+      ----------------------------|-----------------------------------------------------------------------------------------------------------------------------------------
+      `CREDIT_CARD`               | Credit card number
+      `CRYPTO`                    | Crypto wallet number (currently only Bitcoin address)
+      `DATE_TIME`                 | Absolute or relative dates or periods or times smaller than a day
+      `EMAIL_ADDRESS`             | Email address
+      `IBAN_CODE`                 | International Bank Account Number (IBAN)
+      `IP_ADDRESS`                | Internet Protocol (IP) address
+      `NRP`                       | A person's nationality, religious or political group
+      `LOCATION`                  | Name of politically or geographically defined location (cities, provinces, countries, international regions, bodies of water, mountains)
+      `PERSON`                    | Full person name
+      `PHONE_NUMBER`              | Telephone number
+      `MEDICAL_LICENSE`           | Common medical license numbers
+      `URL`                       | Uniform Resource Locator (URL), used to locate a resource on the Internet
+      `US_BANK_NUMBER`            | US bank account number
+      `US_DRIVER_LICENSE`         | US driver license
+      `US_ITIN`                   | US Individual Taxpayer Identification Number (ITIN)
+      `US_PASSPORT`               | US passport number
+      `US_SSN`                    | US Social Security Number (SSN)
+      `UK_NHS`                    | UK NHS number
+      `UK_NINO`                   | UK National Insurance Number
+      `ES_NIF`                    | Spanish NIF number (personal tax ID)
+      `ES_NIE`                    | Spanish NIE number (foreigners ID card)
+      `IT_FISCAL_CODE`            | Italian personal tax ID code
+      `IT_DRIVER_LICENSE`         | Italian driver license number
+      `IT_VAT_CODE`               | Italian VAT code number
+      `IT_PASSPORT`               | Italian passport number
+      `IT_IDENTITY_CARD`          | Italian identity card number
+      `PL_PESEL`                  | Polish PESEL number
+      `SG_NRIC_FIN`               | National Registration Identification Card (Singapore)
+      `SG_UEN`                    | Unique Entity Number (for entities registered in Singapore)
+      `AU_ABN`                    | Australian Business Number (ABN)
+      `AU_ACN`                    | Australian Company Number (ACN)
+      `AU_TFN`                    | Australian tax file number (TFN)
+      `AU_MEDICARE`               | Medicare number (issued by Australian government)
+      `IN_PAN`                    | Indian Permanent Account Number (PAN)
+      `IN_AADHAAR`                | Individual identity number (issued by Indian government)
+      `IN_VEHICLE_REGISTRATION`   | Vehicle registration number (issued by Indian government)
+      `IN_VOTER`                  | Numeric voter ID (issued by Indian Election Commission)
+      `IN_PASSPORT`               | Indian Passport Number
+      `FI_PERSONAL_IDENTITY_CODE` | Finnish Personal Identity Code
+
+      The categories list is based on the [list of PII entities supported by Presidio](https://microsoft.github.io/presidio/supported_entities/). Presidio is the data protection and de-identification SDK used in Firewall for AI.
+
+      Requires a Cloudflare Enterprise plan. You must also enable [Firewall for AI](/waf/detections/firewall-for-ai/).
+    example_block: |-
+      # Matches requests where PII categorized as "EMAIL_ADDRESS" or "IBAN_CODE" was detected:
+      (cf.llm.prompt.pii_detected and any(cf.llm.prompt.pii_categories[*] in {"EMAIL_ADDRESS" "IBAN_CODE"}))
+
   - name: cf.worker.upstream_zone
     data_type: String
     categories: [Request]
diff --git a/src/content/plans/index.json b/src/content/plans/index.json
@@ -1770,6 +1770,15 @@
 					"pro": "No",
 					"biz": "One field only",
 					"ent": "Yes"
+				},
+				"g_fw_for_ai": {
+					"title": "Firewall for AI (beta)",
+					"summary": "Enterprise-only",
+					"link": "/waf/detections/firewall-for-ai/",
+					"free": "No",
+					"pro": "No",
+					"biz": "No",
+					"ent": "Yes"
 				}
 			}
 		},