Add injection score field

pedrosousa · pedrosousa · commit a3fba1a38aa0 · 2025-08-19T14:45:34.000+01:00
diff --git a/src/content/docs/waf/detections/firewall-for-ai.mdx b/src/content/docs/waf/detections/firewall-for-ai.mdx
@@ -16,6 +16,7 @@ Firewall for AI is a detection that can help protect your services powered by la
 
 - Prevent data leaks of personally identifiable information (PII) — for example, phone numbers, email addresses, social security numbers, and credit card numbers.
 - Detect and moderate unsafe or harmful prompts – for example, prompts potentially related to violent crimes.
+- Detect prompts intentionally designed to subvert the intended behavior of the LLM as specified by the developer – for example, prompt injection attacks and jailbreaking attempts.
 
 When enabled, the detection runs on incoming traffic, searching for any LLM prompts attempting to exploit the model.
 
@@ -124,14 +125,15 @@ When enabled, Firewall for AI populates the following fields:
 | LLM Content Detected        | `Boolean`       | [`cf.llm.prompt.detected`](/ruleset-engine/rules-language/fields/reference/cf.llm.prompt.detected/)                               |
 | LLM Unsafe topic detected   | `Boolean`       | [`cf.llm.prompt.unsafe_topic_detected`](/ruleset-engine/rules-language/fields/reference/cf.llm.prompt.unsafe_topic_detected/)     |
 | LLM Unsafe topic categories | `Array<String>` | [`cf.llm.prompt.unsafe_topic_categories`](/ruleset-engine/rules-language/fields/reference/cf.llm.prompt.unsafe_topic_categories/) |
+| LLM Injection score         | `Number`        | [`cf.llm.prompt.injection_score`](/ruleset-engine/rules-language/fields/reference/cf.llm.prompt.injection_score/)                 |
 
 For a list of PII categories, refer to the [`cf.llm.prompt.pii_categories`](/ruleset-engine/rules-language/fields/reference/cf.llm.prompt.pii_categories/) field reference.
 
 For a list of unsafe topic categories, refer to the [`cf.llm.prompt.unsafe_topic_categories`](/ruleset-engine/rules-language/fields/reference/cf.llm.prompt.unsafe_topic_categories/) field reference.
 
 ## Example mitigation rules
 
-### Block requests with specific PII category in prompt
+### Block requests with specific PII category in LLM prompt
 
 The following example [custom rule](/waf/custom-rules/create-dashboard/) will block requests with an LLM prompt that tries to obtain PII of a specific [category](/ruleset-engine/rules-language/fields/reference/cf.llm.prompt.pii_categories/):
 
@@ -146,7 +148,7 @@ The following example [custom rule](/waf/custom-rules/create-dashboard/) will bl
 
 - **Action**: _Block_
 
-### Block requests with specific unsafe content categories in prompt
+### Block requests with specific unsafe content categories in LLM prompt
 
 The following example [custom rule](/waf/custom-rules/create-dashboard/) will block requests with an LLM prompt containing unsafe content of specific [categories](/ruleset-engine/rules-language/fields/reference/cf.llm.prompt.unsafe_topic_categories/):
 
@@ -160,3 +162,18 @@ The following example [custom rule](/waf/custom-rules/create-dashboard/) will bl
   `(any(cf.llm.prompt.unsafe_topic_categories[*] in {"S1" "S10"}))`
 
 - **Action**: _Block_
+
+### Block requests with prompt injection attempt in LLM prompt
+
+The following example [custom rule](/waf/custom-rules/create-dashboard/) will block requests with an [injection score](/ruleset-engine/rules-language/fields/reference/cf.llm.prompt.injection_score/) below `20`. Using a low injection score value in the rule helps avoid false positives.
+
+- **If incoming requests match**:
+
+  | Field               | Operator  | Value |
+  | ------------------- | --------- | ----- |
+  | LLM Injection score | less than | `20`  |
+
+  If you use the Expression Editor, enter the following expression:<br />
+  `(cf.llm.prompt.injection_score < 20)`
+
+- **Action**: _Block_
diff --git a/src/content/fields/index.yaml b/src/content/fields/index.yaml
@@ -995,6 +995,8 @@ entries:
     plan_info_label: Enterprise
     summary: A global score from 1–99 that combines the score of each WAF attack vector into a single score.
     description: |-
+      The special score `100` indicates that Cloudflare did not score the request.
+
       This is the standard [WAF attack score](/waf/detections/attack-score/) to detect variants of attack patterns.
 
       Requires a Cloudflare Enterprise plan. You must also enable [attack score detection](/waf/detections/attack-score/).
@@ -1240,6 +1242,20 @@ entries:
 
       Requires a Cloudflare Enterprise plan. You must also enable [Firewall for AI](/waf/detections/firewall-for-ai/).
 
+  - name: cf.llm.prompt.injection_score
+    data_type: Number
+    categories: [Request]
+    keywords: [request, cloudflare, ai, client, visitor]
+    plan_info_label: Enterprise
+    summary: A score from 1–99 that represents the likelihood that the LLM prompt in the request is trying to perform a prompt injection attack.
+
+    description: |-
+      A low score (for example, below `20`) indicates that there is a high probability that the LLM prompt in the request is trying to perform a prompt injection attack.
+
+      The special score `100` indicates that Cloudflare did not score the request.
+
+      Requires a Cloudflare Enterprise plan. You must also enable [Firewall for AI](/waf/detections/firewall-for-ai/).
+
   - name: cf.worker.upstream_zone
     data_type: String
     categories: [Request]