Skip to content

Commit a3fba1a

Browse files
committed
Add injection score field
1 parent e31376e commit a3fba1a

File tree

2 files changed

+35
-2
lines changed

2 files changed

+35
-2
lines changed

src/content/docs/waf/detections/firewall-for-ai.mdx

Lines changed: 19 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ Firewall for AI is a detection that can help protect your services powered by la
1616

1717
- Prevent data leaks of personally identifiable information (PII) — for example, phone numbers, email addresses, social security numbers, and credit card numbers.
1818
- Detect and moderate unsafe or harmful prompts – for example, prompts potentially related to violent crimes.
19+
- Detect prompts intentionally designed to subvert the intended behavior of the LLM as specified by the developer – for example, prompt injection attacks and jailbreaking attempts.
1920

2021
When enabled, the detection runs on incoming traffic, searching for any LLM prompts attempting to exploit the model.
2122

@@ -124,14 +125,15 @@ When enabled, Firewall for AI populates the following fields:
124125
| LLM Content Detected | `Boolean` | [`cf.llm.prompt.detected`](/ruleset-engine/rules-language/fields/reference/cf.llm.prompt.detected/) |
125126
| LLM Unsafe topic detected | `Boolean` | [`cf.llm.prompt.unsafe_topic_detected`](/ruleset-engine/rules-language/fields/reference/cf.llm.prompt.unsafe_topic_detected/) |
126127
| LLM Unsafe topic categories | `Array<String>` | [`cf.llm.prompt.unsafe_topic_categories`](/ruleset-engine/rules-language/fields/reference/cf.llm.prompt.unsafe_topic_categories/) |
128+
| LLM Injection score | `Number` | [`cf.llm.prompt.injection_score`](/ruleset-engine/rules-language/fields/reference/cf.llm.prompt.injection_score/) |
127129

128130
For a list of PII categories, refer to the [`cf.llm.prompt.pii_categories`](/ruleset-engine/rules-language/fields/reference/cf.llm.prompt.pii_categories/) field reference.
129131

130132
For a list of unsafe topic categories, refer to the [`cf.llm.prompt.unsafe_topic_categories`](/ruleset-engine/rules-language/fields/reference/cf.llm.prompt.unsafe_topic_categories/) field reference.
131133

132134
## Example mitigation rules
133135

134-
### Block requests with specific PII category in prompt
136+
### Block requests with specific PII category in LLM prompt
135137

136138
The following example [custom rule](/waf/custom-rules/create-dashboard/) will block requests with an LLM prompt that tries to obtain PII of a specific [category](/ruleset-engine/rules-language/fields/reference/cf.llm.prompt.pii_categories/):
137139

@@ -146,7 +148,7 @@ The following example [custom rule](/waf/custom-rules/create-dashboard/) will bl
146148

147149
- **Action**: _Block_
148150

149-
### Block requests with specific unsafe content categories in prompt
151+
### Block requests with specific unsafe content categories in LLM prompt
150152

151153
The following example [custom rule](/waf/custom-rules/create-dashboard/) will block requests with an LLM prompt containing unsafe content of specific [categories](/ruleset-engine/rules-language/fields/reference/cf.llm.prompt.unsafe_topic_categories/):
152154

@@ -160,3 +162,18 @@ The following example [custom rule](/waf/custom-rules/create-dashboard/) will bl
160162
`(any(cf.llm.prompt.unsafe_topic_categories[*] in {"S1" "S10"}))`
161163

162164
- **Action**: _Block_
165+
166+
### Block requests with prompt injection attempt in LLM prompt
167+
168+
The following example [custom rule](/waf/custom-rules/create-dashboard/) will block requests with an [injection score](/ruleset-engine/rules-language/fields/reference/cf.llm.prompt.injection_score/) below `20`. Using a low injection score value in the rule helps avoid false positives.
169+
170+
- **If incoming requests match**:
171+
172+
| Field | Operator | Value |
173+
| ------------------- | --------- | ----- |
174+
| LLM Injection score | less than | `20` |
175+
176+
If you use the Expression Editor, enter the following expression:<br />
177+
`(cf.llm.prompt.injection_score < 20)`
178+
179+
- **Action**: _Block_

src/content/fields/index.yaml

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -995,6 +995,8 @@ entries:
995995
plan_info_label: Enterprise
996996
summary: A global score from 1–99 that combines the score of each WAF attack vector into a single score.
997997
description: |-
998+
The special score `100` indicates that Cloudflare did not score the request.
999+
9981000
This is the standard [WAF attack score](/waf/detections/attack-score/) to detect variants of attack patterns.
9991001
10001002
Requires a Cloudflare Enterprise plan. You must also enable [attack score detection](/waf/detections/attack-score/).
@@ -1240,6 +1242,20 @@ entries:
12401242
12411243
Requires a Cloudflare Enterprise plan. You must also enable [Firewall for AI](/waf/detections/firewall-for-ai/).
12421244
1245+
- name: cf.llm.prompt.injection_score
1246+
data_type: Number
1247+
categories: [Request]
1248+
keywords: [request, cloudflare, ai, client, visitor]
1249+
plan_info_label: Enterprise
1250+
summary: A score from 1–99 that represents the likelihood that the LLM prompt in the request is trying to perform a prompt injection attack.
1251+
1252+
description: |-
1253+
A low score (for example, below `20`) indicates that there is a high probability that the LLM prompt in the request is trying to perform a prompt injection attack.
1254+
1255+
The special score `100` indicates that Cloudflare did not score the request.
1256+
1257+
Requires a Cloudflare Enterprise plan. You must also enable [Firewall for AI](/waf/detections/firewall-for-ai/).
1258+
12431259
- name: cf.worker.upstream_zone
12441260
data_type: String
12451261
categories: [Request]

0 commit comments

Comments
 (0)