Skip to content

Commit 7c0d166

Browse files
cloudflareliamliamreesepedrosousa
authored
[WAF] AI Security for Apps docs update (#28756)
* Restructure Firewall for AI (beta) documentation into a new section for AI Security for Apps (GA) --------- Co-authored-by: liamreese <liamreese@cloudflare.com> Co-authored-by: Pedro Sousa <680496+pedrosousa@users.noreply.github.com>
1 parent c6d8822 commit 7c0d166

File tree

22 files changed

+997
-142
lines changed

22 files changed

+997
-142
lines changed

public/__redirects

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2559,6 +2559,7 @@
25592559
# WAF
25602560
/waf/managed-rulesets/* /waf/managed-rules/:splat 301
25612561
/waf/custom-rulesets/* /waf/account/custom-rulesets/:splat 301
2562+
/waf/detections/firewall-for-ai/* /waf/detections/ai-security-for-apps/:splat 301
25622563
/waf/exposed-credentials-check/* /waf/managed-rules/check-for-exposed-credentials/:splat 301
25632564
/waf/security-events/* /waf/analytics/security-events/:splat 301
25642565
/waf/change-log/2019-* /waf/change-log/ 301
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
id: hLp4ni
2+
name: ai-security-for-apps
3+
4+
entry:
5+
title: AI Security for Apps
6+
url: /waf/detections/ai-security-for-apps/
7+
show: false
8+
9+
meta:
10+
title: Cloudflare AI Security for Apps docs
11+
description: AI Security for Apps is a detection that helps protect services powered by large language models against abuse, including PII data leaks, unsafe prompts, and prompt injection attacks.
12+
author: "@cloudflare"

src/content/directory/firewall-for-ai.yaml

Lines changed: 0 additions & 12 deletions
This file was deleted.

src/content/docs/data-localization/compatibility.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -63,11 +63,11 @@ The table below provides a summary of the Data Localization Suite product's beha
6363
| Cloudflare Images | ⚫️ |[^36] | 🚧 [^35] |
6464
| AI Gateway ||| 🚧 [^39] |
6565
| AI Search |[^46] |[^47] | 🚧 [^48] |
66+
| AI Security for Apps ||||
6667
| Cloudflare Pages |[^11] |[^11] | 🚧 [^1] |
6768
| Cloudflare D1 | ⚫️ | ⚫️ | 🚧 [^40] |
6869
| Durable Objects | ⚫️ |[^7] | 🚧 [^1] |
6970
| Email Routing | ⚫️ | ⚫️ ||
70-
| Firewall for AI ||||
7171
| Remote MCP Server |[^44] |[^45] | 🚧 [^1] |
7272
| R2 |[^27] |[^8] |[^28] |
7373
| Smart Placement | ⚫️ |||

src/content/docs/security/settings.mdx

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ In the **Web application exploits** security category you can manage the followi
2929
- [Sensitive data detection](/waf/managed-rules/reference/sensitive-data-detection/)
3030
- [Cloudflare managed ruleset](/waf/managed-rules/reference/cloudflare-managed-ruleset/)
3131
- [OWASP Core](/waf/managed-rules/reference/owasp-core-ruleset/) ruleset
32-
- [Firewall for AI](/waf/detections/firewall-for-ai/)
32+
- [AI Security for Apps](/waf/detections/ai-security-for-apps/)
3333
- [Under Attack mode](/fundamentals/reference/under-attack-mode/) in Security Level
3434
- Managed [security.txt](/security-center/infrastructure/security-file/)
3535

@@ -61,7 +61,7 @@ Additionally, you can manage the following settings:
6161
- [Browser Integrity Check](/waf/tools/browser-integrity-check/)
6262
- [Challenge Passage](/cloudflare-challenges/challenge-types/challenge-pages/challenge-passage/)
6363
- [Cloudflare managed ruleset](/waf/managed-rules/reference/cloudflare-managed-ruleset/)
64-
- [Firewall for AI](/waf/detections/firewall-for-ai/)
64+
- [AI Security for Apps](/waf/detections/ai-security-for-apps/)
6565
- [Schema learning](/api-shield/management-and-monitoring/endpoint-management/schema-learning/)
6666
- [Schema validation](/api-shield/security/schema-validation/) (requires you to upload a schema or apply a learned schema)
6767
- [Under Attack mode](/fundamentals/reference/under-attack-mode/) (under Security Level)
@@ -117,6 +117,7 @@ The following table links to additional information about each available setting
117117
| Setting | Location in previous dashboard navigation |
118118
| --------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
119119
| [AI Labyrinth](/bots/additional-configurations/ai-labyrinth/) | **Security** > **Bots** > **Configure Bot Fight Mode<br/>Security** > **Bots** > **Configure Super Bot Fight Mode<br/>Security** > **Bots** > **Configure Bot Management** |
120+
| [AI Security for Apps](/waf/detections/ai-security-for-apps/) | _N/A_ |
120121
| [Block AI Bots](/bots/concepts/bot/#ai-bots) | **Security** > **Bots** > **Configure Bot Fight Mode<br/>Security** > **Bots** > **Configure Super Bot Fight Mode<br/>Security** > **Bots** > **Configure Bot Management** |
121122
| [Bot Management](/bots/get-started/bot-management/): | **Security** > **Bots** |
122123
|[JS detections](/bots/additional-configurations/javascript-detections/) | **Security** > **Bots** > **Configure Super Bot Fight Mode<br/>Security** > **Bots** > **Configure Bot Management** |
@@ -135,7 +136,6 @@ The following table links to additional information about each available setting
135136
| [Endpoint discovery](/api-shield/security/api-discovery/): | **API Shield** > **Discovery** |
136137
|[Session identifiers](/api-shield/management-and-monitoring/session-identifiers/) | **Security** > **API Shield** > **Settings** |
137138
| [Endpoint labels](/api-shield/management-and-monitoring/endpoint-labels/) | **Security** > **Settings** > **Labels** |
138-
| [Firewall for AI](/waf/detections/firewall-for-ai/) | _N/A_ |
139139
| [Hotlink Protection](/waf/tools/scrape-shield/hotlink-protection/) | **Scrape Shield** |
140140
| [HTTP DDoS attack protection](/ddos-protection/managed-rulesets/http/): | **Security** > **DDoS** |
141141
|[Configure overrides](/ddos-protection/managed-rulesets/http/http-overrides/configure-dashboard/) | **Security** > **DDoS** |

src/content/docs/waf/analytics/security-analytics.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -104,7 +104,7 @@ The suspicious activity section gives you information about suspicious requests
104104
- [Leaked credential check](/waf/detections/leaked-credentials/) (only for user and password leaked)
105105
- [Malicious uploads](/waf/detections/malicious-uploads/)
106106
- [WAF attack score](/waf/detections/attack-score/)
107-
- [Firewall for AI](/waf/detections/firewall-for-ai/)
107+
- [AI Security for Apps](/waf/detections/ai-security-for-apps/)
108108

109109
Each suspicious activity is classified with a severity score that can vary from critical to low. You can use the filter option to investigate further.
110110

src/content/docs/waf/concepts.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ The WAF currently provides the following detections for finding security threats
4747
- [**Attack score**](/waf/detections/attack-score/): Checks for known attack variations and malicious payloads. Scores traffic on a scale from 1 (likely to be malicious) to 99 (unlikely to be malicious).
4848
- [**Leaked credentials**](/waf/detections/leaked-credentials/): Scans incoming requests for credentials (usernames and passwords) previously leaked from data breaches.
4949
- [**Malicious uploads**](/waf/detections/malicious-uploads/): Scans content objects, such as uploaded files, for malicious signatures like malware.
50-
- [**Firewall for AI**](/waf/detections/firewall-for-ai/): Helps protect your services powered by large language models (LLMs) against abuse.
50+
- [**AI Security for Apps**](/waf/detections/ai-security-for-apps/): Helps protect your services powered by large language models (LLMs) against abuse.
5151
- [**Bot score**](/bots/concepts/bot-score/): Scores traffic on a scale from 1 (likely to be a bot) to 99 (likely to be human).
5252

5353
To enable traffic detections in the Cloudflare dashboard, go to the Security **Settings** page.
Lines changed: 141 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,141 @@
1+
---
2+
pcx_content_type: configuration
3+
title: Example mitigation rules
4+
tags:
5+
- AI
6+
sidebar:
7+
order: 8
8+
---
9+
10+
## Return a custom error when a user asks about violent or hateful content
11+
12+
A customer support chatbot should not engage with prompts about violent crimes or hate speech. This [custom rule](/waf/custom-rules/create-dashboard/) blocks the request and returns a JSON response that your application can parse and display to the user.
13+
14+
- **When incoming requests match**:
15+
16+
| Field | Operator | Value |
17+
| --------------------------- | -------- | -------------------------------- |
18+
| LLM Unsafe topic categories | is in | `S1: Violent Crimes` `S10: Hate` |
19+
20+
Expression when using the editor:<br />
21+
`(any(cf.llm.prompt.unsafe_topic_categories[*] in {"S1" "S10"}))`
22+
23+
- **Action**: _Block_
24+
- **With response type**: Custom JSON
25+
- **Response body**:
26+
```txt
27+
{ "error": "content_policy", "message": "Your message could not be processed because it touches on a topic outside this assistant's scope. Please rephrase your question." }
28+
```
29+
30+
Your application can check for a non-200 response and display the `message` field to the user, keeping the experience conversational instead of showing a raw block page.
31+
32+
## Block prompt injection attempts from automated sources outside your country
33+
34+
This rule combines AI Security for Apps's [injection score](/waf/detections/ai-security-for-apps/prompt-injection/) with [Bot Management](/bots/get-started/) and the request's country to focus on high-confidence attacks from automated sources. This layered approach significantly reduces false positives compared to using any single signal alone.
35+
36+
- **When incoming requests match**:
37+
38+
Enter the following expression in the editor:<br />
39+
`(cf.llm.prompt.injection_score lt 25 and cf.bot_management.score lt 10 and ip.geoip.country ne "US")`
40+
41+
- **Action**: _Block_
42+
43+
The rule targets requests that are simultaneously:
44+
45+
1. Likely prompt injection attempts (score below 25).
46+
2. Coming from automated tooling, not a real browser (bot score below 10).
47+
3. Originating from outside the US — adjust the country code to match where your users are.
48+
49+
Any single signal might produce false positives on its own. Together, they identify a pattern strongly associated with automated prompt injection attacks.
50+
51+
## Allow financial PII only from your internal network
52+
53+
A financial services application legitimately handles credit card and bank account numbers from internal agents, but should block those PII types from external users. This rule uses the request's [autonomous system number (ASN)](/ruleset-engine/rules-language/fields/reference/ip.src.asnum/) to distinguish internal traffic from public traffic.
54+
55+
- **When incoming requests match**:
56+
57+
Enter the following expression in the editor:<br />
58+
`(any(cf.llm.prompt.pii_categories[*] in {"CREDIT_CARD" "US_BANK_NUMBER" "IBAN_CODE"}) and ip.src.asnum ne 13335)`
59+
60+
Replace `13335` with your organization's ASN.
61+
62+
- **Action**: _Block_
63+
- **With response type**: Custom JSON
64+
- **Response body**:
65+
```txt
66+
{ "error": "pii_blocked", "message": "Financial account information cannot be submitted from external networks. If you are an internal agent, connect to the corporate network and try again." }
67+
```
68+
69+
Internal agents on your corporate network (identified by ASN) can submit financial PII to the AI assistant as part of their workflow, while external users are blocked. You could further refine this by combining with [Access](/cloudflare-one/access-controls/policies/) service tokens or [mTLS](/ssl/client-certificates/) for stronger identity verification.
70+
71+
## Handle block responses in your application
72+
73+
When a WAF rule blocks a request, Cloudflare sends the block response back to your application — not to the end user. Your application needs to handle that response and decide what to show. Without error handling, your users may see a raw HTML error page or a broken UI.
74+
75+
Here are two things you can do to keep the experience smooth.
76+
77+
### Set a fallback message
78+
79+
Define a friendly default message that your application displays whenever it receives a non-successful response. This works regardless of how the block rule is configured — including the default Cloudflare block page, which returns HTML that would otherwise break a JSON-based chat UI.
80+
81+
```js
82+
// Define a user-friendly fallback message. This is what the user will see
83+
// any time the request is blocked or something unexpected happens.
84+
const FALLBACK = "Sorry, I can't process that request. Please try rephrasing.";
85+
86+
const resp = await fetch("/api/chat", {
87+
method: "POST",
88+
headers: { "Content-Type": "application/json" },
89+
body: JSON.stringify({ prompt: userMessage }),
90+
});
91+
92+
// If the response is not 2xx, show the fallback instead of trying to parse
93+
// the body. This safely handles the default Cloudflare block page (which is
94+
// HTML) without breaking your UI.
95+
if (!resp.ok) {
96+
await resp.text(); // consume the body so the connection is released
97+
showError(FALLBACK);
98+
return;
99+
}
100+
101+
const data = await resp.json();
102+
showMessage(data.message);
103+
```
104+
105+
### Display custom error messages from the WAF
106+
107+
For more control, configure your block rules with a [custom JSON response](/waf/custom-rules/create-dashboard/#configure-a-custom-response-for-blocked-requests) — for example, `{ "message": "That question is outside this assistant's scope." }`. Your application can then parse the response and show the custom message when available, falling back to the default when it is not.
108+
109+
```js
110+
const FALLBACK = "Sorry, I can't process that request. Please try rephrasing.";
111+
112+
const resp = await fetch("/api/chat", {
113+
method: "POST",
114+
headers: { "Content-Type": "application/json" },
115+
body: JSON.stringify({ prompt: userMessage }),
116+
});
117+
118+
if (!resp.ok) {
119+
// Check the content type to determine if the response contains a custom
120+
// JSON error from your WAF rule, or something else (like the default
121+
// Cloudflare HTML block page, or a DDoS / Bot Management challenge).
122+
const ct = (resp.headers.get("content-type") || "").toLowerCase();
123+
124+
if (ct.includes("application/json")) {
125+
// The WAF returned your custom JSON response. Parse it and show the
126+
// message you configured in the rule. Fall back to the default if the
127+
// field is missing or empty.
128+
const data = await resp.json();
129+
showError(data.message || FALLBACK);
130+
} else {
131+
// The response is not JSON — most likely the default Cloudflare HTML
132+
// block page. Discard the body and show the friendly fallback.
133+
await resp.text();
134+
showError(FALLBACK);
135+
}
136+
return;
137+
}
138+
139+
const data = await resp.json();
140+
showMessage(data.message);
141+
```

0 commit comments

Comments
 (0)