You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
AI Crawl Control: Reflect Dashboard Changes and Update Pay Per Crawl Site Owner Onboarding (#25081)
* AI Crawl Control: updated to reflect dash changes and pay per crawl onboarding
AI Crawl Control: removed duplicate FAQ
AI Crawl Control: fixed missing links
AI Crawl Control: fixed missing links
* Setting up redirects, PCX review.
* Nit fix
---------
Co-authored-by: Jun Lee <[email protected]>
| Total requests | The total number of requests to crawl your website, from all AI crawlers |
36
-
| Blocked requests | The number of crawler requests you have blocked, from any rule |
37
-
| Allowed requests | The number of crawler requests you have allowed |
38
-
| Hosts | The owner of the AI crawler |
39
-
| Overall popular paths | The most popular pages crawled by AI crawlers, from all AI crawlers |
40
-
| Most active AI crawlers by operators | The AI crawler owners with the highest number of requests to access your site |
41
-
| Request by AI crawlers | A graph which displays the number of crawl requests from each AI crawler |
42
-
| Most popular paths by AI crawlers | The most popular pages crawled by AI crawlers, for each AI crawler |
43
-
| Referrals | A graph which displays the number of referrals from each AI operator |
44
-
| Referers | The list of referers who directed traffic to your site |
45
-
46
-
:::note[Requests in AI Crawl Control metrics]
47
-
The number of requests in AI Crawl Control metrics are specifically requests which were met with HTTP code 200 (the request was successfully served, with actual content).
48
-
49
-
AI Crawl Control metrics filter all other HTTP codes.
| Total requests | The total number of requests to crawl your website, from all AI crawlers |
36
+
| Allowed requests | The number of crawler requests that received a successful response from your site |
37
+
| Unsuccessful requests | The number of crawler requests that failed (HTTP 4xx or 5xx) as a result of a blocked request, other security rules, or website errors such as a crawler attempting to access a non-existent page |
38
+
| Overall popular paths | The most popular pages crawled by AI crawlers, from all AI crawlers |
39
+
| Most active AI crawlers by operators | The AI crawler owners with the highest number of requests to access your site |
40
+
| Request by AI crawlers | A graph which displays the number of crawl requests from each AI crawler |
41
+
| Most popular paths by AI crawlers | The most popular pages crawled by AI crawlers, for each AI crawler |
42
+
| Referrals | A graph which displays the number of visits that were directed to your site from each AI operator |
43
+
| Referers | The list of referers who directed visits to your site |
Copy file name to clipboardExpand all lines: src/content/docs/ai-crawl-control/features/manage-ai-crawlers.mdx
+40-35Lines changed: 40 additions & 35 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -13,22 +13,22 @@ To manage AI crawlers:
13
13
14
14
1. Log in to the [Cloudflare dashboard](https://dash.cloudflare.com/), and select your account and domain.
15
15
2. Go to **AI Crawl Control**.
16
-
3. Go to the **AI Crawlers** tab.
16
+
3. Go to the **Crawlers** tab.
17
17
18
18
## Review AI crawler activity
19
19
20
20
The **Crawlers** tab displays a table of AI crawlers that are requesting access to your content, and how they interact with your pages. The table provides the following information.
| Crawler | The name of the AI crawler and the operator that owns it. |
25
-
| Category | The category of the AI crawler. Refer to [Verified bot categories](/bots/concepts/bot/verified-bots/#categories). |
26
-
| Requests |Total allowed and blocked requests with trend chart. Blocked requests may come from any configured rule, not just the actions shown here. |
27
-
| Robots.txt violations | The number of times the AI crawler has violated your <GlossaryTooltipterm="robots.txt">`robots.txt`</GlossaryTooltip> file. |
28
-
| Action | The action you wish to take for the AI crawler. Refer to [Take action for each AI crawler](/ai-crawl-control/features/manage-ai-crawlers/#take-action-for-each-ai-crawler). |
| Crawler | The name of the AI crawler and the operator that owns it. |
25
+
| Category | The category of the AI crawler. Refer to [Verified bot categories](/bots/concepts/bot/verified-bots/#categories). |
26
+
| Requests |The total number of allowed and unsuccessful requests, with trend chart. Unsuccessful requests may come from any rule or response error, not just the block action in AI Crawl Control.|
27
+
| Robots.txt violations | The number of times the AI crawler has violated your <GlossaryTooltipterm="robots.txt">`robots.txt`</GlossaryTooltip> file. |
28
+
| Action | The action you wish to take for the AI crawler. Refer to [Take action for each AI crawler](/ai-crawl-control/features/manage-ai-crawlers/#take-action-for-each-ai-crawler). |
29
29
30
30
:::note[Quality of AI crawler detection]
31
-
On the free plan, AI Crawl Control identifies AI crawlers based on their [user agent strings](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/User-Agent). This enables AI Crawl Control to detect easy-to-detect (well-known) AI crawlers.
31
+
On the free plan, AI Crawl Control identifies AI crawlers based on their [user agent strings](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/User-Agent). This enables AI Crawl Control to detect well-known, self-identifying AI crawlers.
32
32
33
33
Upgrade your plan to enable a more thorough detection using Cloudflare's [Bot Management detection ID](/bots/reference/bot-management-variables/#ruleset-engine-fields) field.
34
34
:::
@@ -46,31 +46,21 @@ The values of the table will update according to your filter.
46
46
## Take action for each AI crawler
47
47
48
48
<Tabs>
49
-
<TabItemlabel="With pay per crawl">
50
-
51
-
:::note[Pay per crawl closed beta]
52
-
Pay per crawl is currently in closed beta.
53
-
54
-
To find out how to join the beta program, reach out to us at [Pay per crawl signup](http://www.cloudflare.com/paypercrawl-signup/), or contact your account executive if you are an existing Enterprise customer.
55
-
56
-
To learn more about pay per crawl, refer to Cloudflare blog: [Introducing pay per crawl: enabling content owners to charge AI crawlers for access](https://blog.cloudflare.com/introducing-pay-per-crawl/).
57
-
:::
49
+
<TabItemlabel="Without pay per crawl">
58
50
59
-
For each AI crawler, you can take one of three actions: allow, charge, or block.
51
+
For each AI crawler, you can choose to allowor block access.
60
52
61
53
<Exampletitle="Allow access">
62
54
63
55
-**Summary:** You can allow an AI crawler to scrape your content.
64
56
-**When to use:** Allow AI crawlers that offer services which provide value through citations, referrals, or existing agreements.
65
57
-**Implementation:** From the **Actions** column, select **Allow**.
66
-
Note that you can still choose to [Enforce `robots.txt`](/ai-crawl-control/features/manage-ai-crawlers/#take-action-for-each-ai-crawler).
67
58
68
-
For more details on how this rule interacts with other Cloudflare settings, refer to [How it works](/bots/concepts/bot/#how-it-works).
59
+
Note that you can still choose to [Enforce `robots.txt`](/ai-crawl-control/features/manage-ai-crawlers/#take-action-for-each-ai-crawler).
69
60
70
61
</Example>
71
62
72
63
<Exampletitle="Block access">
73
-
74
64
-**Summary:** You can block an AI crawler to completely stop the AI crawler from scraping your webpage.
75
65
-**When to use:** Block AI crawlers when their behavior do not align with your content strategy, or violate your policies.
76
66
-**Implementation:** From the **Actions** column, select **Block**.
@@ -79,32 +69,32 @@ Note that you can configure the response that gets returned when blocking an AI
79
69
80
70
</Example>
81
71
82
-
<Exampletitle="Charge for crawl (private beta)">
83
-
84
-
-**Summary:** You can charge the owner of the AI crawler for each crawl request.
85
-
-**When to use:** Charge AI crawlers when your content has training value, and you want to explore monetization options
86
-
-**Implementation:** From the **Actions** column, select **Charge**.
72
+
</TabItem>
73
+
<TabItemlabel="With pay per crawl">
87
74
88
-
For more information, refer to [What is Pay Per Crawl?](/ai-crawl-control/features/pay-per-crawl/what-is-pay-per-crawl/).
75
+
:::note[Pay per crawl closed beta]
76
+
Pay per crawl is currently in closed beta.
89
77
90
-
</Example>
78
+
To find out how to join the beta program, reach out to us at [Pay per crawl signup](http://www.cloudflare.com/paypercrawl-signup/), or contact your account executive if you are an existing Enterprise customer.
91
79
92
-
</TabItem>
93
-
<TabItemlabel="Without pay per crawl">
80
+
To learn more about pay per crawl, refer to Cloudflare blog: [Introducing pay per crawl: enabling content owners to charge AI crawlers for access](https://blog.cloudflare.com/introducing-pay-per-crawl/).
81
+
:::
94
82
95
-
For each AI crawler, you can choose to allowor block access.
83
+
For each AI crawler, you can take one of three actions: allow, charge, or block.
96
84
97
85
<Exampletitle="Allow access">
98
86
99
87
-**Summary:** You can allow an AI crawler to scrape your content.
100
88
-**When to use:** Allow AI crawlers that offer services which provide value through citations, referrals, or existing agreements.
101
89
-**Implementation:** From the **Actions** column, select **Allow**.
90
+
Note that you can still choose to [Enforce `robots.txt`](/ai-crawl-control/features/manage-ai-crawlers/#take-action-for-each-ai-crawler).
102
91
103
-
Note that you can still choose to [Enforce `robots.txt`](/ai-crawl-control/features/manage-ai-crawlers/#take-action-for-each-ai-crawler).
92
+
For more details on how this rule interacts with other Cloudflare settings, refer to [How it works](/bots/concepts/bot/#how-it-works).
104
93
105
94
</Example>
106
95
107
96
<Exampletitle="Block access">
97
+
108
98
-**Summary:** You can block an AI crawler to completely stop the AI crawler from scraping your webpage.
109
99
-**When to use:** Block AI crawlers when their behavior do not align with your content strategy, or violate your policies.
110
100
-**Implementation:** From the **Actions** column, select **Block**.
@@ -113,9 +103,24 @@ Note that you can configure the response that gets returned when blocking an AI
113
103
114
104
</Example>
115
105
106
+
<Exampletitle="Charge for crawl (private beta)">
107
+
108
+
-**Summary:** You can charge the owner of the AI crawler for each successful crawl request.
109
+
-**When to use:** Charge AI crawlers when your content has training value, and you want to explore monetization options.
110
+
-**Implementation:** From the **Actions** column, select **Charge**.
111
+
112
+
For more information, refer to [What is Pay Per Crawl?](/ai-crawl-control/features/pay-per-crawl/what-is-pay-per-crawl/).
113
+
114
+
</Example>
115
+
116
116
</TabItem>
117
+
117
118
</Tabs>
118
119
120
+
:::tip[Need more advanced control?]
121
+
You can also create more complex rules when taking action on AI crawlers, using [Cloudflare WAF](/waf/). For more information on creating more specific rules, refer to [Create a custom rule in the dashboard](/waf/custom-rules/create-dashboard/).
122
+
:::
123
+
119
124
## Configure block response
120
125
121
126
<Plantype="paid" />
@@ -147,8 +152,8 @@ You can choose which HTTP response code to return when blocking an AI crawler.
147
152
148
153
Use the dropdown menu to select the desired response code. You can choose from:
149
154
150
-
-`403 Forbidden`: Use this option if you wish to indicate that you do not want the AI crawler to access your content.
151
-
-`402 Payment Required`: Use this option if you wish to indicate that the AI crawler must pay to access your content.
155
+
-`403 Forbidden`: Use this option if you wish to indicate that you do not want the AI crawler to access your content.
156
+
-`402 Payment Required`: Use this option if you wish to indicate that the AI crawler must pay to access your content.
152
157
153
158
:::note
154
159
Behind the scenes, AI Crawl Control uses [Cloudflare WAF](/waf/) to return custom block responses.
### Can I set different prices for different AI crawlers?
13
+
14
+
No. Pay per crawl allows you to configure different actions (Block, Charge, or Allow) for each crawler, but you can only set a single price that applies to all crawlers configured with the "Charge" option.
15
+
16
+
## Frequently asked questions for AI bot operators
17
+
18
+
### Will I be charged for re-crawling the same page?
19
+
20
+
Yes. Every time your AI crawler accesses content on a website protected with pay per crawl, it will incur the cost set by the site owner. You should implement mechanisms within your crawler to track expenditure and enforce any spending limits you want to set.
21
+
22
+
### Am I charged for error responses?
23
+
24
+
No. Charging events are only triggered for successful HTTP response codes. Error responses are not billed, even if you have sent the `crawler-exact-price` or `crawler-max-price` headers.
25
+
26
+
### What user agent should I use?
27
+
28
+
Use the standard user agents associated with your AI crawler that you have onboarded to Cloudflare and identified through Web Bot Auth.
0 commit comments