Skip to content

Commit cbb9ef0

Browse files
committed
Merge branch 'production' into ai-bots-list
2 parents 24db84a + 942993f commit cbb9ef0

File tree

14 files changed

+323
-3
lines changed

14 files changed

+323
-3
lines changed

.github/CODEOWNERS

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -99,6 +99,7 @@
9999
/src/content/docs/d1/ @elithrar @rozenmd @vy-ton @joshthoward @oxyjun @harshil1712 @cloudflare/pcx-technical-writing
100100
/src/content/release-notes/d1.yaml @elithrar @rozenmd @vy-ton @joshthoward @oxyjun @cloudflare/pcx-technical-writing
101101
/src/content/partials/d1/ @elithrar @rozenmd @vy-ton @joshthoward @oxyjun @harshil1712 @cloudflare/pcx-technical-writing
102+
/src/content/docs/ai-audit/ @oxyjun @kodster28 @cloudflare/pcx-technical-writing
102103
/src/content/docs/durable-objects/ @elithrar @vy-ton @joshthoward @oxyjun @harshil1712 @cloudflare/pcx-technical-writing
103104
/src/content/release-notes/durable-objects.yaml @elithrar @rozenmd @vy-ton @joshthoward @oxyjun @cloudflare/pcx-technical-writing
104105
/src/content/docs/email-routing/ @cloudflare/pcx-technical-writing
109 KB
Loading
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
---
2+
title: AI Audit
3+
description: AI Audit is available to all customers
4+
date: 2024-09-23T11:00:00Z
5+
---
6+
7+
Every site on Cloudflare now has access to [**AI Audit**](/ai-audit/), which summarizes the crawling behavior of popular and known AI services.
8+
9+
You can use this data to:
10+
11+
- Understand how and how often crawlers access your site (and which content is the most popular).
12+
- Block some or all of the AI bots accessing your site.
13+
- Use Cloudflare to enforce your `robots.txt` policy via an automatic WAF rule.
14+
15+
![View AI bot activity with AI Audit](~/assets/images/changelog/ai-audit/ai-audit-overview.png)
16+
17+
To get started, explore [AI audit](/ai-audit/).
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
---
2+
pcx_content_type: changelog
3+
title: Changelog
4+
sidebar:
5+
order: 100
6+
---
7+
8+
import { ProductChangelog } from "~/components";
9+
10+
{/* <!-- Actual content lives in /src/content/changelog/ai-audit/. Update the file there for new entries to appear here. For more details, refer to https://developers.cloudflare.com/style-guide/documentation-content-strategy/content-types/changelog/#yaml-file --> */}
11+
12+
<ProductChangelog product="ai-audit" />
Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
---
2+
title: Detect AI crawlers
3+
pcx_content_type: concept
4+
sidebar:
5+
order: 2
6+
---
7+
8+
AI Audit metrics provides you with insight on how AI crawlers are interacting with your website.
9+
10+
## View AI Audit metrics
11+
12+
AI Audit provides you with the following metrics to help you understand how AI crawlers are interacting with your website.
13+
14+
| Metric | Description |
15+
| --------------------------------- | ------------------------------------------------------------------------ |
16+
| Request by AI crawlers | A graph which displays the number of crawl requests from each AI crawler |
17+
| Summary | A list of AI crawlers with the most number of crawl requests |
18+
| Most popular paths by AI crawlers | The most popular pages crawled by AI crawlers, for each AI crawler |
19+
20+
The **Summary** table also enables you to [Enforce your robots.txt](/ai-audit/features/enforce-robots-txt/).
21+
22+
## Filter AI crawler data
23+
24+
You can use filters to narrow the scope of your result.
25+
26+
- **Provider:** Filter by the AI crawler owners.
27+
- **Bot type:** Filter by the type of the AI bot (for example, AI crawler, AI assistant, or archiver).
28+
- **Date range:** Filter the date range of your results. You can choose from three predetermined date ranges:
29+
- Past 7 days
30+
- Past 14 days
31+
- Past month
32+
33+
The values of the AI Audit metrics will update according to your filter.
34+
35+
## Filter subdomains
36+
37+
You can use the subdomain filter to narrow the scope of your result.
38+
39+
From the dropdown, select either **All subdomains**, or the specific subdomain you wish to view.
40+
41+
Selecting a specific subdomain allows you to access:
42+
43+
- **Violations only** toggle: Toggles the AI Audit page to only display bots which are violating your configured rules.
44+
- [**Enforce robots.txt policy**](/ai-audit/features/enforce-robots-txt/): Ensure bots cannot access webpages which are off-limits, as specified in your `robots.txt` file.
45+
46+
The values of the AI Audit metrics will update according to your filter.
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
---
2+
title: Enforce robots.txt
3+
pcx_content_type: concept
4+
sidebar:
5+
order: 5
6+
---
7+
8+
import { Steps } from "~/components";
9+
10+
AI Audit allows you to enforce [`robots.txt`](/radar/glossary/#robotstxt) which instructs bots which webpages they can and cannot access.
11+
12+
To enforce `robots.txt`:
13+
14+
<Steps>
15+
1. Log in to the [Cloudflare dashboard](https://dash.cloudflare.com/), and select your account and domain.
16+
2. Go to **AI Audit**.
17+
3. From the dropdown at the top of the page, select a specific subdomain where you wish you enforce `robots.txt`.
18+
4. From **Summary**, select **Enforce robots.txt policy**.
19+
5. From the **Enforce your robots.txt policy** page, select **Go to WAF custom rules**.
20+
6. From the **New custom rule** page, name your custom rule.
21+
- The page will automatically populate the values for the custom rule.
22+
7. From **Then take action...**:
23+
- For **Choose action**, select **Block**.
24+
- For **With response type**, select **Default Cloudflare WAF block page**.
25+
8. From **Place at**:
26+
- For **Select order**, select **Last**.
27+
9. Select **Deploy**.
28+
</Steps>
29+
30+
This custom rule ensures that bots cannot access the pages specified in your `robots.txt` file.
31+
32+
## Related resources
33+
34+
For more information, refer to the following resources.
35+
36+
- [What is robots.txt? | How a robots.txt file works](https://www.cloudflare.com/en-gb/learning/bots/what-is-robots-txt/)
37+
- [Direct AI crawlers with managed robots.txt](/bots/additional-configurations/managed-robots-txt/)
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
---
2+
title: Features
3+
pcx_content_type: navigation
4+
sidebar:
5+
group:
6+
hideIndex: true
7+
order: 5
8+
---
9+
10+
import { DirectoryListing } from "~/components";
11+
12+
<DirectoryListing />
Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
---
2+
title: Get started
3+
pcx_content_type: get-started
4+
sidebar:
5+
order: 2
6+
group:
7+
badge: beta
8+
head:
9+
- tag: title
10+
content: Get started with Cloudflare AI Audit
11+
description: Learn how to set up AI Audit.
12+
---
13+
14+
import { Render, Steps } from "~/components";
15+
16+
This guide instructs you through
17+
18+
- Viewing AI crawlers that are interacting with your domain.
19+
- Creating a rule to block AI crawlers on your website.
20+
21+
## Prerequisites
22+
23+
1. Sign up for a [Cloudflare account](https://dash.cloudflare.com/sign-up/).
24+
2. [Connect your domain to Cloudflare](/fundamentals/manage-domains/add-site/).
25+
3. Make sure your domain is [proxying traffic through Cloudflare](/fundamentals/concepts/how-cloudflare-works/#cloudflare-as-a-reverse-proxy).
26+
27+
## 1. Block all AI crawlers
28+
29+
To use AI Audit:
30+
31+
{/* prettier-ignore */}
32+
<Steps>
33+
1. Log in to the [Cloudflare dashboard](https://dash.cloudflare.com/), and select your account and domain.
34+
2. Go to **AI Audit**.
35+
3. From **Most Popular Paths**, select **Block All**.
36+
4. From the **Bot traffic** page, under **Block AI Bots**, select **Enable**.
37+
</Steps>
38+
39+
## 2. Block specific bot categories (Enterprise plan only)
40+
41+
Customers on the Enterprise plan -- and with a [Bot Management subscription](/bots/plans/bm-subscription/) -- can choose to only block specific AI crawlers, while allowing others.
42+
43+
{/* prettier-ignore */}
44+
<Steps>
45+
1. Go to the **AI Audit**.
46+
2. From **Most Popular Paths**, select **Block Some**.
47+
3. From the **Security rules** page, select **Create rule** > **Custom rules**.
48+
4. Provide a name for the custom rule. For example, "Block unwanted AI crawlers".
49+
5. From the **Field** dropdown, select **Verified Bot Category**.
50+
6. From the **Value** dropdown, select the specific bot category you wish to block.
51+
- You can use **And** / **Or** buttons to add additional conditions. For example, you can use multiple **Or** options to include multiple bot categories in the same rule.
52+
7. From the **Then take action...** section:
53+
- For **Choose action**, select **Block**.
54+
- For **With response type**, select **Default Cloudflare WAF block page**.
55+
8. From the **Place at** section:
56+
- For **Select order**, select **First**.
57+
9. Select **Save**.
58+
</Steps>
59+
60+
This custom rule will only block the AI bots which belong to the [verified bot categories](/bots/concepts/bot/verified-bots/categories/) you have included in your rule (in step 6).
61+
62+
For more information on creating a custom WAF rule, refer to [Create a custom rule in the dashboard](/waf/custom-rules/create-dashboard/).
63+
64+
## 3. Review detected AI crawlers
65+
66+
Review the AI crawlers detected on your site in the **Metrics** tab of the Cloudflare dashboard for key metrics.
67+
68+
Refer to [Detect AI crawlers](/ai-audit/features/detect-ai-crawlers/) for more information.
69+
70+
## Related resources
71+
72+
Refer to the following related resources:
73+
74+
- Cloudflare blog: [Start auditing and controlling the AI models accessing your content](https://blog.cloudflare.com/nl-nl/cloudflare-ai-audit-control-ai-content-crawlers/)
75+
- Block AI crawlers that do not adhere to recommended guidelines using [Cloudflare AI Labyrinth](/bots/additional-configurations/ai-labyrinth/).
76+
- [Direct AI crawlers with managed robots.txt](/bots/additional-configurations/managed-robots-txt/).
Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
---
2+
title: AI Audit
3+
pcx_content_type: overview
4+
sidebar:
5+
order: 1
6+
badge: Beta
7+
head:
8+
- tag: title
9+
content: Overview
10+
description: AI Audit is a tool which allows you to analyze and control how third-party AI crawlers interact with your website.
11+
---
12+
13+
import { Description, Feature, FeatureTable, Plan, LinkButton, RelatedProduct } from "~/components";
14+
15+
<Description>
16+
17+
Analyze and control third-party AI crawlers in your website.
18+
19+
</Description>
20+
21+
<Plan type="all" />
22+
23+
AI Audit helps manage AI crawlers on your website by providing visibility on which crawlers are accessing your webpage, and allowing you to block unwanted crawlers.
24+
25+
<LinkButton href="/ai-audit/get-started/">Get started </LinkButton>
26+
27+
:::note[Beta phase]
28+
AI Audit is currently only available as a beta product.
29+
:::
30+
31+
---
32+
33+
## Features
34+
35+
<Feature
36+
header="AI crawler detection"
37+
href="/ai-audit/features/ai-crawler-detection/"
38+
cta="View AI crawlers"
39+
>
40+
Displays information about AI crawlers in your domains' pages.
41+
</Feature>
42+
43+
<Feature
44+
header="Enforce robots.txt"
45+
href="/ai-audit/features/enforce-robots-txt/"
46+
cta="Enforce your robots.txt"
47+
>
48+
Enforce your `robots.txt` with a Cloudflare WAF rule.
49+
</Feature>
50+
51+
---
52+
53+
## Related Products
54+
55+
<RelatedProduct
56+
header="Bots"
57+
href="/bots/"
58+
product="bots"
59+
>
60+
Identify and mitigate automated traffic to protect your domain from bad bots.
61+
</RelatedProduct>
62+
63+
<RelatedProduct
64+
header="Web Application Firewall"
65+
href="/waf/"
66+
product="waf"
67+
>
68+
Get automatic protection from vulnerabilities and the flexibility to create custom rules.
69+
</RelatedProduct>

src/content/docs/cloudflare-one/connections/connect-networks/index.mdx

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,16 +9,22 @@ import { Render } from "~/components";
99

1010
<Render file="survey" />
1111

12-
Cloudflare Tunnel provides you with a secure way to connect your resources to Cloudflare without a publicly routable IP address. With Tunnel, you do not send traffic to an external IP — instead, a lightweight daemon in your infrastructure (`cloudflared`) creates outbound-only connections to Cloudflare's global network. Cloudflare Tunnel can connect HTTP web servers, [SSH servers](/cloudflare-one/connections/connect-networks/use-cases/ssh/), [remote desktops](/cloudflare-one/connections/connect-networks/use-cases/rdp/), and other protocols safely to Cloudflare. This way, your origins can serve traffic through Cloudflare without being vulnerable to attacks that bypass Cloudflare.
12+
Cloudflare Tunnel provides you with a secure way to connect your resources to Cloudflare without a publicly routable IP address. With Tunnel, you do not send traffic to an external IP — instead, a lightweight daemon in your infrastructure (`cloudflared`) creates [outbound-only connections](/cloudflare-one/connections/connect-networks/#outbound-only-connection) to Cloudflare's global network. Cloudflare Tunnel can connect HTTP web servers, [SSH servers](/cloudflare-one/connections/connect-networks/use-cases/ssh/), [remote desktops](/cloudflare-one/connections/connect-networks/use-cases/rdp/), and other protocols safely to Cloudflare. This way, your origins can serve traffic through Cloudflare without being vulnerable to attacks that bypass Cloudflare.
1313

1414
Refer to our [reference architecture](/reference-architecture/architectures/sase/) for details on how to implement Cloudflare Tunnel into your existing infrastructure.
1515

1616
## How it works
1717

18-
Cloudflared establishes outbound connections (tunnels) between your resources and Cloudflare's global network. Tunnels are persistent objects that route traffic to DNS records. Within the same tunnel, you can run as many 'cloudflared' processes (connectors) as needed. These processes will establish connections to Cloudflare and send traffic to the nearest Cloudflare data center.
18+
Cloudflared establishes [outbound connections](/cloudflare-one/connections/connect-networks/#outbound-only-connection) (tunnels) between your resources and Cloudflare's global network. Tunnels are persistent objects that route traffic to DNS records. Within the same tunnel, you can run as many 'cloudflared' processes (connectors) as needed. These processes will establish connections to Cloudflare and send traffic to the nearest Cloudflare data center.
1919

2020
![How an HTTP request reaches a private application connected with Cloudflare Tunnel](~/assets/images/cloudflare-one/connections/connect-apps/handshake.jpg)
2121

22+
## Outbound-only connection
23+
24+
Cloudflare Tunnel uses an outbound-only connection model to enable bidirectional communication. When you install and run `cloudflared`, `cloudflared` initiates an outbound connection through your firewall from the origin to the Cloudflare global network.
25+
26+
Once the connection is established, traffic flows in both directions over the tunnel between your origin and Cloudflare. Most firewalls allow outbound traffic by default. `cloudflared` takes advantage of this standard by connecting out to the Cloudflare network from the server you installed `cloudflared` on. You can then configure your firewall to allow only these outbound connections and block all inbound traffic, effectively blocking access to your origin from anything other than Cloudflare. This setup ensures that all traffic to your origin is securely routed through the tunnel.
27+
2228
## Next steps
2329

2430
- Create a tunnel using the [Cloudflare dashboard](/cloudflare-one/connections/connect-networks/get-started/create-remote-tunnel/) or [API](/cloudflare-one/connections/connect-networks/get-started/create-remote-tunnel-api/).

0 commit comments

Comments
 (0)