diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS index 519942f17aece0..7daae50211c039 100644 --- a/.github/CODEOWNERS +++ b/.github/CODEOWNERS @@ -99,6 +99,7 @@ /src/content/docs/d1/ @elithrar @rozenmd @vy-ton @joshthoward @oxyjun @harshil1712 @cloudflare/pcx-technical-writing /src/content/release-notes/d1.yaml @elithrar @rozenmd @vy-ton @joshthoward @oxyjun @cloudflare/pcx-technical-writing /src/content/partials/d1/ @elithrar @rozenmd @vy-ton @joshthoward @oxyjun @harshil1712 @cloudflare/pcx-technical-writing +/src/content/docs/ai-audit/ @oxyjun @kodster28 @cloudflare/pcx-technical-writing /src/content/docs/durable-objects/ @elithrar @vy-ton @joshthoward @oxyjun @harshil1712 @cloudflare/pcx-technical-writing /src/content/release-notes/durable-objects.yaml @elithrar @rozenmd @vy-ton @joshthoward @oxyjun @cloudflare/pcx-technical-writing /src/content/docs/email-routing/ @cloudflare/pcx-technical-writing diff --git a/src/assets/images/changelog/ai-audit/ai-audit-overview.png b/src/assets/images/changelog/ai-audit/ai-audit-overview.png new file mode 100644 index 00000000000000..c6bd81fc355310 Binary files /dev/null and b/src/assets/images/changelog/ai-audit/ai-audit-overview.png differ diff --git a/src/content/changelog/ai-audit/2024-09-23-ai-audit-launch.mdx b/src/content/changelog/ai-audit/2024-09-23-ai-audit-launch.mdx new file mode 100644 index 00000000000000..16fa2ab301ff9d --- /dev/null +++ b/src/content/changelog/ai-audit/2024-09-23-ai-audit-launch.mdx @@ -0,0 +1,17 @@ +--- +title: AI Audit +description: AI Audit is available to all customers +date: 2024-09-23T11:00:00Z +--- + +Every site on Cloudflare now has access to [**AI Audit**](/ai-audit/), which summarizes the crawling behavior of popular and known AI services. + +You can use this data to: + +- Understand how and how often crawlers access your site (and which content is the most popular). +- Block some or all of the AI bots accessing your site. +- Use Cloudflare to enforce your `robots.txt` policy via an automatic WAF rule. + +![View AI bot activity with AI Audit](~/assets/images/changelog/ai-audit/ai-audit-overview.png) + +To get started, explore [AI audit](/ai-audit/). diff --git a/src/content/docs/ai-audit/changelog.mdx b/src/content/docs/ai-audit/changelog.mdx new file mode 100644 index 00000000000000..5cda654b0cb2fe --- /dev/null +++ b/src/content/docs/ai-audit/changelog.mdx @@ -0,0 +1,12 @@ +--- +pcx_content_type: changelog +title: Changelog +sidebar: + order: 100 +--- + +import { ProductChangelog } from "~/components"; + +{/* */} + + diff --git a/src/content/docs/ai-audit/features/detect-ai-crawlers.mdx b/src/content/docs/ai-audit/features/detect-ai-crawlers.mdx new file mode 100644 index 00000000000000..21a9ffff763954 --- /dev/null +++ b/src/content/docs/ai-audit/features/detect-ai-crawlers.mdx @@ -0,0 +1,46 @@ +--- +title: Detect AI crawlers +pcx_content_type: concept +sidebar: + order: 2 +--- + +AI Audit metrics provides you with insight on how AI crawlers are interacting with your website. + +## View AI Audit metrics + +AI Audit provides you with the following metrics to help you understand how AI crawlers are interacting with your website. + +| Metric | Description | +| --------------------------------- | ------------------------------------------------------------------------ | +| Request by AI crawlers | A graph which displays the number of crawl requests from each AI crawler | +| Summary | A list of AI crawlers with the most number of crawl requests | +| Most popular paths by AI crawlers | The most popular pages crawled by AI crawlers, for each AI crawler | + +The **Summary** table also enables you to [Enforce your robots.txt](/ai-audit/features/enforce-robots-txt/). + +## Filter AI crawler data + +You can use filters to narrow the scope of your result. + +- **Provider:** Filter by the AI crawler owners. +- **Bot type:** Filter by the type of the AI bot (for example, AI crawler, AI assistant, or archiver). +- **Date range:** Filter the date range of your results. You can choose from three predetermined date ranges: + - Past 7 days + - Past 14 days + - Past month + +The values of the AI Audit metrics will update according to your filter. + +## Filter subdomains + +You can use the subdomain filter to narrow the scope of your result. + +From the dropdown, select either **All subdomains**, or the specific subdomain you wish to view. + +Selecting a specific subdomain allows you to access: + +- **Violations only** toggle: Toggles the AI Audit page to only display bots which are violating your configured rules. +- [**Enforce robots.txt policy**](/ai-audit/features/enforce-robots-txt/): Ensure bots cannot access webpages which are off-limits, as specified in your `robots.txt` file. + +The values of the AI Audit metrics will update according to your filter. diff --git a/src/content/docs/ai-audit/features/enforce-robots-txt.mdx b/src/content/docs/ai-audit/features/enforce-robots-txt.mdx new file mode 100644 index 00000000000000..b5b0d2e52572b6 --- /dev/null +++ b/src/content/docs/ai-audit/features/enforce-robots-txt.mdx @@ -0,0 +1,37 @@ +--- +title: Enforce robots.txt +pcx_content_type: concept +sidebar: + order: 5 +--- + +import { Steps } from "~/components"; + +AI Audit allows you to enforce [`robots.txt`](/radar/glossary/#robotstxt) which instructs bots which webpages they can and cannot access. + +To enforce `robots.txt`: + + +1. Log in to the [Cloudflare dashboard](https://dash.cloudflare.com/), and select your account and domain. +2. Go to **AI Audit**. +3. From the dropdown at the top of the page, select a specific subdomain where you wish you enforce `robots.txt`. +4. From **Summary**, select **Enforce robots.txt policy**. +5. From the **Enforce your robots.txt policy** page, select **Go to WAF custom rules**. +6. From the **New custom rule** page, name your custom rule. + - The page will automatically populate the values for the custom rule. +7. From **Then take action...**: + - For **Choose action**, select **Block**. + - For **With response type**, select **Default Cloudflare WAF block page**. +8. From **Place at**: + - For **Select order**, select **Last**. +9. Select **Deploy**. + + +This custom rule ensures that bots cannot access the pages specified in your `robots.txt` file. + +## Related resources + +For more information, refer to the following resources. + +- [What is robots.txt? | How a robots.txt file works](https://www.cloudflare.com/en-gb/learning/bots/what-is-robots-txt/) +- [Direct AI crawlers with managed robots.txt](/bots/additional-configurations/managed-robots-txt/) diff --git a/src/content/docs/ai-audit/features/index.mdx b/src/content/docs/ai-audit/features/index.mdx new file mode 100644 index 00000000000000..2eb5d9b996cf03 --- /dev/null +++ b/src/content/docs/ai-audit/features/index.mdx @@ -0,0 +1,12 @@ +--- +title: Features +pcx_content_type: navigation +sidebar: + group: + hideIndex: true + order: 5 +--- + +import { DirectoryListing } from "~/components"; + + diff --git a/src/content/docs/ai-audit/get-started.mdx b/src/content/docs/ai-audit/get-started.mdx new file mode 100644 index 00000000000000..70d9b364d776b6 --- /dev/null +++ b/src/content/docs/ai-audit/get-started.mdx @@ -0,0 +1,76 @@ +--- +title: Get started +pcx_content_type: get-started +sidebar: + order: 2 + group: + badge: beta +head: + - tag: title + content: Get started with Cloudflare AI Audit +description: Learn how to set up AI Audit. +--- + +import { Render, Steps } from "~/components"; + +This guide instructs you through + +- Viewing AI crawlers that are interacting with your domain. +- Creating a rule to block AI crawlers on your website. + +## Prerequisites + +1. Sign up for a [Cloudflare account](https://dash.cloudflare.com/sign-up/). +2. [Connect your domain to Cloudflare](/fundamentals/manage-domains/add-site/). +3. Make sure your domain is [proxying traffic through Cloudflare](/fundamentals/concepts/how-cloudflare-works/#cloudflare-as-a-reverse-proxy). + +## 1. Block all AI crawlers + +To use AI Audit: + +{/* prettier-ignore */} + +1. Log in to the [Cloudflare dashboard](https://dash.cloudflare.com/), and select your account and domain. +2. Go to **AI Audit**. +3. From **Most Popular Paths**, select **Block All**. +4. From the **Bot traffic** page, under **Block AI Bots**, select **Enable**. + + +## 2. Block specific bot categories (Enterprise plan only) + +Customers on the Enterprise plan -- and with a [Bot Management subscription](/bots/plans/bm-subscription/) -- can choose to only block specific AI crawlers, while allowing others. + +{/* prettier-ignore */} + +1. Go to the **AI Audit**. +2. From **Most Popular Paths**, select **Block Some**. +3. From the **Security rules** page, select **Create rule** > **Custom rules**. +4. Provide a name for the custom rule. For example, "Block unwanted AI crawlers". +5. From the **Field** dropdown, select **Verified Bot Category**. +6. From the **Value** dropdown, select the specific bot category you wish to block. + - You can use **And** / **Or** buttons to add additional conditions. For example, you can use multiple **Or** options to include multiple bot categories in the same rule. +7. From the **Then take action...** section: + - For **Choose action**, select **Block**. + - For **With response type**, select **Default Cloudflare WAF block page**. +8. From the **Place at** section: + - For **Select order**, select **First**. +9. Select **Save**. + + +This custom rule will only block the AI bots which belong to the [verified bot categories](/bots/concepts/bot/verified-bots/categories/) you have included in your rule (in step 6). + +For more information on creating a custom WAF rule, refer to [Create a custom rule in the dashboard](/waf/custom-rules/create-dashboard/). + +## 3. Review detected AI crawlers + +Review the AI crawlers detected on your site in the **Metrics** tab of the Cloudflare dashboard for key metrics. + +Refer to [Detect AI crawlers](/ai-audit/features/detect-ai-crawlers/) for more information. + +## Related resources + +Refer to the following related resources: + +- Cloudflare blog: [Start auditing and controlling the AI models accessing your content](https://blog.cloudflare.com/nl-nl/cloudflare-ai-audit-control-ai-content-crawlers/) +- Block AI crawlers that do not adhere to recommended guidelines using [Cloudflare AI Labyrinth](/bots/additional-configurations/ai-labyrinth/). +- [Direct AI crawlers with managed robots.txt](/bots/additional-configurations/managed-robots-txt/). diff --git a/src/content/docs/ai-audit/index.mdx b/src/content/docs/ai-audit/index.mdx new file mode 100644 index 00000000000000..5bd703468867f8 --- /dev/null +++ b/src/content/docs/ai-audit/index.mdx @@ -0,0 +1,69 @@ +--- +title: AI Audit +pcx_content_type: overview +sidebar: + order: 1 + badge: Beta +head: + - tag: title + content: Overview +description: AI Audit is a tool which allows you to analyze and control how third-party AI crawlers interact with your website. +--- + +import { Description, Feature, FeatureTable, Plan, LinkButton, RelatedProduct } from "~/components"; + + + +Analyze and control third-party AI crawlers in your website. + + + + + +AI Audit helps manage AI crawlers on your website by providing visibility on which crawlers are accessing your webpage, and allowing you to block unwanted crawlers. + +Get started + +:::note[Beta phase] +AI Audit is currently only available as a beta product. +::: + +--- + +## Features + + + Displays information about AI crawlers in your domains' pages. + + + + Enforce your `robots.txt` with a Cloudflare WAF rule. + + +--- + +## Related Products + + +Identify and mitigate automated traffic to protect your domain from bad bots. + + + +Get automatic protection from vulnerabilities and the flexibility to create custom rules. + \ No newline at end of file diff --git a/src/content/products/ai-audit.yaml b/src/content/products/ai-audit.yaml new file mode 100644 index 00000000000000..c20581ed3f8d07 --- /dev/null +++ b/src/content/products/ai-audit.yaml @@ -0,0 +1,10 @@ +name: AI Audit +product: + title: AI Audit + url: /ai-audit/ + group: Core platform + additional_groups: [AI] +meta: + title: AI Audit + description: Analyze and control third-party AI crawlers in your website + author: "@cloudflare" diff --git a/src/content/release-notes/ai-audit.yaml b/src/content/release-notes/ai-audit.yaml new file mode 100644 index 00000000000000..cda0dc38ad5bad --- /dev/null +++ b/src/content/release-notes/ai-audit.yaml @@ -0,0 +1,11 @@ +--- +link: "/ai-audit/changelog/" +productName: AI Audit +productLink: "/ai-audit/" +productArea: Core platform +productAreaLink: /fundamentals/reference/changelog/platform/ +entries: + - publish_date: "2025-06-09" + title: Documentation for AI Audit + description: There is now documentation for AI Audit + diff --git a/src/icons/ai-audit.svg b/src/icons/ai-audit.svg new file mode 100644 index 00000000000000..d77de46b2294af --- /dev/null +++ b/src/icons/ai-audit.svg @@ -0,0 +1 @@ + \ No newline at end of file