|
1 | 1 | --- |
2 | | -title: Track robots.txt |
| 2 | +title: New Robots.txt tab for tracking crawler compliance |
3 | 3 | description: Monitor robots.txt file health, track crawler violations, and gain visibility into how AI crawlers interact with your directives. |
4 | 4 | date: 2025-10-21 |
5 | 5 | --- |
6 | 6 |
|
7 | | -We are introducing a new **Robots.txt** tab in [AI Crawl Control](/ai-crawl-control/) that provides comprehensive insights into how AI crawlers interact with your `robots.txt` files. You can now monitor file availability, track request patterns, and identify crawlers that violate your directives. |
| 7 | +AI Crawl Control now includes a **Robots.txt** tab that provides insights into how AI crawlers interact with your `robots.txt` files. |
8 | 8 |
|
9 | | -**Key features:** |
| 9 | +## What's new |
10 | 10 |
|
11 | | -- **Availability monitoring:** View the health status of `robots.txt` files across all your hostnames, including **200 OK** and **404 Not Found** statuses. Quickly identify hostnames that need a `robots.txt` file. |
12 | | -- **Request analytics:** Track the total number of requests to each `robots.txt` file, with breakdowns of allowed versus unsuccessful requests. Sort by request volume to identify high-traffic hostnames. |
13 | | -- **Content signals:** See at a glance whether your `robots.txt` files contain Cloudflare-recommended directives for AI training, AI search, and AI input. |
14 | | -- **Violation tracking:** Identify crawlers that request paths explicitly disallowed by your `robots.txt` directives. The violations table shows the crawler name, operator, violated path, specific directive, and violation count. |
15 | | -- **Actionable insights:** Take immediate action by blocking non-compliant crawlers, creating custom WAF rules, or using Redirect Rules to guide crawlers to a different part of your site. |
| 11 | +The Robots.txt tab allows you to: |
16 | 12 |
|
17 | | -**Filtering capabilities:** |
| 13 | +- Monitor the health status of `robots.txt` files across all your hostnames, including HTTP status codes, and identify hostnames that need a `robots.txt` file. |
| 14 | +- Track the total number of requests to each `robots.txt` file, with breakdowns of allowed versus unsuccessful requests. |
| 15 | +- Check whether your `robots.txt` files contain [Content Signals](https://contentsignals.org/) directives for AI training, search, and AI input. |
| 16 | +- Identify crawlers that request paths explicitly disallowed by your `robots.txt` directives, including the crawler name, operator, violated path, specific directive, and violation count. |
| 17 | +- Filter `robots.txt` request data by crawler, operator, category, and custom time ranges. |
18 | 18 |
|
19 | | -Apply filters to drill down into specific scenarios: |
| 19 | +## Take action |
20 | 20 |
|
21 | | -- Filter by crawler, operator, or category |
22 | | -- Select custom time ranges for historical analysis |
23 | | -- View trends across all your hostnames |
| 21 | +When you identify non-compliant crawlers, you can: |
24 | 22 |
|
25 | | -Learn more in the [Track robots.txt documentation](/ai-crawl-control/features/track-robots-txt/). |
| 23 | +- Block the crawler in the [Crawlers tab](/ai-crawl-control/features/manage-ai-crawlers/) |
| 24 | +- Create custom [WAF rules](/waf/) for path-specific security |
| 25 | +- Use [Redirect Rules](/rules/url-forwarding/) to guide crawlers to appropriate areas of your site |
| 26 | + |
| 27 | +To get started, go to **AI Crawl Control** > **Robots.txt** in the Cloudflare dashboard. Learn more in the [Track robots.txt documentation](/ai-crawl-control/features/track-robots-txt/). |
0 commit comments