diff --git a/src/content/docs/bots/additional-configurations/managed-robots-txt.mdx b/src/content/docs/bots/additional-configurations/managed-robots-txt.mdx index f54bb3c9808f28..d062ad4cfe5b54 100644 --- a/src/content/docs/bots/additional-configurations/managed-robots-txt.mdx +++ b/src/content/docs/bots/additional-configurations/managed-robots-txt.mdx @@ -26,7 +26,7 @@ If your website already has a `robots.txt` file — verified by a HTTP `200` res For example, without this feature enabled, the `robots.txt` content of `crawlstop.com` would be: -```txt +```txt title="Feature not enabled" User-agent: * Disallow: /lp Disallow: /feedback @@ -37,16 +37,53 @@ Sitemap: https://www.crawlstop.com/sitemap.xml With the managed `robots.txt` enabled, Cloudflare will prepend our managed content before your original content, resulting in what you can view at https://www.crawlstop.com/robots.txt. -**Robots.txt example** -
- -
+```txt title="Feature enabled" +# NOTICE: The collection of content and other data on this +# site through automated means, including any device, tool, +# or process designed to data mine or scrape content, is +# prohibited except (1) for the purpose of search engine indexing or +# artificial intelligence retrieval augmented generation or (2) with express +# written permission from this site’s operator. + +# To request permission to license our intellectual +# property and/or other materials, please contact this +# site’s operator directly. + +# BEGIN Cloudflare Managed content + +User-agent: Amazonbot +Disallow: / + +User-agent: Applebot-Extended +Disallow: / + +User-agent: Bytespider +Disallow: / + +User-agent: CCBot +Disallow: / + +User-agent: ClaudeBot +Disallow: / + +User-agent: Google-Extended +Disallow: / + +User-agent: GPTBot +Disallow: / + +User-agent: meta-externalagent +Disallow: / + +# END Cloudflare Managed Content +User-agent: * +Disallow: /lp +Disallow: /feedback +Disallow: /langtest + + +Sitemap: https://www.crawlstop.com/sitemap.xml +``` ### No robots.txt file