Moving over content from PR #25332

Oxyjun · Oxyjun · commit b1e59471ddb9 · 2025-09-24T14:51:54.000+01:00
diff --git a/src/content/docs/ai-search/configuration/data-source/website.mdx b/src/content/docs/ai-search/configuration/data-source/website.mdx
@@ -5,6 +5,8 @@ sidebar:
   order: 2
 ---
 
+import { DashButton, Steps } from "~/components"
+
 The Website data source allows you to connect a domain you own so its pages can be crawled, stored, and indexed.
 
 :::note
@@ -13,11 +15,12 @@ You can only crawl domains that you have onboarded onto the same Cloudflare acco
 Refer to [Onboard a domain](/fundamentals/manage-domains/add-site/) for more information on adding a domain to your Cloudflare account.
 :::
 
-:::caution[Bot protection may block crawling] 
-If you use Cloudflare products that control or restrict bot traffic such as [Bot Management](/bots/), [Web Application Firewall (WAF)](/waf/), or [Turnstile](/turnstile/), the same rules will apply to the AI Search (AutoRAG) crawler. Make sure to configure an exception or an allow-list for the AutoRAG crawler in your settings. 
+:::caution[Bot protection may block crawling]
+If you use Cloudflare products that control or restrict bot traffic such as [Bot Management](/bots/), [Web Application Firewall (WAF)](/waf/), or [Turnstile](/turnstile/), the same rules will apply to the AI Search (AutoRAG) crawler. Make sure to configure an exception or an allow-list for the AutoRAG crawler in your settings.
 :::
 
 ## How website crawling works
+
 When you connect a domain, the crawler looks for your website’s sitemap to determine which pages to visit:
 
 1. The crawler first checks the `robots.txt` for listed sitemaps. If it exists, it reads all sitemaps existing inside.
@@ -26,6 +29,24 @@ When you connect a domain, the crawler looks for your website’s sitemap to det
 
 Pages are visited, according to the `<priority>` attribute set on the sitemaps, if this field is defined.
 
+## How to set WAF rules to allowlist AutoRAG crawler
+
+If you have Security rules configured to block bot activity, you can add a rule to allowlist AutoRAG's crawler bot.
+
+<Steps>
+1. In the Cloudflare dashboard, go to the **Security rules** page of your account and domain.
+
+     <DashButton url="/?to=/:account/:zone/security/security-rules" />
+
+2. To create a new empty rule, select **Create rule** > **Custom rules**.
+3. Enter a descriptive name for the rule in **Rule name**, such as `Allow AutoRAG`.
+4. Under **When incoming requests match**, use the **Field** drop-down list to choose _Bot Detection ID_. For **Operator**, select _equals_. For **Value**, enter `122933950`.
+5. Under **Then take action**, in the **Choose action** dropdown, choose _Skip_.
+6. Under **Place at**, select the order of the rule in the **Select order** dropdown to be _First_. Setting the order as _First_ allows this rule to be applied before subsequent rules.
+7. To save and deploy your rule, select **Deploy**.
+
+</Steps>
+
 ## Parsing options
 You can choose how pages are parsed during crawling:
 
diff --git a/src/content/docs/ai-search/usage/rest-api.mdx b/src/content/docs/ai-search/usage/rest-api.mdx
@@ -48,7 +48,7 @@ curl https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai-search/rags/{
 -H "Authorization: Bearer {API_TOKEN}" \
 -d '{
 	"query": "How do I train a llama to deliver coffee?",
-	"model": @cf/meta/llama-3.3-70b-instruct-sd,
+	"model": @cf/meta/llama-3.3-70b-instruct-fp8-fast,
 	"rewrite_query": false,
 	"max_num_results": 10,
 	"ranking_options": {
diff --git a/src/content/docs/ai-search/usage/workers-binding.mdx b/src/content/docs/ai-search/usage/workers-binding.mdx
@@ -40,7 +40,7 @@ This method searches for relevant results from your data source and generates a
 ```js
 const answer = await env.AI.autorag("my-autorag").aiSearch({
 	query: "How do I train a llama to deliver coffee?",
-	model: "@cf/meta/llama-3.3-70b-instruct-sd",
+	model: "@cf/meta/llama-3.3-70b-instruct-fp8-fast",
 	rewrite_query: true,
 	max_num_results: 2,
 	ranking_options: {