You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/content/docs/autorag/configuration/data-source/website.mdx
+13Lines changed: 13 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -26,6 +26,19 @@ When you connect a domain, the crawler looks for your website’s sitemap to det
26
26
27
27
Pages are visited, according to the `<priority>` attribute set on the sitemaps, if this field is defined.
28
28
29
+
## How to set WAF rules to allowlist AutoRAG crawler
30
+
31
+
If you have Security rules configured to block bot activity, you can add a rule to allowlist AutoRAG's crawler bot.
32
+
33
+
1. Log in to the [Cloudflare dashboard](https://dash.cloudflare.com/), and select your account and domain.
34
+
2. Go to **Security** > **Security rules**.
35
+
3. To create a new empty rule, select **Create rule** > **Custom rules**.
36
+
4. Enter a descriptive name for the rule in **Rule name**, for example, "Allow AutoRAG"
37
+
5. Under **When incoming requests match**, use the **Field** drop-down list to choose **Bot Detection ID**. For Operator select equals, and for the Value enter `122933950`.
38
+
6. Under **Then take action**, select the rule action in the **Choose action** dropdown to be **Skip**.
39
+
7. Under **Place at**, select the order of the rule in the **Select order** dropdown to be **First**. Setting the order as First allows this rule to be applied before subsequent rules.
40
+
8. To save and deploy your rule, select **Deploy**.
41
+
29
42
## Parsing options
30
43
You can choose how pages are parsed during crawling:
0 commit comments