Skip to content

Commit 7a246b1

Browse files
aninibreadOxyjun
andauthored
Apply suggestions from code review
Co-authored-by: Jun Lee <[email protected]>
1 parent fc27070 commit 7a246b1

File tree

2 files changed

+9
-3
lines changed

2 files changed

+9
-3
lines changed

src/content/docs/autorag/configuration/data-source/index.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ sidebar:
55
order: 2
66
---
77

8-
You can have AutoRAG ingest data directly from the following sources:
8+
AutoRAG can directly ingest data from the following sources:
99

1010
| Data Source | Description |
1111
|---------------|-------------|

src/content/docs/autorag/configuration/data-source/website.mdx

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,13 @@ sidebar:
55
order: 2
66
---
77

8-
The Website data source allows you to connect a domain you own so its pages can be crawled, stored, and indexed. You can only crawl domains that are part of the **same Cloudflare account**.
8+
The Website data source allows you to connect a domain you own so its pages can be crawled, stored, and indexed.
9+
10+
:::note
11+
You can only crawl domains that you have onboarded onto the same Cloudflare account.
12+
13+
Refer to [Onboard a domain](/fundamentals/manage-domains/add-site/) for more information on adding a domain to your Cloudflare account.
14+
:::
915

1016
## How website crawling works
1117
When you connect a domain, the crawler looks for your site’s sitemap to determine which pages to visit:
@@ -26,7 +32,7 @@ You can choose how pages are parsed during crawling:
2632
During setup, AutoRAG creates a dedicated R2 bucket in your account to store the pages that have been crawled and downloaded as HTML files. This bucket is automatically managed and is used only for content discovered by the crawler. Any files or objects that you add directly to this bucket will not be indexed.
2733

2834
## Sync and updates
29-
During scheduled or manual [sync jobs](/autorag/configuration/indexing/) the crawler will check for changes on your website. If a page changes, the updated version is stored in the R2 bucket and reindexed automatically so that your search results always reflect the latest content.
35+
During scheduled or manual [sync jobs](/autorag/configuration/indexing/), the crawler will check for changes on your website. If a page changes, the updated version is stored in the R2 bucket and automatically reindexed so that your search results always reflect the latest content.
3036

3137
## Limits
3238
The regular AutoRAG [limits](/autorag/platform/limits-pricing/) apply when using the Website data source.

0 commit comments

Comments
 (0)