Skip to content

Commit bae3acf

Browse files
committed
changes to autorag
1 parent fd71e55 commit bae3acf

File tree

6 files changed

+47
-18
lines changed

6 files changed

+47
-18
lines changed

src/content/docs/autorag/configuration/chunking.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ sidebar:
77

88
Chunking is the process of splitting large data into smaller segments before embedding them for search. AutoRAG uses **recursive chunking**, which breaks your content at natural boundaries (like paragraphs or sentences), and then further splits it if the chunks are too large.
99

10-
## What is recurisve chunking
10+
## What is recursive chunking
1111

1212
Recursive chunking tries to keep chunks meaningful by:
1313

src/content/docs/autorag/configuration/data-source.mdx

Lines changed: 36 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -15,13 +15,46 @@ AutoRAG will automatically scan and process supported files stored in that bucke
1515

1616
AutoRAG has different file size limits depending on the file type:
1717

18-
- Up to **4 MB** for files that are already in plain text or Markdown.
19-
- Up to **1 MB** for files that need to be converted into Markdown (like PDFs or other rich formats).
18+
- **Plain text files:** Up to **4 MB**
19+
- **Rich format files:** Up to **1 MB**
2020

2121
Files that exceed these limits will not be indexed and will show up in the error logs.
2222

2323
## File types
2424

25-
AutoRAG is powered by and accepts the same file types as [Markdown Conversion](/workers-ai/markdown-conversion/). The following table lists the supported formats:
25+
AutoRAG can ingest a variety of different file types to power your RAG. The following plain text files and rich format files are supported.
26+
27+
### Plain text file types
28+
29+
AutoRAG supports the following plain text file types:
30+
31+
| Format | File extensions | Mime Type |
32+
| ---------- | ------------------------------------------------------------------------------ | --------------------------------------------------------------------- |
33+
| Text | `.txt` | `text/plain` |
34+
| Log | `.log` | `text/plain` |
35+
| Config | `.ini`, `.conf`, `.env`, `.properties`, `.gitignore`, `.editorconfig`, `.toml` | `text/plain`, `text/toml` |
36+
| Markdown | `.markdown`, `.md`, `.mdx` | `text/markdown` |
37+
| LaTeX | `.tex`, `.latex` | `application/x-tex`, `application/x-latex` |
38+
| Script | `.sh`, `.bat` , `.ps1` | `application/x-sh` , `application/x-msdos-batch`, `text/x-powershell` |
39+
| SGML | `.sgml` | `text/sgml` |
40+
| JSON | `.json` | `application/json` |
41+
| YAML | `.yaml`, `.yml` | `application/x-yaml` |
42+
| CSS | `.css` | `text/css` |
43+
| JavaScript | `.js` | `application/javascript` |
44+
| PHP | `.php` | `application/x-httpd-php` |
45+
| Python | `.py` | `text/x-python` |
46+
| Ruby | `.rb` | `text/x-ruby` |
47+
| Java | `.java` | `text/x-java-source` |
48+
| C | `.c` | `text/x-c` |
49+
| C++ | `.cpp`, `.cxx` | `text/x-c++` |
50+
| C Header | `.h`, `.hpp` | `text/x-c-header` |
51+
| Go | `.go` | `text/x-go` |
52+
| Rust | `.rs` | `text/rust` |
53+
| Swift | `.swift` | `text/swift` |
54+
| Dart | `.dart` | `text/dart` |
55+
56+
### Rich format file types
57+
58+
AutoRAG uses [Markdown Conversion](/workers-ai/markdown-conversion/) to convert rich format files to markdown. The following table lists the supported formats that will be converted to Markdown:
2659

2760
<Render file="markdown-conversion-support" product="workers-ai" />

src/content/docs/autorag/configuration/index.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ The table below lists all available configuration options:
1313

1414
| Configuration | Editable after creation | Description |
1515
| ---------------------------------------------------------------------------- | ----------------------- | ------------------------------------------------------------------------------------------ |
16-
| [Data source](/autorag/configuration/data-source/) | no | The source where your knowledge base is stored (for example, R2 bucket) |
16+
| [Data source](/autorag/configuration/data-source/) | no | The source where your knowledge base is stored |
1717
| [Chunk size](/autorag/configuration/chunking/) | yes | Number of tokens per chunk |
1818
| [Chunk overlap](/autorag/configuration/chunking/) | yes | Number of overlapping tokens between chunks |
1919
| [Embedding model](/autorag/configuration/models/) | no | Model used to generate vector embeddings |

src/content/docs/autorag/index.mdx

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -30,9 +30,8 @@ AutoRAG lets you create fully-managed, retrieval-augmented generation (RAG) pipe
3030

3131
You can use AutoRAG to build:
3232

33-
- **Support chatbots:** Answer customer questions using your own product content.
34-
- **Internal tools:** Help teams quickly find the information they need using internal documentation.
35-
- **Enterprise knowledge search:** Make documentation easy to search and use.
33+
- **Product Chatbot:** Answer customer questions using your own product content.
34+
- **Docs Search:** Make documentation easy to search and use.
3635

3736
<div>
3837
<LinkButton href="/autorag/get-started">Get started</LinkButton>

src/content/docs/autorag/platform/limits-pricing.mdx

Lines changed: 7 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -7,11 +7,9 @@ sidebar:
77

88
## Pricing
99

10-
During the open beta, AutoRAG is **free to enable**. Compute operations for indexing, retrieval, and augmentation incur no additional cost during this phase.
10+
During the open beta, AutoRAG is **free to enable**. When you create an AutoRAG instance, it provisions and runs on top of Cloudflare services in your account. These resources are **billed as part of your Cloudflare usage**, and includes:
1111

12-
When you create an AutoRAG instance, it provisions and runs on top of Cloudflare services provisioned within your own account. You retain full visibility and control over these resources, and they are billed as part of your existing Cloudflare usage. These services include:
13-
14-
| Service | Description |
12+
| Service & Pricing | Description |
1513
| ------------------------------------------------ | ----------------------------------------------------------------------------------------- |
1614
| [**R2**](/r2/pricing/) | Stores your source data |
1715
| [**Vectorize**](/vectorize/platform/pricing/) | Stores vector embeddings and powers semantic search |
@@ -24,10 +22,10 @@ For more information about how each resource is used within AutoRAG, reference [
2422

2523
The following limits currently apply to AutoRAG during the open beta:
2624

27-
| Limit | Value |
28-
| --------------------------------- | ------------------------------------------------------- |
29-
| Max AutoRAG instances per account | 10 |
30-
| Max files per AutoRAG | 100,000 |
31-
| Max file size | 4 MB (plain text or Markdown) / 1 MB (other file types) |
25+
| Limit | Value |
26+
| --------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------- |
27+
| Max AutoRAG instances per account | 10 |
28+
| Max files per AutoRAG | 100,000 |
29+
| Max file size | 4 MB ([Plain text](/autorag/configuration/data-source/#plain-text-file-types)) / 1 MB ([Rich format](/autorag/configuration/data-source/#rich-format-file-types)) |
3230

3331
These limits are subject to change as AutoRAG evolves beyond open beta.

src/content/docs/autorag/usage/workers-binding.mdx

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,6 @@ const answer = await env.AI.autorag("my-autorag").aiSearch({
4242
ranking_options: {
4343
score_threshold: 0.7,
4444
},
45-
stream: false,
4645
});
4746
```
4847

0 commit comments

Comments
 (0)