You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/content/docs/autorag/configuration/models.mdx
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -32,4 +32,4 @@ If you choose Smart Default in your model selection then AutoRAG will select a C
32
32
33
33
### Per-request generation model override
34
34
35
-
While the generation model can be set globally at the AutoRAG instance level, you can also override it on a per-request basis in the [AI Search API](/autorag/use-autorag/rest-api/#ai-search). This is useful if your application requires dynamic selection of generation models based on context or user preferences.
35
+
While the generation model can be set globally at the AutoRAG instance level, you can also override it on a per-request basis in the [AI Search API](/autorag/usage/rest-api/#ai-search). This is useful if your application requires dynamic selection of generation models based on context or user preferences.
Copy file name to clipboardExpand all lines: src/content/docs/autorag/configuration/retrieval-configuration.mdx
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -39,6 +39,6 @@ If no results meet the threshold, AutoRAG will not generate a response.
39
39
40
40
## Configuration
41
41
42
-
These values can be configured at the AutoRAG instance level or overridden on a per-request basis using the [REST API](/autorag/use-autorag/rest-api/) or the [Workers binding](/autorag/use-autorag/workers-binding/).
42
+
These values can be configured at the AutoRAG instance level or overridden on a per-request basis using the [REST API](/autorag/usage/rest-api/) or the [Workers binding](/autorag/usage/workers-binding/).
43
43
44
44
Use the parameters `match_threshold` and `max_num_results` to customize retrieval behavior per request.
Copy file name to clipboardExpand all lines: src/content/docs/autorag/how-autorag-works.mdx
+11-11Lines changed: 11 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,7 +2,7 @@
2
2
pcx_content_type: concept
3
3
title: How AutoRAG works
4
4
sidebar:
5
-
order: 3
5
+
order: 2
6
6
---
7
7
8
8
AutoRAG simplifies the process of building and managing a Retrieval-Augmented Generation (RAG) pipeline using Cloudflare's serverless platform. Instead of manually stitching together components like Workers AI, Vectorize, and writing custom logic for indexing, retrieval, and generation, AutoRAG handles it all for you. It also continuously indexes your data to ensure responses stay accurate and up-to-date.
@@ -19,27 +19,27 @@ Indexing begins automatically when you create an AutoRAG instance and connect a
19
19
Here is what happens during indexing:
20
20
21
21
1.**Data ingestion:** AutoRAG reads from your connected data source. Files that are unsupported or exceed size limits are flagged and reported as indexing errors.
22
-
2.**Markdown conversion:** AutoRAG uses a Worker powered by [Workers AI’s Markdown Conversion](/workers-ai/markdown-conversion/) to convert all data into structured Markdown. This ensures consistency across diverse file types. For images, Workers AI is used to perform object detection followed by vision-to-language transformation to convert images into Markdown text.
23
-
3.**Chunking:** The extracted text is chunked to improve retrieval granularity.
22
+
2.**Markdown conversion:** AutoRAG uses [Workers AI’s Markdown Conversion](/workers-ai/markdown-conversion/) to convert all data into structured Markdown. This ensures consistency across diverse file types. For images, Workers AI is used to perform object detection followed by vision-to-language transformation to convert images into Markdown text.
23
+
3.**Chunking:** The extracted text is chunked into smaller pieces to improve retrieval granularity.
24
24
4.**Embedding:** Each chunk is embedded using Workers AI’s embedding model to transform the content into vectors.
25
-
5.**Vector storage:** The resulting vectors, along with metadata like source location and file name, are stored in a Vectorize index created on your account.
25
+
5.**Vector storage:** The resulting vectors, along with metadata like source location and file name, are stored in a Cloudflare’s Vectorize database created on your account.
26
26
27
-
[INSERT IMAGE]
27
+

28
28
29
29
## Querying
30
30
31
31
Once indexing is complete, AutoRAG is ready to respond to end-user queries in real time.
32
32
33
33
Here’s how the querying pipeline works:
34
34
35
-
1.**Receive query from AutoRAG API:** The query workflow begins when you send a request to the AutoRAG API.
36
-
2.**Query rewriting (optional):**The input query can be rewritten by one of Workers AI’s LLMs to improve semantic retrieval, if enabled.
37
-
3.**Embedding the query:** The rewritten (or original) query is turned into a vector via the same embedding model in Workers AI.
38
-
4.**Querying Vectorize index:** The query vector is [searched](/vectorize/best-practices/query-vectors/) against the Vectorize index associated with your AutoRAG instance.
35
+
1.**Receive query from AutoRAG API:** The query workflow begins when you send a request to either the AutoRAG’s AI Search or Search endpoint.
36
+
2.**Query rewriting (optional):**AutoRAG provides the option to rewrite the input query using one of Workers AI’s LLMs to improve retrieval quality by transforming the original query into a more effective search query.
37
+
3.**Embedding the query:** The rewritten (or original) query is transformed into a vector via the same embedding model used to embed your data so that it can be compared against your vectorized data to find the most relevant matches.
38
+
4.**Querying Vectorize index:** The query vector is searched against stored vectors in the associated Vectorize database for your AutoRAG.
39
39
5.**Content retrieval:** Vectorize returns the most relevant chunks and their metadata. And the original content is retrieved from the R2 bucket. These are passed to a text-generation model.
40
-
6.**Response generation:** A text-generation model from Workers AI is used to a response using the retrieved content and the original user’s query using the generation model you select.
40
+
6.**Response generation:** A text-generation model from Workers AI is used to generate a response using the retrieved content and the original user’s query.
0 commit comments