Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/CODEOWNERS
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
/src/content/release-notes/workers-ai.yaml @kathayl @mchenco @kodster28 @cloudflare/pcx-technical-writing
/src/content/release-notes/ai-gateway.yaml @kathayl @mchenco @kodster28 @cloudflare/pcx-technical-writing
/src/content/release-notes/vectorize.yaml @elithrar @mchenco @sejoker @cloudflare/pcx-technical-writing
/src/content/docs/autorag/ @rita3ko @irvinebroque @aninibread @cloudflare/pcx-technical-writing
/src/content/docs/ai-search/ @rita3ko @irvinebroque @aninibread @cloudflare/pcx-technical-writing

# Analytics & Logs

Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/publish-production.yml
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ jobs:
--config bin/rclone.conf \
distmd \
zt:zt-dashboard-dev-docs
- name: Upload vendored Markdown files to AutoRAG DevDocs bucket
- name: Upload vendored Markdown files to AI Search DevDocs bucket
env:
AWS_ACCESS_KEY_ID: ${{ secrets.AUTORAG_DEVDOCS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AUTORAG_DEVDOCS_SECRET_ACCESS_KEY }}
Expand Down
7 changes: 5 additions & 2 deletions public/__redirects
Original file line number Diff line number Diff line change
Expand Up @@ -245,8 +245,8 @@
/api-shield/security/sequence-mitigation/configure/ /api-shield/security/sequence-mitigation/api/ 301

#autorag
/autorag/usage/recipes/ /autorag/how-to/ 301
/autorag/configuration/metadata-filtering/ /autorag/configuration/metadata/ 301
/autorag/usage/recipes/ /ai-search/how-to/ 301
/autorag/configuration/metadata-filtering/ /ai-search/configuration/metadata/ 301

# bots
/bots/about/plans/ /bots/plans/ 301
Expand Down Expand Up @@ -2321,6 +2321,9 @@
# AI Crawl Control
/ai-audit/* /ai-crawl-control/:splat 301

# AutoRAG to AI search
/autorag/* /ai-search/:splat 301

# Cloudflare One / Zero Trust
/cloudflare-one/connections/connect-networks/install-and-setup/tunnel-guide/local/as-a-service/* /cloudflare-one/connections/connect-networks/configure-tunnels/local-management/as-a-service/:splat 301
/cloudflare-one/connections/connect-apps/install-and-setup/deployment-guides/* /cloudflare-one/connections/connect-networks/deployment-guides/:splat 301
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,22 +2,22 @@
title: Create fully-managed RAG pipelines for your AI applications with AutoRAG
description: AutoRAG lets you create fully-managed, retrieval-augmented generation (RAG) pipelines that continuously updates and scales on Cloudflare.
products:
- autorag
- ai-search
- vectorize
date: 2025-04-07
---

[AutoRAG](/autorag) is now in open beta, making it easy for you to build fully-managed retrieval-augmented generation (RAG) pipelines without managing infrastructure. Just upload your docs to [R2](/r2/get-started/), and AutoRAG handles the rest: embeddings, indexing, retrieval, and response generation via API.
[AutoRAG](/ai-search/) is now in open beta, making it easy for you to build fully-managed retrieval-augmented generation (RAG) pipelines without managing infrastructure. Just upload your docs to [R2](/r2/get-started/), and AutoRAG handles the rest: embeddings, indexing, retrieval, and response generation via API.

![AutoRAG open beta demo](~/assets/images/changelog/autorag/autorag-open-beta.gif)
![AutoRAG open beta demo](~/assets/images/changelog/ai-search/autorag-open-beta.gif)

With AutoRAG, you can:

- **Customize your pipeline:** Choose from [Workers AI](/workers-ai) models, configure chunking strategies, edit system prompts, and more.
- **Instant setup:** AutoRAG provisions everything you need from [Vectorize](/vectorize), [AI gateway](/ai-gateway), to pipeline logic for you, so you can go from zero to a working RAG pipeline in seconds.
- **Keep your index fresh:** AutoRAG continuously syncs your index with your data source to ensure responses stay accurate and up to date.
- **Ask questions:** Query your data and receive grounded responses via a [Workers binding](/autorag/usage/workers-binding/) or [API](/autorag/usage/rest-api/).
- **Ask questions:** Query your data and receive grounded responses via a [Workers binding](/ai-search/usage/workers-binding/) or [API](/ai-search/usage/rest-api/).

Whether you're building internal tools, AI-powered search, or a support assistant, AutoRAG gets you from idea to deployment in minutes.

Get started in the [Cloudflare dashboard](https://dash.cloudflare.com/?to=/:account/ai/autorag) or check out the [guide](/autorag/get-started/) for instructions on how to build your RAG pipeline today.
Get started in the [Cloudflare dashboard](https://dash.cloudflare.com/?to=/:account/ai/autorag) or check out the [guide](/ai-search/get-started/) for instructions on how to build your RAG pipeline today.
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,13 @@
title: Metadata filtering and multitenancy support in AutoRAG
description: Add metadata filters to AutoRAG queries to enable multitenancy and control the scope of retrieved results.
products:
- autorag
- ai-search
date: 2025-04-23
---

You can now filter [AutoRAG](/autorag) search results by `folder` and `timestamp` using [metadata filtering](/autorag/configuration/metadata) to narrow down the scope of your query.
You can now filter [AutoRAG](/ai-search/) search results by `folder` and `timestamp` using [metadata filtering](/ai-search/configuration/metadata) to narrow down the scope of your query.

This makes it easy to build [multitenant experiences](/autorag/how-to/multitenancy/) where each user can only access their own data. By organizing your content into per-tenant folders and applying a `folder` filter at query time, you ensure that each tenant retrieves only their own documents.
This makes it easy to build [multitenant experiences](/ai-search/how-to/multitenancy/) where each user can only access their own data. By organizing your content into per-tenant folders and applying a `folder` filter at query time, you ensure that each tenant retrieves only their own documents.

**Example folder structure:**

Expand All @@ -33,4 +33,4 @@ const response = await env.AI.autorag("my-autorag").search({

You can use metadata filtering by creating a new AutoRAG or reindexing existing data. To reindex all content in an existing AutoRAG, update any chunking setting and select **Sync index**. Metadata filtering is available for all data indexed on or after **April 21, 2025**.

If you are new to AutoRAG, get started with the [Get started AutoRAG guide](/autorag/get-started/).
If you are new to AutoRAG, get started with the [Get started AutoRAG guide](/ai-search/get-started/).
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,11 @@
title: View custom metadata in responses and guide AI-search with context in AutoRAG
description: You can now view custom metadata in AutoRAG search responses and use a context field to provide additional guidance to AI-generated answers.
products:
- autorag
- ai-search
date: 2025-06-19
---

In [AutoRAG](/autorag/), you can now view your object's custom metadata in the response from [`/search`](/autorag/usage/workers-binding/) and [`/ai-search`](/autorag/usage/workers-binding/), and optionally add a `context` field in the custom metadata of an object to provide additional guidance for AI-generated answers.
In [AutoRAG](/ai-search/), you can now view your object's custom metadata in the response from [`/search`](/ai-search/usage/workers-binding/) and [`/ai-search`](/ai-search/usage/workers-binding/), and optionally add a `context` field in the custom metadata of an object to provide additional guidance for AI-generated answers.

You can add [custom metadata](/r2/api/workers/workers-api-reference/#r2putoptions) to an object when uploading it to your R2 bucket.

Expand Down Expand Up @@ -46,4 +46,4 @@ For example:

This gives you more control over how your content is interpreted, without requiring you to modify the original contents of the file.

Learn more in AutoRAG's [metadata filtering documentation](/autorag/configuration/metadata).
Learn more in AutoRAG's [metadata filtering documentation](/ai-search/configuration/metadata).
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,11 @@
title: Filter your AutoRAG search by file name
description: You can now filter AutoRAG search queries by file name, allowing you to control which files can be retrieved for a given query.
products:
- autorag
- ai-search
date: 2025-06-19
---

In [AutoRAG](/autorag/), you can now [filter](/autorag/configuration/metadata/) by an object's file name using the `filename` attribute, giving you more control over which files are searched for a given query.
In [AutoRAG](/ai-search/), you can now [filter](/ai-search/configuration/metadata/) by an object's file name using the `filename` attribute, giving you more control over which files are searched for a given query.

This is useful when your application has already determined which files should be searched. For example, you might query a PostgreSQL database to get a list of files a user has access to based on their permissions, and then use that list to limit what AutoRAG retrieves.

Expand All @@ -25,4 +25,4 @@ const response = await env.AI.autorag("my-autorag").search({

This allows you to connect your application logic with AutoRAG's retrieval process, making it easy to control what gets searched without needing to reindex or modify your data.

Learn more in AutoRAG's [metadata filtering documentation](/autorag/configuration/metadata/).
Learn more in AutoRAG's [metadata filtering documentation](/ai-search/configuration/metadata/).
Original file line number Diff line number Diff line change
Expand Up @@ -2,21 +2,21 @@
title: Faster indexing and new Jobs view in AutoRAG
description: Track your indexing pipeline in real time with 3–5× faster indexing and a new Jobs dashboard.
products:
- autorag
- ai-search
date: 2025-07-08
---

You can now expect **3-5× faster indexing** in AutoRAG, and with it, a brand new **Jobs view** to help you monitor indexing progress.

With each AutoRAG, indexing jobs are automatically triggered to sync your data source (i.e. R2 bucket) with your Vectorize index, ensuring new or updated files are reflected in your query results. You can also trigger jobs manually via the [Sync API](/api/resources/autorag/subresources/rags/) or by clicking “Sync index” in the dashboard.
With each AutoRAG, indexing jobs are automatically triggered to sync your data source (i.e. R2 bucket) with your Vectorize index, ensuring new or updated files are reflected in your query results. You can also trigger jobs manually via the [Sync API](/api/resources/ai-search/subresources/rags/) or by clicking “Sync index” in the dashboard.

With the new jobs observability, you can now:

- View the status, job ID, source, start time, duration and last sync time for each indexing job
- Inspect real-time logs of job events (e.g. `Starting indexing data source...`)
- See a history of past indexing jobs under the Jobs tab of your AutoRAG

![AutoRAG jobs](~/assets/images/changelog/autorag/autorag-jobs-view.gif)
![AutoRAG jobs](~/assets/images/changelog/ai-search/autorag-jobs-view.gif)

This makes it easier to understand what’s happening behind the scenes.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,18 @@
title: New Metrics View in AutoRAG
description: Track file indexing, search activity, and top retrievals to understand how your AutoRAG instance is being used.
products:
- autorag
- ai-search
date: 2025-09-19
---

[AutoRAG](/autorag/) now includes a **Metrics** tab that shows how your data is indexed and searched. Get a clear view of the health of your indexing pipeline, compare usage between `ai-search` and `search`, and see which files are retrieved most often.
[AutoRAG](/ai-search/) now includes a **Metrics** tab that shows how your data is indexed and searched. Get a clear view of the health of your indexing pipeline, compare usage between `ai-search` and `search`, and see which files are retrieved most often.

![Metrics](~/assets/images/autorag/metrics.png)
![Metrics](~/assets/images/ai-search/metrics.png)

You can find these metrics within each AutoRAG instance:

- Indexing: Track how files are ingested and see status changes over time.
- Search breakdown: Compare usage between `ai-search` and `search` endpoints.
- Top file retrievals: Identify which files are most frequently retrieved in a given period.

Try it today in [AutoRAG](/autorag/get-started/).
Try it today in [AutoRAG](/ai-search/get-started/).
10 changes: 5 additions & 5 deletions src/content/changelog/workers-ai/2025-09-05-embeddinggemma.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,13 @@ We're excited to be a launch partner alongside [Google](https://developers.googl

[`@cf/google/embeddinggemma-300m`](/workers-ai/models/embeddinggemma-300m/) is a 300M parameter embedding model from Google, built from Gemma 3 and the same research used to create Gemini models. This multilingual model supports 100+ languages, making it ideal for RAG systems, semantic search, content classification, and clustering tasks.

**Using EmbeddingGemma in AutoRAG:**
Now you can leverage EmbeddingGemma directly through AutoRAG for your RAG pipelines. EmbeddingGemma's multilingual capabilities make it perfect for global applications that need to understand and retrieve content across different languages with exceptional accuracy.
**Using EmbeddingGemma in AI Search:**
Now you can leverage EmbeddingGemma directly through AI Search for your RAG pipelines. EmbeddingGemma's multilingual capabilities make it perfect for global applications that need to understand and retrieve content across different languages with exceptional accuracy.

To use EmbeddingGemma for your AutoRAG projects:
1. Go to **Create** in the [AutoRAG dashboard](https://dash.cloudflare.com/?to=/:account/ai/autorag)
To use EmbeddingGemma for your AI Search projects:
1. Go to **Create** in the [AI Search dashboard](https://dash.cloudflare.com/?to=/:account/ai/ai-search)
2. Follow the setup flow for your new RAG instance
3. In the **Generate Index** step, open up **More embedding models** and select `@cf/google/embeddinggemma-300m` as your embedding model
4. Complete the setup to create an AutoRAG
4. Complete the setup to create an AI Search

Try it out and let us know what you think!
2 changes: 1 addition & 1 deletion src/content/dash-routes/index.json
Original file line number Diff line number Diff line change
Expand Up @@ -300,7 +300,7 @@
"parent": ["AI"]
},
{
"name": "AutoRAG",
"name": "AI Search",
"deeplink": "/?to=/:account/ai/autorag",
"parent": ["AI"]
},
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ These MCP servers allow your MCP Client to read configurations from your account
| [Browser rendering server](https://github.com/cloudflare/mcp-server-cloudflare/tree/main/apps/browser-rendering) | Fetch web pages, convert them to markdown and take screenshots | `https://browser.mcp.cloudflare.com/sse` |
| [Logpush server](https://github.com/cloudflare/mcp-server-cloudflare/tree/main/apps/logpush) | Get quick summaries for Logpush job health | `https://logs.mcp.cloudflare.com/sse` |
| [AI Gateway server](https://github.com/cloudflare/mcp-server-cloudflare/tree/main/apps/ai-gateway) | Search your logs, get details about the prompts and responses | `https://ai-gateway.mcp.cloudflare.com/sse` |
| [AutoRAG server](https://github.com/cloudflare/mcp-server-cloudflare/tree/main/apps/autorag) | List and search documents on your AutoRAGs | `https://autorag.mcp.cloudflare.com/sse` |
| [AI Search server](https://github.com/cloudflare/mcp-server-cloudflare/tree/main/apps/autorag) | List and search documents on your AI Searchs | `https://autorag.mcp.cloudflare.com/sse` |
| [Audit Logs server](https://github.com/cloudflare/mcp-server-cloudflare/tree/main/apps/auditlogs) | Query audit logs and generate reports for review | `https://auditlogs.mcp.cloudflare.com/sse` |
| [DNS Analytics server](https://github.com/cloudflare/mcp-server-cloudflare/tree/main/apps/dns-analytics) | Optimize DNS performance and debug issues based on current set up | `https://dns-analytics.mcp.cloudflare.com/sse` |
| [Digital Experience Monitoring server](https://github.com/cloudflare/mcp-server-cloudflare/tree/main/apps/dex-analysis) | Get quick insight on critical applications for your organization | `https://dex.mcp.cloudflare.com/sse` |
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
pcx_content_type: navigation
title: REST API
external_link: /api/resources/autorag/
external_link: /api/resources/ai-search/
sidebar:
order: 9
---
44 changes: 44 additions & 0 deletions src/content/docs/ai-search/concepts/how-ai-search-works.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
---
pcx_content_type: concept
title: How AI Search works
sidebar:
order: 2
---

AI Search (formerly AutoRAG) is Cloudflare’s managed search service. You can connect your data such as websites or unstructured content, and it automatically creates a continuously updating index that you can query with natural language in your applications or AI agents.

AI Search consists of two core processes:

- **Indexing:** An asynchronous background process that monitors your data source for changes and converts your data into vectors for search.
- **Querying:** A synchronous process triggered by user queries. It retrieves the most relevant content and generates context-aware responses.

## How indexing works

Indexing begins automatically when you create an AI Search instance and connect a data source.

Here is what happens during indexing:

1. **Data ingestion:** AI Search reads from your connected data source.
2. **Markdown conversion:** AI Search uses [Workers AI’s Markdown Conversion](/workers-ai/features/markdown-conversion/) to convert [supported data types](/ai-search/configuration/data-source/) into structured Markdown. This ensures consistency across diverse file types. For images, Workers AI is used to perform object detection followed by vision-to-language transformation to convert images into Markdown text.
3. **Chunking:** The extracted text is [chunked](/ai-search/configuration/chunking/) into smaller pieces to improve retrieval granularity.
4. **Embedding:** Each chunk is embedded using Workers AI’s embedding model to transform the content into vectors.
5. **Vector storage:** The resulting vectors, along with metadata like file name, are stored in a the [Vectorize](/vectorize/) database created on your Cloudflare account.

After the initial data set is indexed, AI Search will regularly check for updates in your data source (e.g. additions, updates, or deletes) and index changes to ensure your vector database is up to date.

![Indexing](~/assets/images/ai-search/indexing.png)

## How querying works

Once indexing is complete, AI Search is ready to respond to end-user queries in real time.

Here is how the querying pipeline works:

1. **Receive query from AI Search API:** The query workflow begins when you send a request to either the AI Search’s [AI Search](/ai-search/usage/rest-api/#ai-search) or [Search](/ai-search/usage/rest-api/#search) endpoints.
2. **Query rewriting (optional):** AI Search provides the option to [rewrite the input query](/ai-search/configuration/query-rewriting/) using one of Workers AI’s LLMs to improve retrieval quality by transforming the original query into a more effective search query.
3. **Embedding the query:** The rewritten (or original) query is transformed into a vector via the same embedding model used to embed your data so that it can be compared against your vectorized data to find the most relevant matches.
4. **Querying Vectorize index:** The query vector is [queried](/vectorize/best-practices/query-vectors/) against stored vectors in the associated Vectorize database for your AI Search.
5. **Content retrieval:** Vectorize returns the metadata of the most relevant chunks, and the original content is retrieved from the R2 bucket. If you are using the Search endpoint, the content is returned at this point.
6. **Response generation:** If you are using the AI Search endpoint, then a text-generation model from Workers AI is used to generate a response using the retrieved content and the original user’s query, combined via a [system prompt](/ai-search/configuration/system-prompt/). The context-aware response from the model is returned.

![Querying](~/assets/images/ai-search/querying.png)
Loading
Loading