Skip to content

Commit d338596

Browse files
committed
model support and deprecation strategy
1 parent 346bfbc commit d338596

File tree

11 files changed

+162
-79
lines changed

11 files changed

+162
-79
lines changed

src/content/docs/ai-search/concepts/how-autorag-works.mdx renamed to src/content/docs/ai-search/concepts/how-ai-search-works.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ sidebar:
55
order: 2
66
---
77

8-
AI Search sets up and manages your RAG pipeline for you. It connects the tools needed for indexing, retrieval, and generation, and keeps everything up to date by syncing with your data with the index regularly. Once set up, AI Search indexes your content in the background and responds to queries in real time.
8+
AI Search (formerly AutoRAG) is Cloudflare’s managed search service. You can connect your data such as websites or unstructured content, and it automatically creates a continuously updating index that you can query with natural language in your applications or AI agents.
99

1010
AI Search consists of two core processes:
1111

src/content/docs/ai-search/configuration/index.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ sidebar:
77

88
import { MetaInfo, Type } from "~/components";
99

10-
When creating an AI Search instance, you can customize how your RAG pipeline ingests, processes, and responds to data using a set of configuration options. Some settings can be updated after the instance is created, while others are fixed at creation time.
10+
You can customize how your AI Search instance indexes your data, and retrieves and generates responses for queries. Some settings can be updated after the instance is created, while others are fixed at creation time.
1111

1212
The table below lists all available configuration options:
1313

src/content/docs/ai-search/configuration/models.mdx

Lines changed: 0 additions & 39 deletions
This file was deleted.
Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
---
2+
title: Models
3+
pcx_content_type: how-to
4+
sidebar:
5+
order: 2
6+
---
7+
8+
AI Search uses models at multiple stages. You can configure which models are used, or let AI Search automatically select a smart default for you.
9+
10+
## Models usage
11+
12+
AI Search leverages Workers AI models in the following stages:
13+
14+
- Image to markdown conversion (if images are in data source): Converts image content to Markdown using object detection and captioning models.
15+
- Embedding: Transforms your documents and queries into vector representations for semantic search.
16+
- Query rewriting (optional): Reformulates the user’s query to improve retrieval accuracy.
17+
- Generation: Produces the final response from retrieved context.
18+
19+
## Model providers
20+
21+
All AI Search instances support models from [Workers AI](/workers-ai). You can use other providers (such as OpenAI or Anthropic) in AI Search by adding their API keys to an [AI Gateway](/ai-gateway) and connecting that gateway to your AI Search.
22+
23+
To use AI Search with other model providers:
24+
25+
1. Add provider keys to AI Gateway
26+
- Go to **AI > AI Gateway** in the dashboard.
27+
- Select or create an AI gateway.
28+
- In **Provider Keys**, choose your provider, click **Add**, and enter the key.
29+
2. Connect the gateway to AI Search
30+
- When creating a new AI Search, select the AI Gateway with your provider keys.
31+
- For an existing AI Search, go to **Settings** and switch to a gateway that has your keys under **Resources**.
32+
3. Select models
33+
- Embedding model: Only available to be changed when creating a new AI Search.
34+
- Generation model: Can be selected when creating a new AI Search and can be changed at any time in **Settings**.
35+
36+
AI Search supports a subset of models that have been selected to provide the best experience. See list of [supported models](/ai-search/configuration/models/supported-models/).
37+
38+
### Smart default
39+
40+
If you choose **Smart Default** in your model selection, then AI Search will select a Cloudflare recommended model and will update it automatically for you over time. You can switch to explicit model configuration at any time by visiting **Settings**.
41+
42+
### Per-request generation model override
43+
44+
While the generation model can be set globally at the AI Search instance level, you can also override it on a per-request basis in the [AI Search API](/ai-search/usage/rest-api/#ai-search). This is useful if your [RAG application](/ai-search/) requires dynamic selection of generation models based on context or user preferences.
45+
46+
## Model deprecation
47+
AI Search may deprecate support for a given model in order to provide support for better-performing models with improved capabilities. When a model is being deprecated, we announce the change and provide an end-of-life date after which the model will no longer be accessible. Applications that depend on AI Search may therefore require occasional updates to continue working reliably.
48+
49+
### Model lifecycle
50+
AI Search models follow a defined lifecycle to ensure stability and predictable deprecation:
51+
52+
1. **Production:** The model is actively supported and recommended for use. It is included in Smart Defaults and receives ongoing updates and maintenance.
53+
2. **Announcement & Transition:** The model remains available but has been marked for deprecation. An end-of-life date is communicated through documentation, release notes, and other official channels. During this phase, users are encouraged to migrate to the recommended replacement model.
54+
3. **Automatic Upgrade (if applicable):** For some models, AI Search may automatically upgrade requests to a recommended replacement.
55+
4. **End of life:** The model is no longer available. Any requests to the retired model return a clear error message, and the model is removed from documentation and Smart Defaults.
56+
57+
See models are their lifecycle status in [supported models](/ai-search/configuration/models/supported-models/).
58+
59+
### Best practices
60+
61+
- Regularly check the [release note](/ai-search/platform/release-note/) for updates.
62+
- Plan migration efforts according to the communicated end-of-life date.
63+
- Migrate and test the recommended replacement models before the end-of-life date.
Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
---
2+
title: Supported models
3+
pcx_content_type: how-to
4+
sidebar:
5+
order: 2
6+
---
7+
8+
This page lists all models supported by AI Search and their lifecycle status.
9+
10+
:::note[Request model support]
11+
If you would like to use a model that is not currently supported, reach out to us on [Discord](https://discord.gg/cloudflaredev) to request it.
12+
:::
13+
14+
15+
## Production models
16+
Production models are the actively supported and recommended models that are stable, fully available.
17+
18+
### Text generation
19+
| Provider | Alias | Context window (tokens) |
20+
|---|---|---|
21+
| **Anthropic** | anthropic/claude-3-7-sonnet | 200,000 |
22+
| | anthropic/claude-sonnet-4 | 200,000 |
23+
| | anthropic/claude-opus-4 | 200,000 |
24+
| | anthropic/claude-3-5-haiku | 200,000 |
25+
| **Cerebras** | cerebras/qwen-3-235b-a22b-instruct | 64,000 |
26+
| | cerebras/qwen-3-235b-a22b-thinking | 65,000 |
27+
| | cerebras/llama-3.3-70b | 65,000 |
28+
| | cerebras/llama-4-maverick-17b-128e-instruct | 8,000 |
29+
| | cerebras/llama-4-scout-17b-16e-instruct | 8,000 |
30+
| | cerebras/gpt-oss-120b | 64,000 |
31+
| **Google AI Studio** | google-ai-studio/gemini-2.5-flash | 1,048,576 |
32+
| | google-ai-studio/gemini-2.5-pro | 1,048,576 |
33+
| **Grok (x.ai)** | grok/grok-4 | 256,000 |
34+
| **Groq** | groq/llama-3.3-70b-versatile | 131,072 |
35+
| | groq/llama-3.1-8b-instant | 131,072 |
36+
| **OpenAI** | openai/gpt-5 | 400,000 |
37+
| | openai/gpt-5-mini | 400,000 |
38+
| | openai/gpt-5-nano | 400,000 |
39+
| **Workers AI** | @cf/meta/llama-3.3-70b-instruct-fp8-fast | 24,000 |
40+
| | @cf/meta/llama-3.1-8b-instruct-fast | 60,000 |
41+
| | @cf/meta/llama-3.1-8b-instruct-fp8 | 32,000 |
42+
| | @cf/meta/llama-4-scout-17b-16e-instruct | 131,000 |
43+
| | @cf/qwen/qwen3-30b-a3b-fp8 | 32,000 |
44+
| | @cf/moonshotai/kimi-k2-instruct | 128,000 |
45+
46+
### Embedding
47+
| Provider | Alias | Vector dims | Input tokens | Metric |
48+
|---|---|---|---|---|
49+
| **Google AI Studio** | google-ai-studio/gemini-embedding-001 | 1,536 | 512 | cosine |
50+
| **OpenAI** | openai/text-embedding-3-small | 1,536 | 512 | cosine |
51+
| | openai/text-embedding-3-large | 1,536 | 512 | cosine |
52+
| **Workers AI** | @cf/baai/bge-m3 | 1,024 | 512 | cosine |
53+
| | @cf/baai/bge-large-en-v1.5 | 1,024 | 512 | cosine |
54+
| | @cf/google/embeddinggemma-300m | 768 | 512 | cosine |
55+
| | @cf/qwen/qwen3-embedding-0.6b | 1,024 | 512 | cosine |
56+
57+
## Transition models
58+
59+
There are currently no models marked for end-of-life.

src/content/docs/ai-search/get-started.mdx

Lines changed: 21 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -11,48 +11,40 @@ Description: Get started creating fully-managed, retrieval-augmented generation
1111

1212
import { DashButton } from "~/components";
1313

14-
AI Search allows developers to create fully managed retrieval-augmented generation (RAG) pipelines to power AI applications with accurate and up-to-date information without needing to manage infrastructure.
14+
AI Search (formerly AutoRAG) is Cloudflare’s managed search service. You can connect your data such as websites or unstructured content, and it automatically creates a continuously updating index that you can query with natural language in your applications or AI agents.
1515

16-
## 1. Upload data or use existing data in R2
16+
## Prerequisite
1717

18-
AI Search integrates with R2 for data import. Create an R2 bucket if you do not have one and upload your data.
19-
20-
:::note
21-
Before you create your first bucket, you must purchase R2 from the Cloudflare dashboard.
22-
:::
23-
24-
To create and upload objects to your bucket from the Cloudflare dashboard:
25-
26-
1. In the Cloudflare dashboard, go to the **R2** page.
18+
AI Search integrates with R2 for storing your data. You must have an **active R2 subscription** before creating your first AI Search. You can purchase the subscription on the Cloudflare R2 dashboard.
2719

2820
<DashButton url="/?to=/:account/r2/overview" />
2921

30-
2. Select Create bucket, name the bucket, and select **Create bucket**.
31-
3. Choose to either drag and drop your file into the upload area or **select from computer**. Review the [file limits](/ai-search/configuration/data-source/) when creating your knowledge base.
32-
33-
_If you need inspiration for what document to use to make your first AI Search, try downloading and uploading the [RSS](/changelog/rss/index.xml) of the [Cloudflare Changelog](/changelog/)._
34-
35-
## 2. Create an AI Search
22+
## 1. Create an AI Search
3623

3724
To create a new AI Search:
3825

39-
1. In the Cloudflare dashboard, go to the **AI Search* page.
26+
1. In the Cloudflare dashboard, go to the **AI Search** page.
4027

4128
<DashButton url="/?to=/:account/ai/autorag" />
4229

43-
2. Select **Create AI Search**, configure the AI Search, and complete the setup process.
44-
3. Select **Create**.
30+
2. Select **Create**
31+
3. In Create a RAG, select **Get Started**
32+
3. Then choose how you want to connect your data:
33+
- **R2 bucket**: Index the content from one of your R2 buckets.
34+
- **Website**: Provide a domain from your Cloudflare account and AI Search will automatically crawl your site, store the content in R2, and index it.
35+
3. Configure the AI Search and complete the setup process.
36+
4. Select **Create**.
4537

46-
## 3. Monitor indexing
38+
## 2. Monitor indexing
4739

48-
Once created, AI Search will create a Vectorize index in your account and begin indexing the data.
40+
After setup, AI Search creates a Vectorize index in your account and begins indexing the data.
4941

50-
To monitor the indexing progress:
42+
To monitor progress:
5143

5244
1. From the **AI Search** page in the dashboard, locate and select your AI Search.
5345
2. Navigate to the **Overview** page to view the current indexing status.
5446

55-
## 4. Try it out
47+
## 3. Try it out
5648

5749
Once indexing is complete, you can run your first query:
5850

@@ -61,9 +53,11 @@ Once indexing is complete, you can run your first query:
6153
3. Select **Search with AI** or **Search**.
6254
4. Enter a **query** to test out its response.
6355

64-
## 5. Add to your application
56+
## 4. Add to your application
57+
58+
Once you are ready, go to **Connect** for instructions on how to connect AI Search to your application.
6559

66-
There are multiple ways you can create [RAG applications](/ai-search/) with Cloudflare AI Search:
60+
There are multiple ways you can connect:
6761

6862
- [Workers Binding](/ai-search/usage/workers-binding/)
69-
- [REST API](/ai-search/usage/rest-api/)
63+
- [REST API](/ai-search/usage/rest-api/)

src/content/docs/ai-search/how-to/bring-your-own-generation-model.mdx

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,10 +19,15 @@ import {
1919
TypeScriptExample,
2020
} from "~/components";
2121

22-
When using `AI Search`, AI Search leverages a Workers AI model to generate the response. If you want to use a model outside of Workers AI, you can use AI Search for search while leveraging a model outside of Workers AI to generate responses.
22+
When using `AI Search`, AI Search leverages a Workers AI model to generate the response. If you want to use a model outside of Workers AI, you can use AI Search for `search` while leveraging a model outside of Workers AI to generate responses.
2323

2424
Here is an example of how you can use an OpenAI model to generate your responses. This example uses [Workers Binding](/ai-search/usage/workers-binding/), but can be easily adapted to use the [REST API](/ai-search/usage/rest-api/) instead.
2525

26+
:::note
27+
AI Search now supports [bringing your own models natively](/ai-search/configuration/models/). You can attach provider keys through AI Gateway and select third-party models directly in your AI Search settings. The example below still works, but the recommended way is to configure your external model through AI Gateway.
28+
:::
29+
30+
2631
<TypeScriptExample>
2732

2833
```ts

src/content/docs/ai-search/index.mdx

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -23,17 +23,14 @@ import {
2323
} from "~/components";
2424

2525
<Description>
26-
Create fully-managed RAG applications that continuously update and scale on Cloudflare.
26+
Create AI-powered search for your data
2727
</Description>
2828

2929
<Plan type="all" />
3030

31-
AI Search lets you create retrieval-augmented generation (RAG) pipelines that power your AI applications with accurate and up-to-date information. Create RAG applications that integrate context-aware AI without managing infrastructure.
31+
AI Search (formerly AutoRAG) is Cloudflare’s managed search service. You can connect your data such as websites or unstructured content, and it automatically creates a continuously updating index that you can query with natural language in your applications or AI agents. It natively integrates with Cloudflare’s developer platform tools like Vectorize, AI Gateway, R2, and Workers AI, while also supporting third-party providers and open standards.
3232

33-
You can use AI Search to build:
34-
35-
- **Product Chatbot:** Answer customer questions using your own product content.
36-
- **Docs Search:** Make documentation easy to search and use.
33+
It supports retrieval-augmented generation (RAG) patterns, enabling you to build enterprise search, natural language search, and AI-powered chat without managing infrastructure.
3734

3835
<div>
3936
<LinkButton href="/ai-search/get-started">Get started</LinkButton>

src/content/docs/ai-search/tutorial/brower-rendering-autorag-tutorial.mdx

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -156,10 +156,6 @@ Now that you have created your R2 bucket and filled it with your content that yo
156156

157157
Once you’ve created your AI Search, it will automatically create a Vectorize database in your account and begin indexing the data.
158158

159-
You can view the progress of your indexing job in the Overview page of your AI Search.
160-
161-
![AI Search Overview page](~/assets/images/ai-search/tutorial-indexing-page.png)
162-
163159
## Step 3. Test and add to your application
164160

165161
Once AI Search finishes indexing your content, you’re ready to start asking it questions. You can open up your AI Search instance, navigate to the Playground tab, and ask a question based on your uploaded content, like “What is AI Search?”.

src/content/docs/ai-search/usage/rest-api.mdx

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,10 @@ import {
1919

2020
This guide will instruct you through how to use the AI Search REST API to make a query to your AI Search.
2121

22+
:::note[AI Search is the new name for AutoRAG]
23+
API endpoints may still reference `autorag` for the time being. Functionality remains the same, and support for the new naming will be introduced gradually.
24+
:::
25+
2226
## Prerequisite: Get AI Search API token
2327

2428
You need an API token with the `AI Search - Read` and `AI Search Edit` permissions to use the REST API. To create a new token:

0 commit comments

Comments
 (0)