model support and deprecation strategy

aninibread · aninibread · commit d33859686570 · 2025-09-21T17:29:36.000-04:00
diff --git a/src/content/docs/ai-search/concepts/how-ai-search-works.mdx b/src/content/docs/ai-search/concepts/how-ai-search-works.mdx
@@ -5,7 +5,7 @@ sidebar:
   order: 2
 ---
 
-AI Search sets up and manages your RAG pipeline for you. It connects the tools needed for indexing, retrieval, and generation, and keeps everything up to date by syncing with your data with the index regularly. Once set up, AI Search indexes your content in the background and responds to queries in real time.
+AI Search (formerly AutoRAG) is Cloudflare’s managed search service. You can connect your data such as websites or unstructured content, and it automatically creates a continuously updating index that you can query with natural language in your applications or AI agents. 
 
 AI Search consists of two core processes:
 
diff --git a/src/content/docs/ai-search/configuration/index.mdx b/src/content/docs/ai-search/configuration/index.mdx
@@ -7,7 +7,7 @@ sidebar:
 
 import { MetaInfo, Type } from "~/components";
 
-When creating an AI Search instance, you can customize how your RAG pipeline ingests, processes, and responds to data using a set of configuration options. Some settings can be updated after the instance is created, while others are fixed at creation time.
+You can customize how your AI Search instance indexes your data, and retrieves and generates responses for queries. Some settings can be updated after the instance is created, while others are fixed at creation time.
 
 The table below lists all available configuration options:
 
diff --git a/src/content/docs/ai-search/configuration/models.mdx b/src/content/docs/ai-search/configuration/models.mdx
diff --git a/src/content/docs/ai-search/configuration/models/index.mdx b/src/content/docs/ai-search/configuration/models/index.mdx
@@ -0,0 +1,63 @@
+---
+title: Models
+pcx_content_type: how-to
+sidebar:
+  order: 2
+---
+
+AI Search uses models at multiple stages. You can configure which models are used, or let AI Search automatically select a smart default for you.
+
+## Models usage
+
+AI Search leverages Workers AI models in the following stages:
+
+- Image to markdown conversion (if images are in data source): Converts image content to Markdown using object detection and captioning models.
+- Embedding: Transforms your documents and queries into vector representations for semantic search.
+- Query rewriting (optional): Reformulates the user’s query to improve retrieval accuracy.
+- Generation: Produces the final response from retrieved context.
+
+## Model providers
+
+All AI Search instances support models from [Workers AI](/workers-ai). You can use other providers (such as OpenAI or Anthropic) in AI Search by adding their API keys to an [AI Gateway](/ai-gateway) and connecting that gateway to your AI Search.
+
+To use AI Search with other model providers:
+
+1. Add provider keys to AI Gateway
+- Go to **AI > AI Gateway** in the dashboard.
+- Select or create an AI gateway.
+- In **Provider Keys**, choose your provider, click **Add**, and enter the key.
+2. Connect the gateway to AI Search
+- When creating a new AI Search, select the AI Gateway with your provider keys.
+- For an existing AI Search, go to **Settings** and switch to a gateway that has your keys under **Resources**.
+3. Select models
+- Embedding model: Only available to be changed when creating a new AI Search.
+- Generation model: Can be selected when creating a new AI Search and can be changed at any time in **Settings**.
+
+AI Search supports a subset of models that have been selected to provide the best experience. See list of [supported models](/ai-search/configuration/models/supported-models/).
+
+### Smart default
+
+If you choose **Smart Default** in your model selection, then AI Search will select a Cloudflare recommended model and will update it automatically for you over time. You can switch to explicit model configuration at any time by visiting **Settings**.
+
+### Per-request generation model override
+
+While the generation model can be set globally at the AI Search instance level, you can also override it on a per-request basis in the [AI Search API](/ai-search/usage/rest-api/#ai-search). This is useful if your [RAG application](/ai-search/) requires dynamic selection of generation models based on context or user preferences.
+
+## Model deprecation
+AI Search may deprecate support for a given model in order to provide support for better-performing models with improved capabilities. When a model is being deprecated, we announce the change and provide an end-of-life date after which the model will no longer be accessible. Applications that depend on AI Search may therefore require occasional updates to continue working reliably.
+
+### Model lifecycle
+AI Search models follow a defined lifecycle to ensure stability and predictable deprecation:
+
+1. **Production:** The model is actively supported and recommended for use. It is included in Smart Defaults and receives ongoing updates and maintenance.
+2. **Announcement & Transition:** The model remains available but has been marked for deprecation. An end-of-life date is communicated through documentation, release notes, and other official channels. During this phase, users are encouraged to migrate to the recommended replacement model.
+3. **Automatic Upgrade (if applicable):** For some models, AI Search may automatically upgrade requests to a recommended replacement.
+4. **End of life:** The model is no longer available. Any requests to the retired model return a clear error message, and the model is removed from documentation and Smart Defaults.
+
+See models are their lifecycle status in [supported models](/ai-search/configuration/models/supported-models/).
+
+### Best practices
+
+- Regularly check the [release note](/ai-search/platform/release-note/) for updates.
+- Plan migration efforts according to the communicated end-of-life date.
+- Migrate and test the recommended replacement models before the end-of-life date.
diff --git a/src/content/docs/ai-search/configuration/models/supported-models.mdx b/src/content/docs/ai-search/configuration/models/supported-models.mdx
@@ -0,0 +1,59 @@
+---
+title: Supported models
+pcx_content_type: how-to
+sidebar:
+  order: 2
+---
+
+This page lists all models supported by AI Search and their lifecycle status.
+
+:::note[Request model support]
+If you would like to use a model that is not currently supported, reach out to us on [Discord](https://discord.gg/cloudflaredev) to request it.
+:::
+
+
+## Production models
+Production models are the actively supported and recommended models that are stable, fully available.
+
+### Text generation
+| Provider | Alias | Context window (tokens) |
+|---|---|---|
+| **Anthropic** | anthropic/claude-3-7-sonnet | 200,000 |
+|  | anthropic/claude-sonnet-4 | 200,000 |
+|  | anthropic/claude-opus-4 | 200,000 |
+|  | anthropic/claude-3-5-haiku | 200,000 |
+| **Cerebras** | cerebras/qwen-3-235b-a22b-instruct | 64,000 |
+|  | cerebras/qwen-3-235b-a22b-thinking | 65,000 |
+|  | cerebras/llama-3.3-70b | 65,000 |
+|  | cerebras/llama-4-maverick-17b-128e-instruct | 8,000 |
+|  | cerebras/llama-4-scout-17b-16e-instruct | 8,000 |
+|  | cerebras/gpt-oss-120b | 64,000 |
+| **Google AI Studio** | google-ai-studio/gemini-2.5-flash | 1,048,576 |
+|  | google-ai-studio/gemini-2.5-pro | 1,048,576 |
+| **Grok (x.ai)** | grok/grok-4 | 256,000 |
+| **Groq** | groq/llama-3.3-70b-versatile | 131,072 |
+|  | groq/llama-3.1-8b-instant | 131,072 |
+| **OpenAI** | openai/gpt-5 | 400,000 |
+|  | openai/gpt-5-mini | 400,000 |
+|  | openai/gpt-5-nano | 400,000 |
+| **Workers AI** | @cf/meta/llama-3.3-70b-instruct-fp8-fast | 24,000 |
+|  | @cf/meta/llama-3.1-8b-instruct-fast | 60,000 |
+|  | @cf/meta/llama-3.1-8b-instruct-fp8 | 32,000 |
+|  | @cf/meta/llama-4-scout-17b-16e-instruct | 131,000 |
+|  | @cf/qwen/qwen3-30b-a3b-fp8 | 32,000 |
+|  | @cf/moonshotai/kimi-k2-instruct | 128,000 |
+
+### Embedding
+| Provider | Alias | Vector dims | Input tokens | Metric |
+|---|---|---|---|---|
+| **Google AI Studio** | google-ai-studio/gemini-embedding-001 | 1,536 | 512 | cosine |
+| **OpenAI** | openai/text-embedding-3-small | 1,536 | 512 | cosine |
+|  | openai/text-embedding-3-large | 1,536 | 512 | cosine |
+| **Workers AI** | @cf/baai/bge-m3 | 1,024 | 512 | cosine |
+|  | @cf/baai/bge-large-en-v1.5 | 1,024 | 512 | cosine |
+|  | @cf/google/embeddinggemma-300m | 768 | 512 | cosine |
+|  | @cf/qwen/qwen3-embedding-0.6b | 1,024 | 512 | cosine |
+
+## Transition models
+
+There are currently no models marked for end-of-life.
diff --git a/src/content/docs/ai-search/get-started.mdx b/src/content/docs/ai-search/get-started.mdx
@@ -11,48 +11,40 @@ Description: Get started creating fully-managed, retrieval-augmented generation
 
 import { DashButton } from "~/components";
 
-AI Search allows developers to create fully managed retrieval-augmented generation (RAG) pipelines to power AI applications with accurate and up-to-date information without needing to manage infrastructure.
+AI Search (formerly AutoRAG) is Cloudflare’s managed search service. You can connect your data such as websites or unstructured content, and it automatically creates a continuously updating index that you can query with natural language in your applications or AI agents. 
 
-## 1. Upload data or use existing data in R2
+## Prerequisite
 
-AI Search integrates with R2 for data import. Create an R2 bucket if you do not have one and upload your data.
-
-:::note
-Before you create your first bucket, you must purchase R2 from the Cloudflare dashboard.
-:::
-
-To create and upload objects to your bucket from the Cloudflare dashboard:
-
-1. In the Cloudflare dashboard, go to the **R2** page.
+AI Search integrates with R2 for storing your data. You must have an **active R2 subscription** before creating your first AI Search. You can purchase the subscription on the Cloudflare R2 dashboard.
 
 <DashButton url="/?to=/:account/r2/overview" />
 
-2. Select Create bucket, name the bucket, and select **Create bucket**.
-3. Choose to either drag and drop your file into the upload area or **select from computer**. Review the [file limits](/ai-search/configuration/data-source/) when creating your knowledge base.
-
-_If you need inspiration for what document to use to make your first AI Search, try downloading and uploading the [RSS](/changelog/rss/index.xml) of the [Cloudflare Changelog](/changelog/)._
-
-## 2. Create an AI Search
+## 1. Create an AI Search
 
 To create a new AI Search:
 
-1. In the Cloudflare dashboard, go to the **AI Search* page.
+1. In the Cloudflare dashboard, go to the **AI Search** page.
 
 <DashButton url="/?to=/:account/ai/autorag" />
 
-2. Select **Create AI Search**, configure the AI Search, and complete the setup process.
-3. Select **Create**.
+2. Select **Create**
+3. In Create a RAG, select **Get Started**
+3. Then choose how you want to connect your data:
+   - **R2 bucket**: Index the content from one of your R2 buckets.
+   - **Website**: Provide a domain from your Cloudflare account and AI Search will automatically crawl your site, store the content in R2, and index it.
+3. Configure the AI Search and complete the setup process.
+4. Select **Create**.
 
-## 3. Monitor indexing
+## 2. Monitor indexing
 
-Once created, AI Search will create a Vectorize index in your account and begin indexing the data.
+After setup, AI Search creates a Vectorize index in your account and begins indexing the data.
 
-To monitor the indexing progress:
+To monitor progress:
 
 1. From the **AI Search** page in the dashboard, locate and select your AI Search.
 2. Navigate to the **Overview** page to view the current indexing status.
 
-## 4. Try it out
+## 3. Try it out
 
 Once indexing is complete, you can run your first query:
 
@@ -61,9 +53,11 @@ Once indexing is complete, you can run your first query:
 3. Select **Search with AI** or **Search**.
 4. Enter a **query** to test out its response.
 
-## 5. Add to your application
+## 4. Add to your application
+
+Once you are ready, go to **Connect** for instructions on how to connect AI Search to your application. 
 
-There are multiple ways you can create [RAG applications](/ai-search/) with Cloudflare AI Search:
+There are multiple ways you can connect:
 
 - [Workers Binding](/ai-search/usage/workers-binding/)
-- [REST API](/ai-search/usage/rest-api/)
+- [REST API](/ai-search/usage/rest-api/)
diff --git a/src/content/docs/ai-search/how-to/bring-your-own-generation-model.mdx b/src/content/docs/ai-search/how-to/bring-your-own-generation-model.mdx
@@ -19,10 +19,15 @@ import {
 	TypeScriptExample,
 } from "~/components";
 
-When using `AI Search`, AI Search leverages a Workers AI model to generate the response. If you want to use a model outside of Workers AI, you can use AI Search for search while leveraging a model outside of Workers AI to generate responses.
+When using `AI Search`, AI Search leverages a Workers AI model to generate the response. If you want to use a model outside of Workers AI, you can use AI Search for `search` while leveraging a model outside of Workers AI to generate responses.
 
 Here is an example of how you can use an OpenAI model to generate your responses. This example uses [Workers Binding](/ai-search/usage/workers-binding/), but can be easily adapted to use the [REST API](/ai-search/usage/rest-api/) instead.
 
+:::note
+AI Search now supports [bringing your own models natively](/ai-search/configuration/models/). You can attach provider keys through AI Gateway and select third-party models directly in your AI Search settings. The example below still works, but the recommended way is to configure your external model through AI Gateway.  
+:::
+
+
 <TypeScriptExample>
 
 ```ts
diff --git a/src/content/docs/ai-search/index.mdx b/src/content/docs/ai-search/index.mdx
@@ -23,17 +23,14 @@ import {
 } from "~/components";
 
 <Description>
-	Create fully-managed RAG applications that continuously update and scale on Cloudflare.
+	Create AI-powered search for your data
 </Description>
 
 <Plan type="all" />
 
-AI Search lets you create retrieval-augmented generation (RAG) pipelines that power your AI applications with accurate and up-to-date information. Create RAG applications that integrate context-aware AI without managing infrastructure.
+AI Search (formerly AutoRAG) is Cloudflare’s managed search service. You can connect your data such as websites or unstructured content, and it automatically creates a continuously updating index that you can query with natural language in your applications or AI agents. It natively integrates with Cloudflare’s developer platform tools like Vectorize, AI Gateway, R2, and Workers AI, while also supporting third-party providers and open standards. 
 
-You can use AI Search to build:
-
-- **Product Chatbot:** Answer customer questions using your own product content.
-- **Docs Search:** Make documentation easy to search and use.
+It supports retrieval-augmented generation (RAG) patterns, enabling you to build enterprise search, natural language search, and AI-powered chat without managing infrastructure.
 
 <div>
 	<LinkButton href="/ai-search/get-started">Get started</LinkButton>
diff --git a/src/content/docs/ai-search/tutorial/brower-rendering-autorag-tutorial.mdx b/src/content/docs/ai-search/tutorial/brower-rendering-autorag-tutorial.mdx
@@ -156,10 +156,6 @@ Now that you have created your R2 bucket and filled it with your content that yo
 
 Once you’ve created your AI Search, it will automatically create a Vectorize database in your account and begin indexing the data.
 
-You can view the progress of your indexing job in the Overview page of your AI Search.
-
-![AI Search Overview page](~/assets/images/ai-search/tutorial-indexing-page.png)
-
 ## Step 3. Test and add to your application
 
 Once AI Search finishes indexing your content, you’re ready to start asking it questions. You can open up your AI Search instance, navigate to the Playground tab, and ask a question based on your uploaded content, like “What is AI Search?”.
diff --git a/src/content/docs/ai-search/usage/rest-api.mdx b/src/content/docs/ai-search/usage/rest-api.mdx
@@ -19,6 +19,10 @@ import {
 
 This guide will instruct you through how to use the AI Search REST API to make a query to your AI Search.
 
+:::note[AI Search is the new name for AutoRAG]
+API endpoints may still reference `autorag` for the time being. Functionality remains the same, and support for the new naming will be introduced gradually.
+:::
+
 ## Prerequisite: Get AI Search API token
 
 You need an API token with the `AI Search - Read` and `AI Search Edit` permissions to use the REST API. To create a new token:
diff --git a/src/content/docs/ai-search/usage/workers-binding.mdx b/src/content/docs/ai-search/usage/workers-binding.mdx
@@ -29,6 +29,10 @@ binding = "AI" # i.e. available in your Worker on env.AI
 
 </WranglerConfig>
 
+:::note[AI Search is the new name for AutoRAG]
+API endpoints may still reference `autorag` for the time being. Functionality remains the same, and support for the new naming will be introduced gradually.
+:::
+
 ## `aiSearch()`
 
 This method searches for relevant results from your data source and generates a response using your default model and the retrieved context, for an AI Search named `my-autorag`: