Skip to content

Commit cee1f32

Browse files
committed
Merge branch 'main' of https://github.com/MicrosoftDocs/azure-ai-docs-pr into deploy-online-endpoints
2 parents 7f682f7 + 53958ec commit cee1f32

File tree

58 files changed

+1859
-1390
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

58 files changed

+1859
-1390
lines changed

articles/ai-foundry/concepts/model-lifecycle-retirement.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,12 @@ The following tables list the timelines for models that are on track for retirem
8080
| [Cohere-rerank-v3-english](https://ai.azure.com/explore/models/Cohere-rerank-v3-english/version/1/registry/azureml-cohere) | February 28, 2025 | March 31, 2025 | June 30, 2025 | [Cohere-rerank-v3.5-english](https://ai.azure.com/explore/models/Cohere-rerank-v3.5/version/1/registry/azureml-cohere) |
8181
| [Cohere-rerank-v3-multilingual](https://ai.azure.com/explore/models/Cohere-rerank-v3-multilingual/version/1/registry/azureml-cohere) | February 28, 2025 | March 31, 2025 | June 30, 2025 | [Cohere-rerank-v3.5-multilingual](https://ai.azure.com/explore/models/Cohere-rerank-v3.5/version/1/registry/azureml-cohere) |
8282

83+
#### DeepSeek
84+
85+
| Model | Legacy date (UTC) | Deprecation date (UTC) | Retirement date (UTC) | Suggested replacement model |
86+
|-------|-------------------|------------------------|-----------------------|-----------------------------|
87+
| [DeepSeek-V3](https://aka.ms/azureai/landing/DeepSeek-V3) | April 10, 2025 | May 31, 2025 | August 31, 2025 | [DeepSeek-V3-0324](https://aka.ms/azureai/landing/DeepSeek-V3-0324) |
88+
8389
#### Meta
8490

8591
| Model | Legacy date (UTC) | Deprecation date (UTC) | Retirement date (UTC) | Suggested replacement model |

articles/ai-foundry/concepts/models-featured.md

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -141,11 +141,12 @@ For more examples of how to use Jais models, see the following examples:
141141

142142
## DeepSeek
143143

144-
DeepSeek family of models includes DeepSeek-R1, which excels at reasoning tasks using a step-by-step training process, such as language, scientific reasoning, and coding tasks, and DeepSeek-V3, a Mixture-of-Experts (MoE) language model.
144+
DeepSeek family of models includes DeepSeek-R1, which excels at reasoning tasks using a step-by-step training process, such as language, scientific reasoning, and coding tasks, DeepSeek-V3-0324, a Mixture-of-Experts (MoE) language model, and more.
145145

146146
| Model | Type | Capabilities |
147147
| ------ | ---- | --- |
148-
| [DeepSeek-V3](https://ai.azure.com/explore/models/deepseek-v3/version/1/registry/azureml-deepseek) | [chat-completion](../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context) | - **Input:** text (131,072 tokens) <br /> - **Output:** text (131,072 tokens) <br /> - **Tool calling:** No <br /> - **Response formats:** Text, JSON |
148+
| [DeekSeek-V3-0324](https://ai.azure.com/explore/models/deepseek-v3-0324/version/1/registry/azureml-deepseek) | [chat-completion](../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context) | - **Input:** text (131,072 tokens) <br /> - **Output:** (131,072 tokens) <br /> - **Tool calling:** No <br /> - **Response formats:** Text, JSON |
149+
| [DeepSeek-V3](https://ai.azure.com/explore/models/deepseek-v3/version/1/registry/azureml-deepseek) <br />(Legacy) | [chat-completion](../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context) | - **Input:** text (131,072 tokens) <br /> - **Output:** text (131,072 tokens) <br /> - **Tool calling:** No <br /> - **Response formats:** Text, JSON |
149150
| [DeepSeek-R1](https://ai.azure.com/explore/models/deepseek-r1/version/1/registry/azureml-deepseek) | [chat-completion with reasoning content](../model-inference/how-to/use-chat-reasoning.md?context=/azure/ai-foundry/context/context) | - **Input:** text (163,840 tokens) <br /> - **Output:** text (163,840 tokens) <br /> - **Tool calling:** No <br /> - **Response formats:** Text. |
150151

151152
For a tutorial on DeepSeek-R1, see [Tutorial: Get started with DeepSeek-R1 reasoning model in Azure AI model inference](../model-inference/tutorials/get-started-deepseek-r1.md?context=/azure/ai-foundry/context/context).
@@ -171,9 +172,12 @@ Meta Llama models and tools are a collection of pretrained and fine-tuned genera
171172
- Small language models (SLMs) like 1B and 3B Base and Instruct models for on-device and edge inferencing
172173
- Mid-size large language models (LLMs) like 7B, 8B, and 70B Base and Instruct models
173174
- High-performant models like Meta Llama 3.1-405B Instruct for synthetic data generation and distillation use cases.
175+
- High-performant natively multimodal models, Llama 4 Scout and Llama 4 Maverick, leverage a mixture-of-experts architecture to offer industry-leading performance in text and image understanding.
174176

175177
| Model | Type | Capabilities |
176178
| ------ | ---- | ------------ |
179+
| [Llama-4-Scout-17B-16E-Instruct](https://aka.ms/aifoundry/landing/llama-4-scout-17b-16e-instruct) | [chat-completion](../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context) | - **Input:** text and image (128,000 tokens) <br /> - **Output:** text (8,192 tokens) <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text |
180+
| [Llama 4-Maverick-17B-128E-Instruct-FP8](https://aka.ms/aifoundry/landing/llama-4-maverick-17b-128e-instruct-fp8) | [chat-completion](../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context) | - **Input:** text and image (128,000 tokens) <br /> - **Output:** text (8,192 tokens) <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text |
177181
| [Llama-3.3-70B-Instruct](https://ai.azure.com/explore/models/Llama-3.3-70B-Instruct/version/4/registry/azureml-meta) | [chat-completion](../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context) | - **Input:** text (128,000 tokens) <br /> - **Output:** text (8,192 tokens) <br /> - **Tool calling:** No <br /> - **Response formats:** Text |
178182
| [Llama-3.2-90B-Vision-Instruct](https://ai.azure.com/explore/models/Llama-3.2-90B-Vision-Instruct/version/1/registry/azureml-meta) | [chat-completion (with images)](../model-inference/how-to/use-chat-multi-modal.md?context=/azure/ai-foundry/context/context) | - **Input:** text and image (128,000 tokens) <br /> - **Output:** text (8,192 tokens) <br /> - **Tool calling:** No <br /> - **Response formats:** Text |
179183
| [Llama-3.2-11B-Vision-Instruct](https://ai.azure.com/explore/models/Llama-3.2-11B-Vision-Instruct/version/1/registry/azureml-meta) | [chat-completion (with images)](../model-inference/how-to/use-chat-multi-modal.md?context=/azure/ai-foundry/context/context) | - **Input:** text and image (128,000 tokens) <br /> - **Output:** text (8,192 tokens) <br /> - **Tool calling:** No <br /> - **Response formats:** Text |

articles/ai-foundry/includes/region-availability-maas.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@ Cohere Embed v3 - Multilingual | [Microsoft Managed Countries/Regions](/par
4141

4242
| Model | Offer Availability Region | Hub/Project Region for Deployment | Hub/Project Region for Fine tuning |
4343
|---------|---------|---------|---------|
44+
DeepSeek-V3-0324 | Not applicable | East US <br> East US 2 <br> North Central US <br> South Central US <br> West US <br> West US 3 | Not available |
4445
DeepSeek-V3 | Not applicable | East US <br> East US 2 <br> North Central US <br> South Central US <br> West US <br> West US 3 | Not available |
4546
DeepSeek-R1 | Not applicable | East US <br> East US 2 <br> North Central US <br> South Central US <br> West US <br> West US 3 | Not available |
4647

articles/ai-foundry/model-inference/concepts/models.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -110,7 +110,8 @@ DeepSeek family of models includes DeepSeek-R1, which excels at reasoning tasks
110110
| Model | Type | Tier | Capabilities |
111111
| ------ | ---- | --- | ------------ |
112112
| [DeekSeek-R1](https://ai.azure.com/explore/models/deepseek-r1/version/1/registry/azureml-deepseek) | chat-completion <br /> [(with reasoning content)](../how-to/use-chat-reasoning.md) | Global standard | - **Input:** text (163,840 tokens) <br /> - **Output:** (163,840 tokens) <br /> - **Languages:** `en` and `zh` <br /> - **Tool calling:** No <br /> - **Response formats:** Text. |
113-
| [DeekSeek-V3](https://ai.azure.com/explore/models/deepseek-v3/version/1/registry/azureml-deepseek) | chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:** (131,072 tokens) <br /> - **Languages:** `en` and `zh` <br /> - **Tool calling:** No <br /> - **Response formats:** Text, JSON |
113+
| [DeekSeek-V3](https://ai.azure.com/explore/models/deepseek-v3/version/1/registry/azureml-deepseek) <br />(Legacy) | chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:** (131,072 tokens) <br /> - **Languages:** `en` and `zh` <br /> - **Tool calling:** No <br /> - **Response formats:** Text, JSON |
114+
| [DeekSeek-V3-0324](https://ai.azure.com/explore/models/deepseek-v3-0324/version/1/registry/azureml-deepseek) | chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:** (131,072 tokens) <br /> - **Languages:** `en` and `zh` <br /> - **Tool calling:** No <br /> - **Response formats:** Text, JSON |
114115

115116
For a tutorial on DeepSeek-R1, see [Tutorial: Get started with DeepSeek-R1 reasoning model in Azure AI model inference](../tutorials/get-started-deepseek-r1.md).
116117

articles/ai-foundry/model-inference/includes/create-model-deployments/cli.md

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -48,15 +48,16 @@ To add a model, you first need to identify the model that you want to deploy. Yo
4848
```azurecli
4949
accountName="<ai-services-resource-name>"
5050
resourceGroupName="<resource-group>"
51+
location="eastus2"
5152
```
5253
5354
3. If you don't have an Azure AI Services account create yet, you can create one as follows:
5455
5556
```azurecli
56-
az cognitiveservices account create -n $accountName -g $resourceGroupName --custom-domain $accountName
57+
az cognitiveservices account create -n $accountName -g $resourceGroupName --custom-domain $accountName --location $location --kind AIServices --sku S0
5758
```
5859
59-
4. Let's see first which models are available to you and under which SKU. The following command list all the model definitions available:
60+
4. Let's see first which models are available to you and under which SKU. SKUs, also known as [deployment types](../../concepts/deployment-types.md), define how Azure infrastructure is used to process requests. Models may offer different deployment types. The following command list all the model definitions available:
6061
6162
```azurecli
6263
az cognitiveservices account list-models \
@@ -77,10 +78,7 @@ To add a model, you first need to identify the model that you want to deploy. Yo
7778
}
7879
```
7980
80-
6. Identify the model you want to deploy. You need the properties `name`, `format`, `version`, and `sku`. Capacity might also be needed depending on the type of deployment.
81-
82-
> [!TIP]
83-
> Notice that not all the models are available in all the SKUs.
81+
6. Identify the model you want to deploy. You need the properties `name`, `format`, `version`, and `sku`. The property `format` indicates the provider offering the model. Capacity might also be needed depending on the type of deployment.
8482
8583
7. Add the model deployment to the resource. The following example adds `Phi-3.5-vision-instruct`:
8684

articles/ai-foundry/model-inference/quotas-limits.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -32,9 +32,9 @@ Azure uses quotas and limits to prevent budget overruns due to fraud, and to hon
3232
| -------------------- | ------------------- | ----------- |
3333
| Tokens per minute | Azure OpenAI models | Varies per model and SKU. See [limits for Azure OpenAI](../../ai-services/openai/quotas-limits.md). |
3434
| Requests per minute | Azure OpenAI models | Varies per model and SKU. See [limits for Azure OpenAI](../../ai-services/openai/quotas-limits.md). |
35-
| Tokens per minute | DeepSeek-R1 | 5,000,000 |
36-
| Requests per minute | DeepSeek-R1 | 5,000 |
37-
| Concurrent requests | DeepSeek-R1 | 300 |
35+
| Tokens per minute | DeepSeek-R1<br />DeepSeek-V3-0324 | 5,000,000 |
36+
| Requests per minute | DeepSeek-R1<br />DeepSeek-V3-0324 | 5,000 |
37+
| Concurrent requests | DeepSeek-R1<br />DeepSeek-V3-0324 | 300 |
3838
| Tokens per minute | Rest of models | 400,000 |
3939
| Requests per minute | Rest of models | 1,000 |
4040
| Concurrent requests | Rest of models | 300 |

articles/ai-foundry/toc.yml

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -557,8 +557,6 @@ items:
557557
href: ai-services/content-safety-overview.md
558558
- name: Content safety for models deployed with serverless APIs
559559
href: concepts/model-catalog-content-safety.md
560-
- name: Use Azure AI Content Safety in AI Foundry portal
561-
href: /azure/ai-services/content-safety/how-to/foundry?context=/azure/ai-foundry/context/context
562560
- name: Content filtering
563561
href: concepts/content-filtering.md
564562
- name: Use blocklists

articles/ai-services/.openpublishing.redirection.ai-services.json

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -160,6 +160,16 @@
160160
"redirect_url": "/azure/ai-services/content-safety/quickstart-custom-categories",
161161
"redirect_document_id": true
162162
},
163+
{
164+
"source_path_from_root": "/articles/ai-services/content-safety/how-to/foundry.md",
165+
"redirect_url": "/azure/ai-foundry/ai-services/content-safety-overview",
166+
"redirect_document_id": false
167+
},
168+
{
169+
"source_path_from_root": "/articles/ai-services/content-safety/studio-quickstart.md",
170+
"redirect_url": "/azure/ai-foundry/ai-services/content-safety-overview?context=/azure/ai-services/content-safety/context/context",
171+
"redirect_document_id": false
172+
},
163173
{
164174
"source_path_from_root": "/articles/ai-services/speech-service/how-to-custom-voice-create-voice.md",
165175
"redirect_url": "/azure/ai-services/speech-service/professional-voice-train-voice",

articles/ai-services/agents/how-to/tools/fabric.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,8 @@ You need to first build and publish a Fabric data agent and then connect your Fa
3434

3535
* Developers and end users have at least `READ` access to the Fabric data agent and the underlying data sources it connects with.
3636

37+
* Your Fabric Data Agent and Azure AI Agent need to be in the same tenant.
38+
3739
## Setup
3840
> [!NOTE]
3941
> * The model you selected in Azure AI Agent setup is only used for agent orchestration and response generation. It doesn't impact which model Fabric data agent uses for NL2SQL operation.

articles/ai-services/content-safety/how-to/foundry.md

Lines changed: 0 additions & 115 deletions
This file was deleted.

0 commit comments

Comments
 (0)