Skip to content

Commit 0bc728f

Browse files
authored
Merge pull request #1045 from MicrosoftDocs/main
10/25/2024 PM Publish
2 parents acd5cbe + b93a360 commit 0bc728f

36 files changed

+634
-289
lines changed

.openpublishing.redirection.json

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,11 @@
1919
"source_path_from_root": "/articles/search/search-howto-index-csv-blobs.md",
2020
"redirect_url": "/azure/search/search-how-to-index-csv-blobs",
2121
"redirect_document_id": false
22+
},
23+
{
24+
"source_path_from_root": "/articles/search/search-howto-large-index.md",
25+
"redirect_url": "/azure/search/search-how-to-large-index",
26+
"redirect_document_id": false
2227
}
2328
]
2429
}

articles/ai-services/openai/concepts/model-retirements.md

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ titleSuffix: Azure OpenAI
44
description: Learn about the model deprecations and retirements in Azure OpenAI.
55
ms.service: azure-ai-openai
66
ms.topic: conceptual
7-
ms.date: 10/02/2024
7+
ms.date: 10/25/2024
88
ms.custom:
99
manager: nitinme
1010
author: mrbullwinkle
@@ -91,6 +91,8 @@ These models are currently available for use in Azure OpenAI Service.
9191

9292
| Model | Version | Retirement date | Suggested replacements |
9393
| ---- | ---- | ---- | --- |
94+
| `babbage-002` | 1 | Deprecation Date: November 15, 2024 <br>Retirement Date: January 27, 2025 | |
95+
| `davinci-002` | 1 | Deprecation Date: November 15, 2024 <br>Retirement Date: January 27, 2025 | |
9496
| `dall-e-2`| 2 | January 27, 2025 | `dalle-3` |
9597
| `dall-e-3` | 3 | No earlier than April 30, 2025 | |
9698
| `gpt-35-turbo` | 0301 | January 27, 2025<br><br> Deployments set to [**Auto-update to default**](/azure/ai-services/openai/how-to/working-with-models?tabs=powershell#auto-update-to-default) will be automatically upgraded to version: `0125`, starting on November 13, 2024. | `gpt-35-turbo` (0125) <br><br> `gpt-4o-mini` |
@@ -158,9 +160,12 @@ If you're an existing customer looking for information about these models, see [
158160
| code-search-babbage-code-001 | July 6, 2023 | June 14, 2024 | text-embedding-3-small |
159161
| code-search-babbage-text-001 | July 6, 2023 | June 14, 2024 | text-embedding-3-small |
160162

161-
162163
## Retirement and deprecation history
163164

165+
## October 25, 2024
166+
167+
* `babbage-002` & `davinci-002` deprecation date: November 15, 2024 and retirement date: January 27, 2025.
168+
164169
## September 12, 2024
165170

166171
* `gpt-35-turbo` (0301), (0613), (1106) and `gpt-35-turbo-16k` (0613) auto-update to default upgrade date updated to November 13, 2024.

articles/ai-services/openai/concepts/models.md

Lines changed: 77 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ titleSuffix: Azure OpenAI
44
description: Learn about the different model capabilities that are available with Azure OpenAI.
55
ms.service: azure-ai-openai
66
ms.topic: conceptual
7-
ms.date: 10/09/2024
7+
ms.date: 10/25/2024
88
ms.custom: references_regions, build-2023, build-2023-dataai, refefences_regions
99
manager: nitinme
1010
author: mrbullwinkle #ChrisHMSFT
@@ -357,16 +357,40 @@ You can also use the OpenAI text to speech voices via Azure AI Speech. To learn
357357

358358
## Model summary table and region availability
359359

360-
> [!NOTE]
361-
> This article primarily covers model/region availability that applies to all Azure OpenAI customers with deployment types of **Standard**. Some select customers have access to model/region combinations that are not listed in the unified table below. For more information on Provisioned deployments, see our [Provisioned guidance](./provisioned-throughput.md).
360+
### Models by deployment type
361+
362+
Azure OpenAI provides customers with choices on the hosting structure that fits their business and usage patterns. The service offers two main types of deployment:
363+
364+
- **Standard** is offered with a global deployment option, routing traffic globally to provide higher throughput.
365+
- **Provisioned** is also offered with a global deployment option, allowing customers to purchase and deploy provisioned throughput units across Azure global infrastructure.
366+
367+
All deployments can perform the exact same inference operations, however the billing, scale, and performance are substantially different. To learn more about Azure OpenAI deployment types see our [deployment types guide](../how-to/deployment-types.md).
368+
369+
# [Global Standard](#tab/global-standard)
370+
371+
### Global standard model availability
372+
373+
[!INCLUDE [Standard Global](../includes/model-matrix/standard-global.md)]
374+
375+
# [Global Provisioned Managed](#tab/global-ptum)
376+
377+
### Global provisioned managed model availability
378+
379+
[!INCLUDE [Provisioned Managed Global](../includes/model-matrix/provisioned-global.md)]
380+
381+
# [Global Batch](#tab/global-batch)
382+
383+
### Global batch model availability
384+
385+
[!INCLUDE [Global batch](../includes/model-matrix/global-batch.md)]
386+
387+
# [Standard](#tab/standard)
362388

363389
### Standard deployment model availability
364390

365391
[!INCLUDE [Standard Models](../includes/model-matrix/standard-models.md)]
366392

367-
This table doesn't include fine-tuning regional availability information. Consult the [fine-tuning section](#fine-tuning-models) for this information.
368-
369-
For information on default quota, refer to the [quota and limits article](../quotas-limits.md).
393+
# [Provisioned Managed](#tab/provisioned)
370394

371395
### Provisioned deployment model availability
372396

@@ -377,24 +401,20 @@ For information on default quota, refer to the [quota and limits article](../quo
377401
378402
For more information on Provisioned deployments, see our [Provisioned guidance](./provisioned-throughput.md).
379403

380-
### Global standard model availability
404+
---
381405

382-
[!INCLUDE [Standard Global](../includes/model-matrix/standard-global.md)]
406+
This table doesn't include fine-tuning regional availability information. Consult the [fine-tuning section](#fine-tuning-models) for this information.
383407

384-
### Global provisioned managed model availability
408+
### Standard models by endpoint
385409

386-
[!INCLUDE [Provisioned Managed Global](../includes/model-matrix/provisioned-global.md)]
410+
# [Chat Completions](#tab/standard-chat-completions)
387411

388-
### Global batch model availability
412+
### Chat completions
389413

390-
[!INCLUDE [Global batch](../includes/model-matrix/global-batch.md)]
414+
[!INCLUDE [Chat Completions](../includes/model-matrix/standard-chat-completions.md)]
391415

392416
### GPT-4 and GPT-4 Turbo model availability
393417

394-
#### Public cloud regions
395-
396-
[!INCLUDE [GPT-4](../includes/model-matrix/standard-gpt-4.md)]
397-
398418
#### Select customer access
399419

400420
In addition to the regions above which are available to all Azure OpenAI customers, some select pre-existing customers have been granted access to versions of GPT-4 in additional regions:
@@ -406,23 +426,14 @@ In addition to the regions above which are available to all Azure OpenAI custome
406426

407427
### GPT-3.5 models
408428

409-
> [!IMPORTANT]
410-
> The NEW `gpt-35-turbo (0125)` model has various improvements, including higher accuracy at responding in requested formats and a fix for a bug which caused a text encoding issue for non-English language function calls.
411-
412-
GPT-3.5 Turbo is used with the Chat Completion API. GPT-3.5 Turbo version 0301 can also be used with the Completions API, though this is not recommended. GPT-3.5 Turbo versions 0613 and 1106 only support the Chat Completions API.
413-
414-
GPT-3.5 Turbo version 0301 is the first version of the model released. Version 0613 is the second version of the model and adds function calling support.
415-
416429
See [model versions](../concepts/model-versions.md) to learn about how Azure OpenAI Service handles model version upgrades, and [working with models](../how-to/working-with-models.md) to learn how to view and configure the model version settings of your GPT-3.5 Turbo deployments.
417430

418-
### GPT-3.5-Turbo model availability
419-
420-
#### Public cloud regions
421-
422-
[!INCLUDE [GPT-35-Turbo](../includes/model-matrix/standard-gpt-35-turbo.md)]
431+
# [Embeddings](#tab/standard-embeddings)
423432

424433
### Embeddings models
425434

435+
[!INCLUDE [Embeddings](../includes/model-matrix/standard-embeddings.md)]
436+
426437
These models can only be used with Embedding API requests.
427438

428439
> [!NOTE]
@@ -438,21 +449,51 @@ These models can only be used with Embedding API requests.
438449
> [!NOTE]
439450
> When sending an array of inputs for embedding, the max number of input items in the array per call to the embedding endpoint is 2048.
440451
441-
#### Public cloud regions
452+
# [Image Generation](#tab/standard-image-generations)
442453

443-
[!INCLUDE [Embeddings](../includes/model-matrix/standard-embeddings.md)]
454+
### Image generation models
455+
456+
[!INCLUDE [Image Generation](../includes/model-matrix/standard-image-generation.md)]
444457

445458
### DALL-E models
446459

447-
| Model ID | Feature Availability | Max Request (characters) |
448-
| --- | --- | :---: |
449-
| dalle2 (preview) | East US | 1,000 |
450-
| dall-e-3 | East US, Australia East, Sweden Central | 4,000 |
460+
| Model ID | Max Request (characters) |
461+
| --- | :---: |
462+
| dalle2 (preview) | 1,000 |
463+
| dall-e-3 | 4,000 |
464+
465+
# [Audio](#tab/standard-audio)
466+
467+
### Audio models
468+
469+
[!INCLUDE [Audio](../includes/model-matrix/standard-audio.md)]
470+
471+
### Whisper models
451472

452-
### Fine-tuning models
473+
| Model ID | Max Request (audio file size) |
474+
| --- | :---: |
475+
| `whisper` | 25 MB |
476+
477+
### Text to speech models (Preview)
478+
479+
| Model ID | Description |
480+
| --- | :--- |
481+
| `tts` | The latest Azure OpenAI text to speech model, optimized for speed. |
482+
| `tts-hd` | The latest Azure OpenAI text to speech model, optimized for quality.|
483+
|
484+
485+
# [Completions (Legacy)](#tab/standard-completions)
486+
487+
### Completions models
453488

454489
`babbage-002` and `davinci-002` are not trained to follow instructions. Querying these base models should only be done as a point of reference to a fine-tuned version to evaluate the progress of your training.
455490

491+
[!INCLUDE [Completions](../includes/model-matrix/standard-completions.md)]
492+
493+
---
494+
495+
## Fine-tuning models
496+
456497
`gpt-35-turbo` - fine-tuning of this model is limited to a subset of regions, and is not available in every region the base model is available.
457498

458499
| Model ID | Fine-Tuning Regions | Max Request (tokens) | Training Data (up to) |
@@ -468,20 +509,7 @@ These models can only be used with Embedding API requests.
468509

469510
**<sup>1</sup>** GPT-4 is currently in public preview.
470511

471-
### Whisper models
472-
473-
| Model ID | Model Availability | Max Request (audio file size) |
474-
| --- | --- | :---: |
475-
| `whisper` | East US 2 <br> North Central US <br> Norway East <br> South India <br> Sweden Central <br> West Europe | 25 MB |
476-
477-
### Text to speech models (Preview)
478-
479-
| Model ID | Model Availability |
480-
| --- | --- | :---: |
481-
| `tts-1` | North Central US <br> Sweden Central |
482-
| `tts-1-hd` | North Central US <br> Sweden Central |
483-
484-
### Assistants (Preview)
512+
## Assistants (Preview)
485513

486514
For Assistants you need a combination of a supported model, and a supported region. Certain tools and capabilities require the latest models. The following models are available in the Assistants API, SDK, Azure AI Studio and Azure OpenAI Studio. The following table is for pay-as-you-go. For information on Provisioned Throughput Unit (PTU) availability, see [provisioned throughput](./provisioned-throughput.md). The listed models and regions can be used with both Assistants v1 and v2. You can use [global standard models](#global-standard-model-availability) if they are supported in the regions listed below.
487515

articles/ai-services/openai/concepts/provisioned-throughput.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ To help with simplifying the sizing effort, the following table outlines the TPM
4949
| Input TPM per PTU | 2,500 | 37,000 |
5050
| Output TPM per PTU | 833 | 12,333 |
5151

52-
\** For a full list see the [AOAI Studio calcualator](https://oai.azure.com/portal/calculator)
52+
For a full list see the [AOAI Studio calculator](https://oai.azure.com/portal/calculator).
5353

5454

5555
## Key concepts
@@ -114,7 +114,7 @@ In Azure OpenAI Studio, the deployment experience identifies when a region lacks
114114

115115
Details on the new deployment experience can be found in the Azure OpenAI [Provisioned get started guide](../how-to/provisioned-get-started.md).
116116

117-
The new [model capacities API](/rest/api/aiservices/accountmanagement/model-capacities/list?view=rest-aiservices-accountmanagement-2024-04-01-preview&tabs=HTTP&preserve-view=true) can be used to programmatically identify the maximum sized deployment of a specified model. The API consideres both the your quota and service capacity in the region.
117+
The new [model capacities API](/rest/api/aiservices/accountmanagement/model-capacities/list?view=rest-aiservices-accountmanagement-2024-04-01-preview&tabs=HTTP&preserve-view=true) can be used to programmatically identify the maximum sized deployment of a specified model. The API considers both your quota and service capacity in the region.
118118

119119
If an acceptable region isn't available to support the desire model, version and/or PTUs, customers can also try the following steps:
120120

articles/ai-services/openai/includes/model-matrix/global-batch.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -5,11 +5,11 @@ description: Regional availability for Global Batch models
55
manager: nitinme
66
ms.service: azure-ai-openai
77
ms.topic: include
8-
ms.date: 10/03/2024
8+
ms.date: 10/24/2024
99
---
1010

11-
| **Region** | **gpt-4**, **0613** | **gpt-4**, **turbo-2024-04-09** | **gpt-4o**, **2024-05-13** | **gpt-4o**, **2024-08-06** | **gpt-4o-mini**, **2024-07-18** | **gpt-35-turbo**, **0613** | **gpt-35-turbo**, **1106** | **gpt-35-turbo**, **0125** |
12-
|:-----------------|:-------------------:|:-------------------------------:|:--------------------------:|:--------------------------:|:-------------------------------:|:--------------------------:|:--------------------------:|:--------------------------:|
13-
| eastus || || |||||
14-
| swedencentral || || |||||
15-
| westus || || |||||
11+
| **Region** | **gpt-4o**, **2024-05-13** | **gpt-4o**, **2024-08-06** | **gpt-4o-mini**, **2024-07-18** | **gpt-4**, **0613** | **gpt-4**, **turbo-2024-04-09** | **gpt-35-turbo**, **0613** | **gpt-35-turbo**, **1106** | **gpt-35-turbo**, **0125** |
12+
|:-----------------|:--------------------------:|:--------------------------:|:-------------------------------:|:-------------------:|:-------------------------------:|:--------------------------:|:--------------------------:|:--------------------------:|
13+
| eastus | | | | |||||
14+
| swedencentral | | | | |||||
15+
| westus | | | | |||||

articles/ai-services/openai/includes/model-matrix/provisioned-global.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ manager: nitinme
66
ms.service: azure-ai-openai
77
ms.topic: include
88
ms.custom: references_regions
9-
ms.date: 10/03/2024
9+
ms.date: 10/25/2024
1010
---
1111

1212
| **Region** | **gpt-4o**, **2024-08-06** | **gpt-4o-mini**, **2024-07-18** |
@@ -34,4 +34,4 @@ ms.date: 10/03/2024
3434
| uksouth |||
3535
| westeurope |||
3636
| westus |||
37-
| westus3 |||
37+
| westus3 |||

0 commit comments

Comments
 (0)