Skip to content

Commit f29caaa

Browse files
authored
Merge pull request #256430 from mrbullwinkle/mrb_10_27_2023_models
[Azure OpenAI] Models page updates
2 parents 9249df2 + c11d655 commit f29caaa

File tree

3 files changed

+45
-34
lines changed

3 files changed

+45
-34
lines changed

articles/ai-services/openai/concepts/models.md

Lines changed: 43 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ You can see the token context length supported by each model in the [model summa
4646

4747
To learn more about how to interact with GPT-3.5 Turbo and the Chat Completions API check out our [in-depth how-to](../how-to/chatgpt.md).
4848

49-
## Embeddings models
49+
## Embeddings
5050

5151
> [!IMPORTANT]
5252
> We strongly recommend using `text-embedding-ada-002 (Version 2)`. This model/version provides parity with OpenAI's `text-embedding-ada-002`. To learn more about the improvements offered by this model, please refer to [OpenAI's blog post](https://openai.com/blog/new-and-improved-embedding-model). Even if you are currently using Version 1 you should migrate to Version 2 to take advantage of the latest weights/updated token limit. Version 1 and Version 2 are not interchangeable, so document embedding and document search must be done using the same version of the model.
@@ -74,37 +74,48 @@ You can also use the Whisper model via Azure AI Speech [batch transcription](../
7474
7575
### GPT-4 models
7676

77-
GPT-4 and GPT-4-32k are now available to all Azure OpenAI Service customers. Availability varies by region. If you don't see GPT-4 in your region, please check back later.
77+
GPT-4 and GPT-4-32k models are now available to all Azure OpenAI Service customers. Availability varies by region. If you don't see GPT-4 in your region, please check back later.
7878

7979
These models can only be used with the Chat Completion API.
8080

8181
GPT-4 version 0314 is the first version of the model released. Version 0613 is the second version of the model and adds function calling support.
8282

83-
| Model ID | Base model Regions | Fine-Tuning Regions | Max Request (tokens) | Training Data (up to) |
84-
| --- | --- | --- | --- | --- |
85-
| `gpt-4` <sup>2</sup> (0314) | East US<sup>1</sup>, France Central<sup>1</sup> | N/A<sup>3</sup> | 8,192 | September 2021 |
86-
| `gpt-4-32k` <sup>2</sup> (0314) | East US<sup>1</sup>, France Central<sup>1</sup> | N/A<sup>3</sup> | 32,768 | September 2021 |
87-
| `gpt-4` (0613) | Australia East<sup>1</sup>, Canada East, East US<sup>1</sup>, East US 2<sup>1</sup>, France Central<sup>1</sup>, Japan East<sup>1</sup>, Sweden Central, Switzerland North, UK South<sup>1</sup> | N/A<sup>3</sup> | 8,192 | September 2021 |
88-
| `gpt-4-32k` (0613) | Australia East<sup>1</sup>, Canada East, East US<sup>1</sup>, East US 2<sup>1</sup>, France Central<sup>1</sup>, Japan East<sup>1</sup>, Sweden Central, Switzerland North, UK South<sup>1</sup> | N/A<sup>3</sup> | 32,768 | September 2021 |
83+
| Model ID | Max Request (tokens) | Training Data (up to) |
84+
| --- | :---: | :---: |
85+
| `gpt-4` (0314) | 8,192 | Sep 2021 |
86+
| `gpt-4-32k`(0314) | 32,768 | Sep 2021 |
87+
| `gpt-4` (0613) | 8,192 | Sep 2021 |
88+
| `gpt-4-32k` (0613) | 32,768 | Sep 2021 |
8989

90-
<sup>1</sup> Due to high demand, availability is limited in the region<br>
91-
<sup>2</sup> Version `0314` of gpt-4 and gpt-4-32k will be retired no earlier than July 5, 2024. See [model updates](../how-to/working-with-models.md#model-updates) for model upgrade behavior.<br>
92-
<sup>3</sup> Fine-tuning is not supported for GPT-4 models.
90+
> [!NOTE]
91+
> Any region where GPT-4 is listed as available will always have access to both the 8K and 32K versions of the model
92+
93+
### GPT-4 model availability
94+
95+
| Model Availability | gpt-4 (0314) | gpt-4 (0613) |
96+
|---|:---|:---|
97+
| Available to all subscriptions with Azure OpenAI access | | Canada East <br> Sweden Central <br> Switzerland North |
98+
| Available to subscriptions with current access to the model version in the region | East US <br> France Central <br> South Central US <br> UK South | Australia East <br> East US <br> East US 2 <br> France Central <br> Japan East <br> UK South |
9399

94100
### GPT-3.5 models
95101

96102
GPT-3.5 Turbo is used with the Chat Completion API. GPT-3.5 Turbo (0301) can also be used with the Completions API. GPT3.5 Turbo (0613) only supports the Chat Completions API.
97103

98104
GPT-3.5 Turbo version 0301 is the first version of the model released. Version 0613 is the second version of the model and adds function calling support.
99105

100-
| Model ID | Base model Regions | Fine-Tuning Regions | Max Request (tokens) | Training Data (up to) |
101-
| --------- | --------------------- | ------------------- | -------------------- | ---------------------- |
102-
| `gpt-35-turbo`<sup>1</sup> (0301) | East US, France Central, South Central US, UK South, West Europe | N/A | 4,096 | Sep 2021 |
103-
| `gpt-35-turbo` (0613) | Australia East, Canada East, East US, East US 2, France Central, Japan East, North Central US, Sweden Central, Switzerland North, UK South | North Central US, Sweden Central | 4,096 | Sep 2021 |
104-
| `gpt-35-turbo-16k` (0613) | Australia East, Canada East, East US, East US 2, France Central, Japan East, North Central US, Sweden Central, Switzerland North, UK South | N/A | 16,384 | Sep 2021 |
105-
| `gpt-35-turbo-instruct` (0914) | East US, Sweden Central | N/A | 4,097 | Sep 2021 |
106+
> [!NOTE]
107+
> Version `0301` of `gpt-35-turbo` will be retired no earlier than July 5, 2024. See [model updates](../how-to/working-with-models.md#model-updates) for model upgrade behavior.
108+
109+
### GPT-3.5-Turbo model availability
110+
111+
| Model ID | Model Availability | Max Request (tokens) | Training Data (up to) |
112+
| --------- | -------------------- |:------:|:----:|
113+
| `gpt-35-turbo`<sup>1</sup> (0301) | East US <br> France Central <br> South Central US <br> UK South <br> West Europe | 4096 | Sep 2021 |
114+
| `gpt-35-turbo` (0613) | Australia East <br> Canada East <br> East US <br> East US 2 <br> France Central <br> Japan East <br> North Central US <br> Sweden Central <br> Switzerland North <br> UK South | 4096 | Sep 2021 |
115+
| `gpt-35-turbo-16k` (0613) | Australia East <br> Canada East <br> East US <br> East US 2 <br> France Central <br> Japan East <br> North Central US <br> Sweden Central <br> Switzerland North<br> UK South | 16,384 | Sep 2021 |
116+
| `gpt-35-turbo-instruct` (0914) | East US <br> Sweden Central | 4097 |Sep 2021 |
106117

107-
<sup>1</sup> Version `0301` of gpt-35-turbo will be retired no earlier than July 5, 2024. See [model updates](../how-to/working-with-models.md#model-updates) for model upgrade behavior.
118+
<sup>1</sup> This model will accept requests > 4096 tokens. It is not recommended to exceed the 4096 input token limit as the newer version of the model are capped at 4096 tokens. If you encounter issues when exceeding 4096 input tokens with this model this configuration is not officially supported.
108119

109120
### Embeddings models
110121

@@ -113,16 +124,16 @@ These models can only be used with Embedding API requests.
113124
> [!NOTE]
114125
> We strongly recommend using `text-embedding-ada-002 (Version 2)`. This model/version provides parity with OpenAI's `text-embedding-ada-002`. To learn more about the improvements offered by this model, please refer to [OpenAI's blog post](https://openai.com/blog/new-and-improved-embedding-model). Even if you are currently using Version 1 you should migrate to Version 2 to take advantage of the latest weights/updated token limit. Version 1 and Version 2 are not interchangeable, so document embedding and document search must be done using the same version of the model.
115126
116-
| Model ID | Base model Regions | Fine-Tuning Regions | Max Request (tokens) | Training Data (up to) | Output dimensions |
117-
| --- | --- | --- | --- | --- |
118-
| text-embedding-ada-002 (version 2) | Australia East, Canada East, East US, East US2, France Central, Japan East, North Central US, South Central US, Switzerland North, UK South, West Europe | N/A |8,191 | Sep 2021 | 1536 |
119-
| text-embedding-ada-002 (version 1) | East US, South Central US, West Europe | N/A |2,046 | Sep 2021 | 1536 |
127+
| Model ID | Model Availability | Max Request (tokens) | Training Data (up to) | Output Dimensions |
128+
|---|---| :---:|:---:|:---:|
129+
| `text-embedding-ada-002` (version 2) | Australia East <br> Canada East <br> East US <br> East US2 <br> France Central <br> Japan East <br> North Central US <br> South Central US <br> Switzerland North <br> UK South <br> West Europe |8,191 | Sep 2021 | 1536 |
130+
| `text-embedding-ada-002` (version 1) | East US <br> South Central US <br> West Europe |2,046 | Sep 2021 | 1536 |
120131

121132
### DALL-E models (Preview)
122133

123-
| Model ID | Base model Regions | Fine-Tuning Regions | Max Request (characters) | Training Data (up to) |
124-
| --- | --- | --- | --- | --- |
125-
| dalle2 | East US | N/A | 1000 | N/A |
134+
| Model ID | Feature Availability | Max Request (characters) |
135+
| --- | --- | :---: |
136+
| dalle2 | East US | 1000 |
126137

127138
### Fine-tuning models (Preview)
128139

@@ -131,16 +142,16 @@ These models can only be used with Embedding API requests.
131142
`gpt-35-turbo-0613` - fine-tuning of this model is limited to a subset of regions, and is not available in every region the base model is available.
132143

133144
| Model ID | Fine-Tuning Regions | Max Request (tokens) | Training Data (up to) |
134-
| --- | --- | --- | --- | --- |
135-
| `babbage-002` | North Central US, Sweden Central | 16,384 | Sep 2021 |
136-
| `davinci-002` | North Central US, Sweden Central | 16,384 | Sep 2021 |
137-
| `gpt-35-turbo` (0613) | North Central US, Sweden Central | 4096 | Sep 2021 |
145+
| --- | --- | :---: | :---: |
146+
| `babbage-002` | North Central US <br> Sweden Central | 16,384 | Sep 2021 |
147+
| `davinci-002` | North Central US <br> Sweden Central | 16,384 | Sep 2021 |
148+
| `gpt-35-turbo` (0613) | North Central US <br> Sweden Central | 4096 | Sep 2021 |
138149

139150
### Whisper models (Preview)
140151

141-
| Model ID | Base model Regions | Fine-Tuning Regions | Max Request (audio file size) | Training Data (up to) |
142-
| --- | --- | --- | --- | --- |
143-
| whisper | North Central US, West Europe | N/A | 25 MB | N/A |
152+
| Model ID | Model Availability | Max Request (audio file size) |
153+
| --- | --- | :---: |
154+
| `whisper` | North Central US <br> West Europe | 25 MB |
144155

145156
## Next steps
146157

articles/ai-services/openai/concepts/use-your-data.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -315,7 +315,7 @@ Azure OpenAI on your data provides several search options you can use when you a
315315
* [Keyword search](/azure/search/search-lucene-query-architecture)
316316

317317
* [Semantic search](/azure/search/semantic-search-overview)
318-
* [Vector search](/azure/search/vector-search-overview) using Ada [embedding](./understand-embeddings.md) models, available in [select regions](models.md#embeddings-models-1).
318+
* [Vector search](/azure/search/vector-search-overview) using Ada [embedding](./understand-embeddings.md) models, available in [select regions](models.md#embeddings-models).
319319

320320
To enable vector search, you will need a `text-embedding-ada-002` deployment in your Azure OpenAI resource. Select your embedding deployment when connecting your data, then select one of the vector search types under **Data management**.
321321

articles/ai-services/openai/reference.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -398,7 +398,7 @@ The following parameters can be used inside of the `parameters` field inside of
398398
| `filter` | string | Optional | null | The filter pattern used for [restricting access to sensitive documents](./concepts/use-your-data.md#document-level-access-control)
399399
| `embeddingEndpoint` | string | Optional | null | The endpoint URL for an Ada embedding model deployment, generally of the format `https://YOUR_RESOURCE_NAME.openai.azure.com/openai/deployments/YOUR_DEPLOYMENT_NAME/embeddings?api-version=2023-05-15`. Use with the `embeddingKey` parameter for [vector search](./concepts/use-your-data.md#search-options) outside of private networks and private endpoints. |
400400
| `embeddingKey` | string | Optional | null | The API key for an Ada embedding model deployment. Use with `embeddingEndpoint` for [vector search](./concepts/use-your-data.md#search-options) outside of private networks and private endpoints. |
401-
| `embeddingDeploymentName` | string | Optional | null | The Ada embedding model deployment name within the same Azure OpenAI resource. Used instead of `embeddingEndpoint` and `embeddingKey` for [vector search](./concepts/use-your-data.md#search-options). Should only be used when both the `embeddingEndpoint` and `embeddingKey` parameters are defined. When this parameter is provided, Azure OpenAI on your data will use an internal call to evaluate the Ada embedding model, rather than calling the Azure OpenAI endpoint. This enables you to use vector search in private networks and private endpoints. Billing remains the same whether this parameter is defined or not. Available in regions where embedding models are [available](./concepts/models.md#embeddings-models-1) starting in API versions `2023-06-01-preview` and later.|
401+
| `embeddingDeploymentName` | string | Optional | null | The Ada embedding model deployment name within the same Azure OpenAI resource. Used instead of `embeddingEndpoint` and `embeddingKey` for [vector search](./concepts/use-your-data.md#search-options). Should only be used when both the `embeddingEndpoint` and `embeddingKey` parameters are defined. When this parameter is provided, Azure OpenAI on your data will use an internal call to evaluate the Ada embedding model, rather than calling the Azure OpenAI endpoint. This enables you to use vector search in private networks and private endpoints. Billing remains the same whether this parameter is defined or not. Available in regions where embedding models are [available](./concepts/models.md#embeddings-models) starting in API versions `2023-06-01-preview` and later.|
402402

403403
### Start an ingestion job
404404

0 commit comments

Comments
 (0)