You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/cognitive-services/openai/concepts/models.md
+79-51Lines changed: 79 additions & 51 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,7 @@
1
1
---
2
-
title: Azure OpenAI Models
2
+
title: Azure OpenAI models
3
3
titleSuffix: Azure OpenAI
4
-
description: Learn about the different AI models that are available.
4
+
description: Learn about the different models that are available in Azure OpenAI.
5
5
ms.service: cognitive-services
6
6
ms.topic: conceptual
7
7
ms.date: 06/24/2022
@@ -13,41 +13,62 @@ recommendations: false
13
13
keywords:
14
14
---
15
15
16
-
# Azure OpenAI Models
16
+
# Azure OpenAI models
17
17
18
-
The service provides access to many different models. Models describe a family of models and are broken out as follows:
18
+
The service provides access to many different models, grouped by family and capability. A model family typically associates models by their intended task. The following table describes model families currently available in Azure OpenAI.
19
19
20
-
|Modes | Description|
20
+
| Model family | Description|
21
21
|--|--|
22
-
| GPT-3 series | A set of GPT-3 models that can understand and generate natural language |
23
-
| Codex Series | A set of models that can understand and generate code, including translating natural language to code |
24
-
| Embeddings Series | An embedding is a special format of data representation that can be easily utilized by machine learning models and algorithms. The embedding is an information dense representation of the semantic meaning of a piece of text. Currently we offer three families of embedding models for different functionalities: text search, text similarity and code search |
22
+
|[GPT-3](#gpt-3-models)| A series of models that can understand and generate natural language. |
23
+
|[Codex](#codex-models)| A series of models that can understand and generate code, including translating natural language to code. |
24
+
|[Embeddings](#embeddings-models)| A set of models that can understand and use embeddings. An embedding is a special format of data representation that can be easily utilized by machine learning models and algorithms. The embedding is an information dense representation of the semantic meaning of a piece of text. Currently, we offer three families of Embeddings models for different functionalities: similarity, text search, and code search. |
25
+
26
+
## Model capabilities
27
+
28
+
Each model family has a series of models that are further distinguished by capability. These capabilities are typically identified by names, and the alphabetical order of these names generally signifies the relative capability and cost of that model within a given model family. For example, GPT-3 models use names such as Ada, Babbage, Curie, and Davinci to indicate relative capability and cost. Davinci is more capable (at a higher cost) than Curie, which in turn is more capable (at a higher cost) than Babbage, and so on.
29
+
30
+
> [!NOTE]
31
+
> Any task that can be performed by a less capable model like Ada can be performed by a more capable model like Curie or Davinci.
25
32
26
33
## Naming convention
27
34
28
-
Azure OpenAI's models follow a standard naming convention: `{task}-{model name}-{version #}`. For example, our most powerful natural language model is called `text-davinci-001` and a Codex series model would look like `code-cushman-001`.
35
+
Azure OpenAI's model names typically correspond to the following standard naming convention:
|`{family}`| The model family of the model. For example, [GPT-3 models](#gpt-3-models) uses `text`, while [Codex models](#codex-models) use `code`.|
42
+
|`{capability}`| The relative capability of the model. For example, GPT-3 models include `ada`, `babbage`, `curie`, and `davinci`.|
43
+
|`{input-type}`| ([Embeddings models](#embeddings-models) only) The input type of the embedding supported by the model. For example, text search embedding models support `doc` and `query`.|
44
+
|`{identifier}`| The version identifier of the model. |
29
45
30
-
> Older versions of the GPT-3 models are available as `ada`, `babbage`, `curie`, `davinci` and do not follow these conventions. These models are primarily intended to be used for fine-tuning and search.
46
+
For example, our most powerful GPT-3 model is called `text-davinci-002`, while our most powerful Codex model is called `code-davinci-002`.
47
+
48
+
> Older versions of the GPT-3 models are available, named `ada`, `babbage`, `curie`, and `davinci`. These older models do not follow the standard naming conventions, and they are primarily intended for fine tuning. For more information, see [Learn how to customize a model for your application](../how-to/fine-tuning.md).
31
49
32
50
## Finding what models are available
33
51
34
52
You can easily see the models you have available for both inference and fine-tuning in your resource by using the [Models API](../reference.md#models).
35
53
54
+
## Finding the right model
55
+
56
+
We recommend starting with the most capable model in a model family because it's the best way to understand what the service is capable of. After you have an idea of what you want to accomplish, you can either stay with that model or move to a model with lower capability and cost, optimizing around that model's capabilities.
36
57
37
-
## GPT-3 Series
58
+
## GPT-3 models
38
59
39
-
The GPT-3 models can understand and generate natural language. The service offers four model types with different levels of power suitable for different tasks. Davinci is the most capable model, and Ada is the fastest. Going forward these models are named with the following convention: `text-{model name}-XXX` where `XXX` refers to a numerical value for different versions of the model. Currently the latest versions are:
60
+
The GPT-3 models can understand and generate natural language. The service offers four model capabilities, each with different levels of power and speed suitable for different tasks. Davinci is the most capable model, while Ada is the fastest. The following list represents the latest versions of GPT-3 models, ordered by increasing capability.
40
61
41
-
- text-ada-001
42
-
- text-babbage-001
43
-
- text-curie-001
44
-
- text-davinci-001
62
+
-`text-ada-001`
63
+
-`text-babbage-001`
64
+
-`text-curie-001`
65
+
-`text-davinci-002`
45
66
46
-
While Davinci is the most capable, the other models provide significant speed advantages. Our recommendation is for users to start with Davinci while experimenting since it will produce the best results and validate the value our service can provide. Once you have a prototype working, you can then optimize your model choice with the best latency - performance tradeoff for your application.
67
+
While Davinci is the most capable, the other models provide significant speed advantages. Our recommendation is for users to start with Davinci while experimenting, because it will produce the best results and validate the value our service can provide. Once you have a prototype working, you can then optimize your model choice with the best latency/performance balance for your application.
47
68
48
-
### Davinci
69
+
### <aid="gpt-3-davinci"></a>Davinci
49
70
50
-
Davinci is the most capable model and can perform any task the other models can perform and often with less instruction. For applications requiring deep understanding of the content, like summarization for a specific audience and creative content generation, Davinci is going to produce the best results. These increased capabilities require more compute resources, so Davinci costs more and isn't as fast as the other models.
71
+
Davinci is the most capable model and can perform any task the other models can perform, often with less instruction. For applications requiring deep understanding of the content, like summarization for a specific audience and creative content generation, Davinci produces the best results. The increased capabilities provided by Davinci require more compute resources, so Davinci costs more and isn't as fast as other models.
51
72
52
73
Another area where Davinci excels is in understanding the intent of text. Davinci is excellent at solving many kinds of logic problems and explaining the motives of characters. Davinci has been able to solve some of the most challenging AI problems involving cause and effect.
53
74
@@ -61,68 +82,75 @@ Curie is powerful, yet fast. While Davinci is stronger when it comes to analyzin
61
82
62
83
### Babbage
63
84
64
-
Babbage can perform straightforward tasks like simple classification. It’s also capable when it comes to semantic search ranking how well documents match up with search queries.
85
+
Babbage can perform straightforward tasks like simple classification. It’s also capable when it comes to semantic search, ranking how well documents match up with search queries.
Ada is usually the fastest model and can perform tasks like parsing text, address correction and certain kinds of classification tasks that don’t require too much nuance. Ada’s performance can often be improved by providing more context.
The Codex models are descendants of our base GPT-3 models that can understand and generate code. Their training data contains both natural language and billions of lines of public code from GitHub.
80
98
81
-
They’re most capable in Python and proficient in over a dozen languages, including C#, JavaScript, Go, Perl, PHP, Ruby, Swift, TypeScript, SQL, and even Shell.
99
+
They’re most capable in Python and proficient in over a dozen languages, including C#, JavaScript, Go, Perl, PHP, Ruby, Swift, TypeScript, SQL, and even Shell. The following list represents the latest versions of Codex models, ordered by increasing capability.
82
100
83
-
Currently we only offer one Codex model: `code-cushman-001`.
101
+
-`code-cushman-001`
102
+
-`code-davinci-002`
84
103
85
-
##Embeddings Models
104
+
### <aid="codex-davinci"></a>Davinci
86
105
87
-
Currently we offer three families of embedding models for different functionalities: text search, text similarity and code search. Each family includes up to four models across a spectrum of capabilities:
106
+
Similar to GPT-3, Davinci is the most capable Codex model and can perform any task the other models can perform, often with less instruction. For applications requiring deep understanding of the content, Davinci produces the best results. These increased capabilities require more compute resources, so Davinci costs more and isn't as fast as other models.
88
107
89
-
Ada (1024 dimensions),
90
-
Babbage (2048 dimensions),
91
-
Curie (4096 dimensions),
92
-
Davinci (12,288 dimensions).
93
-
Davinci is the most capable, but is slower and more expensive than the other models. Ada is the least capable, but is both faster and cheaper.
108
+
### Cushman
94
109
95
-
These embedding models are specifically created to be good at a particular task.
110
+
Cushman is powerful, yet fast. While Davinci is stronger when it comes to analyzing complicated tasks, Cushman is a capable model for many code generation tasks. Cushman typically runs faster and cheaper than Davinci, as well.
96
111
97
-
### Similarity embeddings
112
+
##Embeddings models
98
113
99
-
These models are good at capturing semantic similarity between two or more pieces of text.
114
+
Currently, we offer three families of Embeddings models for different functionalities:
Each family includes models across a range of capability. The following list indicates the length of the numerical vector returned by the service, based on model capability:
121
+
122
+
- Ada: 1024 dimensions
123
+
- Babbage: 2048 dimensions
124
+
- Curie: 4096 dimensions
125
+
- Davinci: 12288 dimensions
126
+
127
+
Davinci is the most capable, but is slower and more expensive than the other models. Ada is the least capable, but is both faster and cheaper.
106
128
107
-
These models help measure whether long documents are relevant to a short search query. There are two types: one for embedding the documents to be retrieved, and one for embedding the search query.
129
+
### Similarity embedding
108
130
109
-
| USE CASES | AVAILABLE MODELS |
131
+
These models are good at capturing semantic similarity between two or more pieces of text.
Similar to text search embeddings, there are two types: one for embedding code snippets to be retrieved and one for embedding natural language search queries.
139
+
These models help measure whether long documents are relevant to a short search query. There are two input types supported by this family: `doc`, for embedding the documents to be retrieved, and `query`, for embedding the search query.
When using our embedding models, keep in mind their limitations and risks.
145
+
### Code search embedding
122
146
123
-
## Finding the right model
147
+
Similar to text search embedding models, there are two input types supported by this family: `code`, for embedding code snippets to be retrieved, and `text`, for embedding natural language search queries.
We recommend starting with our Davinci model since it will be the best way to understand what the service is capable of. After you have an idea of what you want to accomplish, you can either stay with Davinci if you’re not concerned about cost and speed, or you can move onto Curie or another model and try to optimize around its capabilities.
153
+
When using our Embeddings models, keep in mind their limitations and risks.
0 commit comments