You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/openai/how-to/manage-costs.md
+8-7Lines changed: 8 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -28,25 +28,26 @@ Use the [Azure pricing calculator](https://azure.microsoft.com/pricing/calculato
28
28
29
29
Azure OpenAI Service runs on Azure infrastructure that accrues costs when you deploy new resources. There could be other infrastructure costs that might accrue. The following sections describe how you're charged for Azure OpenAI Service.
30
30
31
-
### Base series and Codex series models
31
+
### Model inference chat completions
32
32
33
-
Azure OpenAI base series and Codex series models are charged per 1,000 tokens. Costs vary depending on which model series you choose: Ada, Babbage, Curie, Davinci, or Code-Cushman.
33
+
Azure OpenAI chat completions model inference is [charged per 1,000 tokens with different rates](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/)depending on model and [deployment type](./deployment-types.md).
34
34
35
35
Azure OpenAI models understand and process text by breaking it down into tokens. For reference, each token is roughly four characters for typical English text.
36
36
37
37
Token costs are for both input and output. For example, suppose you have a 1,000 token JavaScript code sample that you ask an Azure OpenAI model to convert to Python. You would be charged approximately 1,000 tokens for the initial input request sent, and 1,000 more tokens for the output that is received in response for a total of 2,000 tokens.
38
38
39
39
In practice, for this type of completion call, the token input/output wouldn't be perfectly 1:1. A conversion from one programming language to another could result in a longer or shorter output depending on many factors. One such factor is the value assigned to the `max_tokens` parameter.
40
40
41
-
### Base Series and Codex series fine-tuned models
41
+
### Fine-tuned models
42
42
43
-
Azure OpenAI fine-tuned models are charged based on three factors:
43
+
Azure OpenAI fine-tuning models is charged based on the [number of tokens in your training file](https://techcommunity.microsoft.com/blog/azure-ai-services-blog/pricing-update-token-based-billing-for-fine-tuning-training-%F0%9F%8E%89/4164465). For the latest prices, see the [official pricing page](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/).
44
+
45
+
Once your fine-tuned model is deployed you are also charged based on:
44
46
45
-
-[Number of tokens](https://techcommunity.microsoft.com/blog/azure-ai-services-blog/pricing-update-token-based-billing-for-fine-tuning-training-%F0%9F%8E%89/4164465) in your training file
46
47
- Hosting hours
47
-
- Inference per 1,000 tokens
48
+
- Inference per 1,000 tokens (broken down by input usage and output usage.)
48
49
49
-
The hosting hours cost is important to be aware of since after a fine-tuned model is deployed, it continues to incur an hourly cost regardless of whether you're actively using it. Monitor fine-tuned model costs closely.
50
+
The hosting hours cost is important to be aware of since after a fine-tuned model is deployed, it continues to incur an hourly cost regardless of whether you're actively using it. Monitor deployed fine-tuned model costs closely.
50
51
51
52
> [!IMPORTANT]
52
53
> After you deploy a customized model, if at any time the deployment remains inactive for greater than fifteen (15) days,
0 commit comments