update

mrbullwinkle · mrbullwinkle · commit 5e6b9193b983 · 2024-11-19T08:23:06.000-05:00
diff --git a/articles/ai-services/openai/how-to/manage-costs.md b/articles/ai-services/openai/how-to/manage-costs.md
@@ -28,25 +28,26 @@ Use the [Azure pricing calculator](https://azure.microsoft.com/pricing/calculato
 
 Azure OpenAI Service runs on Azure infrastructure that accrues costs when you deploy new resources. There could be other infrastructure costs that might accrue. The following sections describe how you're charged for Azure OpenAI Service.
 
-### Base series and Codex series models
+### Model inference chat completions
 
-Azure OpenAI base series and Codex series models are charged per 1,000 tokens. Costs vary depending on which model series you choose: Ada, Babbage, Curie, Davinci, or Code-Cushman.
+Azure OpenAI chat completions model inference is [charged per 1,000 tokens with different rates](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) depending on model and [deployment type](./deployment-types.md).
 
 Azure OpenAI models understand and process text by breaking it down into tokens. For reference, each token is roughly four characters for typical English text.
 
 Token costs are for both input and output. For example, suppose you have a 1,000 token JavaScript code sample that you ask an Azure OpenAI model to convert to Python. You would be charged approximately 1,000 tokens for the initial input request sent, and 1,000 more tokens for the output that is received in response for a total of 2,000 tokens.
 
 In practice, for this type of completion call, the token input/output wouldn't be perfectly 1:1. A conversion from one programming language to another could result in a longer or shorter output depending on many factors. One such factor is the value assigned to the `max_tokens` parameter.
 
-### Base Series and Codex series fine-tuned models
+### Fine-tuned models
 
-Azure OpenAI fine-tuned models are charged based on three factors:
+Azure OpenAI fine-tuning models is charged based on the [number of tokens in your training file](https://techcommunity.microsoft.com/blog/azure-ai-services-blog/pricing-update-token-based-billing-for-fine-tuning-training-%F0%9F%8E%89/4164465). For the latest prices, see the [official pricing page](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/).
+
+Once your fine-tuned model is deployed you are also charged based on:
 
-- [Number of tokens](https://techcommunity.microsoft.com/blog/azure-ai-services-blog/pricing-update-token-based-billing-for-fine-tuning-training-%F0%9F%8E%89/4164465) in your training file
 - Hosting hours
-- Inference per 1,000 tokens
+- Inference per 1,000 tokens (broken down by input usage and output usage.)
 
-The hosting hours cost is important to be aware of since after a fine-tuned model is deployed, it continues to incur an hourly cost regardless of whether you're actively using it. Monitor fine-tuned model costs closely.
+The hosting hours cost is important to be aware of since after a fine-tuned model is deployed, it continues to incur an hourly cost regardless of whether you're actively using it. Monitor deployed fine-tuned model costs closely.
 
 > [!IMPORTANT]
 > After you deploy a customized model, if at any time the deployment remains inactive for greater than fifteen (15) days,