update

mrbullwinkle · mrbullwinkle · commit 00c0efe98c94 · 2024-04-24T15:40:28.000-04:00
diff --git a/articles/ai-services/openai/concepts/model-retirements.md b/articles/ai-services/openai/concepts/model-retirements.md
@@ -4,7 +4,7 @@ titleSuffix: Azure OpenAI
 description: Learn about the model deprecations and retirements in Azure OpenAI.
 ms.service: azure-ai-openai
 ms.topic: conceptual
-ms.date: 03/12/2024
+ms.date: 04/24/2024
 ms.custom: 
 manager: nitinme
 author: mrbullwinkle
@@ -60,8 +60,8 @@ These models are currently available for use in Azure OpenAI Service.
 
 | Model | Version | Retirement date |
 | ---- | ---- | ---- |
-| `gpt-35-turbo` | 0301 | No earlier than June 13, 2024 |
-| `gpt-35-turbo`<br>`gpt-35-turbo-16k` | 0613 | No earlier than July 13, 2024 |
+| `gpt-35-turbo` | 0301 | No earlier than August 1, 2024 |
+| `gpt-35-turbo`<br>`gpt-35-turbo-16k` | 0613 | No earlier than August 1, 2024 |
 | `gpt-35-turbo` | 1106 | No earlier than Nov 17, 2024 |
 | `gpt-35-turbo` | 0125 | No earlier than Feb 22, 2025 |
 | `gpt-4`<br>`gpt-4-32k` | 0314 | No earlier than July 13, 2024 |
@@ -114,6 +114,10 @@ If you're an existing customer looking for information about these models, see [
 
 ## Retirement and deprecation history
 
+### April 24, 2024
+
+Earliest retirement date for `gpt-35-turbo` 0301 and 0613 has been updated to August 1, 2024.
+
 ### March 13, 2024
 
 We published this document to provide information about the current models, deprecated models, and upcoming retirements.
diff --git a/articles/ai-services/openai/concepts/models.md b/articles/ai-services/openai/concepts/models.md
@@ -4,7 +4,7 @@ titleSuffix: Azure OpenAI
 description: Learn about the different model capabilities that are available with Azure OpenAI.
 ms.service: azure-ai-openai
 ms.topic: conceptual
-ms.date: 04/17/2024
+ms.date: 04/24/2024
 ms.custom: references_regions, build-2023, build-2023-dataai, refefences_regions
 manager: nitinme
 author: mrbullwinkle #ChrisHMSFT
@@ -165,14 +165,14 @@ The following GPT-4 models are available with [Azure Government](/azure/azure-go
 > [!IMPORTANT]
 > The NEW `gpt-35-turbo (0125)`  model has various improvements, including higher accuracy at responding in requested formats and a fix for a bug which caused a text encoding issue for non-English language function calls.
 
-GPT-3.5 Turbo is used with the Chat Completion API. GPT-3.5 Turbo version 0301 can also be used with the Completions API.  GPT-3.5 Turbo versions 0613 and 1106 only support the Chat Completions API.
+GPT-3.5 Turbo is used with the Chat Completion API. GPT-3.5 Turbo version 0301 can also be used with the Completions API, though this is not recommended.  GPT-3.5 Turbo versions 0613 and 1106 only support the Chat Completions API.
 
 GPT-3.5 Turbo version 0301 is the first version of the model released.  Version 0613 is the second version of the model and adds function calling support.
 
 See [model versions](../concepts/model-versions.md) to learn about how Azure OpenAI Service handles model version upgrades, and [working with models](../how-to/working-with-models.md) to learn how to view and configure the model version settings of your GPT-3.5 Turbo deployments.
 
 > [!NOTE]
-> Version `0613` of `gpt-35-turbo` and `gpt-35-turbo-16k` will be retired no earlier than July 13, 2024. Version `0301` of `gpt-35-turbo` will be retired no earlier than June 13, 2024.  See [model updates](../how-to/working-with-models.md#model-updates) for model upgrade behavior.
+> Version `0613` of `gpt-35-turbo` and `gpt-35-turbo-16k` will be retired no earlier than August 1, 2024. Version `0301` of `gpt-35-turbo` will be retired no earlier than August 1, 2024.  See [model updates](../how-to/working-with-models.md#model-updates) for model upgrade behavior.
 
 |  Model ID   | Max Request (tokens) | Training Data (up to) |
 |  --------- |:------:|:----:|
diff --git a/articles/ai-services/openai/how-to/chat-markup-language.md b/articles/ai-services/openai/how-to/chat-markup-language.md
@@ -14,7 +14,7 @@ keywords: ChatGPT
 # Chat Markup Language ChatML (Preview)
 
 > [!IMPORTANT]
-> Using GPT-3.5-Turbo models with the completion endpoint as described in this article remains in preview and is only possible with `gpt-35-turbo` version (0301) which is [slated for retirement as early as June 13th, 2024](../concepts/model-retirements.md#current-models). We strongly recommend using the [GA Chat Completion API/endpoint](./chatgpt.md). The Chat Completion API is the recommended method of interacting with the GPT-3.5-Turbo models. The Chat Completion API is also the only way to access the GPT-4 models.
+> Using GPT-3.5-Turbo models with the completion endpoint as described in this article remains in preview and is only possible with `gpt-35-turbo` version (0301) which is [slated for retirement as early as August 1, 2024](../concepts/model-retirements.md#current-models). We strongly recommend using the [GA Chat Completion API/endpoint](./chatgpt.md). The Chat Completion API is the recommended method of interacting with the GPT-3.5-Turbo models. The Chat Completion API is also the only way to access the GPT-4 models.
 
 The following code snippet shows the most basic way to use the GPT-3.5-Turbo models with ChatML. If this is your first time using these models programmatically we recommend starting with our [GPT-35-Turbo & GPT-4 Quickstart](../chatgpt-quickstart.md).
 
diff --git a/articles/ai-services/openai/includes/chat-completion.md b/articles/ai-services/openai/includes/chat-completion.md
@@ -425,7 +425,7 @@ def num_tokens_from_messages(messages, model="gpt-3.5-turbo-0613"):
         return num_tokens_from_messages(messages, model="gpt-4-0613")
     else:
         raise NotImplementedError(
-            f"""num_tokens_from_messages() is not implemented for model {model}. See https://github.com/openai/openai-python/blob/main/chatml.md for information on how messages are converted to tokens."""
+            f"""num_tokens_from_messages() is not implemented for model {model}."""
         )
     num_tokens = 0
     for message in messages:
@@ -547,13 +547,13 @@ The token counting portion of the code demonstrated previously is a simplified v
 
 Here's a troubleshooting tip.
 
-### Don't use ChatML syntax with the chat completion endpoint
+### Don't use ChatML syntax or special tokens with the chat completion endpoint
 
-Some customers try to use the [legacy ChatML syntax](../how-to/chat-markup-language.md) with the chat completion endpoints and newer models. ChatML was a preview capability that only worked with the legacy completions endpoint with the `gpt-35-turbo` version 0301 model. This model is [slated for retirement](../concepts/model-retirements.md). If you attempt to use ChatML syntax with newer models and the chat completion endpoint, it can result in errors and unexpected model response behavior. We don't recommend this use.
+Some customers try to use the [legacy ChatML syntax](../how-to/chat-markup-language.md) with the chat completion endpoints and newer models. ChatML was a preview capability that only worked with the legacy completions endpoint with the `gpt-35-turbo` version 0301 model. This model is [slated for retirement](../concepts/model-retirements.md). If you attempt to use ChatML syntax with newer models and the chat completion endpoint, it can result in errors and unexpected model response behavior. We don't recommend this use. This same issue can occur when using common special tokens.
 
 | Error |Cause | Solution |
 |---|---|---|
-| 400 - "Failed to generate output due to special tokens in the input." | Your prompt contains legacy ChatML tokens not recognized or supported by the model/endpoint. | Ensure that your prompt/messages array doesn't contain any legacy ChatML tokens. If you're upgrading from a legacy model, exclude all special tokens before you submit an API request to the model.|
+| 400 - "Failed to generate output due to special tokens in the input." | Your prompt contains legacy ChatML tokens not recognized or supported by the model/endpoint. | Ensure that your prompt/messages array doesn't contain any legacy ChatML tokens/special. If you're upgrading from a legacy model, exclude all special tokens before you submit an API request to the model.|
 
 ## Next steps