Merge pull request #273425 from mrbullwinkle/mrb_04_25_2024_error

prmerger-automator[bot] · web-flow · commit 23b3d114686f · 2024-04-25T19:56:21.000Z
[Azure OpenAI] Update
diff --git a/articles/ai-services/openai/faq.yml b/articles/ai-services/openai/faq.yml
@@ -7,7 +7,7 @@ metadata:
   manager: nitinme
   ms.service: azure-ai-openai
   ms.topic: faq
-  ms.date: 01/01/2024
+  ms.date: 04/24/2024
   ms.author: mbullwin
   author: mrbullwinkle
 title: Azure OpenAI Service frequently asked questions
@@ -43,7 +43,7 @@ sections:
       - question: |
           Does Azure OpenAI support VNETs and Private Endpoints?
         answer: | 
-          Yes, as part of Azure AI services, Azure OpenAI supports VNETs and Private Endpoints. To learn more, consult the [Azure AI services virtual networking guidance](../cognitive-services-virtual-networks.md?context=/azure/ai-services/openai/context/context) 
+          Yes, as part of Azure AI services, Azure OpenAI supports VNETs and Private Endpoints. To learn more, consult the [Azure AI services virtual networking guidance](../cognitive-services-virtual-networks.md?context=/azure/ai-services/openai/context/context). 
       - question: |
           Do the GPT-4 models currently support image input?
         answer: | 
@@ -55,7 +55,7 @@ sections:
       - question: |
           I'm trying to use embeddings and received the error "InvalidRequestError: Too many inputs. The max number of inputs is 16." How do I fix this?
         answer: | 
-          This error typically occurs when you try to send a batch of text to embed in a single API request as an array. Currently Azure OpenAI only supports arrays of embeddings with multiple inputs for the `text-embedding-ada-002` Version 2 model. This model version supports an array consisting of up to 16 inputs per API request. The array can be up to 8191 tokens in length when using the text-embedding-ada-002 (Version 2) model.
+          This error typically occurs when you try to send a batch of text to embed in a single API request as an array. Currently Azure OpenAI only supports arrays of embeddings with multiple inputs for the `text-embedding-ada-002` Version 2 model. This model version supports an array consisting of up to 16 inputs per API request. The array can be up to 8,191 tokens in length when using the text-embedding-ada-002 (Version 2) model.
       - question: |
           Where can I read about better ways to use Azure OpenAI to get the responses I want from the service?
         answer: | 
@@ -109,7 +109,10 @@ sections:
           Where do I access pricing information for legacy models, which are no longer available for new deployments? 
         answer: | 
           Legacy pricing information is available via a [downloadable PDF file](https://download.microsoft.com/download/a/b/5/ab542db1-f1a7-4f92-b615-2e2eaccb64ea/Azure-OpenAI-Legacy-Pricing.pdf). For all other models, consult the [official pricing page](https://azure.microsoft.com/en-us/pricing/details/cognitive-services/openai-service/). 
-
+      - question: |
+          How do I fix InternalServerError - 500 - Failed to create completion as the model generated invalid Unicode output?
+        answer:
+          You can minimize the occurrence of these errors by reducing the temperature of your prompts to less than 1 and ensuring you're using a client with retry logic. Reattempting the request often results in a successful response.
           
   - name: Getting access to Azure OpenAI Service
     questions:
@@ -134,8 +137,8 @@ sections:
       - question: |
           Where can I post questions and see answers to other common questions? 
         answer: |  
-          - We recommend posting questions on [Microsoft Q&A](/answers/tags/387/azure-openai)
-          - Alternatively, you can post questions on [Stack Overflow](https://stackoverflow.com/search?q=azure+openai)
+          - We recommend posting questions on [Microsoft Q&A](/answers/tags/387/azure-openai).
+          - Alternatively, you can post questions on [Stack Overflow](https://stackoverflow.com/search?q=azure+openai).
       - question: |
           Where do I go for Azure OpenAI customer support?
         answer: |  
@@ -195,7 +198,7 @@ sections:
       - question: |
           Is there a limit to the size of the image I can upload?
         answer: |
-          Yes, we restrict image uploads to 20MB per image.
+          Yes, we restrict image uploads to 20 MB per image.
       - question: |
           Can I delete an image I uploaded?
         answer: |
diff --git a/articles/ai-services/openai/includes/chat-completion.md b/articles/ai-services/openai/includes/chat-completion.md
@@ -6,7 +6,7 @@ author: mrbullwinkle #dereklegenzoff
 ms.author: mbullwin #delegenz
 ms.service: azure-ai-openai
 ms.topic: include
-ms.date: 04/05/2024
+ms.date: 04/25/2024
 manager: nitinme
 keywords: ChatGPT
 
@@ -363,11 +363,11 @@ while True:
 
 ---
 
-When you run the preceding code, you get a blank console window. Enter your first question in the window and then select the Enter key. After the response is returned, you can repeat the process and keep asking questions.
+When you run the preceding code, you get a blank console window. Enter your first question in the window and then select the `Enter` key. After the response is returned, you can repeat the process and keep asking questions.
 
 ## Manage conversations
 
-The previous example runs until you hit the model's token limit. With each question asked and answer received, the `messages` list grows in size. The token limit for `gpt-35-turbo` is 4,096 tokens. The token limits for `gpt-4` and `gpt-4-32k` are 8,192 and 32,768, respectively. These limits include the token count from both the message list sent and the model response. The number of tokens in the messages list combined with the value of the `max_tokens` parameter must stay under these limits or you receive an error.
+The previous example runs until you hit the model's token limit. With each question asked, and answer received, the `messages` list grows in size. The token limit for `gpt-35-turbo` is 4,096 tokens. The token limits for `gpt-4` and `gpt-4-32k` are 8,192 and 32,768, respectively. These limits include the token count from both the message list sent and the model response. The number of tokens in the messages list combined with the value of the `max_tokens` parameter must stay under these limits or you receive an error.
 
 It's your responsibility to ensure that the prompt and completion fall within the token limit. For longer conversations, you need to keep track of the token count and only send the model a prompt that falls within the limit.
 
@@ -551,9 +551,15 @@ Here's a troubleshooting tip.
 
 Some customers try to use the [legacy ChatML syntax](../how-to/chat-markup-language.md) with the chat completion endpoints and newer models. ChatML was a preview capability that only worked with the legacy completions endpoint with the `gpt-35-turbo` version 0301 model. This model is [slated for retirement](../concepts/model-retirements.md). If you attempt to use ChatML syntax with newer models and the chat completion endpoint, it can result in errors and unexpected model response behavior. We don't recommend this use. This same issue can occur when using common special tokens.
 
-| Error |Cause | Solution |
+| Error Code | Error Message | Solution |
 |---|---|---|
-| 400 - "Failed to generate output due to special tokens in the input." | Your prompt contains legacy ChatML tokens not recognized or supported by the model/endpoint. | Ensure that your prompt/messages array doesn't contain any legacy ChatML tokens/special tokens. If you're upgrading from a legacy model, exclude all special tokens before you submit an API request to the model.|
+| 400 | 400 - "Failed to generate output due to special tokens in the input." | Your prompt contains special tokens or legacy ChatML tokens not recognized or supported by the model/endpoint. Ensure that your prompt/messages array doesn't contain any legacy ChatML tokens/special tokens. If you're upgrading from a legacy model, exclude all special tokens before you submit an API request to the model.|
+
+### Failed to create completion as the model generated invalid Unicode output
+
+| Error Code | Error Message | Workaround |
+|---|---|---|
+| 500 | 500 - InternalServerError: Error code: 500 - {'error': {'message': 'Failed to create completion as the model generated invalid Unicode output}}. | You can minimize the occurrence of these errors by reducing the temperature of your prompts to less than 1 and ensuring you're using a client with retry logic. Reattempting the request often results in a successful response. |
 
 ## Next steps