Skip to content

Commit 8c76d8b

Browse files
Merge pull request #277989 from mrbullwinkle/mrb_06_12_2024_max_tokens
[Azure OpenAI] Clarify default max token limit for vision models
2 parents 6596993 + 9729fb1 commit 8c76d8b

File tree

2 files changed

+8
-2
lines changed

2 files changed

+8
-2
lines changed

articles/ai-services/openai/faq.yml

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ metadata:
77
manager: nitinme
88
ms.service: azure-ai-openai
99
ms.topic: faq
10-
ms.date: 04/24/2024
10+
ms.date: 06/12/2024
1111
ms.author: mbullwin
1212
author: mrbullwinkle
1313
title: Azure OpenAI Service frequently asked questions
@@ -228,6 +228,11 @@ sections:
228228
What are the known limitations of GPT-4 Turbo with Vision?
229229
answer: |
230230
See the [limitations](./concepts/gpt-with-vision.md#limitations) section of the GPT-4 Turbo with Vision concepts guide.
231+
- question: |
232+
I keep getting truncated responses when I use GPT-4 Turbo vision models. Why is this happening?
233+
answer:
234+
By default GPT-4 `vision-preview` and GPT-4 `turbo-2024-04-09` have a `max_tokens` value of 16. Depending on your request this value is often too low and can lead to truncated responses. To resolve this issue, pass a larger `max_tokens` value as part of your chat completions API requests. GPT-4o defaults to 4096 max_tokens.
235+
231236
- name: Assistants
232237
questions:
233238
- question: |

articles/ai-services/openai/quotas-limits.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ ms.custom:
1010
- ignite-2023
1111
- references_regions
1212
ms.topic: conceptual
13-
ms.date: 06/05/2024
13+
ms.date: 06/12/2024
1414
ms.author: mbullwin
1515
---
1616

@@ -46,6 +46,7 @@ The following sections provide you with a quick guide to the default quotas and
4646
| Max file size for Assistants & fine-tuning | 512 MB |
4747
| Assistants token limit | 2,000,000 token limit |
4848
| GPT-4o max images per request (# of images in the messages array/conversation history) | 10 |
49+
| GPT-4 `vision-preview` & GPT-4 `turbo-2024-04-09` default max tokens | 16 <br><br> Increase the `max_tokens` parameter value to avoid truncated responses. GPT-4o max tokens defaults to 4096. |
4950

5051
## Regional quota limits
5152

0 commit comments

Comments
 (0)