Skip to content

Commit 677453b

Browse files
committed
change section
1 parent 69f39be commit 677453b

File tree

1 file changed

+4
-4
lines changed

1 file changed

+4
-4
lines changed

articles/ai-services/openai/how-to/gpt-with-vision.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -347,11 +347,11 @@ Every response includes a `"finish_details"` field. The subfield `"type"` has th
347347

348348
If `finish_details.type` is `stop`, then there is another `"stop"` property that specifies the token that caused the output to end.
349349

350-
## Low or high fidelity image understanding
350+
## Detail parameter settings in image processing: Low, High, Auto
351351

352-
By controlling the _detail_ parameter, which has two options, `low` or `high`, you can control how the model processes the image and generates its textual understanding.
353-
- `low` disables the "high res" mode. The model receives a low-res 512x512 version of the image and represents the image with a budget of 65 tokens. This allows the API to return faster responses and consume fewer input tokens for use cases that don't require high detail.
354-
- `high` enables "high res" mode, which first allows the model to see the low res image and then creates detailed crops of input images as 512x512 squares based on the input image size. Each of the detailed crops uses twice the token budget (65 tokens) for a total of 129 tokens.
352+
The detail parameter in the model offers three choices: `low`, `high`, or `auto`, to adjust the way the model interprets and processes images. The default setting is auto, where the model decides between low or high based on the size of the image input.
353+
- `low` setting: the model does not activate the "high res" mode, instead processing a lower resolution 512x512 version of the image using 65 tokens, resulting in quicker responses and reduced token consumption for scenarios where fine detail isn't crucial.
354+
- `high` setting activates "high res" mode. Here, the model initially views the low-resolution image and then generates detailed 512x512 segments from the input image. Each segment uses double the token budget, amounting to 129 tokens per segment, allowing for a more detailed interpretation of the image.
355355

356356
## Limitations
357357

0 commit comments

Comments
 (0)