Skip to content

Commit b3d89f6

Browse files
author
Jill Grant
authored
Merge pull request #184 from PatrickFarley/openai-updates
tentative rm enhancement content
2 parents 656f7fc + 15fb5ac commit b3d89f6

File tree

10 files changed

+49
-753
lines changed

10 files changed

+49
-753
lines changed

articles/ai-services/computer-vision/concept-object-detection-40.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,6 @@ Try out the capabilities of object detection quickly and easily in your browser
2323
> [!div class="nextstepaction"]
2424
> [Try Vision Studio](https://portal.vision.cognitive.azure.com/)
2525
26-
> [!TIP]
27-
> You can use the Object detection feature through the [Azure OpenAI](/azure/ai-services/openai/overview) service. The **GPT-4 Turbo with Vision** model lets you chat with an AI assistant that can analyze the images you share, and the Vision Enhancement option uses Image Analysis to provide the AI assistance with more details (readable text and object locations) about the image. For more information, see the [GPT-4 Turbo with Vision quickstart](/azure/ai-services/openai/gpt-v-quickstart).
2826

2927
## Object detection example
3028

articles/ai-services/computer-vision/concept-ocr.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,6 @@ OCR is a machine-learning-based technique for extracting text from in-the-wild a
2323

2424
The new Azure AI Vision Image Analysis 4.0 REST API offers the ability to extract printed or handwritten text from images in a unified performance-enhanced synchronous API that makes it easy to get all image insights including OCR results in a single API operation. The Read OCR engine is built on top of multiple deep learning models supported by universal script-based models for [global language support](./language-support.md).
2525

26-
> [!TIP]
27-
> You can also use the OCR feature in conjunction with the [Azure OpenAI](/azure/ai-services/openai/overview) service. The **GPT-4 Turbo with Vision** model lets you chat with an AI assistant that can analyze the images you share, and the Vision Enhancement option uses Image Analysis to give the AI assistant more details (readable text and object locations) about the image. For more information, see the [GPT-4 Turbo with Vision quickstart](/azure/ai-services/openai/gpt-v-quickstart).
2826

2927
## Text extraction example
3028

articles/ai-services/computer-vision/overview-image-analysis.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -61,8 +61,6 @@ You can analyze images to provide insights about their visual features and chara
6161
|**Detect the color scheme** (v3.2 only) |Analyze color usage within an image. Azure AI Vision can determine whether an image is black & white or color and, for color images, identify the dominant and accent colors.| [Detect the color scheme](concept-detecting-color-schemes.md)|
6262
|**Moderate content in images** (v3.2 only) |You can use Azure AI Vision to detect adult content in an image and return confidence scores for different classifications. The threshold for flagging content can be set on a sliding scale to accommodate your preferences.|[Detect adult content](concept-detecting-adult-content.md)|
6363

64-
> [!TIP]
65-
> You can leverage the Read text and Object detection features of Image Analysis through the [Azure OpenAI](/azure/ai-services/openai/overview) service. The **GPT-4 Turbo with Vision** model lets you chat with an AI assistant that can analyze the images you share, and the Vision Enhancement option uses Image Analysis to give the AI assistant more details about the image (readable text and object locations). For more information, see the [GPT-4 Turbo with Vision quickstart](/azure/ai-services/openai/gpt-v-quickstart).
6664

6765
## Product Recognition (v4.0 preview only) (deprecated)
6866

articles/ai-services/content-safety/includes/severity-levels-text-four.md

Lines changed: 16 additions & 16 deletions
Large diffs are not rendered by default.

articles/ai-services/content-safety/includes/severity-levels-text.md

Lines changed: 32 additions & 32 deletions
Large diffs are not rendered by default.

articles/ai-services/openai/concepts/gpt-with-vision.md

Lines changed: 0 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -20,31 +20,6 @@ To try out GPT-4 Turbo with Vision, see the [quickstart](/azure/ai-services/open
2020

2121
The GPT-4 Turbo with Vision model answers general questions about what's present in the images or videos you upload.
2222

23-
## Enhancements
24-
25-
Enhancements let you incorporate other Azure AI services (such as Azure AI Vision) to add new functionality to the chat-with-vision experience.
26-
27-
> [!IMPORTANT]
28-
> To use Vision enhancement, you need a Computer Vision resource. It must be in the paid (S1) tier and in the same Azure region as your GPT-4 Turbo with Vision resource.
29-
30-
> [!IMPORTANT]
31-
> Vision enhancements are not supported by the GPT-4 Turbo GA model. They are only available with the preview models.
32-
33-
**Object grounding**: Azure AI Vision complements GPT-4 Turbo with Vision’s text response by identifying and locating salient objects in the input images. This lets the chat model give more accurate and detailed responses about the contents of the image.
34-
35-
:::image type="content" source="../media/concepts/gpt-v/object-grounding.png" alt-text="Screenshot of an image with object grounding applied. Objects have bounding boxes with labels.":::
36-
37-
:::image type="content" source="../media/concepts/gpt-v/object-grounding-response.png" alt-text="Screenshot of a chat response to an image prompt about an outfit. The response is an itemized list of clothing items seen in the image.":::
38-
39-
**Optical Character Recognition (OCR)**: Azure AI Vision complements GPT-4 Turbo with Vision by providing high-quality OCR results as supplementary information to the chat model. It allows the model to produce higher quality responses for images with dense text, transformed images, and numbers-heavy financial documents, and increases the variety of languages the model can recognize in text.
40-
41-
:::image type="content" source="../media/concepts/gpt-v/receipts.png" alt-text="Photo of several receipts.":::
42-
43-
:::image type="content" source="../media/concepts/gpt-v/ocr-response.png" alt-text="Screenshot of the JSON response of an OCR call.":::
44-
45-
**Video prompt**: The **video prompt** enhancement lets you use video clips as input for AI chat, enabling the model to generate summaries and answers about video content. It uses Azure AI Vision Video Retrieval to sample a set of frames from a video and create a transcript of the speech in the video.
46-
47-
> [!VIDEO https://www.microsoft.com/en-us/videoplayer/embed/RW1eHRf]
4823

4924
## Special pricing information
5025

@@ -59,15 +34,6 @@ Base Pricing for GPT-4 Turbo with Vision is:
5934

6035
See the [Tokens section of the overview](/azure/ai-services/openai/overview#tokens) for information on how text and images translate to tokens.
6136

62-
If you turn on Enhancements, additional usage applies for using GPT-4 Turbo with Vision with Azure AI Vision functionality.
63-
64-
| Model | Price |
65-
|-----------------|-----------------|
66-
| + Enhanced add-on features for OCR | $1.5 per 1000 transactions |
67-
| + Enhanced add-on features for Object Detection | $1.5 per 1000 transactions |
68-
| + Enhanced add-on feature for “Video Retrieval” integration **<sup>1</sup>** | Ingestion: $0.05 per minute of video <br>Transactions: $0.25 per 1000 queries of the Video Retrieval index |
69-
70-
**<sup>1</sup>** Processing videos involves the use of extra tokens to identify key frames for analysis. The number of these additional tokens will be roughly equivalent to the sum of the tokens in the text input, plus 700 tokens.
7137

7238
### Example image price calculation
7339
> [!IMPORTANT]
@@ -108,9 +74,7 @@ This section describes the limitations of GPT-4 Turbo with Vision.
10874

10975
### Image support
11076

111-
- **Limitation on image enhancements per chat session**: Enhancements cannot be applied to multiple images within a single chat call.
11277
- **Maximum input image size**: The maximum size for input images is restricted to 20 MB.
113-
- **Object grounding in enhancement API**: When the enhancement API is used for object grounding, and the model detects duplicates of an object, it will generate one bounding box and label for all the duplicates instead of separate ones for each.
11478
- **Low resolution accuracy**: When images are analyzed using the "low resolution" setting, it allows for faster responses and uses fewer input tokens for certain use cases. However, this could impact the accuracy of object and text recognition within the image.
11579
- **Image chat restriction**: When you upload images in Azure OpenAI Studio or the API, there is a limit of 10 images per chat call.
11680

0 commit comments

Comments
 (0)