You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/computer-vision/overview-image-analysis.md
+31-6Lines changed: 31 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -49,10 +49,10 @@ You can analyze images to provide insights about their visual features and chara
49
49
|**Model customization** (v4.0 preview only)|You can create and train custom models to do image classification or object detection. Bring your own images, label them with custom tags, and Image Analysis trains a model customized for your use case.|[Model customization](./concept-model-customization.md)|
50
50
|**Read text from images** (v4.0 only)| Version 4.0 preview of Image Analysis offers the ability to extract readable text from images. Compared with the async Computer Vision 3.2 Read API, the new version offers the familiar Read OCR engine in a unified performance-enhanced synchronous API that makes it easy to get OCR along with other insights in a single API call. |[OCR for images](concept-ocr.md)|
51
51
|**Detect people in images** (v4.0 only)|Version 4.0 of Image Analysis offers the ability to detect people appearing in images. The bounding box coordinates of each detected person are returned, along with a confidence score. |[People detection](concept-people-detection.md)|
52
-
|**Generate image captions** | Generate a caption of an image in human-readable language, using complete sentences. Computer Vision's algorithms generate captions based on the objects identified in the image. <br/><br/>The version 4.0 image captioning model is a more advanced implementation and works with a wider range of input images. It's only available in the following geographic regions: East US, France Central, Korea Central, North Europe, Southeast Asia, West Europe, West US. <br/><br/>Version 4.0 also lets you use dense captioning, which generates detailed captions for individual objects that are found in the image. The API returns the bounding box coordinates (in pixels) of each object found in the image, plus a caption. You can use this functionality to generate descriptions of separate parts of an image.<br/><br/>:::image type="content" source="Images/description.png" alt-text="Photo of cows with a simple description on the right.":::| [Generate image captions (v3.2)](concept-describing-images.md)<br/>[(v4.0)](concept-describe-images-40.md)|
52
+
|**Generate image captions**| Generate a caption of an image in human-readable language, using complete sentences. Computer Vision's algorithms generate captions based on the objects identified in the image. <br/><br/>The version 4.0 image captioning model is a more advanced implementation and works with a wider range of input images. It's only available in the certain geographic regions. See [Region availability](#region-availability). <br/><br/>Version 4.0 also lets you use dense captioning, which generates detailed captions for individual objects that are found in the image. The API returns the bounding box coordinates (in pixels) of each object found in the image, plus a caption. You can use this functionality to generate descriptions of separate parts of an image.<br/><br/>:::image type="content" source="Images/description.png" alt-text="Photo of cows with a simple description on the right.":::|[Generate image captions (v3.2)](concept-describing-images.md)<br/>[(v4.0)](concept-describe-images-40.md)|
53
53
|**Detect objects**|Object detection is similar to tagging, but the API returns the bounding box coordinates for each tag applied. For example, if an image contains a dog, cat and person, the Detect operation lists those objects together with their coordinates in the image. You can use this functionality to process further relationships between the objects in an image. It also lets you know when there are multiple instances of the same tag in an image. <br/><br/>:::image type="content" source="Images/detect-objects.png" alt-text="Photo of an office with a rectangle drawn around a laptop.":::|[Detect objects (v3.2)](concept-object-detection.md)<br/>[(v4.0)](concept-object-detection-40.md)|
54
54
|**Tag visual features**| Identify and tag visual features in an image, from a set of thousands of recognizable objects, living things, scenery, and actions. When the tags are ambiguous or not common knowledge, the API response provides hints to clarify the context of the tag. Tagging isn't limited to the main subject, such as a person in the foreground, but also includes the setting (indoor or outdoor), furniture, tools, plants, animals, accessories, gadgets, and so on.<br/><br/>:::image type="content" source="Images/tagging.png" alt-text="Photo of a skateboarder with tags listed on the right.":::|[Tag visual features (v3.2)](concept-tagging-images.md)<br/>[(v4.0)](concept-tag-images-40.md)|
55
-
|**Get the area of interest / smart crop**|Analyze the contents of an image to return the coordinates of the *area of interest* that matches a specified aspect ratio. Computer Vision returns the bounding box coordinates of the region, so the calling application can modify the original image as desired. <br/><br/>The version 4.0 smart cropping model is a more advanced implementation and works with a wider range of input images. It's only available in the following geographic regions: East US, France Central, Korea Central, North Europe, Southeast Asia, West Europe, West US. |[Generate a thumbnail (v3.2)](concept-generating-thumbnails.md)<br/>[(v4.0 preview)](concept-generate-thumbnails-40.md)|
55
+
|**Get the area of interest / smart crop**|Analyze the contents of an image to return the coordinates of the *area of interest* that matches a specified aspect ratio. Computer Vision returns the bounding box coordinates of the region, so the calling application can modify the original image as desired. <br/><br/>The version 4.0 smart cropping model is a more advanced implementation and works with a wider range of input images. It's only available in certain geographic regions. See [Region availability](#region-availability). |[Generate a thumbnail (v3.2)](concept-generating-thumbnails.md)<br/>[(v4.0 preview)](concept-generate-thumbnails-40.md)|
56
56
|**Detect brands** (v3.2 only) | Identify commercial brands in images or videos from a database of thousands of global logos. You can use this feature, for example, to discover which brands are most popular on social media or most prevalent in media product placement. |[Detect brands](concept-brand-detection.md)|
57
57
|**Categorize an image** (v3.2 only)|Identify and categorize an entire image, using a [category taxonomy](Category-Taxonomy.md) with parent/child hereditary hierarchies. Categories can be used alone, or with our new tagging models.<br/><br/>Currently, English is the only supported language for tagging and categorizing images. |[Categorize an image](concept-categorizing-images.md)|
58
58
|**Detect faces** (v3.2 only) |Detect faces in an image and provide information about each detected face. Azure AI Vision returns the coordinates, rectangle, gender, and age for each detected face.<br/><br/>You can also use the dedicated [Face API](./overview-identity.md) for these purposes. It provides more detailed analysis, such as facial identification and pose detection.|[Detect faces](concept-detecting-faces.md)|
@@ -76,7 +76,7 @@ The multimodal embeddings APIs enable the _vectorization_ of images and text que
76
76
77
77
The `2024-02-01` API includes a multi-lingual model that supports text search in 102 languages. The original English-only model is still available, but it cannot be combined with the new model in the same search index. If you vectorized text and images using the English-only model, these vectors won’t be compatible with multi-lingual text and image vectors.
78
78
79
-
These APIs are only available in the following geographic regions: East US, France Central, Korea Central, North Europe, Southeast Asia, West Europe, West US.
79
+
These APIs are only available in certain geographic regions. See [Region availability](#region-availability).
@@ -90,7 +90,10 @@ Image Analysis 4.0 (preview) offers the ability to remove the background of an i
90
90
|:---------:|:---------:|:---------:|
91
91
| :::image type="content" source="media/background-removal/person-5.png" alt-text="Photo of a group of people using a tablet."::: | :::image type="content" source="media/background-removal/person-5-result.png" alt-text="Photo of a group of people using a tablet; background is transparent."::: | :::image type="content" source="media/background-removal/person-5-matte.png" alt-text="Alpha matte of a group of people."::: |
92
92
93
-
## Image requirements
93
+
94
+
## Service limits
95
+
96
+
### Input requirements
94
97
95
98
#### [Version 4.0](#tab/4-0)
96
99
@@ -113,14 +116,36 @@ Image Analysis works on images that meet the following requirements:
113
116
114
117
---
115
118
116
-
## Service limits
117
119
118
120
### Language support
119
121
120
122
Different Image Analysis features are available in different languages. See the [Language support](/azure/ai-services/content-safety/language-support) page.
121
123
122
-
### Region/location availability
124
+
### Region availability
125
+
126
+
To use the Image Analysis APIs, you must create your Azure AI Vision resource in a supported region. The Image Analysis features are available in the following regions:
0 commit comments