Skip to content

Commit ff2fd29

Browse files
committed
UUF model catalog updates
1 parent 93c4433 commit ff2fd29

6 files changed

+40
-24
lines changed

articles/search/cognitive-search-aml-skill.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ The **AML** skill can be called with the 2024-07-01 stable API version or the 20
2626

2727
Starting in 2024-05-01-preview REST API and in the Azure portal (which also targets the 2024-05-01-preview), Azure AI Search introduced the [Azure AI Foundry model catalog vectorizer](vector-search-vectorizer-azure-machine-learning-ai-studio-catalog.md) for query time connections to the model catalog in Azure AI Foundry portal. If you want to use that vectorizer for queries, the **AML** skill is the *indexing counterpart* for generating embeddings using a model in the Azure AI Foundry model catalog.
2828

29-
During indexing, the **AML** skill can connect to the model catalog to generate vectors for the index. At query time, queries can use a vectorizer to connect to the same model to vectorize text strings for a vector query. In this workflow, the **AML** skill and the model catalog vectorizer should be used together so that you're using the same embedding model for both indexing and queries. See [How to implement integrated vectorization using models from Azure AI Foundry](vector-search-integrated-vectorization-ai-studio.md) for details on this workflow.
29+
During indexing, the **AML** skill can connect to the model catalog to generate vectors for the index. At query time, queries can use a vectorizer to connect to the same model to vectorize text strings for a vector query. In this workflow, the **AML** skill and the model catalog vectorizer should be used together so that you're using the same embedding model for both indexing and queries. See [Use embedding models from Azure AI Foundry model catalog](vector-search-integrated-vectorization-ai-studio.md) for details on this workflow.
3030

3131
> [!NOTE]
3232
> The indexer will retry twice for certain standard HTTP status codes returned from the AML online endpoint. These HTTP status codes are:
48.2 KB
Loading
56.2 KB
Loading
247 KB
Loading

articles/search/vector-search-integrated-vectorization-ai-studio.md

Lines changed: 31 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ ms.topic: how-to
1111
ms.date: 12/03/2024
1212
---
1313

14-
# How to implement integrated vectorization using models from Azure AI Foundry
14+
# Use embedding models from Azure AI Foundry model catalog for integrated vectorization
1515

1616
> [!IMPORTANT]
1717
> This feature is in public preview under [Supplemental Terms of Use](https://azure.microsoft.com/support/legal/preview-supplemental-terms/). The [2024-05-01-Preview REST API](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2024-05-01-preview&preserve-view=true) supports this feature.
@@ -22,15 +22,29 @@ The workflow includes model deployment steps. The model catalog includes embeddi
2222

2323
After the model is deployed, you can use it for [integrated vectorization](vector-search-integrated-vectorization.md) during indexing, or with the [AI Foundry vectorizer](vector-search-vectorizer-azure-machine-learning-ai-studio-catalog.md) for queries.
2424

25+
## Prerequisites
26+
27+
+ Azure AI Search, any region and tier.
28+
29+
+ Azure AI Foundry and an [Azure AI Foundry project](/azure/ai-studio/how-to/create-projects).
30+
31+
## Supported embedding models
32+
33+
Integrated vectorization and the [Import and vectorize data wizard](search-import-data-portal.md) supports the following embedding models:
34+
35+
+ For text embeddings: Cohere-embed-v3-english, Cohere-embed-v3-multilingual
36+
37+
+ For image embeddings: Facebook-DinoV2-Image-Embeddings-ViT-Base, Facebook-DinoV2-Image-Embeddings-ViT-Giant
38+
2539
## Deploy an embedding model from the Azure AI Foundry model catalog
2640

27-
1. Open the [Azure AI Foundry model catalog](https://ai.azure.com/explore/models).
41+
1. Open the [Azure AI Foundry model catalog](https://ai.azure.com/explore/models). Create a project if you don't have one already.
2842

2943
1. Apply a filter to show just the embedding models. Under **Inference tasks**, select **Embeddings**:
3044

3145
:::image type="content" source="media\vector-search-integrated-vectorization-ai-studio\ai-studio-catalog-embeddings-filter.png" lightbox="media\vector-search-integrated-vectorization-ai-studio\ai-studio-catalog-embeddings-filter.png" alt-text="Screenshot of the Azure AI Foundry model catalog page highlighting how to filter by embeddings models.":::
3246

33-
1. Select the model you would like to vectorize your content with. Then select **Deploy** and pick a deployment option.
47+
1. Select a supported model, then select **Deploy** and pick a deployment option.
3448

3549
:::image type="content" source="media\vector-search-integrated-vectorization-ai-studio\ai-studio-deploy-endpoint.png" lightbox="media\vector-search-integrated-vectorization-ai-studio\ai-studio-deploy-endpoint.png" alt-text="Screenshot of deploying an endpoint via the Azure AI Foundry model catalog.":::
3650

@@ -56,12 +70,12 @@ When you deploy embedding models from the [Azure AI Foundry model catalog](https
5670

5771
This section describes the AML skill definition and index mappings. It includes sample payloads that are already configured to work with their corresponding deployed endpoints. For more technical details on how these payloads work, read about the [Skill context and input annotation language](cognitive-search-skill-annotation-language.md).
5872

59-
### [**Text Input for "Inference" API**](#tab/inference-text)
73+
<!-- ### [**Text Input for "Inference" API**](#tab/inference-text)
6074
6175
This AML skill payload works with the following models from AI Foundry:
6276
63-
+ OpenAI-CLIP-Image-Text-Embeddings-vit-base-patch32
64-
+ OpenAI-CLIP-Image-Text-Embeddings-ViT-Large-Patch14-336
77+
+ Cohere-embed-v3-english
78+
+ Cohere-embed-v3-multilingual
6579
6680
It assumes that you're chunking your content using the [Text Split skill](cognitive-search-skill-textsplit.md) and that the text to be vectorized is in the `/document/pages/*` path. If your text comes from a different path, update all references to the `/document/pages/*` path accordingly.
6781
@@ -99,11 +113,11 @@ The URI and key are generated when you deploy the model from the catalog. For mo
99113
}
100114
]
101115
}
102-
```
116+
``` -->
103117

104-
### [**Image Input for "Inference" API**](#tab/inference-image)
118+
### [**Facebook embedding models**](#tab/inference-image)
105119

106-
This AML skill payload works with the following models from AI Foundry:
120+
This AML skill payload works with the following image embedding models from AI Foundry:
107121

108122
+ Facebook-DinoV2-Image-Embeddings-ViT-Base
109123
+ Facebook-DinoV2-Image-Embeddings-ViT-Giant
@@ -116,8 +130,8 @@ The URI and key are generated when you deploy the model from the catalog. For mo
116130
{
117131
"@odata.type": "#Microsoft.Skills.Custom.AmlSkill",
118132
"context": "/document/normalized_images/*",
119-
"uri": "<YOUR_MODEL_URL_HERE>",
120-
"key": "<YOUR_MODEL_HERE>",
133+
"uri": "https://myproject-1a1a-abcd.eastus.inference.ml.azure.com/score",
134+
"key": "bbbbbbbb-1c1c-2d2d-3e3e-444444444444",
121135
"inputs": [
122136
{
123137
"name": "input_data",
@@ -146,27 +160,27 @@ The URI and key are generated when you deploy the model from the catalog. For mo
146160
}
147161
```
148162

149-
### [**Cohere**](#tab/cohere)
163+
### [**Cohere embedding models**](#tab/cohere)
150164

151-
This AML skill payload works with the following models from AI Foundry:
165+
This AML skill payload works with the following text embedding models from AI Foundry:
152166

153167
+ Cohere-embed-v3-english
154168
+ Cohere-embed-v3-multilingual
155169

156-
It assumes that you're chunking your content using the SplitSkill and therefore your text to be vectorized is in the `/document/pages/*` path. If your text comes from a different path, update all references to the `/document/pages/*` path according.
170+
It assumes that you're chunking your content using the Text Split skill and therefore your text to be vectorized is in the `/document/pages/*` path. If your text comes from a different path, update all references to the `/document/pages/*` path according.
157171

158172
You must add the `/v1/embed` path onto the end of the URL that you copied from your AI Foundry deployment. You might also change the values for the `input_type`, `truncate` and `embedding_types` inputs to better fit your use case. For more information on the available options, review the [Cohere Embed API reference](/azure/ai-studio/how-to/deploy-models-cohere-embed).
159173

160174
The URI and key are generated when you deploy the model from the catalog. For more information about these values, see [How to deploy Cohere Embed models with Azure AI Foundry](/azure/ai-studio/how-to/deploy-models-cohere-embed).
161175

162-
Note that image URIs are not supported by this integration at this time.
176+
Note that image URIs aren't supported by this integration at this time.
163177

164178
```json
165179
{
166180
"@odata.type": "#Microsoft.Skills.Custom.AmlSkill",
167181
"context": "/document/pages/*",
168-
"uri": "<YOUR_MODEL_URL_HERE>/v1/embed",
169-
"key": "<YOUR_MODEL_KEY_HERE>",
182+
"uri": "https://Cohere-embed-v3-multilingual-hin.eastus.models.ai.azure.com/v1/embed",
183+
"key": "aaaaaaaa-0b0b-1c1c-2d2d-333333333333",
170184
"inputs": [
171185
{
172186
"name": "texts",

articles/search/vector-search-vectorizer-azure-machine-learning-ai-studio-catalog.md

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ Parameters are case-sensitive. Which parameters you choose to use depends on wha
2727
| Parameter name | Description |
2828
|--------------------|-------------|
2929
| `uri` | (Required) The [URI of the AML online endpoint](../machine-learning/how-to-authenticate-online-endpoint.md) to which the _JSON_ payload is sent. Only the **https** URI scheme is allowed. |
30-
| `modelName` | (Required) The model ID from the AI Foundry model catalog that is deployed at the provided endpoint. Currently supported models are <ul><li>Facebook-DinoV2-Image-Embeddings-ViT-Base </li><li>Facebook-DinoV2-Image-Embeddings-ViT-Giant </li><li>Cohere-embed-v3-english </li><li>Cohere-embed-v3-multilingual</ul> |
30+
| `modelName` | (Required) The model ID from the AI Foundry model catalog that is deployed at the provided endpoint. Supported models are: <ul><li>Facebook-DinoV2-Image-Embeddings-ViT-Base </li><li>Facebook-DinoV2-Image-Embeddings-ViT-Giant </li><li>Cohere-embed-v3-english </li><li>Cohere-embed-v3-multilingual</ul> |
3131
| `key` | (Required for [key authentication](#WhatParametersToUse)) The [key for the AML online endpoint](../machine-learning/how-to-authenticate-online-endpoint.md). |
3232
| `resourceId` | (Required for [token authentication](#WhatParametersToUse)). The Azure Resource Manager resource ID of the AML online endpoint. It should be in the format subscriptions/{guid}/resourceGroups/{resource-group-name}/Microsoft.MachineLearningServices/workspaces/{workspace-name}/onlineendpoints/{endpoint_name}. |
3333
| `region` | (Optional for [token authentication](#WhatParametersToUse)). The [region](https://azure.microsoft.com/global-infrastructure/regions/) the AML online endpoint is deployed in. Needed if the region is different from the region of the search service. |
@@ -49,7 +49,7 @@ Which authentication parameters are required depends on what authentication your
4949

5050
Which vector query types are supported by the AI Foundry model catalog vectorizer depends on the `modelName` that is configured.
5151

52-
| `modelName` | Supports `text` query | Supports `imageUrl` query | Supports `imageBinary` query |
52+
| Embedding model | Supports `text` query | Supports `imageUrl` query | Supports `imageBinary` query |
5353
|--------------------|-------------|-------------|-------------|
5454
| Facebook-DinoV2-Image-Embeddings-ViT-Base | | X | X |
5555
| Facebook-DinoV2-Image-Embeddings-ViT-Giant | | X | X |
@@ -69,16 +69,18 @@ The expected field dimensions for a field configured with an AI Foundry model ca
6969

7070
## Sample definition
7171

72+
Suggested model names in the Azure AI Foundry model catalog consist of the base model plus a random three-letter suffix. The name of your model will be different from the one shown in this example.
73+
7274
```json
7375
"vectorizers": [
7476
{
75-
"name": "my-ai-studio-catalog-vectorizer",
77+
"name": "my-model-catalog-vectorizer",
7678
"kind": "aml",
7779
"amlParameters": {
78-
"uri": "https://my-aml-endpoint.eastus.inference.ml.azure.com/score",
79-
"key": "0000000000000000000000000000000000000",
80+
"uri": "https://Cohere-embed-v3-multilingual-hin.eastus.models.ai.azure.com",
81+
"key": "aaaaaaaa-0b0b-1c1c-2d2d-333333333333",
8082
"timeout": "PT60S",
81-
"modelName": "OpenAI-CLIP-Image-Text-Embeddings-vit-base-patch3",
83+
"modelName": "Cohere-embed-v3-multilingual-hin",
8284
"resourceId": null,
8385
"region": null,
8486
},

0 commit comments

Comments
 (0)