You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/cognitive-search-aml-skill.md
+26-25Lines changed: 26 additions & 25 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,56 +10,57 @@ ms.custom:
10
10
- ignite-2023
11
11
- build-2024
12
12
ms.topic: reference
13
-
ms.date: 05/08/2025
13
+
ms.date: 08/04/2025
14
14
---
15
15
16
16
# AML skill in an Azure AI Search enrichment pipeline
17
17
18
18
> [!IMPORTANT]
19
-
> Support for indexer connections to the Azure AI Foundry model catalog is in public preview under [supplemental terms of use](https://azure.microsoft.com/support/legal/preview-supplemental-terms/). Preview REST APIs support this skill.
19
+
> Support for indexer connections to the Azure AI Foundry model catalog is in public preview under [supplemental terms of use](https://azure.microsoft.com/support/legal/preview-supplemental-terms/). Preview REST APIs support this capability.
20
20
21
-
The **AML** skill allows you to extend AI enrichment with a custom [Azure Machine Learning (AML)](../machine-learning/overview-what-is-azure-machine-learning.md)model or deployed base embedding model in the Azure AI Foundry model catalog. Once an AML model is [trained and deployed](../machine-learning/concept-azure-machine-learning-architecture.md#workspace), an**AML** skill integrates it into a skillset.
21
+
Use the **AML** skill to extend AI enrichment with a deployed base embedding model from the [Azure AI Foundry model catalog](vector-search-integrated-vectorization-ai-studio.md) or a custom [Azure Machine Learning](../machine-learning/overview-what-is-azure-machine-learning.md)(AML) model. After an AML model is [trained and deployed](../machine-learning/concept-azure-machine-learning-architecture.md#workspace), the**AML** skill integrates the model into a skillset.
22
22
23
23
## AML skill usage
24
24
25
-
Like other built-in skills, a custom **AML** skill has inputs and outputs. The inputs are sent to a deployed AML online endpoint as a JSON object. The output of the endpoint must be a JSON payload in the response, along with a success status code. Your data is processed in the [Geo](https://azure.microsoft.com/explore/global-infrastructure/data-residency/) where your model is deployed. The response is expected to provide the outputs specified by your **AML** skill definition. Any other response is considered an error and no enrichments are performed.
25
+
Like other built-in skills, a custom **AML** skill has inputs and outputs. The inputs are sent to a deployed AML online endpoint as a JSON object. The output of the endpoint must be a JSON payload in the response, along with a success status code. Your data is processed in the [Geo](https://azure.microsoft.com/explore/global-infrastructure/data-residency/) where your model is deployed. The response should provide the outputs specified by your **AML** skill definition. Any other response is considered an error, and no enrichments are performed.
26
26
27
27
> [!NOTE]
28
-
> The indexer will retry twice for certain standard HTTP status codes returned from the AML online endpoint. These HTTP status codes are:
28
+
> The indexer retries two times for certain standard HTTP status codes returned from the AML online endpoint. These HTTP status codes are:
29
+
>
29
30
> *`503 Service Unavailable`
30
31
> *`429 Too Many Requests`
31
32
32
-
The **AML** skill can be called with the 2024-07-01 stable API version or equivalent Azure SDK, or the 2024-05-01-preview API version for connections to the model catalog in Azure AI Foundry portal.
33
+
You can call the **AML** skill with the 2024-07-01 stable API version or an equivalent Azure SDK. For connections to the model catalog in the Azure AI Foundry portal, use the 2024-05-01-preview API version or later.
33
34
34
35
## AML skill for models in Azure AI Foundry
35
36
36
-
Starting in 2024-05-01-preview REST API and in the Azure portal (which also targets the 2024-05-01-preview), Azure AI Search provides the [Azure AI Foundry model catalog vectorizer](vector-search-vectorizer-azure-machine-learning-ai-studio-catalog.md) for querytime connections to the model catalog in Azure AI Foundry portal. If you want to use that vectorizer for queries, an**AML** skill is the *indexing counterpart* for generating embeddings using a model in the Azure AI Foundry model catalog.
37
+
Starting in the 2024-05-01-preview REST API and the Azure portal, which also targets the 2024-05-01-preview, Azure AI Search provides the [Azure AI Foundry model catalog vectorizer](vector-search-vectorizer-azure-machine-learning-ai-studio-catalog.md) for query-time connections to the model catalog in the Azure AI Foundry portal. If you want to use that vectorizer for queries, the**AML** skill is the *indexing counterpart* for generating embeddings using a model from the model catalog.
37
38
38
-
During indexing, the **AML** skill can connect to the model catalog to generate vectors for the index. At query time, queries can use a vectorizer to connect to the same model to vectorize text strings for a vector query. In this workflow, the **AML** skill and the model catalog vectorizer should be used together so that you're using the same embedding model for both indexing and queries. See [Use embedding models from Azure AI Foundry model catalog](vector-search-integrated-vectorization-ai-studio.md) for details and for a list of the [supported embedding models](vector-search-integrated-vectorization-ai-studio.md#supported-embedding-models).
39
+
During indexing, the **AML** skill can connect to the model catalog to generate vectors for the index. At query time, queries can use a vectorizer to connect to the same model to vectorize text strings for a vector query. In this workflow, you should use the **AML** skill and the model catalog vectorizer together so that the same embedding model is used for indexing and queries. For more information, including a list of supported embedding models, see [Use embedding models from Azure AI Foundry model catalog](vector-search-integrated-vectorization-ai-studio.md).
39
40
40
-
We recommend using the [**Import and vectorize data wizard**](search-get-started-portal-import-vectors.md) to generate a skillset that includes an AML skill for deployed embedding models on Azure AI Foundry. AML skill definition for inputs, outputs, and mappings are generated by the wizard, which gives you an easy way to test a model before writing any code.
41
+
We recommend using the [**Import and vectorize data wizard**](search-get-started-portal-import-vectors.md) to generate a skillset that includes an AML skill for deployed embedding models in Azure AI Foundry. The wizard generates the AML skill definition for inputs, outputs, and mappings, providing an easy way to test a model before writing any code.
41
42
42
43
## Prerequisites
43
44
44
-
* An [AML workspace](../machine-learning/concept-workspace.md) for a custom model that you create, or a project in Azure AI Foundry if an embedding model is deployed from the catalog.
45
-
*An [Online endpoints (real-time)](../machine-learning/concept-endpoints-online.md)in this workspace for a custom model, or the model endpoint for embedding models deployed from the catalog.
45
+
* An [Azure AI Foundry project](/azure/ai-foundry/how-to/create-projects?tabs=ai-foundry&pivots=fdp-project) for an embedding model deployed from the catalog, or an [AML workspace](../machine-learning/concept-workspace.md) for a custom model that you create.
46
+
*The model endpoint for an embedding model deployed from the catalog, or an [online endpoint (real-time)](../machine-learning/concept-endpoints-online.md)of your AML workspace for a custom model.
46
47
47
48
## @odata.type
48
49
49
50
Microsoft.Skills.Custom.AmlSkill
50
51
51
52
## Skill parameters
52
53
53
-
Parameters are case-sensitive. Which parameters you choose to use depends on what [authentication your AML online endpoint requires, if any](#WhatSkillParametersToUse)
54
+
Parameters are casesensitive. The parameters you use depend on what [authentication your AML online endpoint requires](#WhatSkillParametersToUse), if any.
54
55
55
56
| Parameter name | Description |
56
57
|--------------------|-------------|
57
58
|`uri`| (Required for [key authentication](#WhatSkillParametersToUse)) The [scoring URI of the AML online endpoint](../machine-learning/how-to-authenticate-online-endpoint.md) to which the _JSON_ payload is sent. Only the **https** URI scheme is allowed. For embedding models in the Azure AI Foundry model catalog, this is the target URI.|
58
59
|`key`| (Required for [key authentication](#WhatSkillParametersToUse)) The [key for the AML online endpoint](../machine-learning/how-to-authenticate-online-endpoint.md) or the |
59
-
|`resourceId`| (Required for [token authentication](#WhatSkillParametersToUse)). The Azure Resource Manager resource ID of the AML online endpoint. It should be in the format `subscriptions/{guid}/resourceGroups/{resource-group-name}/Microsoft.MachineLearningServices/workspaces/{workspace-name}/onlineendpoints/{endpoint_name}`. |
60
+
|`resourceId`| (Required for [token authentication](#WhatSkillParametersToUse)). The Azure Resource Manager resource ID of the AML online endpoint. Use the format `subscriptions/{guid}/resourceGroups/{resource-group-name}/Microsoft.MachineLearningServices/workspaces/{workspace-name}/onlineendpoints/{endpoint_name}`. |
60
61
|`region`| (Optional for [token authentication](#WhatSkillParametersToUse)). The [region](https://azure.microsoft.com/global-infrastructure/regions/) the AML online endpoint is deployed in. |
61
-
|`timeout`| (Optional) When specified, indicates the timeout for the http client making the API call. It must be formatted as an XSD "dayTimeDuration" value (a restricted subset of an [ISO 8601 duration](https://www.w3.org/TR/xmlschema11-2/#dayTimeDuration) value). For example, `PT60S` for 60 seconds. If not set, a default value of 30 seconds is chosen. The timeout can be set to a maximum of 230 seconds and a minimum of 1 second. |
62
-
| `degreeOfParallelism` | (Optional) When specified, indicates the number of calls the indexer makes in parallel to the endpoint you have provided. You can decrease this value if your endpoint is failing under too high of a request load. You can raise it if your endpoint is able to accept more requests and you would like an increase in the performance of the indexer. If not set, a default value of 5 is used. The degreeOfParallelism can be set to a maximum of 10 and a minimum of 1.
62
+
|`timeout`| (Optional) When specified, indicates the timeout for the http client making the API call. It must be formatted as an XSD "dayTimeDuration" value, which is a restricted subset of an [ISO 8601 duration](https://www.w3.org/TR/xmlschema11-2/#dayTimeDuration) value. For example, `PT60S` for 60 seconds. If not set, a default value of 30 seconds is chosen. You can set the timeout to a minimum of 1 second and a maximum of 230 seconds. |
63
+
|`degreeOfParallelism`| (Optional) When specified, indicates the number of calls the indexer makes in parallel to the endpoint you provide. You can decrease this value if your endpoint is failing under too high of a request load. You can raise it if your endpoint is able to accept more requests and you would like an increase in the performance of the indexer. If not set, a default value of 5 is used. You can set the degreeOfParallelism to a minimum of 1 and a maximum of 10. |
63
64
64
65
<aname="WhatSkillParametersToUse"></a>
65
66
@@ -69,11 +70,11 @@ AML online endpoints provide two authentication options:
69
70
70
71
*[Key-based authentication](../machine-learning/how-to-authenticate-online-endpoint.md). A static key is provided to authenticate scoring requests from AML skills. Set the `uri` and `key` parameters for this connection.
71
72
72
-
*[Token-Based Authentication](../machine-learning/how-to-authenticate-online-endpoint.md), where the AML online endpoint is [deployed using tokenbased authentication](../machine-learning/how-to-authenticate-online-endpoint.md). The Azure AI Search service's [managed identity](/azure/active-directory/managed-identities-azure-resources/overview)must be enabled and have a role assignment on workspace. The AML skill then uses the service's managed identity to authenticate against the AML online endpoint, with no static keys required. The search service identity must be an**Owner** or **Contributor**. Set the `resourceId` parameter, and if the search service is in a different region from the AML workspace, set the `region` parameter.
73
+
*[Token-based authentication](../machine-learning/how-to-authenticate-online-endpoint.md), where the AML online endpoint is deployed using token-based authentication. The Azure AI Search service must have a [managed identity](/azure/active-directory/managed-identities-azure-resources/overview) and a role assignment on the AML workspace. The AML skill then uses the service's managed identity to authenticate against the AML online endpoint, with no static keys required. The search service identity must have the**Owner** or **Contributor** role. Set the `resourceId` parameter, and if the search service is in a different region from the AML workspace, set the `region` parameter.
73
74
74
75
## Skill inputs
75
76
76
-
Skill inputs are a node of the [enriched document](cognitive-search-working-with-skillsets.md#enrichment-tree)that's created during *document cracking*. For example, it might be the root document, a normalized image, or the content of a blob. There are no predefined inputs for this skill. For inputs, you should specify one or more nodes that are populated at the time of the AML skill's execution.
77
+
Skill inputs are a node of the [enriched document](cognitive-search-working-with-skillsets.md#enrichment-tree) created during *document cracking*. For example, it might be the root document, a normalized image, or the content of a blob. There are no predefined inputs for this skill. For inputs, you should specify one or more nodes that are populated at the time of the AML skill's execution.
77
78
78
79
## Skill outputs
79
80
@@ -103,7 +104,7 @@ Skill outputs are new nodes of an enriched document created by the skill. There
103
104
104
105
## Sample input JSON structure
105
106
106
-
This _JSON_ structure represents the payload that is sent to your AML online endpoint. The top-level fields of the structure correspond to the "names" specified in the `inputs` section of the skill definition. The values of those fields are from the `source` of those fields (which could be from a field in the document, or potentially from another skill)
107
+
This _JSON_ structure represents the payload sent to your AML online endpoint. The top-level fields of the structure correspond to the "names" specified in the `inputs` section of the skill definition. The values of those fields are from the "sources" of those fields, which could be from a field in the document or another skill.
107
108
108
109
```json
109
110
{
@@ -113,7 +114,7 @@ This _JSON_ structure represents the payload that is sent to your AML online end
113
114
114
115
## Sample output JSON structure
115
116
116
-
The output corresponds to the response returned from your AML online endpoint. The AML online endpoint should only return a _JSON_ payload (verified by looking at the `Content-Type` response header) and should be an object where the fields are enrichments matching the "names" in the `output` and whose value is considered the enrichment.
117
+
The output corresponds to the response from your AML online endpoint. The AML online endpoint should only return a _JSON_ payload (verified by looking at the `Content-Type` response header) and should be an object where the fields are enrichments matching the "names" in the `output` and whose value is considered the enrichment.
117
118
118
119
```json
119
120
{
@@ -167,16 +168,16 @@ The output corresponds to the response returned from your AML online endpoint. T
167
168
168
169
## Error cases
169
170
170
-
In addition to your AML being unavailable or sending out nonsuccessful status codes, the following are considered erroneous cases:
171
+
In addition to your AML being unavailable or sending nonsuccessful status codes, the following cases are considered errors:
171
172
172
-
* The AML online endpoint returns a success status code, but the response indicates that it isn't `application/json`, then the response is considered invalid and no enrichments are performed.
173
+
* The AML online endpoint returns a success status code, but the response indicates that it isn't `application/json`. The response is thus invalid, and no enrichments are performed.
173
174
174
175
* The AML online endpoint returns invalid JSON.
175
176
176
-
For cases when the AML online endpoint is unavailable or returns an HTTP error, a friendly error with any available details about the HTTP error is added to the indexer execution history.
177
+
If the AML online endpoint is unavailable or returns an HTTP error, a friendly error with any available details about the HTTP error is added to the indexer execution history.
177
178
178
179
## See also
179
180
180
-
+[How to define a skillset](cognitive-search-defining-skillset.md)
0 commit comments