You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/cognitive-search-concept-intro.md
+13-23Lines changed: 13 additions & 23 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,22 +8,22 @@ author: HeidiSteen
8
8
ms.author: heidist
9
9
ms.service: cognitive-search
10
10
ms.topic: conceptual
11
-
ms.date: 07/01/2022
12
-
ms.custom: references_regions
11
+
ms.date: 07/19/2023
12
+
13
13
---
14
14
# AI enrichment in Azure Cognitive Search
15
15
16
-
*AI enrichment* is the application of machine learning models over content that isn't full text searchable in its raw form. Through enrichment, analysis and inference are used to create searchable content and structure where none previously existed.
16
+
In Cognitive Search, *AI enrichment* is the application of machine learning models over content that isn't full text searchable in its raw form. Through enrichment, analysis and inference are used to create searchable content and structure where none previously existed.
17
17
18
18
Because Azure Cognitive Search is a full text search solution, the purpose of AI enrichment is to improve the utility of your content in search-related scenarios:
19
19
20
-
+Translation and language detection for multi-lingual search
21
-
+Entity recognition extracts people, places, and other entities from large chunks of text
22
-
+Key phrase extraction identifies and then outputs important terms
23
-
+ Optical Character Recognition (OCR) recognizes printed and handwritten text in binary files
24
-
+Image analysis describes image content and outputs the descriptions as searchable text fields
20
+
+Apply translation and language detection for multi-lingual search
21
+
+Apply entity recognition to extract people names, places, and other entities from large chunks of text
22
+
+Apply key phrase extraction to identify and output important terms
23
+
+Apply Optical Character Recognition (OCR) to recognize printed and handwritten text in binary files
24
+
+Apply image analysis to describe image content, and output the descriptions as searchable text fields
25
25
26
-
AI enrichment is an extension of an [**indexer pipeline**](search-indexer-overview.md). An enrichment pipeline has all of the components of an indexer pipeline (indexer, data source, index), plus a [**skillset**](cognitive-search-working-with-skillsets.md) that specifies atomic enrichment steps.
26
+
AI enrichment is an extension of an [**indexer pipeline**](search-indexer-overview.md) that connects to Azure data sources. An enrichment pipeline has all of the components of an indexer pipeline (indexer, data source, index), plus a [**skillset**](cognitive-search-working-with-skillsets.md) that specifies atomic enrichment steps.
27
27
28
28
The following diagram shows the progression of AI enrichment:
29
29
@@ -43,8 +43,6 @@ The following diagram shows the progression of AI enrichment:
43
43
44
44
**Exploration** is the last step. Output is always a [search index](search-what-is-an-index.md) that you can query from a client app. Output can optionally be a [knowledge store](knowledge-store-concept-intro.md) consisting of blobs and tables in Azure Storage that are accessed through data exploration tools or downstream processes. If you're creating a knowledge store, [projections](knowledge-store-projection-overview.md) determine the data path for enriched content. The same enriched content can appear in both indexes and knowledge stores.
Enrichment is useful if raw content is unstructured text, image content, or content that needs language detection and translation. Applying AI through the [**built-in skills**](cognitive-search-predefined-skills.md) can unlock this content for full text search and data science applications.
@@ -54,7 +52,7 @@ Open-source, third-party, or first-party code can be integrated into the pipelin
54
52
55
53
### Use-cases for built-in skills
56
54
57
-
Built-in skills are based on the Azure AI services APIs: [Azure AI Vision](../ai-services/computer-vision/index.yml) and [Language Service](../ai-services/language-service/overview.md). Unless your content input is small, expect to [attach a billable Azure AI services resource](cognitive-search-attach-cognitive-services.md) to run larger workloads.
55
+
Built-in skills are based on the Azure AI services APIs: [Azure AIComputer Vision](../ai-services/computer-vision/index.yml) and [Language Service](../ai-services/language-service/overview.md). Unless your content input is small, expect to [attach a billable Azure AI services resource](cognitive-search-attach-cognitive-services.md) to run larger workloads.
58
56
59
57
A [skillset](cognitive-search-defining-skillset.md) that's assembled using built-in skills is well suited for the following application scenarios:
60
58
@@ -66,11 +64,7 @@ A [skillset](cognitive-search-defining-skillset.md) that's assembled using built
66
64
67
65
### Use-cases for custom skills
68
66
69
-
[**Custom skills**](cognitive-search-create-custom-skill-example.md) execute external code that you provide. Custom skills can support more complex scenarios, such as recognizing forms, or custom entity detection using a model that you provide and wrap in the [custom skill web interface](cognitive-search-custom-skill-interface.md). Several examples of custom skills include:
[**Custom skills**](cognitive-search-create-custom-skill-example.md) execute external code that you provide and wrap in the [custom skill web interface](cognitive-search-custom-skill-interface.md). Several examples of custom skills can be found in the [azure-search-power-skills](https://github.com/Azure-Samples/azure-search-power-skills/blob/main/README.md) GitHub repository.
74
68
75
69
Custom skills aren’t always complex. For example, if you have an existing package that provides pattern matching or a document classification model, you can wrap it in a custom skill.
76
70
@@ -104,11 +98,7 @@ In Azure Storage, a [knowledge store](knowledge-store-concept-intro.md) can assu
104
98
105
99
## Availability and pricing
106
100
107
-
Enrichment is available in regions that have Azure AI services. You can check the availability of enrichment on the [Azure products available by region](https://azure.microsoft.com/global-infrastructure/services/?products=search) page. Enrichment is available in all regions except:
108
-
109
-
+ Australia Southeast
110
-
+ China North 2
111
-
+ Germany West Central
101
+
Enrichment is available in regions that have Azure AI services. You can check the availability of enrichment on the [Azure products available by region](https://azure.microsoft.com/global-infrastructure/services/?products=search) page.
112
102
113
103
Billing follows a pay-as-you-go pricing model. The costs of using built-in skills are passed on when a multi-region Azure AI services key is specified in the skillset. There are also costs associated with image extraction, as metered by Cognitive Search. Text extraction and utility skills, however, aren't billable. For more information, see [How you're charged for Azure Cognitive Search](search-sku-manage-costs.md#how-youre-charged-for-azure-cognitive-search).
114
104
@@ -141,4 +131,4 @@ To repeat any of the above steps, [reset the indexer](search-howto-reindex.md) b
Copy file name to clipboardExpand all lines: articles/search/cognitive-search-tutorial-debug-sessions.md
+38-25Lines changed: 38 additions & 25 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,23 +1,23 @@
1
1
---
2
2
title: 'Tutorial: Debug skillsets'
3
3
titleSuffix: Azure Cognitive Search
4
-
description: Debug sessions is an Azure portal tool used to find, diagnose, and repair problems in a skillset.
4
+
description: Debug Sessions is an Azure portal tool used to find, diagnose, and repair problems in a skillset.
5
5
author: HeidiSteen
6
6
ms.author: heidist
7
7
manager: nitinme
8
8
9
9
ms.service: cognitive-search
10
10
ms.topic: tutorial
11
-
ms.date: 06/15/2022
11
+
ms.date: 07/20/2023
12
12
---
13
13
14
14
# Tutorial: Debug a skillset using Debug Sessions
15
15
16
16
Skillsets coordinate a series of actions that analyze or transform content, where the output of one skill becomes the input of another. When inputs depend on outputs, mistakes in skillset definitions and field associations can result in missed operations and data.
17
17
18
-
**Debug sessions** in the Azure portal provides a holistic visualization of a skillset. Using this tool, you can drill down to specific steps to easily see where an action might be falling down.
18
+
**Debug sessions**is a tool in the Azure portal provides a holistic visualization of a skillset. Using this tool, you can drill down to specific steps to easily see where an action might be falling down.
19
19
20
-
In this article, you'll use **Debug sessions** to find and fix missing inputs and outputs. The tutorial is all-inclusive. It provides sample data, a Postman collection that creates objects, and instructions for debugging problems in the skillset.
20
+
In this article, use **Debug sessions** to find and fix missing inputs and outputs. The tutorial is all-inclusive. It provides sample data, a Postman collection that creates objects, and instructions for debugging problems in the skillset.
21
21
22
22
## Prerequisites
23
23
@@ -68,24 +68,26 @@ All requests require an api-key on every request sent to your service. Having a
68
68
69
69
## Create data source, skillset, index, and indexer
70
70
71
-
In this section, Postman and a provided collection are used to create the Cognitive Search data source, skillset, index, and indexer. If you're unfamiliar with Postman, see [this quickstart](search-get-started-rest.md).
71
+
In this section, import a Postman collection containing a "buggy" workflow that you fix in this tutorial.
72
72
73
-
You will need the [Postman collection](https://github.com/Azure-Samples/azure-search-postman-samples/tree/master/Debug-sessions)created for this tutorial to complete this task.
73
+
1. Start Postman and import the [DebugSessions.postman_collection.json](https://github.com/Azure-Samples/azure-search-postman-samples/tree/master/Debug-sessions)collection. If you're unfamiliar with Postman, see [this quickstart](search-get-started-rest.md).
74
74
75
-
1.Start Postman and import the "DebugSessions.postman_collection.json" collection. Under **Files** > **New**, select the collection.
75
+
1. Under **Files** > **New**, select the collection.
76
76
77
77
1. After the collection is imported, expand the actions list (...).
78
78
79
-
1. Select **Edit** to set variables used in each request, and then **Save**.
79
+
1. Select **Edit** to set variables used in each request.
80
80
81
81
| Current value | Description |
82
82
|---------------|-------------|
83
-
| searchService | The name of your search service (for example, if the endpoint is `https://mydemo.search.windows.net`, then the service name is "mydemo". |
83
+
| searchService | The name of your search service (for example, if the endpoint is `https://mydemo.search.windows.net`, then the service name is `mydemo`). |
84
84
| apiKey | The primary or secondary key obtained from the **Keys** page of your search service. |
85
85
| storageConnectionString | The connection string obtained from the **Access Keys** page of your Azure Storage account. |
86
86
| containerName | The name of the container you created for the sample data. |
87
87
88
-
1. Verify that the collection you imported contains four REST calls, used to create objects in this tutorial.
88
+
1.**Save** your changes. The requests fail unless you save the variables.
89
+
90
+
1. You should see four REST calls in the collection.
89
91
90
92
+ CreateDataSource adds `clinical-trials-ds`
91
93
+ CreateSkillset adds `clinical-trials-ss`
@@ -122,13 +124,15 @@ Another way to investigate errors and warnings is through the Azure portal.
122
124
123
125
## Start your debug session
124
126
125
-
1. From the search service **Overview** page, click the **Debug sessions** tab.
127
+
1. From the search service left-navigation pane, under **Search management**, select **Debug sessions**.
126
128
127
-
1. Select **+ New Debug Session**.
129
+
1. Select **+ Add Debug Session**.
128
130
129
131
1. Give the session a name.
130
132
131
-
1. Connect the session to your storage account.
133
+
1. Connect the session to your storage account. Create a container named "debug sessions". You can use this container repeatedly to store all of your debug session data.
134
+
135
+
1. If you configured a trusted connection between search and storage, select the user-managed identity or system identity for the connection. Otherwise, use the default (None).
132
136
133
137
1. In Indexer template, provide the indexer name. The indexer has references to the data source, the skillset, and index.
134
138
@@ -152,24 +156,24 @@ Notice that the **Errors/Warnings** tab will provide a much smaller list than th
152
156
153
157
Select **Errors/Warnings** to review the notifications. You should see four:
154
158
155
-
+ "Could not execute skill because one or more skill input was invalid. Required skill input is missing. Name: 'text', Source: '/document/content'."
159
+
+ "Could not execute skill because one or more skill inputs were invalid. Required skill input is missing. Name: 'text', Source: '/document/content'."
156
160
157
161
+ "Could not map output field 'locations' to search index. Check the 'outputFieldMappings' property of your indexer.
158
162
Missing value '/document/merged_content/locations'."
159
163
160
164
+ "Could not map output field 'organizations' to search index. Check the 'outputFieldMappings' property of your indexer.
161
165
Missing value '/document/merged_content/organizations'."
162
166
163
-
+ "Skill executed but may have unexpected results because one or more skill input was invalid.
167
+
+ "Skill executed but may have unexpected results because one or more skill inputs were invalid.
164
168
Optional skill input is missing. Name: 'languageCode', Source: '/document/languageCode'. Expression language parsing issues: Missing value '/document/languageCode'."
165
169
166
-
Many skills have a "languageCode" parameter. By inspecting the operation, you can see that this language code input is missing from the `EntityRecognitionSkillV3.#1`, which is the same Entity Recognition skill that is having trouble with 'locations' and 'organizations' output.
170
+
Many skills have a "languageCode" parameter. By inspecting the operation, you can see that this language code input is missing from the `EntityRecognitionSkill.#1`, which is the same Entity Recognition skill that is having trouble with 'locations' and 'organizations' output.
167
171
168
172
Because all four notifications are about this skill, your next step is to debug this skill. If possible, start by solving input issues first before moving on to output issues.
169
173
170
174
## Fix missing skill input values
171
175
172
-
In the **Errors/Warnings** tab, there are two missing inputs for an operation labeled `EntityRecognitionSkillV3.#1`. The detail of the first error explains that a required input for 'text' is missing. The second indicates a problem with an input value "/document/languageCode".
176
+
In the **Errors/Warnings** tab, there are two missing inputs for an operation labeled `EntityRecognitionSkill.#1`. The detail of the first error explains that a required input for 'text' is missing. The second indicates a problem with an input value "/document/languageCode".
173
177
174
178
1. In **AI Enrichments** > **Skill Graph**, select the skill labeled **#1** to display its details in the right pane.
175
179
@@ -191,7 +195,14 @@ In the **Errors/Warnings** tab, there are two missing inputs for an operation la
191
195
192
196
1. Switch to **Skill JSON Editor**.
193
197
194
-
1. Change `/document/content` to `/document/merged_content`.
198
+
1. At line 16, under "inputs", change `/document/content` to `/document/merged_content`.
199
+
200
+
```json
201
+
{
202
+
"name": "text",
203
+
"source": "/document/merged_content"
204
+
},
205
+
```
195
206
196
207
1. Select **Save** in the Skill Details pane.
197
208
@@ -205,7 +216,7 @@ In the **Errors/Warnings** tab, there are two missing inputs for an operation la
205
216
206
217
1. Select the **Executions** tab and locate the input for "languageCode".
207
218
208
-
1. Select the **</>** symbol to pop open the Expression Evaluator. Notice the confirmation that the "languageCode" property is not a valid input.
219
+
1. Select the **</>** symbol to pop open the Expression Evaluator. Notice the confirmation that the "languageCode" property isn't a valid input.
209
220
210
221
:::image type="content" source="media/cognitive-search-debug/expression-evaluator-language.png" alt-text="Screenshot of Expression Evaluator for the language input." border="true":::
211
222
@@ -265,19 +276,21 @@ Alternatively, if you aren't ready to commit changes, you can save the debug ses
265
276
266
277
1. Select **OK** to confirm that you wish to update your skillset.
267
278
268
-
1. Close Debug session and select the **Indexers**tab.
279
+
1. Close Debug session and open **Indexers**from the left navigation pane.
269
280
270
-
1.Open your 'clinical-trials-idxr'.
281
+
1.Select 'clinical-trials-idxr'.
271
282
272
283
1. Select **Reset**.
273
284
274
-
1. Select **Run**. Select **OK** to confirm.
285
+
1. Select **Run**.
286
+
287
+
1. Select **Refresh** to show the status of the reset and run commands.
275
288
276
289
When the indexer has finished running, there should be a green checkmark and the word Success next to the time stamp for the latest run in the **Execution history** tab. To ensure that the changes have been applied:
277
290
278
-
1. In the search Overview page, select the **Index** tab.
291
+
1. In the left navigation pane, open **Indexes**.
279
292
280
-
1.Open the 'clinical-trials' index and in the Search explorer tab, enter this query string: `$select=metadata_storage_path, organizations, locations&$count=true` to return fields for specific documents (identified by the unique `metadata_storage_path` field).
293
+
1.Select 'clinical-trials' index and in the Search explorer tab, enter this query string: `$select=metadata_storage_path, organizations, locations&$count=true` to return fields for specific documents (identified by the unique `metadata_storage_path` field).
281
294
282
295
1. Select **Search**.
283
296
@@ -289,7 +302,7 @@ When you're working in your own subscription, it's a good idea at the end of a p
289
302
290
303
You can find and manage resources in the portal, using the **All resources** or **Resource groups** link in the left-navigation pane.
291
304
292
-
If you are using a free service, remember that you are limited to three indexes, indexers, and data sources. You can delete individual items in the portal to stay under the limit.
305
+
If you're using a free service, remember that you're limited to three indexes, indexers, and data sources. You can delete individual items in the portal to stay under the limit.
0 commit comments