You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/cognitive-search-create-custom-skill-example.md
+12-14Lines changed: 12 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ author: gmndrg
6
6
ms.author: gimondra
7
7
ms.service: cognitive-search
8
8
ms.topic: conceptual
9
-
ms.date: 12/01/2022
9
+
ms.date: 03/18/2024
10
10
ms.custom:
11
11
- devx-track-csharp
12
12
- ignite-2023
@@ -20,31 +20,33 @@ In this example, learn how to create a web API custom skill. This skill will acc
20
20
21
21
+ Read about [custom skill interface](cognitive-search-custom-skill-interface.md) article if you aren't familiar with the input/output interface that a custom skill should implement.
22
22
23
-
+ Create a [Bing Search v7 resource](https://portal.azure.com/#create/Microsoft.BingSearch) through the Azure Portal. A free tier is available and sufficient for this example.
23
+
+ Create a [Bing Search resource](https://portal.azure.com/#create/Microsoft.BingSearch) through the Azure portal. A free tier is available and sufficient for this example.
24
24
25
-
+ Install [Visual Studio 2019](https://www.visualstudio.com/vs/) or later, including the Azure development workload.
25
+
+ Install [Visual Studio](https://www.visualstudio.com/vs/) or later.
26
26
27
27
## Create an Azure Function
28
28
29
29
Although this example uses an Azure Function to host a web API, it isn't required. As long as you meet the [interface requirements for a cognitive skill](cognitive-search-custom-skill-interface.md), the approach you take is immaterial. Azure Functions, however, make it easy to create a custom skill.
30
30
31
-
### Create a function app
31
+
### Create a project
32
32
33
33
1. In Visual Studio, select **New** > **Project** from the File menu.
34
34
35
-
1.In the New Project dialog, select**Azure Functions** as the template and select **Next**. Type a name for your project, and select **Create**. The function app name must be valid as a C# namespace, so don't use underscores, hyphens, or any other non-alphanumeric characters.
35
+
1.Choose**Azure Functions** as the template and select **Next**. Type a name for your project, and select **Create**. The function app name must be valid as a C# namespace, so don't use underscores, hyphens, or any other non-alphanumeric characters.
36
36
37
-
1. Select the type to be **HTTP Trigger**
37
+
1. Select a framework that has long term support.
38
38
39
-
1. For Storage Account, you may select **None**, as you won't need any storage for this function.
39
+
1. Choose **HTTP Trigger** for the type of function to add to the project.
40
+
41
+
1. Choose **Function** for the authorization level.
40
42
41
43
1. Select **Create** to create the function project and HTTP triggered function.
42
44
43
-
### Modify the code to call the Bing Entity Search Service
45
+
### Add code to call the Bing Entity API
44
46
45
-
Visual Studio creates a project and in it a class that contains boilerplate code for the chosen function type. The *FunctionName* attribute on the method sets the name of the function. The *HttpTrigger* attribute specifies that the function is triggered by an HTTP request.
47
+
Visual Studio creates a project with boilerplate code for the chosen function type. The *FunctionName* attribute on the method sets the name of the function. The *HttpTrigger* attribute specifies that the function is triggered by an HTTP request.
46
48
47
-
Now, replace all of the content of the file*Function1.cs* with the following code:
49
+
Replace the contents of *Function1.cs* with the following code:
48
50
49
51
```csharp
50
52
usingSystem;
@@ -308,10 +310,6 @@ namespace SampleSkills
308
310
309
311
Make sure to enter your own *key* value in the `key` constant based on the key you got when signing up for the Bing entity search API.
310
312
311
-
This sample includes all necessary code in a single file for convenience. You can find a slightly more structured version of that same skill in [the power skills repository](https://github.com/Azure-Samples/azure-search-power-skills/tree/main/Text/BingEntitySearch).
312
-
313
-
Of course, you may rename the file from `Function1.cs` to `BingEntitySearch.cs`.
314
-
315
313
## Test the function from Visual Studio
316
314
317
315
Press **F5** to run the program and test function behaviors. In this case, we'll use the function below to look up two entities. Use a REST client to issue a call like the one shown below:
Copy file name to clipboardExpand all lines: articles/search/cognitive-search-custom-skill-scale.md
+20-32Lines changed: 20 additions & 32 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,51 +9,48 @@ ms.service: cognitive-search
9
9
ms.custom:
10
10
- ignite-2023
11
11
ms.topic: conceptual
12
-
ms.date: 12/01/2022
12
+
ms.date: 03/18/2024
13
13
---
14
14
15
15
# Efficiently scale out a custom skill
16
16
17
17
Custom skills are web APIs that implement a specific interface. A custom skill can be implemented on any publicly addressable resource. The most common implementations for custom skills are:
18
-
* Azure Functions for custom logic skills
19
-
* Azure Webapps for simple containerized AI skills
20
-
* Azure Kubernetes service for more complex or larger skills.
18
+
19
+
+ Azure Functions for custom logic skills
20
+
+ Azure Web apps for simple containerized AI skills
21
+
+ Azure Kubernetes service for more complex or larger skills.
21
22
22
23
## Prerequisites
23
24
24
-
+ Review the [custom skill interface](cognitive-search-custom-skill-interface.md) for an introduction into the input/output interface that a custom skill should implement.
25
+
+ Review the [custom skill interface](cognitive-search-custom-skill-interface.md) for an introduction into the inputs and outputs that a custom skill should implement.
25
26
26
-
+ Set up your environment. You could start with [this tutorial end-to-end](../azure-functions/create-first-function-vs-code-python.md) to set up serverless Azure Function using Visual Studio Code and Python extensions.
27
+
+ Set up your environment. You can start with [this tutorial end-to-end](../azure-functions/create-first-function-vs-code-python.md) to set up serverless Azure Function using Visual Studio Code with the Python extension.
27
28
28
29
## Skillset configuration
29
30
30
-
Configuring a custom skill for maximizing throughput of the indexing process requires an understanding of the skill, indexer configurations and how the skill relates to each document. For example, the number of times a skill is invoked per document and the expected duration per invocation.
31
-
32
-
### Skill settings
33
-
34
-
On the [custom skill](cognitive-search-custom-skill-web-api.md) set the following parameters.
31
+
The following properties on a [custom skill](cognitive-search-custom-skill-web-api.md) are used for scale.
35
32
36
33
1. Set `batchSize` of the custom skill to configure the number of records sent to the skill in a single invocation of the skill.
37
34
38
-
2. Set the `degreeOfParallelism` to calibrate the number of concurrent requests the indexer will make to your skill.
35
+
1. Set the `degreeOfParallelism` to calibrate the number of concurrent requests the indexer makes to your skill.
39
36
40
-
3. Set `timeout`to a value sufficient for the skill to respond with a valid response.
37
+
1. Set `timeout`to a value sufficient for the skill to respond with a valid response.
41
38
42
-
4. In the `indexer` definition, set [`batchSize`](/rest/api/searchservice/create-indexer#indexer-parameters) to the number of documents that should be read from the data source and enriched concurrently.
39
+
1. In the `indexer` definition, set [`batchSize`](/rest/api/searchservice/create-indexer#indexer-parameters) to the number of documents that should be read from the data source and enriched concurrently.
43
40
44
41
### Considerations
45
42
46
-
Setting these variables to optimize the indexers performance requires determining if your skill performs better with many concurrent small requests or fewer large requests. A few questions to consider are:
43
+
There's no "one size fits all" set of recommendations. You should plan on testing different configurations to reach an optimum result. Strategies are either fewer large requests or many small requests.
47
44
48
-
* What is the skill invocation cardinality? Does the skill execute once for each document, for instance a document classification skill, or could the skill execute multiple times per document, a paragraph classification skill?
45
+
+ Skill invocation cardinality: Does the skill execute once for each document (`/document/content`) or multiple times per document (`/document/reviews_text/pages/*`).
49
46
50
-
* On average how many documents are read from the data source to fill out a skill request based on the skill batch size? Ideally, this should be less than the indexer batch size. With batch sizes greater than 1 your skill can receive records from multiple source documents. For example if the indexer batch count is 5 and the skill batch count is 50 and each document generates only five records, the indexer will need to fill a custom skill request across multiple indexer batches.
47
+
+ On average, how many documents are read from the data source to fill out a skill request based on the skill batch size? Ideally, this should be less than the indexer batch size. With batch sizes greater than one, your skill can receive records from multiple source documents. For example, if the indexer batch count is 5, and the skill batch count is 50 and each document generates only five records, the indexer will need to fill a custom skill request across multiple indexer batches.
51
48
52
-
* The average number of requests an indexer batch can generate should give you an optimal setting for the degrees of parallelism. If your infrastructure hosting the skill cannot support that level of concurrency, consider dialing down the degrees of parallelism. As a best practice, test your configuration with a few documents to validate your choices on the parameters.
49
+
+ The average number of requests an indexer batch can generate should give you an optimal setting for the degrees of parallelism. If your infrastructure hosting the skill can't support that level of concurrency, consider dialing down the degrees of parallelism. As a best practice, test your configuration with a few documents to validate your choices on the parameters.
53
50
54
-
* Testing with a smaller sample of documents, evaluate the execution time of your skill to the overall time taken to process the subset of documents. Does your indexer spend more time building a batch or waiting for a response from your skill?
51
+
+ Testing with a smaller sample of documents, evaluate the execution time of your skill to the overall time taken to process the subset of documents. Does your indexer spend more time building a batch or waiting for a response from your skill?
55
52
56
-
* Consider the upstream implications of parallelism. If the input to a custom skill is an output from a prior skill, are all the skills in the skillset scaled out effectively to minimize latency?
53
+
+ Consider the upstream implications of parallelism. If the input to a custom skill is an output from a prior skill, are all the skills in the skillset scaled out effectively to minimize latency?
57
54
58
55
## Error handling in the custom skill
59
56
@@ -83,25 +80,16 @@ Start by testing your custom skill with a REST API client to validate:
83
80
84
81
* Returns a valid HTTP status code
85
82
86
-
Create a [debug session](cognitive-search-debug-session.md) to add your skill to the skillset and make sure it produces a valid enrichment. While a debug session does not allow you to tune the performance of the skill, it enables you to ensure that the skill is configured with valid values and returns the expected enriched objects.
83
+
Create a [debug session](cognitive-search-debug-session.md) to add your skill to the skillset and make sure it produces a valid enrichment. While a debug session doesn't allow you to tune the performance of the skill, it enables you to ensure that the skill is configured with valid values and returns the expected enriched objects.
87
84
88
85
## Best practices
89
86
90
87
* While skills can accept and return larger payloads, consider limiting the response to 150 MB or less when returning JSON.
91
88
92
89
* Consider setting the batch size on the indexer and skill to ensure that each data source batch generates a full payload for your skill.
93
90
94
-
* For long running tasks, set the timeout to a high enough value to ensure the indexer does not error out when processing documents concurrently.
91
+
* For long running tasks, set the timeout to a high enough value to ensure the indexer doesn't error out when processing documents concurrently.
95
92
96
93
* Optimize the indexer batch size, skill batch size, and skill degrees of parallelism to generate the load pattern your skill expects, fewer large requests or many small requests.
97
94
98
-
* Monitor custom skills with detailed logs of failures as you can have scenarios where specific requests consistently fail as a result of the data variability.
99
-
100
-
101
-
## Next steps
102
-
Congratulations! Your custom skill is now scaled right to maximize throughput on the indexer.
103
-
104
-
+[Power Skills: a repository of custom skills](https://github.com/Azure-Samples/azure-search-power-skills)
105
-
+[Add a custom skill to an AI enrichment pipeline](cognitive-search-custom-skill-interface.md)
106
-
+[Add an Azure Machine Learning skill](./cognitive-search-aml-skill.md)
107
-
+[Use debug sessions to test changes](./cognitive-search-debug-session.md)
95
+
* Monitor custom skills with detailed logs of failures as you can have scenarios where specific requests consistently fail as a result of the data variability.
Copy file name to clipboardExpand all lines: articles/search/index-sql-relational-data.md
+8-7Lines changed: 8 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,21 +1,22 @@
1
1
---
2
2
title: Model SQL relational data for import and indexing
3
3
titleSuffix: Azure AI Search
4
-
description: Learn how to model relational data, de-normalized into a flat result set, for indexing and full text search in Azure AI Search.
4
+
description: Learn how to model relational data, denormalized into a flat result set, for indexing and full text search in Azure AI Search.
5
5
author: HeidiSteen
6
6
manager: nitinme
7
7
ms.author: heidist
8
8
ms.service: cognitive-search
9
9
ms.custom:
10
10
- ignite-2023
11
11
ms.topic: how-to
12
-
ms.date: 02/22/2023
12
+
ms.date: 03/18/2024
13
13
---
14
+
14
15
# How to model relational SQL data for import and indexing in Azure AI Search
15
16
16
17
Azure AI Search accepts a flat rowset as input to the [indexing pipeline](search-what-is-an-index.md). If your source data originates from joined tables in a SQL Server relational database, this article explains how to construct the result set, and how to model a parent-child relationship in an Azure AI Search index.
17
18
18
-
As an illustration, we refer to a hypothetical hotels database, based on [demo data](https://github.com/Azure-Samples/azure-search-sample-data/tree/main/hotels). Assume the database consists of a Hotels$ table with 50 hotels, and a Rooms$ table with rooms of varying types, rates, and amenities, for a total of 750 rooms. There's a one-to-many relationship between the tables. In our approach, a view provides the query that returns 50 rows, one row per hotel, with associated room detail embedded into each row.
19
+
As an illustration, we refer to a hypothetical hotels database, based on [demo data](https://github.com/Azure-Samples/azure-search-sample-data/tree/main/hotels). Assume the database consists of a `Hotels$` table with 50 hotels, and a `Rooms$` table with rooms of varying types, rates, and amenities, for a total of 750 rooms. There's a one-to-many relationship between the tables. In our approach, a view provides the query that returns 50 rows, one row per hotel, with associated room detail embedded into each row.
19
20
20
21

21
22
@@ -43,7 +44,7 @@ To deliver the expected search experience, your data set should consist of one r
43
44
44
45
The solution is to capture the room detail as nested JSON, and then insert the JSON structure into a field in a view, as shown in the second step.
45
46
46
-
1. Assume you've two joined tables, Hotels$ and Rooms$, that contain details for 50 hotels and 750 rooms and are joined on the HotelID field. Individually, these tables contain 50 hotels and 750 related rooms.
47
+
1. Assume you have two joined tables, `Hotels$` and `Rooms$`, that contain details for 50 hotels and 750 rooms and are joined on the HotelID field. Individually, these tables contain 50 hotels and 750 related rooms.
47
48
48
49
```sql
49
50
CREATE TABLE [dbo].[Hotels$](
@@ -106,7 +107,7 @@ This rowset is now ready for import into Azure AI Search.
106
107
107
108
## Use a complex collection for the "many" side of a one-to-many relationship
108
109
109
-
On the Azure AI Search side, create an index schema that models the one-to-many relationship using nested JSON. The result set you created in the previous section generally corresponds to the index schema provided below (we cut some fields for brevity).
110
+
On the Azure AI Search side, create an index schema that models the one-to-many relationship using nested JSON. The result set you created in the previous section generally corresponds to the index schema provided next (we cut some fields for brevity).
110
111
111
112
The following example is similar to the example in [How to model complex data types](search-howto-complex-data-types.md#create-complex-fields). The *Rooms* structure, which has been the focus of this article, is in the fields collection of an index named *hotels*. This example also shows a complex type for *Address*, which differs from *Rooms* in that it's composed of a fixed set of items, as opposed to the multiple, arbitrary number of items allowed in a collection.
112
113
@@ -144,11 +145,11 @@ The following example is similar to the example in [How to model complex data ty
144
145
}
145
146
```
146
147
147
-
Given the previous result setand the above index schema, you've all the required components for a successful indexing operation. The flattened data set meets indexing requirements yet preserves detail information. In the Azure AI Search index, search results will fall easily into hotel-based entities, while preserving the context of individual rooms and their attributes.
148
+
Given the previous result setand the above index schema, you have all the required components for a successful indexing operation. The flattened data set meets indexing requirements yet preserves detail information. In the Azure AI Search index, search results fall easily into hotel-based entities, while preserving the context of individual rooms and their attributes.
148
149
149
150
## Facet behavior on complex type subfields
150
151
151
-
Fields that have a parent, such as the fields under Address and Rooms, are called *subfields*. Although you can assign a "facetable" attribute to a subfield, the count of the facet will always be for the main document.
152
+
Fields that have a parent, such as the fields under Address and Rooms, are called *subfields*. Although you can assign a "facetable" attribute to a subfield, the count of the facet is always for the main document.
152
153
153
154
For complex types like Address, where there's just one "Address/City" or "Address/stateProvince" in the document, the facet behavior works as expected. However, in the case of Rooms, where there are multiple subdocuments for each main document, the facet counts can be misleading.
0 commit comments