You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/cognitive-search-attach-cognitive-services.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@ author: HeidiSteen
7
7
ms.author: heidist
8
8
ms.service: cognitive-search
9
9
ms.topic: how-to
10
-
ms.date: 12/09/2021
10
+
ms.date: 09/16/2022
11
11
12
12
---
13
13
@@ -28,7 +28,7 @@ A multi-service resource references "Cognitive Services" as the offering, rather
28
28
29
29
You can use the Azure portal, REST API, or an Azure SDK to attach a billable resource to a skillset.
30
30
31
-
If you leave the property unspecified, execution of billable skills will stop at 20 transactions per indexer invocation and a "Time Out" message will appear in indexer execution history.
31
+
If you leave the property unspecified, your search service will attempt to use the free enrichments available to your indexer on a daily basis. Execution of billable skills will stop at 20 transactions per indexer invocation and a "Time Out" message will appear in indexer execution history.
32
32
33
33
### [**Azure portal**](#tab/portal)
34
34
@@ -120,7 +120,7 @@ Key-based billing applies when API calls to Cognitive Services resources exceed
120
120
121
121
The key is used for billing, but not connections. For connections, a search service [connects over the internal network](search-security-overview.md#internal-traffic) to a Cognitive Services resource that's co-located in the [same physical region](https://azure.microsoft.com/global-infrastructure/services/?products=search). Most regions that offer Cognitive Search also offer Cognitive Services.
122
122
123
-
If you attempt AI enrichment in a region that doesn't have both services, you'll see this message: "Provided key is not a valid CognitiveServices type key for the region of your search service."
123
+
If you attempt AI enrichment in a region that doesn't have both services, you'll see this message: "Provided key isn't a valid CognitiveServices type key for the region of your search service."
124
124
125
125
> [!NOTE]
126
126
> Some built-in skills are based on non-regional Cognitive Services (for example, the [Text Translation Skill](cognitive-search-skill-text-translation.md)). Using a non-regional skill means that your request might be serviced in a region other than the Azure Cognitive Search region. For more information on non-regional services, see the [Cognitive Services product by region](https://aka.ms/allinoneregioninfo) page.
@@ -135,9 +135,9 @@ AI enrichment offers a small quantity of free processing of billable enrichment
135
135
136
136
Some enrichments are always free:
137
137
138
-
+ Utility skills that do not call Cognitive Services (namely, [Conditional](cognitive-search-skill-conditional.md), [Document Extraction](cognitive-search-skill-document-extraction.md), [Shaper](cognitive-search-skill-shaper.md), [Text Merge](cognitive-search-skill-textmerger.md), and [Text Split skills](cognitive-search-skill-textsplit.md)) are not billable.
+ Text extraction from PDF documents and other application files is non-billable. Text extraction occurs during the [document cracking](search-indexer-overview.md#document-cracking) phase and is not an enrichment per se, but it occurs during AI enrichment and is thus noted here.
140
+
+ Text extraction from PDF documents and other application files is non-billable. Text extraction occurs during the [document cracking](search-indexer-overview.md#document-cracking) phase and isn't an enrichment in itself, but it occurs during AI enrichment and is thus noted here.
Copy file name to clipboardExpand all lines: articles/search/cognitive-search-concept-annotations-syntax.md
+37-8Lines changed: 37 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,19 +1,27 @@
1
1
---
2
2
title: Reference inputs and outputs in skillsets
3
3
titleSuffix: Azure Cognitive Search
4
-
description: Explains the annotation syntax and how to reference an annotation in the inputs and outputs of a skillset in an AI enrichment pipeline in Azure Cognitive Search.
4
+
description: Explains the annotation syntax and how to reference inputs and outputs of a skillset in an AI enrichment pipeline in Azure Cognitive Search.
5
5
6
6
author: HeidiSteen
7
7
ms.author: heidist
8
8
ms.service: cognitive-search
9
9
ms.topic: conceptual
10
-
ms.date: 09/24/2021
10
+
ms.date: 09/16/2022
11
11
---
12
-
# Reference annotations in an Azure Cognitive Search skillset
12
+
# Reference an annotation in an Azure Cognitive Search skillset
13
13
14
-
In this article, you learn how to reference annotations in skill definitions, using examples to illustrate various scenarios. As the content of a document flows through a set of skills, it gets enriched with annotations. Annotations can be used as inputs for further downstream enrichment, or mapped to an output field in an index.
15
-
16
-
Examples in this article are based on the *content* field generated automatically by [Azure Blob indexers](search-howto-indexing-azure-blob-storage.md) as part of the [document cracking](search-indexer-overview.md#document-cracking) phase. When referring to documents from a Blob container, use a format such as `"/document/content"`, where the *content* field is part of the *document*.
14
+
In this article, you'll learn how to reference *annotations* (or an enrichment node) in skill definitions, using examples to illustrate various scenarios. Skills read inputs and write outputs to nodes in an [enriched document](cognitive-search-working-with-skillsets#enrichment-tree) tree, building the tree as the enrichments progress. Any node can be used as an input for further downstream enrichment, or mapped to an output field in an index. This article introduces the syntax and provides examples for specifying a path. For the full syntax, see [Skill context and input annotation language language](cognitive-search-skill-annotation-language.md).
15
+
16
+
Paths to an annotation are specified in the "context" and "source" properties:
17
+
18
+
:::image type="content" source="media/cognitive-search-annotations-syntax/content-source-annotation-path.png" alt-text="Screenshot of a skillset definition with context and source elements highlighted.":::
19
+
20
+
The example in the screenshot is for an item in a Cosmos DB collection.
21
+
22
+
+ "context" is `/document/HotelId` because the collection is partitioned into documents by the `/HotelId` field. For a document in a Cosmos DB collection, it's also the root node of the enrichment document.
23
+
24
+
+ "source" is `/document/Description` because the skill is a translation skill, and the field that you'll want the skill to translate is the `Description` field in each document.
17
25
18
26
## Background concepts
19
27
@@ -25,7 +33,21 @@ Before reviewing the syntax, let's revisit a few important concepts to better un
25
33
| "annotation" | Within an enriched document, a node that is created and populated by a skill, such as "text" and "layoutText" in the OCR skill, is called an annotation. An enriched document is populated with both annotations and unchanged field values or metadata copied from the source. |
26
34
| "context" | The context in which the enrichment takes place, in terms of which element or component of the document is enriched. By default, the enrichment context is at the `"/document"` level, scoped to individual documents contained in the data source. When a skill runs, the outputs of that skill become [properties of the defined context](#example-2). |
27
35
36
+
## Root nodes and context
37
+
38
+
An enriched document is created in the "document cracking" stage of indexer execution, when the indexer opens a document or reads in a row from the data source. Initially, the only node in an enriched document is the [root node (`/document`)](cognitive-search-skill-annotation-language.md#document-root), and it's the node from which all other enrichments occur.
39
+
40
+
The following tables shows several well-known paths:
41
+
42
+
+`/document` is the root node and indicates an entire blob in Azure Storage, or a row in SQL table.
43
+
+`/document/content` is the "content" property of a JSON blob.
44
+
+`/document/pages/*` or `/document/sentences/*` become the context if you're breaking a large document into smaller chunks for processing.
45
+
+`/document/normalized_images/*` is created during document cracking if the document contains images. All paths to images start with normalized_images.
46
+
47
+
Examples in this article are based on the *content* field generated automatically by [Azure Blob indexers](search-howto-indexing-azure-blob-storage.md) as part of the [document cracking](search-indexer-overview.md#document-cracking) phase. When referring to documents from a Blob container, use a format such as `"/document/content"`, where the *content* field is part of the *document*.
48
+
28
49
<aname="example-1"></a>
50
+
29
51
## Example 1: Simple annotation reference
30
52
31
53
In Azure Blob Storage, suppose you have a variety of files containing references to people's names that you want to extract using entity recognition. In the skill definition below, `"/document/content"` is the textual representation of the entire document, and "people" is an extraction of full names for entities identified as persons.
@@ -58,7 +80,7 @@ Because the default context is `"/document"`, the list of people can now be refe
58
80
59
81
This example builds on the previous one, showing you how to invoke an enrichment step multiple times over the same document. Assume the previous example generated an array of strings with 10 people names from a single document. A reasonable next step might be a second enrichment that extracts the last name from a full name. Because there are 10 names, you want this step to be called 10 times in this document, once for each person.
60
82
61
-
To invoke the right number of iterations, set the context as `"/document/people/*"`, where the asterisk (`"*"`) represents all the nodes in the enriched document as descendants of `"/document/people"`. Although this skill is only defined once in the skills array, it is called for each member within the document until all members are processed.
83
+
To invoke the right number of iterations, set the context as `"/document/people/*"`, where the asterisk (`"*"`) represents all the nodes in the enriched document as descendants of `"/document/people"`. Although this skill is only defined once in the skills array, it's called for each member within the document until all members are processed.
62
84
63
85
```json
64
86
{
@@ -90,7 +112,7 @@ When annotations are arrays or collections of strings, you might want to target
90
112
91
113
Sometimes you need to group all annotations of a particular type to pass them to a particular skill. Consider a hypothetical custom skill that identifies the most common last name from all the last names extracted in Example 2. To provide just the last names to the custom skill, specify the context as `"/document"` and the input as `"/document/people/*/lastname"`.
92
114
93
-
Notice that the cardinality of `"/document/people/*/lastname"` is larger than that of document. There may be 10 lastname nodes while there is only one document node for this document. In that case, the system will automatically create an array of `"/document/people/*/lastname"` containing all of the elements in the document.
115
+
Notice that the cardinality of `"/document/people/*/lastname"` is larger than that of document. There may be 10 lastname nodes while there's only one document node for this document. In that case, the system will automatically create an array of `"/document/people/*/lastname"` containing all of the elements in the document.
94
116
95
117
```json
96
118
{
@@ -113,9 +135,16 @@ Notice that the cardinality of `"/document/people/*/lastname"` is larger than th
113
135
}
114
136
```
115
137
138
+
## Tips for annotation path troubleshooting
116
139
140
+
If you're having trouble with specifying skill inputs, these tips might help you move forward:
141
+
142
+
+[Run the Import data wizard](search-import-data-portal.md) over your data to review the skillset definitions and field mappings that the wizard generates.
143
+
144
+
+[Start a debug session](cognitive-search-how-to-debug-skillset.md) on a skillset to view the structure of an enriched document. You can edit the paths and other parts of the skill definition, and then run the skill to validate your changes.
117
145
118
146
## See also
147
+
119
148
+[Skill context and input annotation language](cognitive-search-skill-annotation-language.md)
120
149
+[How to integrate a custom skill into an enrichment pipeline](cognitive-search-custom-skill-interface.md)
121
150
+[How to define a skillset](cognitive-search-defining-skillset.md)
Copy file name to clipboardExpand all lines: articles/search/cognitive-search-working-with-skillsets.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -129,7 +129,7 @@ An enriched document exists for the duration of skillset execution, but can be [
129
129
130
130
Initially, an enriched document is simply the content extracted from a data source during [*document cracking*](search-indexer-overview.md#document-cracking), where text and images are extracted from the source and made available for language or image analysis.
131
131
132
-
The initial content is metadata and the *root node* (`document\content`). The root node is usually a whole document or a normalized image that is extracted from a data source during document cracking. How it's articulated in an enrichment tree varies for each data source type. The following table shows the state of a document entering into the enrichment pipeline for several supported data sources:
132
+
The initial content is metadata and the *root node* (`document/content`). The root node is usually a whole document or a normalized image that is extracted from a data source during document cracking. How it's articulated in an enrichment tree varies for each data source type. The following table shows the state of a document entering into the enrichment pipeline for several supported data sources:
Copy file name to clipboardExpand all lines: articles/search/search-modeling-multitenant-saas-applications.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -84,7 +84,7 @@ In the case of a multitenant scenario, the application developer consumes one or
84
84
85
85
## Model 1: One index per tenant
86
86
87
-
:::image type="content" source="media/search-modeling-multitenant-saas-applications/azure-search-index-per-tenant.png" alt-text="A portrayal of the index-per-tenant model" border="false":::
87
+
:::image type="content" source="media/search-modeling-multitenant-saas-applications/azure-search-index-per-tenant.png" alt-text="A portrayal of the index-per-tenant model" border="false":::
88
88
89
89
In an index-per-tenant model, multiple tenants occupy a single Azure Cognitive Search service where each tenant has their own index.
0 commit comments