You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In this article, you learn how to map enriched input fields to output fields in a searchable index. Once you've [defined a skillset](cognitive-search-defining-skillset.md), you must map the output fields of any skill that directly contributes values to a given field in your search index.
17
+
This article explains how to set up *output field mappings* that determine a data path between in-memory data structures created during skill processing, and target fields in a search index. An output field mapping is defined in an [indexer](search-indexer-overview.md) and has the following elements:
18
18
19
-
Output Field Mappings are required for moving content from enriched documents into the index. The enriched document is really a tree of information, and even though there is support for complex types in the index, sometimes you may want to transform the information from the enriched tree into a more simple type (for instance, an array of strings). Output field mappings allow you to perform data shape transformations by flattening information. Output field mappings always occur after skillset execution, although it is possible for this stage to run even if no skillset is defined.
In contrast with a [`fieldMappings`](search-indexer-field-mappings.md) definition that maps a path between two physical data structures, an `outputFieldMappings` definition maps in-memory data to fields in a search index.
22
30
23
-
* As part of your skillset, you extracted the names of organizations mentioned in each of the pages of your document. Now you want to map each of those organization names into a field in your index of type Edm.Collection(Edm.String).
31
+
Output field mappings are required if your indexer has an attached [skillset](cognitive-search-working-with-skillsets.md) that creates new information, such as text translation or key phrase extraction. During indexer execution, AI-generated information exists in memory only. To persist this information in a search index, you'll need to tell the indexer where to send the data.
24
32
25
-
* As part of your skillset, you produced a new node called “document/translated_text”. You would like to map the information on this node to a specific field in your index.
33
+
Output field mappings can also be used to retrieve specific nodes in a source document's complex type. If you don't need the full complex structure, you can [flatten individual nodes in a nested data structures](#flattening-information-from-complex-types), and then use an output field mapping to send the output to a string collection in your search index.
26
34
27
-
* You don’t have a skillset but are indexing a complex type from a Cosmos DB database. You would like to get to a node on that complex type and map it into a field in your index.
35
+
Output field mappings apply to:
28
36
29
-
> [!NOTE]
30
-
> Output field mappings apply to search indexes only. For indexers that create [knowledge stores](knowledge-store-concept-intro.md), output field mappings are ignored.
37
+
+ Content that's created by skills or extracted by an indexer. The source field is a node in an enriched document residing in memory.
31
38
32
-
## Use outputFieldMappings
39
+
+ Search indexes. If you're populating a [knowledge store](knowledge-store-concept-intro.md), use [projections](knowledge-store-projections-examples.md) for data path configuration.
33
40
34
-
To map fields, add `outputFieldMappings` to your indexer definition as shown below:
41
+
Output field mappings are applied after [skillset execution](cognitive-search-working-with-skillsets.md) or after document cracking if there's no associated skillset.
35
42
36
-
```http
37
-
PUT https://[servicename].search.windows.net/indexers/[indexer name]?api-version=2020-06-30
38
-
api-key: [admin key]
39
-
Content-Type: application/json
40
-
```
43
+
## Define an output field mapping
41
44
42
-
The body of the request is structured as follows:
45
+
Output field mappings are added to the `outputFieldMappings` array in an indexer definition, typically placed after the `fieldMappings` array. An output field mapping consists of three parts.
| sourceFieldName | Required. Specifies a path to enriched content. An example might be `/document/content`. See [Reference annotations in an Azure Cognitive Search skillset](cognitive-search-concept-annotations-syntax.md) for path syntax and examples. |
61
+
| targetFieldName | Optional. Specifies the search field that receives the enriched content. Target fields must be top-level simple fields or collections. It can't be a path to a subfield in a complex type. If you want to retrieve specific nodes in a complex structure, you can [flatten individual nodes](#flattening-information-from-complex-types) in memory, and then send the output to a string collection in your index. |
62
+
| mappingFunction | Optional. Adds extra processing provided by [mapping functions](search-indexer-field-mappings.md#mappingFunctions) supported by indexers. In the case of enrichment nodes, encoding and decoding are the most commonly used functions. |
63
+
64
+
You can use the REST API or an Azure SDK to define output field mappings.
65
+
66
+
> [!TIP]
67
+
> Indexers created by the [Import data wizard](search-import-data-portal.md) include output field mappings generated by the wizard. If you need examples, run the wizard over your data source to see the rendered definition.
68
+
69
+
### [**REST APIs**](#tab/rest)
70
+
71
+
Use [Create Indexer (REST)](/rest/api/searchservice/create-Indexer) or [Update Indexer (REST)](/rest/api/searchservice/update-indexer), any API version.
72
+
73
+
This example adds entities and sentiment labels extracted from a blob's content property to fields in a search index.
74
+
75
+
```JSON
76
+
PUT https://[service name].search.windows.net/indexers/myindexer?api-version=[api-version]
@@ -76,72 +102,227 @@ The body of the request is structured as follows:
76
102
}
77
103
```
78
104
79
-
For each output field mapping, set the location of the data in the enriched document tree (sourceFieldName), and the name of the field as referenced in the index (targetFieldName). Assign any [mapping functions](search-indexer-field-mappings.md#field-mapping-functions-and-examples) that you require to transform the content of a field before it's stored in the index.
105
+
For each output field mapping, set the location of the data in the enriched document tree (sourceFieldName), and the name of the field as referenced in the index (targetFieldName). Assign any [mapping functions](search-indexer-field-mappings.md#mappingFunctions) that you require to transform the content of a field before it's stored in the index.
80
106
81
-
##Flattening Information from Complex Types
107
+
### [**.NET SDK (C#)**](#tab/csharp)
82
108
83
-
The path in a sourceFieldName can represent one element or multiple elements. In the example above, ```/document/content/sentiment``` represents a single numeric value, while ```/document/content/organizations/*/description``` represents several organization descriptions.
109
+
In the Azure SDK for .NET, use the [OutputFieldMappingEntry](/dotnet/api/azure.search.documents.indexes.models.outputfieldmappingentry) class that provides "Name" and "TargetFieldName" properties and an optional "MappingFunction" reference.
84
110
85
-
In cases where there are several elements, they are "flattened" into an array that contains each of the elements.
111
+
Specify output field mappings when constructing the indexer, or later by directly setting [SearchIndexer.OutputFieldMappings](/dotnet/api/azure.search.documents.indexes.models.searchindexer.outputfieldmappings). The following C# example sets the output field mappings when constructing an indexer.
86
112
87
-
More concretely, for the ```/document/content/organizations/*/description``` example, the data in the *descriptions* field would look like a flat array of descriptions before it gets indexed:
113
+
```csharp
114
+
stringindexerName="cog-search-demo";
115
+
SearchIndexerindexer=newSearchIndexer(
116
+
indexerName,
117
+
dataSourceConnectionName,
118
+
indexName)
119
+
{
120
+
// Field mappings omitted for this example (assume default mappings)
## Flatten complex structures into a string collection
138
+
139
+
If your source data is composed of nested or hierarchical JSON, you can't use field mappings to set up the data paths. Instead, your search index must mirror the source data structure for at each level for a full import.
140
+
141
+
This section walks you through an import process that produces a one-to-one reflection of a complex document on both the source and target sides. Next, it uses the same source document to illustrate the retrieval and flattening of individual nodes into string collections.
142
+
143
+
Here's an example of a document in Cosmos DB with nested JSON:
144
+
145
+
```json
146
+
{
147
+
"palette":"primary colors",
148
+
"colors":[
149
+
{
150
+
"name":"blue",
151
+
"medium":[
152
+
"acrylic",
153
+
"oil",
154
+
"pastel"
155
+
]
156
+
},
157
+
{
158
+
"name":"red",
159
+
"medium":[
160
+
"acrylic",
161
+
"pastel",
162
+
"watercolor"
163
+
]
164
+
},
165
+
{
166
+
"name":"yellow",
167
+
"medium":[
168
+
"acrylic",
169
+
"watercolor"
170
+
]
171
+
}
172
+
]
173
+
}
91
174
```
92
175
93
-
This is an important principle, so we will provide another example. Imagine that you have an array of complex types as part of the enrichment tree. Let's say there is a member called customEntities that has an array of complex types like the one described below.
176
+
If you wanted to fully index the above source document, you'd create an index definition where the field names, levels, and types are reflected as a complex type. Because field mappings aren't supported for complex types in the search index, your index definition must mirror the source document.
Let's assume that your index has a field called 'diseases' of type Collection(Edm.String), where you would like to store each of the names of the entities.
205
+
Here's a sample indexer definition that executes the import (notice there are no field mappings and no skillset).
125
206
126
-
This can be done easily by using the "\*" symbol, as follows:
207
+
```json
208
+
{
209
+
"name": "my-test-indexer",
210
+
"dataSourceName": "my-test-ds",
211
+
"skillsetName": null,
212
+
"targetIndexName": "my-test-index",
213
+
214
+
"fieldMappings": [],
215
+
"outputFieldMappings": []
216
+
}
217
+
```
218
+
219
+
The result is the following sample search document, similar to the original in Cosmos DB.
This operation will simply “flatten” each of the names of the customEntities elements into a single array of strings like this:
258
+
An alternative rendering in a search index is to flatten individual nodes in the source's nested structure into a string collection in a search index.
259
+
260
+
To accomplish this task, you'll need an `outputFieldMapping` that maps an in-memory node to a string collection in the index. Although output field mappings primarily apply to skill outputs, you can also use them to address nodes after ["document cracking"](search-indexer-overview.md#stage-1-document-cracking) where the indexer opens a source document and reads it into memory.
261
+
262
+
Below is a sample index definition in Cognitive Search, using string collections to receive flattened output:
Here's the sample indexer definition, using `outputFieldMappings` to associate the nested JSON with the string collection fields. Notice that the source field uses the path syntax for enrichment nodes, even though there's no skillset. Enriched documents are created in the system during document cracking, which means you can access nodes in each document tree as long as those nodes exist when the document is cracked.
278
+
279
+
```json
280
+
{
281
+
"name": "my-test-indexer",
282
+
"dataSourceName": "my-test-ds",
283
+
"skillsetName": null,
284
+
"targetIndexName": "my-new-flattened-index",
285
+
"parameters": { },
286
+
"fieldMappings": [ ],
287
+
"outputFieldMappings": [
288
+
{
289
+
"sourceFieldName": "/document/colors/*/name",
290
+
"targetFieldName": "color_names"
291
+
},
292
+
{
293
+
"sourceFieldName": "/document/colors/*/medium",
294
+
"targetFieldName": "color_mediums"
295
+
}
296
+
]
297
+
}
298
+
```
299
+
300
+
Results from the above definition are as follows. Simplifying the structure loses context in this case. There's no longer any associations between a given color and the mediums it's available in. However, depending on your scenario, a result similar to the one shown below might be exactly what you need.
301
+
302
+
```json
303
+
{
304
+
"value": [
305
+
{
306
+
"@search.score": 1,
307
+
"id": "240a98f5-90c9-406b-a8c8-f50ff86f116c",
308
+
"palette": "primary colors",
309
+
"color_names": [
310
+
"blue",
311
+
"red",
312
+
"yellow"
313
+
],
314
+
"color_mediums": [
315
+
"[\"acrylic\",\"oil\",\"pastel\"]",
316
+
"[\"acrylic\",\"pastel\",\"watercolor\"]",
317
+
"[\"acrylic\",\"watercolor\"]"
318
+
]
319
+
}
320
+
]
321
+
}
322
+
```
144
323
145
-
*[Search indexes in Azure Cognitive Search](search-what-is-an-index.md).
324
+
## See also
146
325
147
-
*[Define field mappings in a search indexer](search-indexer-field-mappings.md).
326
+
+[Define field mappings in a search indexer](search-indexer-field-mappings.md)
0 commit comments