You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/search-indexer-field-mappings.md
+98-56Lines changed: 98 additions & 56 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -13,27 +13,38 @@ ms.topic: conceptual
13
13
ms.custom: seodec2018
14
14
---
15
15
16
-
# Field mappings in Azure Search indexers
17
-
When using Azure Search indexers, you can occasionally find yourself in situations where your input data doesn't quite match the schema of your target index. In those cases, you can use **field mappings** to transform your data into the desired shape.
16
+
# Field mappings and transformations using Azure Search indexers
17
+
18
+
When using Azure Search indexers, you sometimes find that the input data doesn't quite match the schema of your target index. In those cases, you can use **field mappings** to reshape your data during the indexing process.
18
19
19
20
Some situations where field mappings are useful:
20
21
21
-
* Your data source has a field `_id`, but Azure Search doesn't allow field names starting with an underscore. A field mapping allows you to "rename" a field.
22
-
* You want to populate several fields in the index with the same data source data, for example because you want to apply different analyzers to those fields. Field mappings let you "fork" a data source field.
23
-
* You need to Base64 encode or decode your data. Field mappings support several **mapping functions**, including functions for Base64 encoding and decoding.
22
+
* Your data source has a field named `_id`, but Azure Search doesn't allow field names that start with an underscore. A field mapping lets you effectively rename a field.
23
+
* You want to populate several fields in the index from the same data source data. For example, you might want to apply different analyzers to those fields.
24
+
* You want to populate an index field with data from more than one data source, and the data sources each use different field names.
25
+
* You need to Base64 encode or decode your data. Field mappings support several **mapping functions**, including functions for Base64 encoding and decoding.
26
+
27
+
> [!NOTE]
28
+
> The field mapping feature of Azure Search indexers provides a simple way to map data fields to index fields, with a few options for data conversion. More complex data might require pre-processing to reshape it into a form that's easy to index.
29
+
>
30
+
> Microsoft Azure Data Factory is a powerful cloud-based solution for importing and transforming data. You can also write code to transform source data before indexing. For code examples, see [Model relational data](search-example-adventureworks-modeling.md) and [Model multilevel facets](search-example-adventureworks-multilevel-faceting.md).
31
+
>
24
32
25
-
## Setting up field mappings
26
-
You can add field mappings when creating a new indexer using the [Create Indexer](https://msdn.microsoft.com/library/azure/dn946899.aspx) API. You can manage field mappings on an indexing indexer using the [Update Indexer](https://msdn.microsoft.com/library/azure/dn946892.aspx) API.
33
+
## Set up field mappings
27
34
28
35
A field mapping consists of three parts:
29
36
30
37
1. A `sourceFieldName`, which represents a field in your data source. This property is required.
31
38
2. An optional `targetFieldName`, which represents a field in your search index. If omitted, the same name as in the data source is used.
32
39
3. An optional `mappingFunction`, which can transform your data using one of several predefined functions. The full list of functions is [below](#mappingFunctions).
33
40
34
-
Fields mappings are added to the `fieldMappings` array on the indexer definition.
41
+
Field mappings are added to the `fieldMappings` array of the indexer definition.
42
+
43
+
## Map fields using the REST API
35
44
36
-
For example, here's how you can accommodate differences in field names:
45
+
You can add field mappings when creating a new indexer using the [Create Indexer](https://docs.microsoft.com/rest/api/searchservice/create-Indexer) API request. You can manage the field mappings of an existing indexer using the [Update Indexer](https://docs.microsoft.com/rest/api/searchservice/update-indexer) API request.
46
+
47
+
For example, here's how to map a source field to a target field with a different name:
37
48
38
49
```JSON
39
50
@@ -47,7 +58,7 @@ api-key: [admin key]
47
58
}
48
59
```
49
60
50
-
An indexer can have multiple field mappings. For example, here's how you can "fork" a field:
61
+
A source field can be referenced in multiple field mappings. The following example shows how to "fork" a field, copying the same source field to two different index fields:
51
62
52
63
```JSON
53
64
@@ -62,10 +73,37 @@ An indexer can have multiple field mappings. For example, here's how you can "fo
62
73
>
63
74
>
64
75
76
+
## Map fields using the .NET SDK
77
+
78
+
You define field mappings in the .NET SDK using the [FieldMapping](https://docs.microsoft.com/dotnet/api/microsoft.azure.search.models.fieldmapping) class, which has the properties `SourceFieldName` and `TargetFieldName`, and an optional `MappingFunction` reference.
79
+
80
+
You can specify field mappings when constructing the indexer, or later by directly setting the `Indexer.FieldMappings` property.
81
+
82
+
The following C# example sets the field mappings when constructing an indexer.
A field mapping function transforms the contents of a field before it's stored in the index. The following mapping functions are currently supported:
69
107
70
108
*[base64Encode](#base64EncodeFunction)
71
109
*[base64Decode](#base64DecodeFunction)
@@ -74,68 +112,69 @@ These functions are currently supported:
74
112
75
113
<aname="base64EncodeFunction"></a>
76
114
77
-
## base64Encode
115
+
### base64Encode function
116
+
78
117
Performs *URL-safe* Base64 encoding of the input string. Assumes that the input is UTF-8 encoded.
79
118
80
-
### Sample use case - document key lookup
81
-
Only URL-safe characters can appear in an Azure Search document key (because customers must be able to address the document using the [Lookup API](https://docs.microsoft.com/rest/api/searchservice/lookup-document), for example). If your data contains URL-unsafe characters and you want to use it to populate a key field in your search index, use this function. Once the key is encoded, you can use base64 decode to retrieve the original value. For details, see the [base64 encoding and decoding](#base64details) section.
119
+
#### Example - document key lookup
82
120
83
-
#### Example
84
-
```JSON
121
+
Only URL-safe characters can appear in an Azure Search document key (because customers must be able to address the document using the [Lookup API](https://docs.microsoft.com/rest/api/searchservice/lookup-document) ). If the source field for your key contains URL-unsafe characters, you can use the `base64Encode` function to convert it at indexing time.
85
122
86
-
"fieldMappings" : [
87
-
{
88
-
"sourceFieldName" : "SourceKey",
89
-
"targetFieldName" : "IndexKey",
90
-
"mappingFunction" : { "name" : "base64Encode" }
91
-
}]
92
-
```
93
-
94
-
### Sample use case - retrieve original key
95
-
You have a blob indexer that indexes blobs with the blob path metadata as the document key. After retrieving the encoded document key, you want to decode the path and download the blob.
123
+
When you retrieve the encoded key at search time, you can then use the `base64Decode` function to get the original key value, and use that to retrieve the source document.
If you don't need to look up documents by keys and also don't need to decode the encoded content, you can just leave out `parameters` for the mapping function, which defaults `useHttpServerUtilityUrlTokenEncode` to `true`. Otherwise, see [base64 details](#base64details) section to decide which settings to use.
138
+
If you don't include a parameters property for your mapping function, it defaults to the value `{"useHttpServerUtilityUrlTokenEncode" : true}`.
139
+
140
+
Azure Search supports two different Base64 encodings. You should use the same parameters when encoding and decoding the same field. For more information, see [base64 encoding options](#base64details) to decide which parameters to use.
109
141
110
142
<aname="base64DecodeFunction"></a>
111
143
112
-
## base64Decode
113
-
Performs Base64 decoding of the input string. The input is assumed to a *URL-safe* Base64-encoded string.
144
+
### base64Decode function
145
+
146
+
Performs Base64 decoding of the input string. The input is assumed to be a *URL-safe* Base64-encoded string.
147
+
148
+
#### Example - decode blob metadata or URLs
114
149
115
-
### Sample use case
116
-
Blob custom metadata values must be ASCII-encoded. You can use Base64 encoding to represent arbitrary UTF-8 strings in blob custom metadata. However, to make search meaningful, you can use this function to turn the encoded data back into "regular" strings when populating your search index.
150
+
Your source data might contain Base64-encoded strings, such as blob metadata strings or web URLs, that you want to make searchable as plain text. You can use the `base64Decode` function to turn the encoded data back into regular strings when populating your search index.
If you don't specify any `parameters`, then the default value of `useHttpServerUtilityUrlTokenDecode` is `true`. See [base64 details](#base64details) section to decide which settings to use.
165
+
If you don't include a parameters property, it defaults to the value `{"useHttpServerUtilityUrlTokenEncode" : true}`.
166
+
167
+
Azure Search supports two different Base64 encodings. You should use the same parameters when encoding and decoding the same field. For more details, see [base64 encoding options](#base64details) to decide which parameters to use.
130
168
131
169
<aname="base64details"></a>
132
170
133
-
### Details of base64 encoding and decoding
134
-
Azure Search supports two base64 encodings: HttpServerUtility URL token and URL-safe base64 encoding without padding. You need to use the same encoding as the mapping functions if you want to encode a document key for look up, encode a value to be decoded by the indexer, or decode a field encoded by the indexer.
171
+
#### base64 encoding options
172
+
173
+
Azure Search supports two different Base64 encodings: **HttpServerUtility URL token**, and **URL-safe Base64 encoding without padding**. A string that is base64-encoded during indexing should later be decoded with the same encoding options, or else the result won't match the original.
135
174
136
-
If `useHttpServerUtilityUrlTokenEncode` or `useHttpServerUtilityUrlTokenDecode` parameters for encoding and decoding respectively are set to `true`, then `base64Encode` behaves like [HttpServerUtility.UrlTokenEncode](https://msdn.microsoft.com/library/system.web.httpserverutility.urltokenencode.aspx) and `base64Decode` behaves like [HttpServerUtility.UrlTokenDecode](https://msdn.microsoft.com/library/system.web.httpserverutility.urltokendecode.aspx).
175
+
If the `useHttpServerUtilityUrlTokenEncode` or `useHttpServerUtilityUrlTokenDecode` parameters for encoding and decoding respectively are set to `true`, then `base64Encode` behaves like [HttpServerUtility.UrlTokenEncode](https://msdn.microsoft.com/library/system.web.httpserverutility.urltokenencode.aspx) and `base64Decode` behaves like [HttpServerUtility.UrlTokenDecode](https://msdn.microsoft.com/library/system.web.httpserverutility.urltokendecode.aspx).
137
176
138
-
If you are not using the full .NET Framework (i.e., you are using .NET Core or other programming environment) to produce the key values to emulate Azure Search behavior, then you should set `useHttpServerUtilityUrlTokenEncode` and `useHttpServerUtilityUrlTokenDecode` to `false`. Depending on the library you use, the base64 encode and decode utility functions may be different from Azure Search.
177
+
If you are not using the full .NET Framework (that is, you are using .NET Core or another framework) to produce the key values to emulate Azure Search behavior, then you should set `useHttpServerUtilityUrlTokenEncode` and `useHttpServerUtilityUrlTokenDecode` to `false`. Depending on the library you use, the base64 encoding and decoding functions might differ from the ones used by Azure Search.
139
178
140
179
The following table compares different base64 encodings of the string `00>00?00`. To determine the required additional processing (if any) for your base64 functions, apply your library encode function on the string `00>00?00` and compare the output with the expected output `MDA-MDA_MDA`.
141
180
@@ -148,19 +187,21 @@ The following table compares different base64 encodings of the string `00>00?00`
148
187
149
188
<aname="extractTokenAtPositionFunction"></a>
150
189
151
-
## extractTokenAtPosition
190
+
### extractTokenAtPosition function
191
+
152
192
Splits a string field using the specified delimiter, and picks the token at the specified position in the resulting split.
153
193
194
+
This function uses the following parameters:
195
+
196
+
*`delimiter`: a string to use as the separator when splitting the input string.
197
+
*`position`: an integer zero-based position of the token to pick after the input string is split.
198
+
154
199
For example, if the input is `Jane Doe`, the `delimiter` is `" "`(space) and the `position` is 0, the result is `Jane`; if the `position` is 1, the result is `Doe`. If the position refers to a token that doesn't exist, an error is returned.
155
200
156
-
### Sample use case
157
-
Your data source contains a `PersonName` field, and you want to index it as two separate `FirstName` and `LastName` fields. You can use this function to split the input using the space character as the delimiter.
201
+
#### Example - extract a name
158
202
159
-
### Parameters
160
-
*`delimiter`: a string to use as the separator when splitting the input string.
161
-
*`position`: an integer zero-based position of the token to pick after the input string is split.
203
+
Your data source contains a `PersonName` field, and you want to index it as two separate `FirstName` and `LastName` fields. You can use this function to split the input using the space character as the delimiter.
162
204
163
-
### Example
164
205
```JSON
165
206
166
207
"fieldMappings" : [
@@ -178,22 +219,23 @@ Your data source contains a `PersonName` field, and you want to index it as two
178
219
179
220
<aname="jsonArrayToStringCollectionFunction"></a>
180
221
181
-
## jsonArrayToStringCollection
222
+
### jsonArrayToStringCollection function
223
+
182
224
Transforms a string formatted as a JSON array of strings into a string array that can be used to populate a `Collection(Edm.String)` field in the index.
183
225
184
226
For example, if the input string is `["red", "white", "blue"]`, then the target field of type `Collection(Edm.String)` will be populated with the three values `red`, `white`, and `blue`. For input values that cannot be parsed as JSON string arrays, an error is returned.
185
227
186
-
### Sample use case
187
-
Azure SQL database doesn't have a built-in data type that naturally maps to `Collection(Edm.String)` fields in Azure Search. To populate string collection fields, format your source data as a JSON string array and use this function.
228
+
#### Example - populate collection from relational data
229
+
230
+
Azure SQL Database doesn't have a built-in data type that naturally maps to `Collection(Edm.String)` fields in Azure Search. To populate string collection fields, you can pre-process your source data as a JSON string array and then use the `jsonArrayToStringCollection` mapping function.
If you have feature requests or ideas for improvements, please reach out to us on our [UserVoice site](https://feedback.azure.com/forums/263029-azure-search/).
241
+
For a detailed example that transforms relational data into index collection fields, see [Model relational data](search-example-adventureworks-modeling.md).
0 commit comments