Skip to content

Commit 91c4d82

Browse files
authored
Merge pull request #79623 from RobDixon22/master
Updated the indexer Field mappings article
2 parents ef88387 + 2be4507 commit 91c4d82

File tree

1 file changed

+98
-56
lines changed

1 file changed

+98
-56
lines changed

articles/search/search-indexer-field-mappings.md

Lines changed: 98 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -13,27 +13,38 @@ ms.topic: conceptual
1313
ms.custom: seodec2018
1414
---
1515

16-
# Field mappings in Azure Search indexers
17-
When using Azure Search indexers, you can occasionally find yourself in situations where your input data doesn't quite match the schema of your target index. In those cases, you can use **field mappings** to transform your data into the desired shape.
16+
# Field mappings and transformations using Azure Search indexers
17+
18+
When using Azure Search indexers, you sometimes find that the input data doesn't quite match the schema of your target index. In those cases, you can use **field mappings** to reshape your data during the indexing process.
1819

1920
Some situations where field mappings are useful:
2021

21-
* Your data source has a field `_id`, but Azure Search doesn't allow field names starting with an underscore. A field mapping allows you to "rename" a field.
22-
* You want to populate several fields in the index with the same data source data, for example because you want to apply different analyzers to those fields. Field mappings let you "fork" a data source field.
23-
* You need to Base64 encode or decode your data. Field mappings support several **mapping functions**, including functions for Base64 encoding and decoding.
22+
* Your data source has a field named `_id`, but Azure Search doesn't allow field names that start with an underscore. A field mapping lets you effectively rename a field.
23+
* You want to populate several fields in the index from the same data source data. For example, you might want to apply different analyzers to those fields.
24+
* You want to populate an index field with data from more than one data source, and the data sources each use different field names.
25+
* You need to Base64 encode or decode your data. Field mappings support several **mapping functions**, including functions for Base64 encoding and decoding.
26+
27+
> [!NOTE]
28+
> The field mapping feature of Azure Search indexers provides a simple way to map data fields to index fields, with a few options for data conversion. More complex data might require pre-processing to reshape it into a form that's easy to index.
29+
>
30+
> Microsoft Azure Data Factory is a powerful cloud-based solution for importing and transforming data. You can also write code to transform source data before indexing. For code examples, see [Model relational data](search-example-adventureworks-modeling.md) and [Model multilevel facets](search-example-adventureworks-multilevel-faceting.md).
31+
>
2432
25-
## Setting up field mappings
26-
You can add field mappings when creating a new indexer using the [Create Indexer](https://msdn.microsoft.com/library/azure/dn946899.aspx) API. You can manage field mappings on an indexing indexer using the [Update Indexer](https://msdn.microsoft.com/library/azure/dn946892.aspx) API.
33+
## Set up field mappings
2734

2835
A field mapping consists of three parts:
2936

3037
1. A `sourceFieldName`, which represents a field in your data source. This property is required.
3138
2. An optional `targetFieldName`, which represents a field in your search index. If omitted, the same name as in the data source is used.
3239
3. An optional `mappingFunction`, which can transform your data using one of several predefined functions. The full list of functions is [below](#mappingFunctions).
3340

34-
Fields mappings are added to the `fieldMappings` array on the indexer definition.
41+
Field mappings are added to the `fieldMappings` array of the indexer definition.
42+
43+
## Map fields using the REST API
3544

36-
For example, here's how you can accommodate differences in field names:
45+
You can add field mappings when creating a new indexer using the [Create Indexer](https://docs.microsoft.com/rest/api/searchservice/create-Indexer) API request. You can manage the field mappings of an existing indexer using the [Update Indexer](https://docs.microsoft.com/rest/api/searchservice/update-indexer) API request.
46+
47+
For example, here's how to map a source field to a target field with a different name:
3748

3849
```JSON
3950

@@ -47,7 +58,7 @@ api-key: [admin key]
4758
}
4859
```
4960

50-
An indexer can have multiple field mappings. For example, here's how you can "fork" a field:
61+
A source field can be referenced in multiple field mappings. The following example shows how to "fork" a field, copying the same source field to two different index fields:
5162

5263
```JSON
5364

@@ -62,10 +73,37 @@ An indexer can have multiple field mappings. For example, here's how you can "fo
6273
>
6374
>
6475
76+
## Map fields using the .NET SDK
77+
78+
You define field mappings in the .NET SDK using the [FieldMapping](https://docs.microsoft.com/dotnet/api/microsoft.azure.search.models.fieldmapping) class, which has the properties `SourceFieldName` and `TargetFieldName`, and an optional `MappingFunction` reference.
79+
80+
You can specify field mappings when constructing the indexer, or later by directly setting the `Indexer.FieldMappings` property.
81+
82+
The following C# example sets the field mappings when constructing an indexer.
83+
84+
```csharp
85+
List<FieldMapping> map = new List<FieldMapping> {
86+
// removes a leading underscore from a field name
87+
new FieldMapping("_custId", "custId"),
88+
// URL-encodes a field for use as the index key
89+
new FieldMapping("docPath", "docId", FieldMappingFunction.Base64Encode() )
90+
};
91+
92+
Indexer sqlIndexer = new Indexer(
93+
name: "azure-sql-indexer",
94+
dataSourceName: sqlDataSource.Name,
95+
targetIndexName: index.Name,
96+
fieldMappings: map,
97+
schedule: new IndexingSchedule(TimeSpan.FromDays(1)));
98+
99+
await searchService.Indexers.CreateOrUpdateAsync(indexer);
100+
```
101+
65102
<a name="mappingFunctions"></a>
66103

67104
## Field mapping functions
68-
These functions are currently supported:
105+
106+
A field mapping function transforms the contents of a field before it's stored in the index. The following mapping functions are currently supported:
69107

70108
* [base64Encode](#base64EncodeFunction)
71109
* [base64Decode](#base64DecodeFunction)
@@ -74,68 +112,69 @@ These functions are currently supported:
74112

75113
<a name="base64EncodeFunction"></a>
76114

77-
## base64Encode
115+
### base64Encode function
116+
78117
Performs *URL-safe* Base64 encoding of the input string. Assumes that the input is UTF-8 encoded.
79118

80-
### Sample use case - document key lookup
81-
Only URL-safe characters can appear in an Azure Search document key (because customers must be able to address the document using the [Lookup API](https://docs.microsoft.com/rest/api/searchservice/lookup-document), for example). If your data contains URL-unsafe characters and you want to use it to populate a key field in your search index, use this function. Once the key is encoded, you can use base64 decode to retrieve the original value. For details, see the [base64 encoding and decoding](#base64details) section.
119+
#### Example - document key lookup
82120

83-
#### Example
84-
```JSON
121+
Only URL-safe characters can appear in an Azure Search document key (because customers must be able to address the document using the [Lookup API](https://docs.microsoft.com/rest/api/searchservice/lookup-document) ). If the source field for your key contains URL-unsafe characters, you can use the `base64Encode` function to convert it at indexing time.
85122

86-
"fieldMappings" : [
87-
{
88-
"sourceFieldName" : "SourceKey",
89-
"targetFieldName" : "IndexKey",
90-
"mappingFunction" : { "name" : "base64Encode" }
91-
}]
92-
```
93-
94-
### Sample use case - retrieve original key
95-
You have a blob indexer that indexes blobs with the blob path metadata as the document key. After retrieving the encoded document key, you want to decode the path and download the blob.
123+
When you retrieve the encoded key at search time, you can then use the `base64Decode` function to get the original key value, and use that to retrieve the source document.
96124

97-
#### Example
98125
```JSON
99126

100127
"fieldMappings" : [
101128
{
102129
"sourceFieldName" : "SourceKey",
103130
"targetFieldName" : "IndexKey",
104-
"mappingFunction" : { "name" : "base64Encode", "parameters" : { "useHttpServerUtilityUrlTokenEncode" : false } }
131+
"mappingFunction" : {
132+
"name" : "base64Encode",
133+
"parameters" : { "useHttpServerUtilityUrlTokenEncode" : false }
134+
}
105135
}]
106136
```
107137

108-
If you don't need to look up documents by keys and also don't need to decode the encoded content, you can just leave out `parameters` for the mapping function, which defaults `useHttpServerUtilityUrlTokenEncode` to `true`. Otherwise, see [base64 details](#base64details) section to decide which settings to use.
138+
If you don't include a parameters property for your mapping function, it defaults to the value `{"useHttpServerUtilityUrlTokenEncode" : true}`.
139+
140+
Azure Search supports two different Base64 encodings. You should use the same parameters when encoding and decoding the same field. For more information, see [base64 encoding options](#base64details) to decide which parameters to use.
109141

110142
<a name="base64DecodeFunction"></a>
111143

112-
## base64Decode
113-
Performs Base64 decoding of the input string. The input is assumed to a *URL-safe* Base64-encoded string.
144+
### base64Decode function
145+
146+
Performs Base64 decoding of the input string. The input is assumed to be a *URL-safe* Base64-encoded string.
147+
148+
#### Example - decode blob metadata or URLs
114149

115-
### Sample use case
116-
Blob custom metadata values must be ASCII-encoded. You can use Base64 encoding to represent arbitrary UTF-8 strings in blob custom metadata. However, to make search meaningful, you can use this function to turn the encoded data back into "regular" strings when populating your search index.
150+
Your source data might contain Base64-encoded strings, such as blob metadata strings or web URLs, that you want to make searchable as plain text. You can use the `base64Decode` function to turn the encoded data back into regular strings when populating your search index.
117151

118-
#### Example
119152
```JSON
120153

121154
"fieldMappings" : [
122155
{
123156
"sourceFieldName" : "Base64EncodedMetadata",
124157
"targetFieldName" : "SearchableMetadata",
125-
"mappingFunction" : { "name" : "base64Decode", "parameters" : { "useHttpServerUtilityUrlTokenDecode" : false } }
158+
"mappingFunction" : {
159+
"name" : "base64Decode",
160+
"parameters" : { "useHttpServerUtilityUrlTokenDecode" : false }
161+
}
126162
}]
127163
```
128164

129-
If you don't specify any `parameters`, then the default value of `useHttpServerUtilityUrlTokenDecode` is `true`. See [base64 details](#base64details) section to decide which settings to use.
165+
If you don't include a parameters property, it defaults to the value `{"useHttpServerUtilityUrlTokenEncode" : true}`.
166+
167+
Azure Search supports two different Base64 encodings. You should use the same parameters when encoding and decoding the same field. For more details, see [base64 encoding options](#base64details) to decide which parameters to use.
130168

131169
<a name="base64details"></a>
132170

133-
### Details of base64 encoding and decoding
134-
Azure Search supports two base64 encodings: HttpServerUtility URL token and URL-safe base64 encoding without padding. You need to use the same encoding as the mapping functions if you want to encode a document key for look up, encode a value to be decoded by the indexer, or decode a field encoded by the indexer.
171+
#### base64 encoding options
172+
173+
Azure Search supports two different Base64 encodings: **HttpServerUtility URL token**, and **URL-safe Base64 encoding without padding**. A string that is base64-encoded during indexing should later be decoded with the same encoding options, or else the result won't match the original.
135174

136-
If `useHttpServerUtilityUrlTokenEncode` or `useHttpServerUtilityUrlTokenDecode` parameters for encoding and decoding respectively are set to `true`, then `base64Encode` behaves like [HttpServerUtility.UrlTokenEncode](https://msdn.microsoft.com/library/system.web.httpserverutility.urltokenencode.aspx) and `base64Decode` behaves like [HttpServerUtility.UrlTokenDecode](https://msdn.microsoft.com/library/system.web.httpserverutility.urltokendecode.aspx).
175+
If the `useHttpServerUtilityUrlTokenEncode` or `useHttpServerUtilityUrlTokenDecode` parameters for encoding and decoding respectively are set to `true`, then `base64Encode` behaves like [HttpServerUtility.UrlTokenEncode](https://msdn.microsoft.com/library/system.web.httpserverutility.urltokenencode.aspx) and `base64Decode` behaves like [HttpServerUtility.UrlTokenDecode](https://msdn.microsoft.com/library/system.web.httpserverutility.urltokendecode.aspx).
137176

138-
If you are not using the full .NET Framework (i.e., you are using .NET Core or other programming environment) to produce the key values to emulate Azure Search behavior, then you should set `useHttpServerUtilityUrlTokenEncode` and `useHttpServerUtilityUrlTokenDecode` to `false`. Depending on the library you use, the base64 encode and decode utility functions may be different from Azure Search.
177+
If you are not using the full .NET Framework (that is, you are using .NET Core or another framework) to produce the key values to emulate Azure Search behavior, then you should set `useHttpServerUtilityUrlTokenEncode` and `useHttpServerUtilityUrlTokenDecode` to `false`. Depending on the library you use, the base64 encoding and decoding functions might differ from the ones used by Azure Search.
139178

140179
The following table compares different base64 encodings of the string `00>00?00`. To determine the required additional processing (if any) for your base64 functions, apply your library encode function on the string `00>00?00` and compare the output with the expected output `MDA-MDA_MDA`.
141180

@@ -148,19 +187,21 @@ The following table compares different base64 encodings of the string `00>00?00`
148187

149188
<a name="extractTokenAtPositionFunction"></a>
150189

151-
## extractTokenAtPosition
190+
### extractTokenAtPosition function
191+
152192
Splits a string field using the specified delimiter, and picks the token at the specified position in the resulting split.
153193

194+
This function uses the following parameters:
195+
196+
* `delimiter`: a string to use as the separator when splitting the input string.
197+
* `position`: an integer zero-based position of the token to pick after the input string is split.
198+
154199
For example, if the input is `Jane Doe`, the `delimiter` is `" "`(space) and the `position` is 0, the result is `Jane`; if the `position` is 1, the result is `Doe`. If the position refers to a token that doesn't exist, an error is returned.
155200

156-
### Sample use case
157-
Your data source contains a `PersonName` field, and you want to index it as two separate `FirstName` and `LastName` fields. You can use this function to split the input using the space character as the delimiter.
201+
#### Example - extract a name
158202

159-
### Parameters
160-
* `delimiter`: a string to use as the separator when splitting the input string.
161-
* `position`: an integer zero-based position of the token to pick after the input string is split.
203+
Your data source contains a `PersonName` field, and you want to index it as two separate `FirstName` and `LastName` fields. You can use this function to split the input using the space character as the delimiter.
162204

163-
### Example
164205
```JSON
165206

166207
"fieldMappings" : [
@@ -178,22 +219,23 @@ Your data source contains a `PersonName` field, and you want to index it as two
178219

179220
<a name="jsonArrayToStringCollectionFunction"></a>
180221

181-
## jsonArrayToStringCollection
222+
### jsonArrayToStringCollection function
223+
182224
Transforms a string formatted as a JSON array of strings into a string array that can be used to populate a `Collection(Edm.String)` field in the index.
183225

184226
For example, if the input string is `["red", "white", "blue"]`, then the target field of type `Collection(Edm.String)` will be populated with the three values `red`, `white`, and `blue`. For input values that cannot be parsed as JSON string arrays, an error is returned.
185227

186-
### Sample use case
187-
Azure SQL database doesn't have a built-in data type that naturally maps to `Collection(Edm.String)` fields in Azure Search. To populate string collection fields, format your source data as a JSON string array and use this function.
228+
#### Example - populate collection from relational data
229+
230+
Azure SQL Database doesn't have a built-in data type that naturally maps to `Collection(Edm.String)` fields in Azure Search. To populate string collection fields, you can pre-process your source data as a JSON string array and then use the `jsonArrayToStringCollection` mapping function.
188231

189-
### Example
190232
```JSON
191233

192234
"fieldMappings" : [
193-
{ "sourceFieldName" : "tags", "mappingFunction" : { "name" : "jsonArrayToStringCollection" } }
194-
]
235+
{
236+
"sourceFieldName" : "tags",
237+
"mappingFunction" : { "name" : "jsonArrayToStringCollection" }
238+
}]
195239
```
196240

197-
198-
## Help us make Azure Search better
199-
If you have feature requests or ideas for improvements, please reach out to us on our [UserVoice site](https://feedback.azure.com/forums/263029-azure-search/).
241+
For a detailed example that transforms relational data into index collection fields, see [Model relational data](search-example-adventureworks-modeling.md).

0 commit comments

Comments
 (0)