checkpoint

HeidiSteen · HeidiSteen · commit 9da48d8e8d0e · 2022-01-18T19:14:01.000-08:00
diff --git a/articles/search/search-file-storage-integration.md b/articles/search/search-file-storage-integration.md
@@ -7,7 +7,7 @@ author: mattmsft
 ms.author: magottei
 ms.service: cognitive-search
 ms.topic: how-to
-ms.date: 01/17/2022
+ms.date: 01/19/2022
 ---
 
 # Index data from Azure Files
@@ -54,7 +54,7 @@ The data source definition specifies the data source type, content path, and how
 
 1. Set "container" to the root file share, and use "query" to specify any subfolders.
 
-A data source definition can also include additional properties for [soft deletion policies](#soft-delete-using-custom-metadata) and [field mappings](search-indexer-field-mappings.md) if field names and types are not the same.
+A data source definition can also include [soft deletion policies](search-howto-index-changed-deleted-blobs.md), if you want the indexer to delete a search document when the source document is flagged for deletion.
 
 <a name="Credentials"></a>
 
diff --git a/articles/search/search-howto-index-azure-data-lake-storage.md b/articles/search/search-howto-index-azure-data-lake-storage.md
@@ -9,7 +9,7 @@ manager: nitinme
 
 ms.service: cognitive-search
 ms.topic: how-to
-ms.date: 01/17/2022
+ms.date: 01/19/2022
 ---
 
 # Index data from Azure Data Lake Storage Gen2
@@ -67,7 +67,7 @@ The data source definition specifies the data source type, content path, and how
 
 1. Set `"container"` to the blob container, and use "query" to specify any subfolders.
 
-A data source definition can also include properties for [soft deletion policies](search-howto-index-changed-deleted-blobs.md) and [field mappings](search-indexer-field-mappings.md) if field names and types are not the same or need to be forked.
+A data source definition can also include [soft deletion policies](search-howto-index-changed-deleted-blobs.md), if you want the indexer to delete a search document when the source document is flagged for deletion.
 
 <a name="Credentials"></a>
 
@@ -166,6 +166,8 @@ Indexer configuration specifies the inputs, parameters, and properties controlli
 
 1. See [Create an indexer](search-howto-create-indexers.md) for more information about other properties.
 
+For the full list of parameter descriptions, see [Blob configuration parameters](/rest/api/searchservice/create-indexer#blob-configuration-parameters) in the REST API.
+
 ### How to make an encoded field "searchable"
 
 There are times when you need to use an encoded version of a field like `metadata_storage_path` as the key, but also need that field to be searchable (without encoding) in the search index. To support both use cases, you can map `metadata_storage_path` to two fields; one for the key (encoded), and a second for a path field that we can assume is attributed as "searchable" in the index schema. The example below shows two field mappings for `metadata_storage_path`.
@@ -235,7 +237,7 @@ User-specified metadata properties are extracted verbatim. To receive the values
 
 Standard blob metadata properties can be extracted into similarly named and typed fields, as listed below. The blob indexer automatically creates internal field mappings for these blob metadata properties, converting the original hyphenated name ("metadata-storage-name") to an underscored equivalent name ("metadata_storage_name").
 
-You still have to add the underscored fields to the index definition, but you can omit creating field mappings in the indexer because the indexer will recognize the counterpart automatically.
+You still have to add the underscored fields to the index definition, but you can omit field mappings because the indexer will make the association automatically.
 
 + **metadata_storage_name** (`Edm.String`) - the file name of the blob. For example, if you have a blob /my-container/my-folder/subfolder/resume.pdf, the value of this field is `resume.pdf`.
 
@@ -282,21 +284,21 @@ The indexer configuration parameters apply to all blobs in the container or fold
 | `AzureSearch_Skip` |`"true"` |Instructs the blob indexer to completely skip the blob. Neither metadata nor content extraction is attempted. This is useful when a particular blob fails repeatedly and interrupts the indexing process. |
 | `AzureSearch_SkipContent` |`"true"` |This is equivalent of `"dataToExtract" : "allMetadata"` setting described [above](#PartsOfBlobToIndex) scoped to a particular blob. |
 
-## Index large datasets
+## How to index large datasets
 
 Indexing blobs can be a time-consuming process. In cases where you have millions of blobs to index, you can speed up indexing by partitioning your data and using multiple indexers to [process the data in parallel](search-howto-large-index.md#parallel-indexing). 
 
 1. Partition your data into multiple blob containers or virtual folders.
 
-1. Set up several data sources, one per container or folder. Use the `query` parameter to specify the partition: `"container" : { "name" : "my-container", "query" : "my-folder" }`.
+1. Set up several data sources, one per container or folder. Use the "query" parameter to specify the partition: `"container" : { "name" : "my-container", "query" : "my-folder" }`.
 
 1. Create one indexer for each data source. Point them to the same target index.  
 
 Make sure you have sufficient capacity. One search unit in your service can run one indexer at any given time. Partitioning data and creating multiple indexers is only useful if they can run in parallel.
 
 <a name="DealingWithErrors"></a>
 
-## Configure the response to errors
+## Handle errors
 
 Errors that commonly occur during indexing include unsupported content types, missing content, or oversized blobs.
 
diff --git a/articles/search/search-howto-index-changed-deleted-blobs.md b/articles/search/search-howto-index-changed-deleted-blobs.md
@@ -1,15 +1,15 @@
 ---
 title: Changed and deleted blobs
 titleSuffix: Azure Cognitive Search
-description: Indexers that index from Azure Storage can pick up new and changed content automaticaly. To automate deletion detection, follow the strategies described in this article.
+description: Indexers that index from Azure Storage can pick up new and changed content automatically. To automate deletion detection, follow the strategies described in this article.
 
 author: gmndrg
 ms.author: gimondra
 manager: nitinme
 
 ms.service: cognitive-search
 ms.topic: how-to
-ms.date: 01/18/2022
+ms.date: 01/19/2022
 ---
 
 # Change and delete detection using indexers for Azure Storage in Azure Cognitive Search
@@ -108,7 +108,7 @@ You can reverse a soft-delete if the original source file still physically exist
 
 1. Change the `"softDeleteMarkerValue" : "false"` on the blob or file in Azure Storage.
 
-1. Check the blob or file's `LastModified` timestamp to make it is newer than the last indexer run. You can force an update to the current date and time by resaving the existing metadata.
+1. Check the blob or file's `LastModified` timestamp to make it is newer than the last indexer run. You can force an update to the current date and time by re-saving the existing metadata.
 
 1. Run the indexer.
 
diff --git a/articles/search/search-howto-indexing-azure-blob-storage.md b/articles/search/search-howto-indexing-azure-blob-storage.md
@@ -9,7 +9,7 @@ manager: nitinme
 
 ms.service: cognitive-search
 ms.topic: how-to
-ms.date: 01/17/2022
+ms.date: 01/19/2022
 ---
 
 # Index data from Azure Blob Storage
@@ -61,7 +61,7 @@ The data source definition specifies the data source type, content path, and how
 
 1. Set "container" to the blob container, and use "query" to specify any subfolders.
 
-A data source definition can also include properties for [soft deletion policies](search-howto-index-changed-deleted-blobs.md) and [field mappings](search-indexer-field-mappings.md) if field names and types are not the same or need to be forked.
+A data source definition can also include [soft deletion policies](search-howto-index-changed-deleted-blobs.md), if you want the indexer to delete a search document when the source document is flagged for deletion.
 
 <a name="credentials"></a>
 
@@ -197,15 +197,15 @@ Textual content of a document is extracted into a string field named "content".
 
 <a name="indexing-blob-metadata"></a>
 
-## Indexing blob metadata
+### Indexing blob metadata
 
 Blob metadata can also be indexed, and that's helpful if you think any of the standard or custom metadata properties will be useful in filters and queries.
 
 User-specified metadata properties are extracted verbatim. To receive the values, you must define field in the search index of type `Edm.String`, with same name as the metadata key of the blob. For example, if a blob has a metadata key of `Sensitivity` with value `High`, you should define a field named `Sensitivity` in your search index and it will be populated with the value `High`.
 
 Standard blob metadata properties can be extracted into similarly named and typed fields, as listed below. The blob indexer automatically creates internal field mappings for these blob metadata properties, converting the original hyphenated name ("metadata-storage-name") to an underscored equivalent name ("metadata_storage_name").
 
-You still have to add the underscored fields to the index definition, but you can omit creating field mappings in the indexer because the indexer will recognize the counterpart automatically.
+You still have to add the underscored fields to the index definition, but you can omit field mappings because the indexer will make the association automatically.
 
 + **metadata_storage_name** (`Edm.String`) - the file name of the blob. For example, if you have a blob /my-container/my-folder/subfolder/resume.pdf, the value of this field is `resume.pdf`.
 
@@ -252,21 +252,21 @@ The indexer configuration parameters apply to all blobs in the container or fold
 | "AzureSearch_Skip" |`"true"` |Instructs the blob indexer to completely skip the blob. Neither metadata nor content extraction is attempted. This is useful when a particular blob fails repeatedly and interrupts the indexing process. |
 | "AzureSearch_SkipContent" |`"true"` |This is equivalent of "dataToExtract" : "allMetadata" setting described [above](#PartsOfBlobToIndex) scoped to a particular blob. |
 
-## Index large datasets
+## How to index large datasets
 
 Indexing blobs can be a time-consuming process. In cases where you have millions of blobs to index, you can speed up indexing by partitioning your data and using multiple indexers to [process the data in parallel](search-howto-large-index.md#parallel-indexing). 
 
 1. Partition your data into multiple blob containers or virtual folders.
 
-1. Set up several data sources, one per container or folder. Use the `query` parameter to specify the partition: `"container" : { "name" : "my-container", "query" : "my-folder" }`.
+1. Set up several data sources, one per container or folder. Use the "query" parameter to specify the partition: `"container" : { "name" : "my-container", "query" : "my-folder" }`.
 
 1. Create one indexer for each data source. Point them to the same target index.  
 
 Make sure you have sufficient capacity. One search unit in your service can run one indexer at any given time. Partitioning data and creating multiple indexers is only useful if they can run in parallel.
 
 <a name="DealingWithErrors"></a>
 
-## Configure the response to errors
+## Handle errors
 
 Errors that commonly occur during indexing include unsupported content types, missing content, or oversized blobs.
 
diff --git a/articles/search/search-howto-indexing-azure-tables.md b/articles/search/search-howto-indexing-azure-tables.md
@@ -9,7 +9,7 @@ ms.author: magottei
 
 ms.service: cognitive-search
 ms.topic: how-to
-ms.date: 01/17/2022
+ms.date: 01/19/2022
 ---
 
 # Index data from Azure Table Storage
@@ -47,7 +47,7 @@ The data source definition specifies the data source type, content path, and how
 
 1. Optionally, set "query" to a filter on PartitionKey. This is a best practice that improves performance. If "query" is specified any other way, the indexer will execute a full table scan, resulting in poor performance if the tables are large.
 
-A data source definition can also include additional properties for [soft deletion policies](#soft-delete-using-custom-metadata) and [field mappings](search-indexer-field-mappings.md) if field names and types are not the same.
+A data source definition can also include [soft deletion policies](search-howto-index-changed-deleted-blobs.md), if you want the indexer to delete a search document when the source document is flagged for deletion.
 
 <a name="Credentials"></a>
 
diff --git a/articles/search/search-indexer-field-mappings.md b/articles/search/search-indexer-field-mappings.md
@@ -9,16 +9,16 @@ ms.author: heidist
 
 ms.service: cognitive-search
 ms.topic: conceptual
-ms.date: 10/19/2021
+ms.date: 01/19/2022
 ---
 
 # Field mappings and transformations using Azure Cognitive Search indexers
 
 ![Indexer Stages](./media/search-indexer-field-mappings/indexer-stages-field-mappings.png "indexer stages")
 
-When using Azure Cognitive Search indexers, the indexer will automatically map fields in a data source to fields in a target index, assuming field names and types are compatible. In some cases, input data doesn't quite match the schema of your target index. One solution is to use *field mappings* to specifically set the data path during the indexing process.
+When using Azure Cognitive Search indexers, the indexer will automatically map fields in a data source to fields in a target index, assuming field names and types are compatible. When input data doesn't quite match the schema of your target index, you can define *field mappings* to specifically set the data path.
 
-Field mappings can be used to address the following scenarios:
+Field mappings address the following scenarios:
 
 + Mismatched field names. Suppose your data source has a field named `_id`. Given that Azure Cognitive Search doesn't allow field names that start with an underscore, a field mapping lets you effectively rename a field.
 
@@ -122,23 +122,48 @@ A field mapping function transforms the contents of a field before it's stored i
 
 Performs *URL-safe* Base64 encoding of the input string. Assumes that the input is UTF-8 encoded.
 
-#### Example - document key lookup
+#### Example: Base-encoding a document key
 
-Only URL-safe characters can appear in an Azure Cognitive Search document key (so that you can address the document using the [Lookup API](/rest/api/searchservice/lookup-document)). If the source field for your key contains URL-unsafe characters, you can use the `base64Encode` function to convert it at indexing time. However, a document key (both before and after conversion) can't be longer than 1,024 characters.
+Only URL-safe characters can appear in an Azure Cognitive Search document key (so that you can address the document using the [Lookup API](/rest/api/searchservice/lookup-document)). If the source field for your key contains URL-unsafe characters, such as `-` and `\`, use the `base64Encode` function to convert it at indexing time. 
 
-When you retrieve the encoded key at search time, use the `base64Decode` function to get the original key value, and use that to retrieve the source document.
+The following example specifies the base64Encode function on "metadata_storage_name" to handle unsupported characters.
 
-```JSON
-"fieldMappings" : [
-  {
-    "sourceFieldName" : "SourceKey",
-    "targetFieldName" : "IndexKey",
-    "mappingFunction" : {
-      "name" : "base64Encode",
-      "parameters" : { "useHttpServerUtilityUrlTokenEncode" : false }
+```http
+PUT /indexers?api-version=2020-06-30
+{
+  "dataSourceName" : "my-blob-datasource ",
+  "targetIndexName" : "my-search-index",
+  "fieldMappings" : [
+    { 
+        "sourceFieldName" : "metadata_storage_name", 
+        "targetFieldName" : "key", 
+        "mappingFunction" : { 
+            "name" : "base64Encode",
+            "parameters" : { "useHttpServerUtilityUrlTokenEncode" : false }
+        } 
     }
-  }]
- ```
+  ]
+}
+```
+
+A document key (both before and after conversion) can't be longer than 1,024 characters. When you retrieve the encoded key at search time, use the `base64Decode` function to get the original key value, and use that to retrieve the source document.
+
+#### Example: Make an base-encoded field "searchable"
+
+There are times when you need to use an encoded version of a field like "metadata_storage_path" as the key, but also need an un-encoded version for full text search. To support both scenarios, you can map "metadata_storage_path" to two fields: one for the key (encoded), and a second for a path field that we can assume is attributed as "searchable" in the index schema.
+
+```http
+PUT /indexers/blob-indexer?api-version=2020-06-30
+{
+    "dataSourceName" : " blob-datasource ",
+    "targetIndexName" : "my-target-index",
+    "schedule" : { "interval" : "PT2H" },
+    "fieldMappings" : [
+        { "sourceFieldName" : "metadata_storage_path", "targetFieldName" : "key", "mappingFunction" : { "name" : "base64Encode" } },
+        { "sourceFieldName" : "metadata_storage_path", "targetFieldName" : "path" }
+      ]
+}
+```
 
 #### Example - preserve original values
 
@@ -329,60 +354,4 @@ When facing errors complaining about document key being longer than 1024 charact
       "name" : "fixedLengthEncode"
     }
   }]
- ```
-
-<!-- 
-
-### Example: Base-encoding metadata_storage_name
-
-The following example demonstrates "metadata_storage_name" as the document key. Assume the index has a key field named "key" and another field named "fileSize" for storing the document size. [Field mappings](search-indexer-field-mappings.md) in the indexer definition establish field associations, and "metadata_storage_name" has the [base64Encode field mapping function](search-indexer-field-mappings.md#base64EncodeFunction) to handle unsupported characters.
-
-```http
-POST https://[service name].search.windows.net/indexers?api-version=2020-06-30
-{
-  "name" : "my-blob-indexer",
-  "dataSourceName" : "my-blob-datasource ",
-  "targetIndexName" : "my-search-index",
-  "fieldMappings" : [
-    { "sourceFieldName" : "metadata_storage_name", "targetFieldName" : "key", "mappingFunction" : { "name" : "base64Encode" } },
-    { "sourceFieldName" : "metadata_storage_size", "targetFieldName" : "fileSize" }
-  ]
-}
-```
-
-### Example: How to make an encoded field "searchable"
-
-There are times when you need to use an encoded version of a field like "metadata_storage_path" as the key, but also need that field to be searchable (without encoding) in the search index. To support both use cases, you can map "metadata_storage_path" to two fields; one for the key (encoded), and a second for a path field that we can assume is attributed as "searchable" in the index schema. The example below shows two field mappings for "metadata_storage_path".
-
-```http
-PUT /indexers/blob-indexer?api-version=2020-06-30
-{
-    "dataSourceName" : " blob-datasource ",
-    "targetIndexName" : "my-target-index",
-    "schedule" : { "interval" : "PT2H" },
-    "fieldMappings" : [
-        { "sourceFieldName" : "metadata_storage_path", "targetFieldName" : "key", "mappingFunction" : { "name" : "base64Encode" } },
-        { "sourceFieldName" : "metadata_storage_path", "targetFieldName" : "path" }
-      ]
-}
-``` -->
-
-<!-- ### Example
-
-The following example demonstrates `metadata_storage_name` as the document key. Assume the index has a key field named `key` and another field named `fileSize` for storing the document size. [Field mappings](search-indexer-field-mappings.md) in the indexer definition establish field associations, and `metadata_storage_name` has the [`base64Encode` field mapping function](search-indexer-field-mappings.md#base64EncodeFunction) to handle unsupported characters.
-
-```http
-    PUT https://[service name].search.windows.net/indexers/adlsgen2-indexer?api-version=2020-06-30
-    Content-Type: application/json
-    api-key: [admin key]
-    
-    {
-      "dataSourceName" : "adlsgen2-datasource",
-      "targetIndexName" : "my-target-index",
-      "schedule" : { "interval" : "PT2H" },
-      "fieldMappings" : [
-        { "sourceFieldName" : "metadata_storage_name", "targetFieldName" : "key", "mappingFunction" : { "name" : "base64Encode" } },
-        { "sourceFieldName" : "metadata_storage_size", "targetFieldName" : "fileSize" }
-      ]
-    }
-``` -->
+ ```