Merge pull request #220682 from HeidiSteen/heidist-fix

prmerger-automator[bot] · web-flow · commit 8210cf549759 · 2022-12-07T01:35:17.000Z
edits for readability
diff --git a/articles/search/search-howto-create-indexers.md b/articles/search/search-howto-create-indexers.md
@@ -15,19 +15,21 @@ ms.date: 05/11/2022
 
 # Creating indexers in Azure Cognitive Search
 
-A search indexer connects to an external data source, retrieves and processes data, and then passes it to the search engine for indexing. Indexers support two workflows:
+An indexer is a named object on a search service that automates an indexing workload by connecting to an external data source, retrieving and processing data, and then passing the data on to the search engine for indexing. Using indexers significantly reduces the quantity and complexity of the code you need to write. 
+
+Indexers support two workflows:
 
 + Text-based indexing, extracting strings and metadata for full text search scenarios.
 
-+ [AI-enriched indexing](cognitive-search-concept-intro.md), applying integrated machine learning and AI models to analyze content that isn't otherwise searchable, such as images and large undifferentiated text.
++ Skills-based indexing, using built-in or custom skills to apply integrated machine learning and AI models that analyze content for text and structure. Skill-based indexing enables search over content that isn't otherwise easily searchable, such as images and large undifferentiated text. To learn about skills-based indexing, see [AI enrichment in Cognitive Search](cognitive-search-concept-intro.md).
 
-Using indexers significantly reduces the quantity and complexity of the code you need to write. This article focuses on the basics of creating an indexer. Depending on the data source and your workflow, more configuration might be necessary.
+This article focuses on the basic steps of creating an indexer. Depending on the data source and your workflow, more configuration might be necessary.
 
 ## Indexer definitions
 
-When you create an indexer, the definition will adhere to one of two patterns: text-based indexing or AI enrichment with skills. The only difference is that an indexer that invokes AI enrichment has more definitions.
+When you create an indexer, the definition will adhere to one of two patterns: text-based indexing or AI enrichment with skills. The patterns are the same except that skills-based indexing has more definitions.
 
-### Indexer definition for full text search
+### Indexer definition for text-based indexing
 
 Full text search is the primary use case for indexers, and for this workflow, an indexer will look like this example.
 
@@ -57,19 +59,21 @@ Indexers have the following requirements:
 + A "dataSourceName" property that points to a data source object. It specifies a connection to external data.
 + A "targetIndexName" property that points to the destination search index.
 
-Parameters are optional and modify run time behaviors, such as how many errors to accept before failing the entire job. The parameters above are available for all indexers and are documented in the [REST API reference](/rest/api/searchservice/create-indexer#request-body). 
+Other parameters are optional and modify run time behaviors, such as how many errors to accept before failing the entire job. The parameters above are available for all indexers and are documented in the [REST API reference](/rest/api/searchservice/create-indexer#request-body). 
 
-Source-specific indexers for blobs, SQL, and Azure Cosmos DB provide extra "configuration" parameters for source-specific behaviors. For example, if the source is Blob Storage, you can set a parameter that filters on file extensions: `"parameters" : { "configuration" : { "indexedFileNameExtensions" : ".pdf,.docx" } }`.
+Data source-specific indexers for blobs, SQL, and Azure Cosmos DB provide extra "configuration" parameters for source-specific behaviors. For example, if the source is Blob Storage, you can set a parameter that filters on file extensions: `"parameters" : { "configuration" : { "indexedFileNameExtensions" : ".pdf,.docx" } }`. If the source is Azure SQL, you can set a query time out parameter.
 
-[Field mappings](search-indexer-field-mappings.md) are used to explicitly map source-to-destination fields if those fields differ by name or type. 
+[Field mappings](search-indexer-field-mappings.md) are used to explicitly map source-to-destination fields if there are discrepancies by name or type between a field in the data source and a field in the search index.
 
-An indexer will run immediately when you create it on the search service. If you don't want indexer execution, set "disabled" to true. 
+By default, an indexer runs immediately when you create it on the search service. If you don't want indexer execution, set "disabled" to true when creating the indexer.
 
 You can also [specify a schedule](search-howto-schedule-indexers.md) or set an [encryption key](search-security-manage-encryption-keys.md) for supplemental encryption of the indexer definition.
 
-### Indexer definition for AI enrichment
+### Indexer definition for skills-based indexing and AI enrichment
+
+Indexers also drive [AI enrichment](cognitive-search-concept-intro.md). All of the above properties and parameters apply, but the following extra properties are specific to AI enrichment: **`skillSetName`**, **`outputFieldMappings`**, **`cache`**. 
 
-Indexers also drive [AI enrichment](cognitive-search-concept-intro.md). All of the above properties and parameters apply, but the following properties are specific to AI enrichment: **`skillSetName`**, **`outputFieldMappings`**, **`cache`**. A [skillset](cognitive-search-defining-skillset.md) also has **`cognitiveServices`**, and **`knowledgeStore`**. A few other required and similarly named properties are added for context.
+A [skillset](cognitive-search-defining-skillset.md) also has **`cognitiveServices`**, and **`knowledgeStore`**. A few other required and similarly named properties are added for context.
 
 ```json
 {
@@ -226,11 +230,11 @@ If your data source supports change detection, an indexer can detect underlying
 
 Change detection logic is built into the data platforms. How an indexer supports change detection varies by data source:
 
-+ Azure Storage has built-in change detection, which means an indexer can recognize new and updated documents automatically. Blob Storage, Azure Table Storage, and Azure Data Lake Storage Gen2 stamp each blob or row update with a date and time. An indexer can use this information to determine which documents to update in the index.
++ Azure Storage has built-in change detection, which means an indexer can recognize new and updated documents automatically. Blob Storage, Azure Table Storage, and Azure Data Lake Storage Gen2 stamp each blob or row update with a date and time. An indexer automatically uses this information to determine which documents to update in the index.
 
-+ Azure SQL and Azure Cosmos DB provide optional change detection features in their platforms. You can specify the change detection policy in your data source definition.
++ Azure SQL and Azure Cosmos DB provide optional change detection features in their platforms. For these data sources, change detection isn't automatic. You'll need to specify in the data source definition which change detection policy is used.
 
-For large indexing loads, an indexer also keeps track of the last document it processed through an internal "high water mark". The marker is never exposed in the API, but internally the indexer keeps track of where it stopped. When indexing resumes, either through a scheduled run or an on-demand invocation, the indexer references the high water mark so that it can pick up where it left off.
+Indexers keep track of the last document it processed from the data source through an internal "high water mark". The marker is never exposed in the API, but internally the indexer keeps track of where it stopped. When indexing resumes, either through a scheduled run or an on-demand invocation, the indexer references the high water mark so that it can pick up where it left off.
 
 If you need to clear the high water mark to reindex in full, you can use [Reset Indexer](/rest/api/searchservice/reset-indexer). For more selective reindexing, use [Reset Skills](/rest/api/searchservice/preview-api/reset-skills) or [Reset Documents](/rest/api/searchservice/preview-api/reset-documents). Through the reset APIs, you can clear internal state, and also flush the cache if you enabled [incremental enrichment](search-howto-incremental-index.md). For more background and comparison of each reset option, see [Run or reset indexers, skills, and documents](search-howto-run-reset-indexers.md).
 
diff --git a/articles/search/search-howto-run-reset-indexers.md b/articles/search/search-howto-run-reset-indexers.md
@@ -13,19 +13,19 @@ ms.date: 01/07/2022
 
 # Run or reset indexers, skills, or documents
 
-Indexers can be invoked in three ways: on demand, on a schedule, or when the [indexer is created](/rest/api/searchservice/create-indexer), assuming it's not created in "disabled" mode. This article explains how to run indexers on demand, with and without a reset.
+In Azure Cognitive Search, there are several ways to run an indexer:
 
-## Resetting indexers
++ [Run when creating or updating an indexer](search-howto-create-indexers.md), assuming it's not created in "disabled" mode.
++ [Run on a schedule](search-howto-schedule-indexers.md) to invoke execution at regular intervals.
++ Run on demand, with or without a "reset".
 
-After the initial run, an indexer keeps track of which search documents have been indexed through an internal *high-water mark*. The marker is never exposed, but internally the indexer knows where it last stopped, so that it can pick up where it left off on the next run.
+This article explains how to run indexers on demand, with and without a reset.
 
-If you need to rebuild all or part of an index, you can clear the indexer's high-water mark through a reset. Reset APIs are available at decreasing levels in the object hierarchy:
+## Run without reset
 
-+ [Reset Indexers](#reset-indexers) clears the high-water mark and performs a full reindex of all documents
-+ [Reset Documents (preview)](#reset-docs) reindexes a specific document or list of documents
-+ [Reset Skills (preview)](#reset-skills) invokes skill processing for a specific skill
+[Run Indexer](/rest/api/searchservice/run-indexer) will detect and process only what it necessary to synchronize the search index with changes in the underlying data source. Incremental indexing starts by locating an internal high-water mark to find the last updated search document, which becomes the starting point for indexer execution over new and updated documents in the data source.
 
-After reset, follow with a Run command to reprocess new and existing documents. Orphaned search documents having no counterpart in the data source cannot be removed through reset/run. If you need to delete documents, see [Add, Update or Delete Documents](/rest/api/searchservice/addupdate-or-delete-documents) instead.
+Change detection is essential for determining what's new or updated in the data source. If the content is unchanged, Run has no effect. Blob storage has built-in change detection through its LastModified property. Other data sources, such as Azure SQL or Azure Cosmos DB, have to be configured for change detection before the indexer can read new and updated rows. 
 
 ## Indexer execution
 
@@ -49,11 +49,17 @@ Indexer limits vary by the workload. For each workload, the following job limits
 > [!TIP]
 > If you are [indexing a large data set](search-howto-large-index.md), you can stretch processing out by putting the indexer [on a schedule](search-howto-schedule-indexers.md). For the full list of all indexer-related limits, see [indexer limits](search-limits-quotas-capacity.md#indexer-limits)
 
-## Run without reset
+## Resetting indexers
 
-[Run Indexer](/rest/api/searchservice/run-indexer) will detect and process only what it necessary to synchronize the search index with changes in the underlying data source. Incremental indexing starts by locating an internal high-water mark to find the last updated search document, which becomes the starting point for indexer execution over new and updated documents in the data source.
+After the initial run, an indexer keeps track of which search documents have been indexed through an internal *high-water mark*. The marker is never exposed, but internally the indexer knows where it last stopped, so that it can pick up where it left off on the next run.
 
-Change detection is essential for determining what's new or updated in the data source. If the content is unchanged, Run has no effect. Blob storage has built-in change detection through its LastModified property. Other data sources, such as Azure SQL or Azure Cosmos DB, have to be configured for change detection before the indexer can read new and updated rows. 
+If you need to rebuild all or part of an index, you can clear the indexer's high-water mark through a reset. Reset APIs are available at decreasing levels in the object hierarchy:
+
++ [Reset Indexers](#reset-indexers) clears the high-water mark and performs a full reindex of all documents
++ [Reset Documents (preview)](#reset-docs) reindexes a specific document or list of documents
++ [Reset Skills (preview)](#reset-skills) invokes skill processing for a specific skill
+
+After reset, follow with a Run command to reprocess new and existing documents. Orphaned search documents having no counterpart in the data source cannot be removed through reset/run. If you need to delete documents, see [Add, Update or Delete Documents](/rest/api/searchservice/addupdate-or-delete-documents) instead.
 
 <a name="reset-indexers"></a>
 
diff --git a/articles/search/search-howto-schedule-indexers.md b/articles/search/search-howto-schedule-indexers.md
@@ -109,11 +109,11 @@ Skills-based indexers run in a different [execution environment](search-howto-ru
 
 Although multiple indexers can run simultaneously, a given indexer is single instance. You can't run two copies of the same indexer concurrently. If an indexer happens to still be running when its next scheduled execution is set to start, the pending execution is postponed until the next scheduled occurrence, allowing the current job to finish.
 
-Let’s consider an example to make this more concrete. Suppose we configure an indexer schedule with an interval of hourly and a start time of June 1, 2021 at 8:00:00 AM UTC. Here's what could happen when an indexer run takes longer than an hour:
+Let’s consider an example to make this more concrete. Suppose we configure an indexer schedule with an interval of hourly and a start time of June 1, 2022 at 8:00:00 AM UTC. Here's what could happen when an indexer run takes longer than an hour:
 
-+ The first indexer execution starts at or around June 1, 2021 at 8:00 AM UTC. Assume this execution takes 20 minutes (or any time less than 1 hour).
++ The first indexer execution starts at or around June 1, 2022 at 8:00 AM UTC. Assume this execution takes 20 minutes (or any amount of time that's less than 1 hour).
 
-+ The second execution starts at or around June 1, 2021 9:00 AM UTC. Suppose that this execution takes 70 minutes - more than an hour – and it will not complete until 10:10 AM UTC.
++ The second execution starts at or around June 1, 2022 9:00 AM UTC. Suppose that this execution takes 70 minutes - more than an hour – and it will not complete until 10:10 AM UTC.
 
 + The third execution is scheduled to start at 10:00 AM UTC, but at that time the previous execution is still running. This scheduled execution is then skipped. The next execution of the indexer won't start until 11:00 AM UTC.