Merge pull request #278522 from gmndrg/main

AnnaMHuff · web-flow · commit 96c8d8d31cc2 · 2024-06-18T20:35:57.000-06:00
Supportability requested updates in documentation
diff --git a/articles/search/cognitive-search-incremental-indexing-conceptual.md b/articles/search/cognitive-search-incremental-indexing-conceptual.md
@@ -8,7 +8,7 @@ ms.service: cognitive-search
 ms.custom:
   - ignite-2023
 ms.topic: conceptual
-ms.date: 02/16/2024
+ms.date: 06/18/2024
 ---
 
 # Incremental enrichment and caching in Azure AI Search
@@ -18,6 +18,8 @@ ms.date: 02/16/2024
 
 *Incremental enrichment* refers to the use of cached enrichments during [skillset execution](cognitive-search-working-with-skillsets.md) so that only new and changed skills and documents incur pay-as-you-go processing charges for API calls to Azure AI services. The cache contains the output from [document cracking](search-indexer-overview.md#document-cracking), plus the outputs of each skill for every document. Although caching is billable (it uses Azure Storage), the overall cost of enrichment is reduced because the costs of storage are less than image extraction and AI processing.
 
+To ensure synchronization between your data source data and your index, it's important to understand your unique [data source](search-data-sources-gallery.md) change and deletion tracking prerequisites. This guide specifically addresses how to manage incremental modifications in terms of your skills processing and how to utilize cache for this purpose.
+
 When you enable caching, the indexer evaluates your updates to determine whether existing enrichments can be pulled from the cache. Image and text content from the document cracking phase, plus skill outputs that are upstream or orthogonal to your edits, are likely to be reusable.
 
 After skillset processing is finished, the refreshed results are written back to the cache, and also to the search index or knowledge store.
diff --git a/articles/search/search-data-sources-gallery.md b/articles/search/search-data-sources-gallery.md
@@ -10,7 +10,7 @@ ms.custom:
   - ignite-2023
 ms.topic: conceptual
 layout: LandingPage
-ms.date: 05/22/2024
+ms.date: 06/18/2024
 ---
 
 # Data sources gallery
@@ -21,6 +21,11 @@ Find a data connector from Microsoft or a partner that works with [an indexer](s
 + [Preview data sources by Azure AI Search](#preview)
 + [Data sources from our Partners](#partners)
 
+
+> [!NOTE]
+> The connectors mentioned in this article don't represent the only methods for indexing data from data sources to AI Search, but low/no-code options to accomplish this task. You have the option to develop your own connector utilizing the [Push REST API/SDK](search-what-is-data-import.md#pushing-data-to-an-index). This implies that provided you can programmatically extract data from a source, you can also employ the corresponding programmatic Push method to index your data.
+
+
 <a name="ga"></a>
 
 ## Generally available data sources by Azure AI Search
diff --git a/articles/search/search-howto-index-cosmosdb.md b/articles/search/search-howto-index-cosmosdb.md
@@ -10,7 +10,7 @@ ms.custom:
   - devx-track-dotnet
   - ignite-2023
 ms.topic: how-to
-ms.date: 01/18/2024
+ms.date: 06/18/2024
 ---
 
 # Index data from Azure Cosmos DB for NoSQL for queries in Azure AI Search
@@ -303,6 +303,9 @@ The following example shows a [data source definition](#define-the-data-source)
 },
 ```
 
+> [!NOTE]
+> When you assign a `null` value to a field in your Azure Cosmos DB, the AI Search indexer is unable to distinguish between `null` and a missing field value. Therefore, if a field in the index is empty, it will not be substituted with a `null` value, even if that modification was specifically made in your database.
+
 <a name="IncrementalProgress"></a>
 
 ### Incremental indexing and custom queries
diff --git a/articles/search/search-howto-indexing-azure-blob-storage.md b/articles/search/search-howto-indexing-azure-blob-storage.md
@@ -10,7 +10,7 @@ ms.service: cognitive-search
 ms.custom:
   - ignite-2023
 ms.topic: how-to
-ms.date: 05/04/2024
+ms.date: 06/17/2024
 ---
 
 # Index data from Azure Blob Storage
@@ -243,6 +243,61 @@ Once the index and data source have been created, you're ready to create the ind
 
 An indexer runs automatically when it's created. You can prevent this by setting "disabled" to true. To control indexer execution, [run an indexer on demand](search-howto-run-reset-indexers.md) or [put it on a schedule](search-howto-schedule-indexers.md).
 
+## Indexing data from multiple Azure Blob containers to a single index
+
+Keep in mind that an indexer can only index data from a single container. If your requirement is to index data from multiple containers and consolidate it into a single AI Search index, this can be achieved by configuring multiple indexers, all directed to the same index. Please be aware of the [maximum number of indexers available per SKU](search-limits-quotas-capacity.md#indexer-limits). 
+
+To illustrate, let's consider an example of two indexers, pulling data from two distinct data sources, named `my-blob-datasource1` and `my-blob-datasource2`. Each data source points to a separate Azure Blob container, but both direct to the same index named `my-search-index`.
+
+First indexer definition example:
+
+```http
+POST https://[service name].search.windows.net/indexers?api-version=2023-11-01
+{
+  "name" : "my-blob-indexer1",
+  "dataSourceName" : "my-blob-datasource1",
+  "targetIndexName" : "my-search-index",
+  "parameters": {
+      "batchSize": null,
+      "maxFailedItems": null,
+      "maxFailedItemsPerBatch": null,
+      "base64EncodeKeys": null,
+      "configuration": {
+          "indexedFileNameExtensions" : ".pdf,.docx",
+          "excludedFileNameExtensions" : ".png,.jpeg",
+          "dataToExtract": "contentAndMetadata",
+          "parsingMode": "default"
+      }
+  },
+  "schedule" : { },
+  "fieldMappings" : [ ]
+}
+```
+Second indexer definition that runs in parallel example:
+
+```http
+POST https://[service name].search.windows.net/indexers?api-version=2023-11-01
+{
+  "name" : "my-blob-indexer2",
+  "dataSourceName" : "my-blob-datasource2",
+  "targetIndexName" : "my-search-index",
+  "parameters": {
+      "batchSize": null,
+      "maxFailedItems": null,
+      "maxFailedItemsPerBatch": null,
+      "base64EncodeKeys": null,
+      "configuration": {
+          "indexedFileNameExtensions" : ".pdf,.docx",
+          "excludedFileNameExtensions" : ".png,.jpeg",
+          "dataToExtract": "contentAndMetadata",
+          "parsingMode": "default"
+      }
+  },
+  "schedule" : { },
+  "fieldMappings" : [ ]
+}
+```
+
 ## Check indexer status
 
 To monitor the indexer status and execution history, send a [Get Indexer Status](/rest/api/searchservice/get-indexer-status) request:
diff --git a/articles/search/search-indexer-troubleshooting.md b/articles/search/search-indexer-troubleshooting.md
@@ -9,7 +9,7 @@ ms.service: cognitive-search
 ms.custom:
   - ignite-2023
 ms.topic: conceptual
-ms.date: 01/11/2024
+ms.date: 06/17/2024
 ---
 
 # Indexer troubleshooting guidance for Azure AI Search
@@ -266,6 +266,11 @@ Conditions under which a document is processed twice is explained in the followi
 
 In practice, this scenario only happens when on-demand indexers are manually invoked within minutes of each other, for certain data sources. It can result in mismatched numbers (like the indexer processed 345 documents total according to the indexer execution stats, but there are 340 documents in the data source and index) or potentially increased billing if you're running the same skills for the same document multiple times. Running an indexer using a schedule is the preferred recommendation.
 
+## Parallel indexing
+
+When multiple indexers are operating simultaneously, it's typical for some to enter a queue, waiting for available resources to begin execution. The number of indexers that can run concurrently depends on several factors. If the indexers are not linked with [skillsets](cognitive-search-working-with-skillsets.md), the capacity to run in parallel relies on the number of [replicas and partitions](search-capacity-planning.md#concepts-search-units-replicas-partitions) set up in the AI Search service.
+
+On the other hand, if an indexer is associated with a skillset, it operates within the AI Search's internal clusters. The ability to run concurrently in this case is determined by the complexity of the skillset and whether other skillsets are running simultaneously. Built-in indexers are designed to reliably extract data from the source, so no data is missed if running on a schedule. However, it is expected that the indexer processes of parallelization and scaling out may require some time. 
 
 ## Indexing documents with sensitivity labels