You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/search-howto-large-index.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -13,9 +13,9 @@ ms.date: 01/17/2023
13
13
14
14
# Index large data sets in Azure Cognitive Search
15
15
16
-
If your search solution includes indexing big data or complex data, this article describes the strategies for accommodating long running processes on Azure Cognitive Search.
16
+
If your search solution requirements include indexing big data or complex data, this article describes the strategies for accommodating long running processes on Azure Cognitive Search.
17
17
18
-
This article assumes familiarity with the [two basic approaches for importing data](search-what-is-data-import.md): pushing data into an index, or pulling in data using a [search indexer](search-indexer-overview.md) on a supported data source. The strategy you choose will be determined by the indexing approach you're already using. If your scenario involves computationally intensive [AI enrichment](cognitive-search-concept-intro.md), then your strategy must include indexers, given the skillset dependency on indexers.
18
+
This article assumes familiarity with the [two basic approaches for importing data](search-what-is-data-import.md): pushing data into an index, or pulling in data from a supported data source using a [search indexer](search-indexer-overview.md). The strategy you choose will be determined by the indexing approach you're already using. If your scenario involves computationally intensive [AI enrichment](cognitive-search-concept-intro.md), then your strategy must include indexers, given the skillset dependency on indexers.
19
19
20
20
This article complements [Tips for better performance](search-performance-tips.md), which offers best practices on index and query design. A well-designed index that includes only the fields and attributes you need is an important prerequisite for large-scale indexing.
21
21
@@ -26,8 +26,8 @@ This article complements [Tips for better performance](search-performance-tips.m
26
26
27
27
"Push" APIs, such as [Add Documents REST API](/rest/api/searchservice/addupdate-or-delete-documents) or the [IndexDocuments method (Azure SDK for .NET)](/dotnet/api/azure.search.documents.searchclient.indexdocuments), are the most prevalent form of indexing in Cognitive Search. For solutions that use a push API, the strategy for long-running indexing will have one or both of the following components:
28
28
29
-
+Batch documents
30
-
+Manage threads
29
+
+Batching documents
30
+
+Managing threads
31
31
32
32
### Batch multiple documents per request
33
33
@@ -64,7 +64,7 @@ The Azure .NET SDK automatically retries 503s and other failed requests, but you
64
64
+ Parallel indexing over partitioned data
65
65
+ Scheduling and integration with change detection logic to index just new and change documents over time
66
66
67
-
Indexer schedules allow you to parcel out indexing at regular intervals. Scheduled indexing can resume at the last known stopping point. If a data source isn't fully scanned within the processing window, the indexer picks up wherever it left off at the last job.
67
+
Indexer schedules can resume processing at the last known stopping point. If data isn't fully indexed within the processing window, the indexer picks up wherever it left off on the next run.
68
68
69
69
Partitioning data into smaller individual data sources enables parallel processing. You can break up source data into smaller components, such as into multiple containers in Azure Blob Storage, create a [data source](/rest/api/searchservice/create-data-source) for each partition, and then [run the indexers in parallel](search-howto-run-reset-indexers.md), subject to the number of search units of your search service.
0 commit comments