You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/search-howto-large-index.md
+8-8Lines changed: 8 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -49,8 +49,8 @@ In general, we recommend only adding additional properties to fields if you inte
49
49
One of the simplest mechanisms for indexing a larger data set is to submit multiple documents or records in a single request. As long as the entire payload is under 16 MB, a request can handle up to 1000 documents in a bulk upload operation. These limits apply whether you're using the [Add Documents REST API](https://docs.microsoft.com/rest/api/searchservice/addupdate-or-delete-documents) or the [Index method](https://docs.microsoft.com/dotnet/api/microsoft.azure.search.documentsoperationsextensions.index?view=azure-dotnet) in the .NET SDK. For either API, you would package 1000 documents in the body of each request.
50
50
51
51
Using batches to index documents will significantly improve indexing performance. Determining the optimal batch size for your data is a key component of optimizing indexing speeds. The two primary factors influencing the optimal batch size are:
52
-
1. The schema of your index
53
-
1. The size of your data
52
+
+ The schema of your index
53
+
+ The size of your data
54
54
55
55
Because the optimal batch size depends on your index and your data, the best approach is to test different batch sizes to determine what results in the fastest indexing speeds for your scenario. This [tutorial](tutorial-optimize-indexing-pushapi.md) provides sample code for testing batch sizes using the .NET SDK.
56
56
@@ -60,10 +60,10 @@ To take full advantage of Azure Cognitive Search's indexing speeds, you'll likel
60
60
61
61
The optimal number of threads is determined by:
62
62
63
-
1. The tier of your search service
64
-
1. The number of partitions
65
-
1. The size of your batches
66
-
1. The schema of your index
63
+
+ The tier of your search service
64
+
+ The number of partitions
65
+
+ The size of your batches
66
+
+ The schema of your index
67
67
68
68
You can modify this sample and test with different thread counts to determine the optimal thread count for your scenario. However, as long as you have several threads running concurrently, you should be able to take advantage of most of the efficiency gains.
69
69
@@ -72,8 +72,8 @@ You can modify this sample and test with different thread counts to determine th
72
72
73
73
As you ramp up the requests hitting the search service, you may encounter [HTTP status codes](https://docs.microsoft.com/rest/api/searchservice/http-status-codes) indicating the request didn't fully succeed. During indexing, two common HTTP status codes are:
74
74
75
-
***503 Service Unavailable** - This error means that the system is under heavy load and your request can't be processed at this time.
76
-
***207 Multi-Status** - This error means that some documents succeeded, but at least one failed.
75
+
+**503 Service Unavailable** - This error means that the system is under heavy load and your request can't be processed at this time.
76
+
+**207 Multi-Status** - This error means that some documents succeeded, but at least one failed.
Copy file name to clipboardExpand all lines: articles/search/tutorial-optimize-indexing-push-api.md
+15-15Lines changed: 15 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -48,12 +48,12 @@ When pushing data into an index, there's several key considerations that impact
48
48
49
49
Six key factors to consider are:
50
50
51
-
1.**Service tier and number of partitions/replicas** - Adding partitions and increasing your tier will both increase indexing speeds.
52
-
1.**Index Schema** - Adding fields and adding additional properties to fields (such as *searchable*, *facetable*, or *filterable*) both reduce indexing speeds.
53
-
1.**Batch size** - The optimal batch size varies based on your index schema and dataset.
54
-
1.**Number of threads/workers** - a single thread won't take full advantage of indexing speeds
55
-
1.**Retry strategy** - An exponential backoff retry strategy should be used to optimize indexing.
56
-
1.**Network data transfer speeds** - Data transfer speeds can be a limiting factor. Index data from within your Azure environment to increase data transfer speeds.
51
+
+**Service tier and number of partitions/replicas** - Adding partitions and increasing your tier will both increase indexing speeds.
52
+
+**Index Schema** - Adding fields and adding additional properties to fields (such as *searchable*, *facetable*, or *filterable*) both reduce indexing speeds.
53
+
+**Batch size** - The optimal batch size varies based on your index schema and dataset.
54
+
+**Number of threads/workers** - a single thread won't take full advantage of indexing speeds
55
+
+**Retry strategy** - An exponential backoff retry strategy should be used to optimize indexing.
56
+
+**Network data transfer speeds** - Data transfer speeds can be a limiting factor. Index data from within your Azure environment to increase data transfer speeds.
57
57
58
58
59
59
## 1 - Create Azure Cognitive Search service
@@ -92,11 +92,11 @@ This code is derived from the [C# Quickstart](search-get-started-dotnet.md). You
92
92
93
93
This simple C#/.NET console app performs the following tasks:
94
94
95
-
* Creates a new index based on the data structure of the C# Hotel class (which also references the Address class).
96
-
* Tests various batch sizes to determine the most efficient size
97
-
* Indexes data asynchronously
98
-
* Using multiple threads to increase indexing speeds
99
-
* Using an exponential backoff retry strategy to retry failed items
95
+
+ Creates a new index based on the data structure of the C# Hotel class (which also references the Address class).
96
+
+ Tests various batch sizes to determine the most efficient size
97
+
+ Indexes data asynchronously
98
+
+ Using multiple threads to increase indexing speeds
99
+
+ Using an exponential backoff retry strategy to retry failed items
100
100
101
101
Before running the program, take a minute to study the code and the index definitions for this sample. The relevant code is in several files:
102
102
@@ -165,8 +165,8 @@ Indexing documents in batches will significantly improve indexing performance. T
165
165
166
166
Determining the optimal batch size for your data is a key component of optimizing indexing speeds. The two primary factors influencing the optimal batch size are:
167
167
168
-
1. The schema of your index
169
-
1. The size of your data
168
+
+ The schema of your index
169
+
+ The size of your data
170
170
171
171
Because the optimal batch size is dependent on your index and your data, the best approach is to test different batch sizes to determine what results in the fastest indexing speeds for your scenario.
172
172
@@ -256,8 +256,8 @@ Several of the key considerations mentioned above impact the optimal number of t
256
256
257
257
As you ramp up the requests hitting the search service, you may encounter [HTTP status codes](https://docs.microsoft.com/rest/api/searchservice/http-status-codes) indicating the request didn't fully succeed. During indexing, two common HTTP status codes are:
258
258
259
-
***503 Service Unavailable** - This error means that the system is under heavy load and your request can't be processed at this time.
260
-
***207 Multi-Status** - This error means that some documents succeeded, but at least one failed.
259
+
+**503 Service Unavailable** - This error means that the system is under heavy load and your request can't be processed at this time.
260
+
+**207 Multi-Status** - This error means that some documents succeeded, but at least one failed.
261
261
262
262
### Implement an exponential backoff retry strategy
0 commit comments