Skip to content

Commit 2bd30b2

Browse files
updating toc, fixing links, removing extra image
1 parent 0ac3e10 commit 2bd30b2

File tree

5 files changed

+17
-15
lines changed

5 files changed

+17
-15
lines changed

articles/search/TOC.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,8 @@
6060
href: search-semi-structured-data.md
6161
- name: Index multiple Azure data sources
6262
href: tutorial-multiple-data-sources.md
63+
- name: Index any data
64+
href: tutorial-optimize-indexing-pushapi.md
6365
- name: Use AI to create content
6466
items:
6567
- name: C#
1.58 KB
Loading
Binary file not shown.

articles/search/search-howto-large-index.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ ms.date: 5/8/2020
1313

1414
# How to index large data sets in Azure Cognitive Search
1515

16-
Azure Cognitive Search supports [two basic approaches](https://docs.microsoft.com/en-us/azure/search/search-what-is-data-import) for importing data into a search index: *pushing* your data into the index programmatically, or pointing an [Azure Cognitive Search indexer](https://docs.microsoft.com/en-us/azure/search/search-indexer-overview) at a supported data source to *pull* in the data.
16+
Azure Cognitive Search supports [two basic approaches](search-what-is-data-import.md) for importing data into a search index: *pushing* your data into the index programmatically, or pointing an [Azure Cognitive Search indexer](search-indexer-overview.md) at a supported data source to *pull* in the data.
1717

1818
As data volumes grow or processing needs change, you might find that simple or default indexing strategies are no longer practical. For Azure Cognitive Search, there are several approaches for accommodating larger data sets, ranging from how you structure a data upload request, to using a source-specific indexer for scheduled and distributed workloads.
1919

@@ -64,14 +64,14 @@ The optimal number of threads is determined by the tier of your search service,
6464
> [!NOTE]
6565
> As you increase the tier of your search service or increase the partitions, you should also increase the number of concurrent threads.
6666
67-
As you ramp up the requests hitting the search service, you may encounter [HTTP status codes](https://docs.microsoft.com/en-us/rest/api/searchservice/http-status-codes) indicating the request did not fully succeed. During indexing, two common HTTP status codes are:
67+
As you ramp up the requests hitting the search service, you may encounter [HTTP status codes](http-status-codes.md) indicating the request did not fully succeed. During indexing, two common HTTP status codes are:
6868

6969
* **503 Service Unavailable** - This error means that the system is under heavy load and your request can't be processed at this time.
7070
* **207 Multi-Status** - This error means that some documents succeeded, but at least one failed.
7171

7272
### Retry strategy
7373

74-
If a failure happens, requests should be retried using an [exponential backoff retry strategy](https://docs.microsoft.com/en-us/dotnet/architecture/microservices/implement-resilient-applications/implement-retries-exponential-backoff).
74+
If a failure happens, requests should be retried using an [exponential backoff retry strategy](https://docs.microsoft.com/dotnet/architecture/microservices/implement-resilient-applications/implement-retries-exponential-backoff).
7575

7676
Azure Cognitive Search's .NET SDK automatically retries 503s and other failed requests but you'll need to implement your own logic to retry 207s. Open-source tools such as [Polly](https://github.com/App-vNext/Polly) can also be used to implement a retry strategy. In this sample, we implement our own exponential backoff retry strategy.
7777

articles/search/tutorial-optimize-indexing-pushapi.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: 'C# Tutorial: Optimize indexing with the push API'
2+
title: 'C# tutorial optimize indexing with the push API'
33
titleSuffix: Azure Cognitive Search
44
description: Learn how to efficiently index data using Azure Cognitive Search's push API. This tutorial and sample code are in C#.
55

@@ -13,9 +13,9 @@ ms.date: 05/08/2020
1313

1414
# Tutorial: Optimize indexing with the push API
1515

16-
Azure Cognitive Search supports [two basic approaches](https://docs.microsoft.com/en-us/azure/search/search-what-is-data-import) for importing data into a search index: *pushing* your data into the index programmatically, or pointing an [Azure Cognitive Search indexer](https://docs.microsoft.com/en-us/azure/search/search-indexer-overview) at a supported data source to *pull* in the data.
16+
Azure Cognitive Search supports [two basic approaches](search-what-is-data-import.md) for importing data into a search index: *pushing* your data into the index programmatically, or pointing an [Azure Cognitive Search indexer](search-indexer-overview.md) at a supported data source to *pull* in the data.
1717

18-
This tutorial describes how to efficiently index data using the [push model](https://docs.microsoft.com/en-us/azure/search/search-what-is-data-import#pushing-data-to-an-index). A .NET Core C# console application has been created so you can [download and run the application](https://github.com/Azure-Samples/azure-search-dotnet-samples/tree/master/optimize-data-indexing). This article explains the key aspects of the application as well as factors to consider when indexing data.
18+
This tutorial describes how to efficiently index data using the [push model](search-what-is-data-import.md#pushing-data-to-an-index) by batching requests and leveraging an exponential backoff retry strategy. You can [download and run the application](https://github.com/Azure-Samples/azure-search-dotnet-samples/tree/master/optimize-data-indexing). This article explains the key aspects of the application as well as factors to consider when indexing data.
1919

2020
This tutorial uses C# and the [.NET SDK](https://aka.ms/search-sdk) to perform the following tasks:
2121

@@ -30,7 +30,7 @@ If you don't have an Azure subscription, create a [free account](https://azure.m
3030

3131
## Prerequisites
3232

33-
The following services and tools are required for this quickstart.
33+
The following services and tools are required for this tutorial.
3434

3535
+ [Visual Studio](https://visualstudio.microsoft.com/downloads/), any edition. Sample code and instructions were tested on the free Community edition.
3636

@@ -88,7 +88,7 @@ API calls require the service URL and an access key. A search service is created
8888

8989
Once you update *appsettings.json*, the sample program in **OptimizeDataIndexing.sln** should be ready to build and run.
9090

91-
This code is derived from the [C# Quickstart](https://docs.microsoft.com/en-us/azure/search/search-get-started-dotnet) and you can find more detailed information on creating indexes and the basics of working with the .NET SDK in that article.
91+
This code is derived from the [C# Quickstart](search-get-started-dotnet.md) and you can find more detailed information on creating indexes and the basics of working with the .NET SDK in that article.
9292

9393
This simple C#/.NET console app performs the following tasks:
9494

@@ -167,7 +167,7 @@ Determining the optimal batch size for your data is a key component of optimizin
167167
1. The schema of your index
168168
1. The size of your data
169169

170-
Because, the optimal batch size is dependent on your index and your data, the best approach is to test different batch sizes to determine what results in the fastest indexing speeds in terms of MB/s for your scenario.
170+
Because the optimal batch size is dependent on your index and your data, the best approach is to test different batch sizes to determine what results in the fastest indexing speeds in terms of MB/s for your scenario.
171171

172172
The following function demonstrates a simple approach to testing batch sizes.
173173

@@ -256,14 +256,14 @@ The optimal number of threads is determined by the tier of your search service,
256256
> [!NOTE]
257257
> As you increase the tier of your search service or increase the partitions, you should also increase the number of concurrent threads.
258258
259-
As you ramp up the requests hitting the search service, you may encounter [HTTP status codes](https://docs.microsoft.com/en-us/rest/api/searchservice/http-status-codes) indicating the request did not fully succeed. During indexing, two common HTTP status codes are:
259+
As you ramp up the requests hitting the search service, you may encounter [HTTP status codes](http-status-codes.md) indicating the request did not fully succeed. During indexing, two common HTTP status codes are:
260260

261261
* **503 Service Unavailable** - This error means that the system is under heavy load and your request can't be processed at this time.
262262
* **207 Multi-Status** - This error means that some documents succeeded, but at least one failed.
263263

264264
### Implement an exponential backoff retry strategy
265265

266-
If a failure happens, requests should be retried using an [exponential backoff retry strategy](https://docs.microsoft.com/en-us/dotnet/architecture/microservices/implement-resilient-applications/implement-retries-exponential-backoff).
266+
If a failure happens, requests should be retried using an [exponential backoff retry strategy](https://docs.microsoft.com/dotnet/architecture/microservices/implement-resilient-applications/implement-retries-exponential-backoff).
267267

268268
Azure Cognitive Search's .NET SDK automatically retries 503s and other failed requests but you'll need to implement your own logic to retry 207s. Open-source tools such as [Polly](https://github.com/App-vNext/Polly) can also be used to implement a retry strategy. In this sample, we implement our own exponential backoff retry strategy.
269269

@@ -279,7 +279,7 @@ TimeSpan delay = delay = TimeSpan.FromSeconds(2);
279279
int maxRetryAttempts = 5;
280280
```
281281

282-
It's important to catch [IndexBatchException](https://docs.microsoft.com/en-us/dotnet/api/microsoft.azure.search.indexbatchexception?view=azure-dotnet) as this indicates that the indexing operation only partially succeeded (207s). Failed items should be retried using the `FindFailedActionsToRetry` method which making it easy to create a new batch containing only the failed items.
282+
It's important to catch [IndexBatchException](https://docs.microsoft.com/dotnet/api/microsoft.azure.search.indexbatchexception?view=azure-dotnet) as this indicates that the indexing operation only partially succeeded (207s). Failed items should be retried using the `FindFailedActionsToRetry` method which making it easy to create a new batch containing only the failed items.
283283

284284
Exceptions other than `IndexBatchException` should also be caught and indicate the request failed completely. These exceptions are less common, particularly with the .NET SDK as it retries 503s automatically.
285285

@@ -343,7 +343,7 @@ You can explore the populated search index after the program has run programatic
343343

344344
### Programatically
345345

346-
There are two main options for checking the number of documents in an index: the [Count Documents API](https://docs.microsoft.com/en-us/rest/api/searchservice/count-documents) and the [Get Index Statistics API](https://docs.microsoft.com/en-us/rest/api/searchservice/get-index-statistics). Both paths
346+
There are two main options for checking the number of documents in an index: the [Count Documents API](https://docs.microsoft.com/rest/api/searchservice/count-documents) and the [Get Index Statistics API](https://docs.microsoft.com/rest/api/searchservice/get-index-statistics). Both paths
347347

348348
#### Count Documents
349349

@@ -365,9 +365,9 @@ IndexGetStatisticsResult indexStats = serviceClient.Indexes.GetStatistics(config
365365

366366
In Azure portal, open the search service **Overview** page, and find the **optimize-indexing** index in the **Indexes** list.
367367

368-
![List of Azure Cognitive Search indexes](media/tutorial-optimize-data-indexing/portal-output2.png "List of Azure Cognitive Search indexes")
368+
![List of Azure Cognitive Search indexes](media/tutorial-optimize-data-indexing/portal-output.png "List of Azure Cognitive Search indexes")
369369

370-
The *Document Count* and *Storage Size* are based on [Get Index Statistics API](https://docs.microsoft.com/en-us/rest/api/searchservice/get-index-statistics) and may take several minutes to update.
370+
The *Document Count* and *Storage Size* are based on [Get Index Statistics API](https://docs.microsoft.com/rest/api/searchservice/get-index-statistics) and may take several minutes to update.
371371

372372
## Reset and rerun
373373

0 commit comments

Comments
 (0)