Skip to content

Commit 96ac088

Browse files
authored
Merge pull request #213505 from HeidiSteen/heidist-REST
fixed tab layout error
2 parents 9c09059 + 2746b68 commit 96ac088

File tree

1 file changed

+22
-18
lines changed

1 file changed

+22
-18
lines changed

articles/search/search-howto-index-changed-deleted-blobs.md

Lines changed: 22 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
title: Changed and deleted blobs
33
titleSuffix: Azure Cognitive Search
4-
description: Indexers that index from Azure Storage can pick up new and changed content automatically. To automate deletion detection, follow the strategies described in this article.
4+
description: Indexers that index from Azure Storage can pick up new and changed content automatically. This article describes the strategies.
55

66
author: gmndrg
77
ms.author: gimondra
@@ -16,7 +16,7 @@ ms.date: 09/09/2022
1616

1717
After an initial search index is created, you might want subsequent indexer jobs to only pick up new and changed documents. For indexed content that originates from Azure Storage, change detection occurs automatically because indexers keep track of the last update using the built-in timestamps on objects and files in Azure Storage.
1818

19-
Although change detection is a given, deletion detection is not. An indexer doesn't track object deletion in data sources. To avoid having orphan search documents, you can implement a "soft delete" strategy that results in deleting search documents first, with physical deletion in Azure Storage following as a second step.
19+
Although change detection is a given, deletion detection isn't. An indexer doesn't track object deletion in data sources. To avoid having orphan search documents, you can implement a "soft delete" strategy that results in deleting search documents first, with physical deletion in Azure Storage following as a second step.
2020

2121
There are two ways to implement a soft delete strategy:
2222

@@ -37,20 +37,23 @@ There are two ways to implement a soft delete strategy:
3737
For this deletion detection approach, Cognitive Search depends on the [native blob soft delete](../storage/blobs/soft-delete-blob-overview.md) feature in Azure Blob Storage to determine whether blobs have transitioned to a soft deleted state. When blobs are detected in this state, a search indexer uses this information to remove the corresponding document from the index.
3838

3939
> [!IMPORTANT]
40-
> Support for native blob soft delete is in preview under [Supplemental Terms of Use](https://azure.microsoft.com/support/legal/preview-supplemental-terms/). The [REST API version 2020-06-30-Preview](./search-api-preview.md) provides this feature. There is currently no .NET SDK support.
40+
> Support for native blob soft delete is in preview under [Supplemental Terms of Use](https://azure.microsoft.com/support/legal/preview-supplemental-terms/). The [REST API version 2020-06-30-Preview](./search-api-preview.md) provides this feature. There's currently no .NET SDK support.
4141
4242
### Requirements for native soft delete
4343

44+
+ Blobs must be in an Azure Blob Storage container. The Cognitive Search native blob soft delete policy isn't supported for blobs in ADLS Gen2.
45+
4446
+ [Enable soft delete for blobs](../storage/blobs/soft-delete-blob-enable.md).
45-
+ Blobs must be in an Azure Blob Storage container. The Cognitive Search native blob soft delete policy is not supported for blobs in ADLS Gen2.
46-
+ Document keys for the documents in your index must be mapped to either be a blob property or blob metadata.
47-
+ You must use the preview REST API (`api-version=2020-06-30-Preview`) or the indexer Data Source configuration in your Cognitive Search Service from the Azure portal, to configure support for soft delete.
4847

49-
### How to configure deletion detection using native soft delete
48+
+ Document keys for the documents in your index must be mapped to either be a blob property or blob metadata, such as "metadata_storage_path".
49+
50+
+ You must use the preview REST API (`api-version=2020-06-30-Preview`), or the indexer Data Source configuration in the Azure portal, to configure support for soft delete.
51+
52+
### Configure native soft delete
5053

51-
1. In Blob storage, when enabling soft delete, set the retention policy to a value that's much higher than your indexer interval schedule. This way if there's an issue running the indexer or if you have a large number of documents to index, there's plenty of time for the indexer to eventually process the soft deleted blobs. Azure Cognitive Search indexers will only delete a document from the index if it processes the blob while it's in a soft deleted state.
54+
In Blob storage, when enabling soft delete per the requirements, set the retention policy to a value that's much higher than your indexer interval schedule. If there's an issue running the indexer, or if you have a large number of documents to index, there's plenty of time for the indexer to eventually process the soft deleted blobs. Azure Cognitive Search indexers will only delete a document from the index if it processes the blob while it's in a soft deleted state.
5255

53-
1. In Cognitive Search, set a native blob soft deletion detection policy on the data source. You can do this either from the Azure portal or by using preview REST API (`api-version=2020-06-30-Preview`).
56+
In Cognitive Search, set a native blob soft deletion detection policy on the data source. You can do this either from the Azure portal or by using preview REST API (`api-version=2020-06-30-Preview`). The following instructions explain how to set the delete detection policy in Azure portal or through REST APIs.
5457

5558
### [**Azure portal**](#tab/portal)
5659

@@ -60,16 +63,15 @@ For this deletion detection approach, Cognitive Search depends on the [native bl
6063

6164
The following screenshot shows where you can find this feature in the portal.
6265

63-
:::image type="content" source="media/search-indexing-changed-deleted-blobs/new-data-source.png" alt-text="Screenshot of portal data source." border="true":::
66+
:::image type="content" source="media/search-indexing-changed-deleted-blobs/new-data-source.png" alt-text="Screenshot of data source configuration in Import Data wizard." border="true":::
6467

6568
1. On the **New Data Source** form, fill out the required fields, select the **Track deletions** checkbox and choose **Native blob soft delete**. Then hit **Save** to enable the feature on Data Source creation.
6669

6770
:::image type="content" source="media/search-indexing-changed-deleted-blobs/native-soft-delete.png" alt-text="Screenshot of portal data source native soft delete." border="true":::
6871

69-
7072
### [**REST**](#tab/rest-api)
7173

72-
An example of using REST API to set soft deletion detection policy on the data source is shown below.
74+
Set the soft deletion detection policy in the data source definition. Specify the preview API version when creating or updating the data source.
7375

7476
```http
7577
PUT https://[service name].search.windows.net/datasources/blob-datasource?api-version=2020-06-30-Preview
@@ -86,13 +88,15 @@ api-key: [admin key]
8688
}
8789
```
8890

89-
1. [Run the indexer](/rest/api/searchservice/run-indexer) or set the indexer to run [on a schedule](search-howto-schedule-indexers.md). When the indexer runs and processes a blob having a soft delete state, the corresponding search document will be removed from the index.
91+
[Run the indexer](/rest/api/searchservice/run-indexer) or set the indexer to run [on a schedule](search-howto-schedule-indexers.md). When the indexer runs and processes a blob having a soft delete state, the corresponding search document will be removed from the index.
92+
93+
---
9094

91-
### Re-index un-deleted blobs using native soft delete policies
95+
### Reindex undeleted blobs using native soft delete policies
9296

93-
If you restore a soft deleted blob in Blob storage, the indexer will not always re-index it. This is because the indexer uses the blob's `LastModified` timestamp to determine whether indexing is needed. When a soft deleted blob is undeleted, its `LastModified` timestamp does not get updated, so if the indexer has already processed blobs with more recent `LastModified` timestamps, it won't re-index the undeleted blob.
97+
If you restore a soft deleted blob in Blob storage, the indexer won't always reindex it. This is because the indexer uses the blob's `LastModified` timestamp to determine whether indexing is needed. When a soft deleted blob is undeleted, its `LastModified` timestamp doesn't get updated, so if the indexer has already processed blobs with more recent `LastModified` timestamps, it won't reindex the undeleted blob.
9498

95-
To make sure that an undeleted blob is reindexed, you will need to update the blob's `LastModified` timestamp. One way to do this is by resaving the metadata of that blob. You don't need to change the metadata, but resaving the metadata will update the blob's `LastModified` timestamp so that the indexer knows to pick it up.
99+
To make sure that an undeleted blob is reindexed, you'll need to update the blob's `LastModified` timestamp. One way to do this is by resaving the metadata of that blob. You don't need to change the metadata, but resaving the metadata will update the blob's `LastModified` timestamp so that the indexer knows to pick it up.
96100

97101
<a name="soft-delete-using-custom-metadata"></a>
98102

@@ -123,13 +127,13 @@ There are steps to follow in both Azure Storage and Cognitive Search, but there
123127
124128
1. Run the indexer. Once the indexer has processed the file and deleted the document from the search index, you can then delete the physical file in Azure Storage.
125129
126-
## Re-index un-deleted blobs and files
130+
## Reindex undeleted blobs and files
127131
128132
You can reverse a soft-delete if the original source file still physically exists in Azure Storage.
129133
130134
1. Change the `"softDeleteMarkerValue" : "false"` on the blob or file in Azure Storage.
131135
132-
1. Check the blob or file's `LastModified` timestamp to make it is newer than the last indexer run. You can force an update to the current date and time by re-saving the existing metadata.
136+
1. Check the blob or file's `LastModified` timestamp to make it's newer than the last indexer run. You can force an update to the current date and time by resaving the existing metadata.
133137
134138
1. Run the indexer.
135139

0 commit comments

Comments
 (0)