You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/search-howto-indexing-azure-blob-storage.md
+10-10Lines changed: 10 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -294,7 +294,7 @@ When you set up a blob indexer to run on a schedule, it reindexes only the chang
294
294
> [!NOTE]
295
295
> You don't have to specify a change detection policy – incremental indexing is enabled for you automatically.
296
296
297
-
To support deleting documents, use a soft delete approach. If you delete the blobs outright, corresponding documents will not be removed from the search index. Blobs must be in a soft delete state for Azure Cognitive Search to process them.
297
+
To support deleting documents, use a "soft delete" approach. If you delete the blobs outright, corresponding documents will not be removed from the search index.
298
298
299
299
There are two ways to implement the soft delete approach. Both are described below.
300
300
@@ -303,12 +303,12 @@ There are two ways to implement the soft delete approach. Both are described bel
303
303
> [!IMPORTANT]
304
304
> Support for native blob soft delete is in preview. Preview functionality is provided without a service level agreement, and is not recommended for production workloads. For more information, see [Supplemental Terms of Use for Microsoft Azure Previews](https://azure.microsoft.com/support/legal/preview-supplemental-terms/). The [REST API version 2019-05-06-Preview](https://docs.microsoft.com/azure/search/search-api-preview) provides this feature. There is currently no portal or .NET SDK support.
305
305
306
-
In this method you will use the native blob soft delete feature offered by Azure Blob storage. If the data source has a native soft delete policy set and the indexer finds a blob that has been soft deleted, the indexer will remove that document from the index.
306
+
In this method you will use the [native blob soft delete](https://docs.microsoft.com/azure/storage/blobs/storage-blob-soft-delete) feature offered by Azure Blob storage. If the data source has a native soft delete policy set and the indexer finds a blob that has been transitioned to a soft deleted state, the indexer will remove that document from the index.
307
307
308
308
Use the following steps:
309
-
1. Enable [native soft delete for Azure Blob storage](https://docs.microsoft.com/azure/storage/blobs/storage-blob-soft-delete). We recommend setting the retention policy to a value that's much higher than your indexer interval schedule. This way if there's an issue running the indexer or if you have a large number of documents to index, there's plenty of time for the indexer to eventually process the soft deleted documents. Azure Cognitive Search indexers will only delete a document from the index if it processes the blob while it's in a soft delete state.
310
-
2. Configure a native blob soft deletion detection policy on the data source. An example is shown below. Since this feature is in preview, you must use the preview REST API.
311
-
3.When the indexer processes the blob it will be removed from the index.
309
+
1. Enable [native soft delete for Azure Blob storage](https://docs.microsoft.com/azure/storage/blobs/storage-blob-soft-delete). We recommend setting the retention policy to a value that's much higher than your indexer interval schedule. This way if there's an issue running the indexer or if you have a large number of documents to index, there's plenty of time for the indexer to eventually process the soft deleted blobs. Azure Cognitive Search indexers will only delete a document from the index if it processes the blob while it's in a soft deleted state.
310
+
1. Configure a native blob soft deletion detection policy on the data source. An example is shown below. Since this feature is in preview, you must use the preview REST API.
311
+
1. Run the indexer or set the indexer to run on a schedule. When the indexer runs and processes the blob the document will be removed from the index.
312
312
313
313
```
314
314
PUT https://[service name].search.windows.net/datasources/blob-datasource?api-version=2019-05-06-Preview
If you natively soft delete a blob from Azure Blob storage you have the option to undelete that blob within the retention period. When an Azure Cognitive Search data source has a native blob soft delete policy and the indexer processes a soft deleted blob it will remove that document from the index. If that blob is later undeleted the indexer will **not** always reindex that blob. This is because the indexer determines which blobs to index based on the blob's `LastModified` timestamp. When a soft deleted blob is undeleted its `LastModified` timestamp does not get updated so if the indexer has already processed blobs with `LastModified` timestamps more recent than the undeleted blob it won't reindex the undeleted blob. To make sure that an undeleted blob is reindexed, you should resave the metadata of that blob. This will update its `LastModified` timestamp so that the indexer knows that it needs to index this blob.
330
+
If you delete a blob from Azure Blob storage with native soft delete enabled on your storage account the blob will transition to a soft deleted state giving you the option to undelete that blob within the retention period. When an Azure Cognitive Search data source has a native blob soft delete policy and the indexer processes a soft deleted blob it will remove that document from the index. If that blob is later undeleted the indexer will **not** always reindex that blob. This is because the indexer determines which blobs to index based on the blob's `LastModified` timestamp. When a soft deleted blob is undeleted its `LastModified` timestamp does not get updated, so if the indexer has already processed blobs with `LastModified` timestamps more recent than the undeleted blob it won't reindex the undeleted blob. To make sure that an undeleted blob is reindexed, you should resave the metadata of that blob. You don't need to change the metadata but resaving the metadata will update the blob's `LastModified` timestamp so that the indexer knows that it needs to reindex this blob.
331
331
332
-
### Custom soft delete
332
+
### Soft delete using custom metadata
333
333
334
334
In this method you will use a custom metadata property to indicate when a document should be removed from the search index.
335
335
@@ -349,7 +349,7 @@ For example, the following policy considers a blob to be deleted if it has a met
@@ -359,7 +359,7 @@ For example, the following policy considers a blob to be deleted if it has a met
359
359
360
360
#### Reindexing undeleted blobs
361
361
362
-
If you set a soft delete column detection policy on your data source, set the soft delete column name with the marker value, then ran the indexer the indexer will remove that document from the index. If you would like reindex that document, simply change the marker value for that blob and rerun the indexer.
362
+
If you set a soft delete column detection policy on your data source, then add the custom metadata property to a blob with the marker value, then run the indexer, the indexer will remove that document from the index. If you would like reindex that document, simply change the soft delete metadata value for that blob and rerun the indexer.
0 commit comments