You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/cognitive-search-incremental-indexing-conceptual.md
+13-13Lines changed: 13 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,15 +8,15 @@ ms.service: azure-ai-search
8
8
ms.custom:
9
9
- ignite-2023
10
10
ms.topic: conceptual
11
-
ms.date: 01/17/2025
11
+
ms.date: 07/11/2025
12
12
---
13
13
14
14
# Incremental enrichment and caching in Azure AI Search
15
15
16
16
> [!IMPORTANT]
17
17
> This feature is in public preview under [supplemental terms of use](https://azure.microsoft.com/support/legal/preview-supplemental-terms/). The [preview REST API](/rest/api/searchservice/search-service-api-versions#preview-versions) supports this feature.
18
18
19
-
*Incremental enrichment* refers to the use of cached enrichments during [skillset execution](cognitive-search-working-with-skillsets.md) so that only new and changed skills and documents incur Standard processing charges for API calls to Azure AI services. The cache contains the output from [document cracking](search-indexer-overview.md#document-cracking), plus the outputs of each skill for every document. Although caching is billable (it uses Azure Storage), the overall cost of enrichment is reduced because the costs of storage are less than image extraction and AI processing.
19
+
*Incremental enrichment* refers to the use of cached enrichments during [skillset execution](cognitive-search-working-with-skillsets.md) so that only new and changed skills and documents incur standard processing charges for API calls to Azure AI services. The cache contains the output from [document cracking](search-indexer-overview.md#document-cracking), plus the outputs of each skill for every document. Although caching is billable (it uses Azure Storage), the overall cost of enrichment is reduced because the costs of storage are less than image extraction and AI processing.
20
20
21
21
To ensure synchronization between your data source data and your index, it's important to understand your unique [data source](search-data-sources-gallery.md) change and deletion tracking prerequisites. This guide specifically addresses how to manage incremental modifications in terms of your skills processing and how to utilize cache for this purpose.
22
22
@@ -37,10 +37,10 @@ The cache is created when you specify the "cache" property and run the indexer.
37
37
38
38
The following example illustrates an indexer with caching enabled. See [Enable enrichment caching](search-howto-incremental-index.md) for full instructions.
39
39
40
-
To use the cache property, you can use 2020-06-30-preview or later when you [create or update an indexer](/rest/api/searchservice/indexers/create-or-update?view=rest-searchservice-2024-05-01-preview&preserve-view=true). We recommend the latest preview API.
40
+
To use the cache property, you can use 2020-06-30-preview or later when you [create or update an indexer](/rest/api/searchservice/indexers/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true). We recommend the latest preview API.
41
41
42
42
```json
43
-
POST https://[YOUR-SEARCH-SERVICE-NAME].search.windows.net/indexers?api-version=2024-05-01-preview
43
+
POST https://[YOUR-SEARCH-SERVICE-NAME].search.windows.net/indexers?api-version=2025-05-01-preview
44
44
{
45
45
"name": "myIndexerName",
46
46
"targetIndexName": "myIndex",
@@ -90,7 +90,7 @@ If you know that a change to the skill is indeed superficial, you should overrid
90
90
Setting this parameter ensures that only updates to the skillset definition are committed and the change isn't evaluated for effects on the existing cache. Use a preview API version, 2020-06-30-Preview or later. We recommend the latest preview API.
91
91
92
92
```http
93
-
PUT https://[servicename].search.windows.net/skillsets/[skillset name]?api-version=2024-05-01-preview&disableCacheReprocessingChangeDetection
93
+
PUT https://[servicename].search.windows.net/skillsets/[skillset name]?api-version=2025-05-01-preview&disableCacheReprocessingChangeDetection
94
94
95
95
```
96
96
@@ -101,7 +101,7 @@ PUT https://[servicename].search.windows.net/skillsets/[skillset name]?api-versi
101
101
Most changes to a data source definition will invalidate the cache. However, for scenarios where you know that a change shouldn't invalidate the cache - such as changing a connection string or rotating the key on the storage account - append the `ignoreResetRequirement` parameter on the [data source update](/rest/api/searchservice/data-sources/create-or-update). Setting this parameter to true allows the commit to go through, without triggering a reset condition that would result in all objects being rebuilt and populated from scratch.
102
102
103
103
```http
104
-
PUT https://[search service].search.windows.net/datasources/[data source name]?api-version=2024-05-01-preview&ignoreResetRequirement
104
+
PUT https://[search service].search.windows.net/datasources/[data source name]?api-version=2025-05-01-preview&ignoreResetRequirement
105
105
106
106
```
107
107
@@ -111,13 +111,13 @@ PUT https://[search service].search.windows.net/datasources/[data source name]?a
111
111
112
112
The purpose of the cache is to avoid unnecessary processing, but suppose you make a change to a skill that the indexer doesn't detect (for example, changing something in external code, such as a custom skill).
113
113
114
-
In this case, you can use the [Reset Skills](/rest/api/searchservice/skillsets/reset-skills?view=rest-searchservice-2024-05-01-preview&preserve-view=true) to force reprocessing of a particular skill, including any downstream skills that have a dependency on that skill's output. This API accepts a POST request with a list of skills that should be invalidated and marked for reprocessing. After Reset Skills, follow with a [Run Indexer](/rest/api/searchservice/indexers/run) request to invoke the pipeline processing.
114
+
In this case, you can use the [Reset Skills](/rest/api/searchservice/skillsets/reset-skills?view=rest-searchservice-2025-05-01-preview&preserve-view=true) to force reprocessing of a particular skill, including any downstream skills that have a dependency on that skill's output. This API accepts a POST request with a list of skills that should be invalidated and marked for reprocessing. After Reset Skills, follow with a [Run Indexer](/rest/api/searchservice/indexers/run) request to invoke the pipeline processing.
115
115
116
116
## Re-cache specific documents
117
117
118
118
[Resetting an indexer](/rest/api/searchservice/indexers/reset) will result in all documents in the search corpus being reprocessed.
119
119
120
-
In scenarios where only a few documents need to be reprocessed, use [Reset Documents (preview)](/rest/api/searchservice/indexers/reset-docs?view=rest-searchservice-2024-05-01-preview&preserve-view=true) to force reprocessing of specific documents. When a document is reset, the indexer invalidates the cache for that document, which is then reprocessed by reading it from the data source. For more information, see [Run or reset indexers, skills, and documents](search-howto-run-reset-indexers.md).
120
+
In scenarios where only a few documents need to be reprocessed, use [Reset Documents (preview)](/rest/api/searchservice/indexers/reset-docs?view=rest-searchservice-2025-05-01-preview&preserve-view=true) to force reprocessing of specific documents. When a document is reset, the indexer invalidates the cache for that document, which is then reprocessed by reading it from the data source. For more information, see [Run or reset indexers, skills, and documents](search-howto-run-reset-indexers.md).
121
121
122
122
To reset specific documents, the request provides a list of document keys as read from the search index. If the key is mapped to a field in the external data source, the value that you provide should be the one used in the search index.
123
123
@@ -132,7 +132,7 @@ Depending on how you call the API, the request will either append, overwrite, or
132
132
The following example illustrates a reset document request:
133
133
134
134
```http
135
-
POST https://[search service name].search.windows.net/indexers/[indexer name]/resetdocs?api-version=2024-05-01-preview
135
+
POST https://[search service name].search.windows.net/indexers/[indexer name]/resetdocs?api-version=2025-05-01-preview
136
136
{
137
137
"documentKeys" : [
138
138
"key1",
@@ -183,13 +183,13 @@ REST API version `2020-06-30-Preview` or later provides incremental enrichment t
183
183
184
184
Skillsets and data sources can use the generally available version. In addition to the reference documentation, see [Configure caching for incremental enrichment](search-howto-incremental-index.md) for details about order of operations.
185
185
186
-
+[Create or Update Indexer (api-version=2024-05-01-preview)](/rest/api/searchservice/indexers/create-or-update?view=rest-searchservice-2024-05-01-preview&preserve-view=true)
186
+
+[Create or Update Indexer (api-version=2025-05-01-preview)](/rest/api/searchservice/indexers/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true)
+[Create or Update Skillset (api-version=2024-07-01)](/rest/api/searchservice/skillsets/create-or-update) (New URI parameter on the request)
190
+
+[Create or Update Skillset (api-version=2025-05-01-preview)](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true) (New URI parameter on the request)
191
191
192
-
+[Create or Update Data Source (api-version=2024-07-01)](/rest/api/searchservice/data-sources/create-or-update), when called with a preview API version, provides a new parameter named "ignoreResetRequirement", which should be set to true when your update action shouldn't invalidate the cache. Use "ignoreResetRequirement" sparingly as it could lead to unintended inconsistency in your data that won't be detected easily.
192
+
+[Create or Update Data Source (api-version=2025-05-01-preview)](/rest/api/searchservice/data-sources/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true), when called with a preview API version, provides a new parameter named "ignoreResetRequirement", which should be set to true when your update action shouldn't invalidate the cache. Use "ignoreResetRequirement" sparingly as it could lead to unintended inconsistency in your data that won't be detected easily.
Copy file name to clipboardExpand all lines: articles/search/cognitive-search-skill-image-analysis.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,7 +10,7 @@ ms.service: azure-ai-search
10
10
ms.custom:
11
11
- ignite-2023
12
12
ms.topic: reference
13
-
ms.date: 03/07/2024
13
+
ms.date: 07/11/2024
14
14
---
15
15
16
16
# Image Analysis cognitive skill
@@ -25,7 +25,7 @@ This skill uses the machine learning models provided by [Azure AI Vision](/azure
25
25
26
26
Supported data sources for OCR and image analysis are blobs in Azure Blob Storage and Azure Data Lake Storage (ADLS) Gen2, and image content in OneLake. Images can be standalone files or embedded images in a PDF or other files.
27
27
28
-
This skill is implemented using the [AI Image Analysis API](/azure/ai-services/computer-vision/overview-image-analysis) version 3.2. If your solution requires calling a newer version of that service API (such as version 4.0), consider implementing through [Web API custom skill](cognitive-search-custom-skill-web-api.md).
28
+
This skill is implemented using the [AI Image Analysis API](/azure/ai-services/computer-vision/overview-image-analysis) version 3.2. If your solution requires calling a newer version of that service API (such as version 4.0), consider implementing through [Web API custom skill](cognitive-search-custom-skill-web-api.md) or use the [ImageAnalysisV4 power skill](https://github.com/Azure-Samples/azure-search-power-skills/blob/main/Vision/ImageAnalysisV4/README.md).
29
29
30
30
> [!NOTE]
31
31
> This skill is bound to Azure AI services and requires [a billable resource](cognitive-search-attach-cognitive-services.md) for transactions that exceed 20 documents per indexer per day. Execution of built-in skills is charged at the existing [Azure AI services Standard price](https://azure.microsoft.com/pricing/details/cognitive-services/).
@@ -55,15 +55,15 @@ Once content is extracted, the [skillset](cognitive-search-working-with-skillset
55
55
56
56
### Download files
57
57
58
-
Download a zip file of the sample data repository and extract the contents. [Learn how](https://docs.github.com/get-started/start-your-journey/downloading-files-from-github).
58
+
+Download a zip file of the sample data repository and extract the contents. [Learn how](https://docs.github.com/get-started/start-your-journey/downloading-files-from-github).
59
59
60
-
+[Sample data files (mixed media)](https://github.com/Azure-Samples/azure-search-sample-data/tree/main/ai-enrichment-mixed-media)
60
+
+Download the [sample data files (mixed media)](https://github.com/Azure-Samples/azure-search-sample-data/tree/main/ai-enrichment-mixed-media)
61
61
62
62
### Upload sample data to Azure Storage
63
63
64
-
1. In Azure Storage, create a new container and name it *cog-search-demo*.
64
+
1. In Azure Storage, [create a new container](/azure/storage/blobs/storage-quickstart-blobs-portal) and name it *mixed-content-types*.
65
65
66
-
1.[Upload the sample data files](/azure/storage/blobs/storage-quickstart-blobs-portal).
66
+
1. Upload the sample data files.
67
67
68
68
1. Get a storage connection string so that you can formulate a connection in Azure AI Search.
69
69
@@ -87,11 +87,9 @@ For this tutorial, connections to Azure AI Search require an endpoint and an API
87
87
88
88
1. Under **Settings** > **Keys**, copy an admin key. Admin keys are used to add, modify, and delete objects. There are two interchangeable admin keys. Copy either one.
89
89
90
-
:::image type="content" source="media/search-get-started-rest/get-url-key.png" alt-text="Screenshot of the URL and API keys in the Azure portal.":::
91
-
92
90
## Set up your environment
93
91
94
-
Begin by opening Visual Studio and creating a new Console App project that can run on .NET Core.
92
+
Begin by opening Visual Studio and creating a new Console App project.
95
93
96
94
### Install Azure.Search.Documents
97
95
@@ -173,7 +171,7 @@ public static void Main(string[] args)
173
171
> The clients connect to your search service. In order to avoid opening too many connections, you should try to share a single instance in your application if possible. The methods are thread-safe to enable such sharing.
174
172
>
175
173
176
-
### Add function to exit the program during failure
174
+
### Add a function to exit the program during failure
177
175
178
176
This tutorial is meant to help you understand each step of the indexing pipeline. If there's a critical issue that prevents the program from creating the data source, skillset, index, or indexer the program will output the error message and exit so that the issue can be understood and addressed.
0 commit comments