Skip to content

Commit af4da57

Browse files
committed
Freshness pass #1 for July
1 parent edfb365 commit af4da57

8 files changed

+57
-56
lines changed

articles/search/cognitive-search-incremental-indexing-conceptual.md

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -8,15 +8,15 @@ ms.service: azure-ai-search
88
ms.custom:
99
- ignite-2023
1010
ms.topic: conceptual
11-
ms.date: 01/17/2025
11+
ms.date: 07/11/2025
1212
---
1313

1414
# Incremental enrichment and caching in Azure AI Search
1515

1616
> [!IMPORTANT]
1717
> This feature is in public preview under [supplemental terms of use](https://azure.microsoft.com/support/legal/preview-supplemental-terms/). The [preview REST API](/rest/api/searchservice/search-service-api-versions#preview-versions) supports this feature.
1818
19-
*Incremental enrichment* refers to the use of cached enrichments during [skillset execution](cognitive-search-working-with-skillsets.md) so that only new and changed skills and documents incur Standard processing charges for API calls to Azure AI services. The cache contains the output from [document cracking](search-indexer-overview.md#document-cracking), plus the outputs of each skill for every document. Although caching is billable (it uses Azure Storage), the overall cost of enrichment is reduced because the costs of storage are less than image extraction and AI processing.
19+
*Incremental enrichment* refers to the use of cached enrichments during [skillset execution](cognitive-search-working-with-skillsets.md) so that only new and changed skills and documents incur standard processing charges for API calls to Azure AI services. The cache contains the output from [document cracking](search-indexer-overview.md#document-cracking), plus the outputs of each skill for every document. Although caching is billable (it uses Azure Storage), the overall cost of enrichment is reduced because the costs of storage are less than image extraction and AI processing.
2020

2121
To ensure synchronization between your data source data and your index, it's important to understand your unique [data source](search-data-sources-gallery.md) change and deletion tracking prerequisites. This guide specifically addresses how to manage incremental modifications in terms of your skills processing and how to utilize cache for this purpose.
2222

@@ -37,10 +37,10 @@ The cache is created when you specify the "cache" property and run the indexer.
3737

3838
The following example illustrates an indexer with caching enabled. See [Enable enrichment caching](search-howto-incremental-index.md) for full instructions.
3939

40-
To use the cache property, you can use 2020-06-30-preview or later when you [create or update an indexer](/rest/api/searchservice/indexers/create-or-update?view=rest-searchservice-2024-05-01-preview&preserve-view=true). We recommend the latest preview API.
40+
To use the cache property, you can use 2020-06-30-preview or later when you [create or update an indexer](/rest/api/searchservice/indexers/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true). We recommend the latest preview API.
4141

4242
```json
43-
POST https://[YOUR-SEARCH-SERVICE-NAME].search.windows.net/indexers?api-version=2024-05-01-preview
43+
POST https://[YOUR-SEARCH-SERVICE-NAME].search.windows.net/indexers?api-version=2025-05-01-preview
4444
{
4545
"name": "myIndexerName",
4646
"targetIndexName": "myIndex",
@@ -90,7 +90,7 @@ If you know that a change to the skill is indeed superficial, you should overrid
9090
Setting this parameter ensures that only updates to the skillset definition are committed and the change isn't evaluated for effects on the existing cache. Use a preview API version, 2020-06-30-Preview or later. We recommend the latest preview API.
9191

9292
```http
93-
PUT https://[servicename].search.windows.net/skillsets/[skillset name]?api-version=2024-05-01-preview&disableCacheReprocessingChangeDetection
93+
PUT https://[servicename].search.windows.net/skillsets/[skillset name]?api-version=2025-05-01-preview&disableCacheReprocessingChangeDetection
9494
9595
```
9696

@@ -101,7 +101,7 @@ PUT https://[servicename].search.windows.net/skillsets/[skillset name]?api-versi
101101
Most changes to a data source definition will invalidate the cache. However, for scenarios where you know that a change shouldn't invalidate the cache - such as changing a connection string or rotating the key on the storage account - append the `ignoreResetRequirement` parameter on the [data source update](/rest/api/searchservice/data-sources/create-or-update). Setting this parameter to true allows the commit to go through, without triggering a reset condition that would result in all objects being rebuilt and populated from scratch.
102102

103103
```http
104-
PUT https://[search service].search.windows.net/datasources/[data source name]?api-version=2024-05-01-preview&ignoreResetRequirement
104+
PUT https://[search service].search.windows.net/datasources/[data source name]?api-version=2025-05-01-preview&ignoreResetRequirement
105105
106106
```
107107

@@ -111,13 +111,13 @@ PUT https://[search service].search.windows.net/datasources/[data source name]?a
111111

112112
The purpose of the cache is to avoid unnecessary processing, but suppose you make a change to a skill that the indexer doesn't detect (for example, changing something in external code, such as a custom skill).
113113

114-
In this case, you can use the [Reset Skills](/rest/api/searchservice/skillsets/reset-skills?view=rest-searchservice-2024-05-01-preview&preserve-view=true) to force reprocessing of a particular skill, including any downstream skills that have a dependency on that skill's output. This API accepts a POST request with a list of skills that should be invalidated and marked for reprocessing. After Reset Skills, follow with a [Run Indexer](/rest/api/searchservice/indexers/run) request to invoke the pipeline processing.
114+
In this case, you can use the [Reset Skills](/rest/api/searchservice/skillsets/reset-skills?view=rest-searchservice-2025-05-01-preview&preserve-view=true) to force reprocessing of a particular skill, including any downstream skills that have a dependency on that skill's output. This API accepts a POST request with a list of skills that should be invalidated and marked for reprocessing. After Reset Skills, follow with a [Run Indexer](/rest/api/searchservice/indexers/run) request to invoke the pipeline processing.
115115

116116
## Re-cache specific documents
117117

118118
[Resetting an indexer](/rest/api/searchservice/indexers/reset) will result in all documents in the search corpus being reprocessed.
119119

120-
In scenarios where only a few documents need to be reprocessed, use [Reset Documents (preview)](/rest/api/searchservice/indexers/reset-docs?view=rest-searchservice-2024-05-01-preview&preserve-view=true) to force reprocessing of specific documents. When a document is reset, the indexer invalidates the cache for that document, which is then reprocessed by reading it from the data source. For more information, see [Run or reset indexers, skills, and documents](search-howto-run-reset-indexers.md).
120+
In scenarios where only a few documents need to be reprocessed, use [Reset Documents (preview)](/rest/api/searchservice/indexers/reset-docs?view=rest-searchservice-2025-05-01-preview&preserve-view=true) to force reprocessing of specific documents. When a document is reset, the indexer invalidates the cache for that document, which is then reprocessed by reading it from the data source. For more information, see [Run or reset indexers, skills, and documents](search-howto-run-reset-indexers.md).
121121

122122
To reset specific documents, the request provides a list of document keys as read from the search index. If the key is mapped to a field in the external data source, the value that you provide should be the one used in the search index.
123123

@@ -132,7 +132,7 @@ Depending on how you call the API, the request will either append, overwrite, or
132132
The following example illustrates a reset document request:
133133

134134
```http
135-
POST https://[search service name].search.windows.net/indexers/[indexer name]/resetdocs?api-version=2024-05-01-preview
135+
POST https://[search service name].search.windows.net/indexers/[indexer name]/resetdocs?api-version=2025-05-01-preview
136136
{
137137
"documentKeys" : [
138138
"key1",
@@ -183,13 +183,13 @@ REST API version `2020-06-30-Preview` or later provides incremental enrichment t
183183

184184
Skillsets and data sources can use the generally available version. In addition to the reference documentation, see [Configure caching for incremental enrichment](search-howto-incremental-index.md) for details about order of operations.
185185

186-
+ [Create or Update Indexer (api-version=2024-05-01-preview)](/rest/api/searchservice/indexers/create-or-update?view=rest-searchservice-2024-05-01-preview&preserve-view=true)
186+
+ [Create or Update Indexer (api-version=2025-05-01-preview)](/rest/api/searchservice/indexers/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true)
187187

188-
+ [Reset Skills (api-version=2024-05-01-preview)](/rest/api/searchservice/skillsets/reset-skills?view=rest-searchservice-2024-05-01-preview&preserve-view=true)
188+
+ [Reset Skills (api-version=2025-05-01-preview)](/rest/api/searchservice/skillsets/reset-skills?view=rest-searchservice-2025-05-01-preview&preserve-view=true)
189189

190-
+ [Create or Update Skillset (api-version=2024-07-01)](/rest/api/searchservice/skillsets/create-or-update) (New URI parameter on the request)
190+
+ [Create or Update Skillset (api-version=2025-05-01-preview)](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true) (New URI parameter on the request)
191191

192-
+ [Create or Update Data Source (api-version=2024-07-01)](/rest/api/searchservice/data-sources/create-or-update), when called with a preview API version, provides a new parameter named "ignoreResetRequirement", which should be set to true when your update action shouldn't invalidate the cache. Use "ignoreResetRequirement" sparingly as it could lead to unintended inconsistency in your data that won't be detected easily.
192+
+ [Create or Update Data Source (api-version=2025-05-01-preview)](/rest/api/searchservice/data-sources/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true), when called with a preview API version, provides a new parameter named "ignoreResetRequirement", which should be set to true when your update action shouldn't invalidate the cache. Use "ignoreResetRequirement" sparingly as it could lead to unintended inconsistency in your data that won't be detected easily.
193193

194194
## Next steps
195195

articles/search/cognitive-search-tutorial-blob-dotnet.md

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ manager: nitinme
99

1010
ms.service: azure-ai-search
1111
ms.topic: tutorial
12-
ms.date: 03/31/2025
12+
ms.date: 07/11/2025
1313
ms.custom:
1414
- devx-track-csharp
1515
- devx-track-dotnet
@@ -46,7 +46,7 @@ Once content is extracted, the [skillset](cognitive-search-working-with-skillset
4646

4747
+ [Azure AI Search](search-create-app-portal.md).
4848

49-
+ [Azure.Search.Documents 11.x NuGet package](https://www.nuget.org/packages/Azure.Search.Documents).
49+
+ [Azure.Search.Documents package](https://www.nuget.org/packages/Azure.Search.Documents).
5050

5151
+ [Visual Studio](https://visualstudio.microsoft.com/downloads/).
5252

@@ -55,15 +55,15 @@ Once content is extracted, the [skillset](cognitive-search-working-with-skillset
5555
5656
### Download files
5757

58-
Download a zip file of the sample data repository and extract the contents. [Learn how](https://docs.github.com/get-started/start-your-journey/downloading-files-from-github).
58+
+ Download a zip file of the sample data repository and extract the contents. [Learn how](https://docs.github.com/get-started/start-your-journey/downloading-files-from-github).
5959

60-
+ [Sample data files (mixed media)](https://github.com/Azure-Samples/azure-search-sample-data/tree/main/ai-enrichment-mixed-media)
60+
+ Download the [sample data files (mixed media)](https://github.com/Azure-Samples/azure-search-sample-data/tree/main/ai-enrichment-mixed-media)
6161

6262
### Upload sample data to Azure Storage
6363

64-
1. In Azure Storage, create a new container and name it *cog-search-demo*.
64+
1. In Azure Storage, [create a new container](/azure/storage/blobs/storage-quickstart-blobs-portal) and name it *mixed-content-types*.
6565

66-
1. [Upload the sample data files](/azure/storage/blobs/storage-quickstart-blobs-portal).
66+
1. Upload the sample data files.
6767

6868
1. Get a storage connection string so that you can formulate a connection in Azure AI Search.
6969

@@ -87,11 +87,9 @@ For this tutorial, connections to Azure AI Search require an endpoint and an API
8787
8888
1. Under **Settings** > **Keys**, copy an admin key. Admin keys are used to add, modify, and delete objects. There are two interchangeable admin keys. Copy either one.
8989
90-
:::image type="content" source="media/search-get-started-rest/get-url-key.png" alt-text="Screenshot of the URL and API keys in the Azure portal.":::
91-
9290
## Set up your environment
9391
94-
Begin by opening Visual Studio and creating a new Console App project that can run on .NET Core.
92+
Begin by opening Visual Studio and creating a new Console App project.
9593
9694
### Install Azure.Search.Documents
9795
@@ -173,7 +171,7 @@ public static void Main(string[] args)
173171
> The clients connect to your search service. In order to avoid opening too many connections, you should try to share a single instance in your application if possible. The methods are thread-safe to enable such sharing.
174172
>
175173
176-
### Add function to exit the program during failure
174+
### Add a function to exit the program during failure
177175

178176
This tutorial is meant to help you understand each step of the indexing pipeline. If there's a critical issue that prevents the program from creating the data source, skillset, index, or indexer the program will output the error message and exit so that the issue can be understood and addressed.
179177

@@ -206,7 +204,7 @@ private static SearchIndexerDataSourceConnection CreateOrUpdateDataSource(Search
206204
name: "demodata",
207205
type: SearchIndexerDataSourceType.AzureBlob,
208206
connectionString: configuration["AzureBlobConnectionString"],
209-
container: new SearchIndexerDataContainer("cog-search-demo"))
207+
container: new SearchIndexerDataContainer("mixed-content-type"))
210208
{
211209
Description = "Demo files to demonstrate Azure AI Search capabilities."
212210
};
@@ -419,14 +417,16 @@ private static EntityRecognitionSkill CreateEntityRecognitionSkill()
419417
TargetName = "organizations"
420418
});
421419

422-
EntityRecognitionSkill entityRecognitionSkill = new EntityRecognitionSkill(inputMappings, outputMappings)
420+
// Specify the V3 version of the EntityRecognitionSkill
421+
var skillVersion = EntityRecognitionSkill.SkillVersion.V3;
422+
423+
var entityRecognitionSkill = new EntityRecognitionSkill(inputMappings, outputMappings, skillVersion)
423424
{
424425
Description = "Recognize organizations",
425426
Context = "/document/pages/*",
426427
DefaultLanguageCode = EntityRecognitionSkillLanguage.En
427428
};
428429
entityRecognitionSkill.Categories.Add(EntityCategory.Organization);
429-
430430
return entityRecognitionSkill;
431431
}
432432
```

articles/search/cognitive-search-working-with-skillsets.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ author: HeidiSteen
66
ms.author: heidist
77
ms.service: azure-ai-search
88
ms.topic: conceptual
9-
ms.date: 01/15/2025
9+
ms.date: 07/11/2025
1010
---
1111

1212
# Skillset concepts in Azure AI Search
@@ -19,9 +19,9 @@ The following diagram illustrates the basic data flow of skillset execution.
1919

2020
:::image type="content" source="media/cognitive-search-working-with-skillsets/skillset-process-diagram-1.png" alt-text="Diagram showing skillset data flows, with focus on inputs, outputs, and mappings." border="true":::
2121

22-
From the onset of skillset processing to its conclusion, skills read from and write to an [*enriched document*](#enrichment-tree) that exists in memory. Initially, an enriched document is just the raw content extracted from a data source (articulated as the `"/document"` root node). With each skill execution, the enriched document gains structure and substance as each skill writes its output as nodes in the graph.
22+
From the onset of skillset processing to its conclusion, skills read from and write to an [*enriched document tree](#enrichment-tree) that exists in memory. Initially, an enriched document is just the raw content extracted from a data source (articulated as the `"/document"` root node). With each skill execution, the enriched document gains structure and substance as each skill writes its output as nodes in the graph.
2323

24-
After skillset execution is done, the output of an enriched document finds its way into an index through user-defined *output field mappings*. Any raw content that you want transferred intact, from source to an index, is defined through *field mappings*. In contrast, *output field mappings* transfer in-memory content (nodes) to the index.
24+
After skillset execution is done, the output of an enriched document is routed to an index through user-defined *output field mappings*. Any raw content that you want transferred intact, from source to an index, is defined through *field mappings*. In contrast, *output field mappings* transfer in-memory content (nodes) to the index.
2525

2626
To configure applied AI, specify settings in a skillset and indexer.
2727

@@ -58,11 +58,11 @@ A context determines:
5858

5959
### Skill dependencies
6060

61-
Skills can execute independently and in parallel, or sequentially if you feed the output of one skill into another skill. The following example demonstrates two [built-in skills](cognitive-search-predefined-skills.md) that execute in sequence:
61+
Skills can execute independently and in parallel, or sequentially in a dependent relationship if you feed the output of one skill into another skill. The following example demonstrates two [built-in skills](cognitive-search-predefined-skills.md) that execute in sequence:
6262

6363
+ Skill #1 is a [Text Split skill](cognitive-search-skill-textsplit.md) that accepts the contents of the "reviews_text" source field as input, and splits that content into "pages" of 5,000 characters as output. Splitting large text into smaller chunks can produce better outcomes for skills like sentiment detection.
6464

65-
+ Skill #2 is a [Sentiment Detection skill](cognitive-search-skill-sentiment.md) accepts "pages" as input, and produces a new field called "Sentiment" as output that contains the results of sentiment analysis.
65+
+ Skill #2 is a [Sentiment Detection skill](cognitive-search-skill-sentiment.md) depends on the split skill output. It accepts "pages" as input, and produces a new field called "Sentiment" as output that contains the results of sentiment analysis.
6666

6767
Notice how the output of the first skill ("pages") is used in sentiment analysis, where "/document/reviews_text/pages/*" is both the context and input. For more information about path formulation, see [How to reference enrichments](cognitive-search-concept-annotations-syntax.md).
6868

@@ -124,7 +124,7 @@ Notice how the output of the first skill ("pages") is used in sentiment analysis
124124

125125
## Enrichment tree
126126

127-
An enriched document is a temporary, tree-like data structure created during skillset execution that collects all of the changes introduced through skills. Collectively, enrichments are represented as a hierarchy of addressable nodes. Nodes also include any unenriched fields that are passed in verbatim from the external data source.
127+
An enriched document is a temporary, tree-like data structure created during skillset execution that collects all of the changes introduced through skills. Collectively, enrichments are represented as a hierarchy of addressable nodes. Nodes also include any unenriched fields that are passed in verbatim from the external data source. The best approach for examining the structure and content of an enrichment tree is through a [debug session](cognitive-search-debug-session.md) in the Azure portal.
128128

129129
An enriched document exists for the duration of skillset execution, but can be [cached](cognitive-search-incremental-indexing-conceptual.md) or sent to a [knowledge store](knowledge-store-concept-intro.md).
130130

0 commit comments

Comments
 (0)