You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/cognitive-search-incremental-indexing-conceptual.md
+9-9Lines changed: 9 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -20,7 +20,7 @@ ms.date: 02/16/2024
20
20
21
21
When you enable caching, the indexer evaluates your updates to determine whether existing enrichments can be pulled from the cache. Image and text content from the document cracking phase, plus skill outputs that are upstream or orthogonal to your edits, are likely to be reusable.
22
22
23
-
After performing the incremental enrichments as indicated by the skillset update, refreshed results are written back to the cache, and also to the search index or knowledge store.
23
+
After skillset processing is finished, the refreshed results are written back to the cache, and also to the search index or knowledge store.
24
24
25
25
## Limitations
26
26
@@ -29,9 +29,9 @@ After performing the incremental enrichments as indicated by the skillset update
29
29
30
30
## Cache configuration
31
31
32
-
Physically, the cache is stored in a blob container in your Azure Storage account, one per indexer. Each indexer is assigned a unique and immutable cache identifier that corresponds to the container it is using.
32
+
Physically, the cache is stored in a blob container in your Azure Storage account, one per indexer. Each indexer is assigned a unique and immutable cache identifier that corresponds to the container it's using.
33
33
34
-
The cache is created when you specify the "cache" property and run the indexer. Only enriched content can be cached. If your indexer does not have an attached skillset, then caching does not apply.
34
+
The cache is created when you specify the "cache" property and run the indexer. Only enriched content can be cached. If your indexer doesn't have an attached skillset, then caching doesn't apply.
35
35
36
36
The following example illustrates an indexer with caching enabled. See [Enable enrichment caching](search-howto-incremental-index.md) for full instructions. Notice that when adding the cache property, use a [preview API version](/rest/api/searchservice/search-service-api-versions#preview-versions), 2020-06-30-Preview or later, on the request.
37
37
@@ -69,13 +69,13 @@ While incremental enrichment is designed to detect and respond to changes with n
69
69
70
70
The cache property includes an `enableReprocessing` parameter. It's used to control processing over incoming documents already represented in the cache. When true (default), documents already in the cache are reprocessed when you rerun the indexer, assuming your skill update affects that doc.
71
71
72
-
When false, existing documents are not reprocessed, effectively prioritizing new, incoming content over existing content. You should only set enableReprocessing to false on a temporary basis. Having enableReprocessing set to true most of the time ensures that all documents, both new and existing, are valid per the current skillset definition.
72
+
When false, existing documents aren't reprocessed, effectively prioritizing new, incoming content over existing content. You should only set enableReprocessing to false on a temporary basis. Having enableReprocessing set to true most of the time ensures that all documents, both new and existing, are valid per the current skillset definition.
73
73
74
74
<aname="Bypass-skillset-checks"></a>
75
75
76
76
### Bypass skillset evaluation
77
77
78
-
Modifying a skill and reprocessing of that skill typically go hand in hand. However, some changes to a skill should not result in reprocessing (for example, deploying a custom skill to a new location or with a new access key). Most likely, these are peripheral modifications that have no genuine impact on the substance of the skill output itself.
78
+
Modifying a skill and reprocessing of that skill typically go hand in hand. However, some changes to a skill shouldn't result in reprocessing (for example, deploying a custom skill to a new location or with a new access key). Most likely, these are peripheral modifications that have no genuine impact on the substance of the skill output itself.
79
79
80
80
If you know that a change to the skill is indeed superficial, you should override skill evaluation by setting the `disableCacheReprocessingChangeDetection` parameter to true:
81
81
@@ -94,7 +94,7 @@ PUT https://[servicename].search.windows.net/skillsets/[skillset name]?api-versi
94
94
95
95
### Bypass data source validation checks
96
96
97
-
Most changes to a data source definition will invalidate the cache. However, for scenarios where you know that a change should not invalidate the cache - such as changing a connection string or rotating the key on the storage account - append the `ignoreResetRequirement` parameter on the [data source update](/rest/api/searchservice/update-data-source). Setting this parameter to true allows the commit to go through, without triggering a reset condition that would result in all objects being rebuilt and populated from scratch.
97
+
Most changes to a data source definition will invalidate the cache. However, for scenarios where you know that a change shouldn't invalidate the cache - such as changing a connection string or rotating the key on the storage account - append the `ignoreResetRequirement` parameter on the [data source update](/rest/api/searchservice/update-data-source). Setting this parameter to true allows the commit to go through, without triggering a reset condition that would result in all objects being rebuilt and populated from scratch.
98
98
99
99
```http
100
100
PUT https://[search service].search.windows.net/datasources/[data source name]?api-version=2020-06-30-Preview&ignoreResetRequirement
@@ -140,7 +140,7 @@ POST https://[search service name].search.windows.net/indexers/[indexer name]/re
140
140
141
141
Once you enable a cache, the indexer evaluates changes in your pipeline composition to determine which content can be reused and which needs reprocessing. This section enumerates changes that invalidate the cache outright, followed by changes that trigger incremental processing.
142
142
143
-
An invalidating change is one where the entire cache is no longer valid. An example of an invalidating change is one where your data source is updated. Here is the complete list of changes to any part of the indexer pipeline that would invalidate your cache:
143
+
An invalidating change is one where the entire cache is no longer valid. An example of an invalidating change is one where your data source is updated. Here's the complete list of changes to any part of the indexer pipeline that would invalidate your cache:
144
144
145
145
+ Changing the data source type
146
146
+ Changing data source container
@@ -160,7 +160,7 @@ An invalidating change is one where the entire cache is no longer valid. An exam
160
160
161
161
## Changes that trigger incremental processing
162
162
163
-
Incremental processing evaluates your skillset definition and determines which skills to rerun, selectively updating the affected portions of the document tree. Here is the complete list of changes resulting in incremental enrichment:
163
+
Incremental processing evaluates your skillset definition and determines which skills to rerun, selectively updating the affected portions of the document tree. Here's the complete list of changes resulting in incremental enrichment:
164
164
165
165
+ Changing the skill type (the OData type of the skill is updated)
166
166
+ Skill-specific parameters are updated, for example a URL, defaults, or other parameters
@@ -181,7 +181,7 @@ REST API version `2020-06-30-Preview` or later provides incremental enrichment t
+[Update Data Source](/rest/api/searchservice/update-data-source), when called with a preview API version, provides a new parameter named "ignoreResetRequirement", which should be set to true when your update action should not invalidate the cache. Use "ignoreResetRequirement" sparingly as it could lead to unintended inconsistency in your data that will not be detected easily.
184
+
+[Update Data Source](/rest/api/searchservice/update-data-source), when called with a preview API version, provides a new parameter named "ignoreResetRequirement", which should be set to true when your update action shouldn't invalidate the cache. Use "ignoreResetRequirement" sparingly as it could lead to unintended inconsistency in your data that won't be detected easily.
Copy file name to clipboardExpand all lines: articles/search/search-performance-tips.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -95,7 +95,7 @@ When query performance is slowing down in general, adding more replicas frequent
95
95
96
96
One positive side-effect of adding partitions is that slower queries sometimes perform faster due to parallel computing. We've noted parallelization on low selectivity queries, such as queries that match many documents, or facets providing counts over a large number of documents. Since significant computation is required to score the relevancy of the documents, or to count the numbers of documents, adding extra partitions helps queries complete faster.
97
97
98
-
To add partitions, use [Azure portal](search-capacity-planning.md#add-or-reduce-replicas-and-partitions.md), [PowerShell](search-manage-powershell.md), [Azure CLI](search-manage-azure-cli.md), or a management SDK.
98
+
To add partitions, use [Azure portal](search-capacity-planning.md#add-or-reduce-replicas-and-partitions), [PowerShell](search-manage-powershell.md), [Azure CLI](search-manage-azure-cli.md), or a management SDK.
Copy file name to clipboardExpand all lines: articles/search/search-query-fuzzy.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,7 @@
1
1
---
2
2
title: Fuzzy search
3
3
titleSuffix: Azure AI Search
4
-
description: Implement a fuzzy search query for a "did you mean" search experience. Fuzzy search auto-corrects a misspelled term or typo on the query.
4
+
description: Implement a fuzzy search query for a "did you mean" search experience. Fuzzy search autocorrects a misspelled term or typo on the query.
5
5
6
6
manager: nitinme
7
7
author: HeidiSteen
@@ -18,11 +18,11 @@ Azure AI Search supports fuzzy search, a type of query that compensates for typo
18
18
19
19
## What is fuzzy search?
20
20
21
-
It's a query expansion exercise that produces a match on terms having a similar composition. When a fuzzy search is specified, the search engine builds a graph (based on [deterministic finite automaton theory](https://en.wikipedia.org/wiki/Deterministic_finite_automaton)) of similarly composed terms, for all whole terms in the query. For example, if your query includes three terms "university of washington", a graph is created for every term in the query `search=university~ of~ washington~` (there's no stop-word removal in fuzzy search, so "of" gets a graph).
21
+
It's a query expansion exercise that produces a match on terms having a similar composition. When a fuzzy search is specified, the search engine builds a graph (based on [deterministic finite automaton theory](https://en.wikipedia.org/wiki/Deterministic_finite_automaton)) of similarly composed terms, for all whole terms in the query. For example, if your query includes three terms `"university of washington"`, a graph is created for every term in the query `search=university~ of~ washington~` (there's no stop-word removal in fuzzy search, so `"of"` gets a graph).
22
22
23
23
The graph consists of up to 50 expansions, or permutations, of each term, capturing both correct and incorrect variants in the process. The engine then returns the topmost relevant matches in the response.
24
24
25
-
For a term like "university", the graph might have "unversty, universty, university, universe, inverse". Any documents that match on those in the graph are included in results. In contrast with other queries that analyze the text to handle different forms of the same word ("mice" and "mouse"), the comparisons in a fuzzy query are taken at face value without any linguistic analysis on the text. "Universe" and "inverse", which are semantically different, will match because the syntactic discrepancies are small.
25
+
For a term like "university", the graph might have `"unversty, universty, university, universe, inverse"`. Any documents that match on those in the graph are included in results. In contrast with other queries that analyze the text to handle different forms of the same word ("mice" and "mouse"), the comparisons in a fuzzy query are taken at face value without any linguistic analysis on the text. "Universe" and "inverse", which are semantically different, will match because the syntactic discrepancies are small.
26
26
27
27
A match succeeds if the discrepancies are limited to two or fewer edits, where an edit is an inserted, deleted, substituted, or transposed character. The string correction algorithm that specifies the differential is the [Damerau-Levenshtein distance](https://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance) metric. It's described as the "minimum number of operations (insertions, deletions, substitutions, or transpositions of two adjacent characters) required to change one word into the other".
28
28
@@ -108,13 +108,13 @@ In the response, because you added hit highlighting, formatting is applied to "s
108
108
}
109
109
```
110
110
111
-
Try the request again, misspelling "special" by taking out several letters ("pe"):
111
+
Try the request again, misspelling "special" by taking out several letters (`"pe"`):
112
112
113
113
```console
114
114
search=scial~&highlight=Description
115
115
```
116
116
117
-
So far, no change to the response. Using the default of 2 degrees distance, removing two characters "pe" from "special" still allows for a successful match on that term.
117
+
So far, no change to the response. Given the default of 2 degrees distance, removing two characters `"pe"` from "special" still allows for a successful match on that term.
118
118
119
119
```output
120
120
"@search.highlights": {
@@ -124,7 +124,7 @@ So far, no change to the response. Using the default of 2 degrees distance, remo
124
124
}
125
125
```
126
126
127
-
Trying one more request, further modify the search term by taking out one last character for a total of three deletions (from "special" to "scal"):
127
+
Trying one more request, further modify the search term by taking out one last character for a total of three deletions (from "special" to `"scal"`):
0 commit comments