Skip to content

Commit be191ad

Browse files
authored
Merge pull request #107024 from HeidiSteen/heidist-master
[Azure Cognitive Search] Freshness pass on Autocomplete example
2 parents e2d74a8 + 8328e1b commit be191ad

9 files changed

+140
-103
lines changed

articles/search/index-add-suggesters.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,13 +19,13 @@ The following screenshot from [Create your first app in C#](tutorial-csharp-type
1919

2020
![Visual comparison of autocomplete and suggested queries](./media/index-add-suggesters/hotel-app-suggestions-autocomplete.png "Visual comparison of autocomplete and suggested queries")
2121

22-
To implement these behaviors in Azure Cognitive Search, there is an index and query component.
22+
You can use these features separately or together. To implement these behaviors in Azure Cognitive Search, there is an index and query component.
2323

2424
+ In the index, add a suggester to an index. You can use the portal, [REST API](https://docs.microsoft.com/rest/api/searchservice/create-index), or [.NET SDK](https://docs.microsoft.com/dotnet/api/microsoft.azure.search.models.suggester?view=azure-dotnet). The remainder of this article is focused on creating a suggester.
2525

2626
+ In the query request, call one of the [APIs listed below](#how-to-use-a-suggester).
2727

28-
Search-as-you-type support is enabled on a per-field basis. You can implement both typeahead behaviors within the same search solution if you want an experience similar to the one indicated in the screenshot. Both requests target the *documents* collection of specific index and responses are returned after a user has provided at least a three character input string.
28+
Search-as-you-type support is enabled on a per-field basis for string fields. You can implement both typeahead behaviors within the same search solution if you want an experience similar to the one indicated in the screenshot. Both requests target the *documents* collection of specific index and responses are returned after a user has provided at least a three character input string.
2929

3030
## What is a suggester?
3131

67 KB
Loading

articles/search/query-lucene-syntax.md

Lines changed: 41 additions & 27 deletions
Large diffs are not rendered by default.

articles/search/query-simple-syntax.md

Lines changed: 52 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -8,23 +8,14 @@ author: brjohnstmsft
88
ms.author: brjohnst
99
ms.service: cognitive-search
1010
ms.topic: conceptual
11-
ms.date: 04/03/2020
12-
translation.priority.mt:
13-
- "de-de"
14-
- "es-es"
15-
- "fr-fr"
16-
- "it-it"
17-
- "ja-jp"
18-
- "ko-kr"
19-
- "pt-br"
20-
- "ru-ru"
21-
- "zh-cn"
22-
- "zh-tw"
11+
ms.date: 04/12/2020
2312
---
2413

2514
# Simple query syntax in Azure Cognitive Search
2615

27-
Azure Cognitive Search implements two Lucene-based query languages: [Simple Query Parser](https://lucene.apache.org/core/6_6_1/queryparser/org/apache/lucene/queryparser/simple/SimpleQueryParser.html) and the [Lucene Query Parser](https://lucene.apache.org/core/6_6_1/queryparser/org/apache/lucene/queryparser/classic/package-summary.html). In Azure Cognitive Search, the simple query syntax excludes the fuzzy/slop options.
16+
Azure Cognitive Search implements two Lucene-based query languages: [Simple Query Parser](https://lucene.apache.org/core/6_6_1/queryparser/org/apache/lucene/queryparser/simple/SimpleQueryParser.html) and the [Lucene Query Parser](https://lucene.apache.org/core/6_6_1/queryparser/org/apache/lucene/queryparser/classic/package-summary.html).
17+
18+
In Azure Cognitive Search, the simple query syntax excludes fuzzy search operations. Instead, use the full Lucene syntax for [fuzzy search](search-query-fuzzy.md).
2819

2920
> [!NOTE]
3021
> The simple query syntax is used for query expressions passed in the **search** parameter of the [Search Documents](https://docs.microsoft.com/rest/api/searchservice/search-documents) API, not to be confused with the [OData syntax](query-odata-filter-orderby-syntax.md) used for the [$filter](search-filters.md) parameter of that API. These different syntaxes have their own rules for constructing queries, escaping strings, and so on.
@@ -33,15 +24,44 @@ Azure Cognitive Search implements two Lucene-based query languages: [Simple Quer
3324
3425
## How to invoke simple parsing
3526

36-
Simple syntax is the default. Invocation is only necessary if you are resetting the syntax from full to simple. To explicitly set the syntax, use the `queryType` search parameter. Valid values include `simple|full`, with `simple` as the default, and `full` for Lucene.
27+
Simple syntax is the default. Invocation is only necessary if you are resetting the syntax from full to simple. To explicitly set the syntax, use the `queryType` search parameter. Valid values include `queryType=simple` or `queryType=full`, where `simple` is the default, and `full` invokes the [full Lucene query parser](query-lucene-syntax.md) for more advanced queries.
28+
29+
## Syntax fundamentals
30+
31+
Any text with one or more terms is considered a valid starting point for query execution. Azure Cognitive Search will match documents containing any or all of the terms, including any variations found during analysis of the text.
32+
33+
As straightforward as this sounds, there is one aspect of query execution in Azure Cognitive Search that *might* produce unexpected results, increasing rather than decreasing search results as more terms and operators are added to the input string. Whether this expansion actually occurs depends on the inclusion of a NOT operator, combined with a **searchMode** parameter setting that determines how NOT is interpreted in terms of AND or OR behaviors. For more information, see [NOT operator](#not-operator).
34+
35+
### Precedence operators (grouping)
36+
37+
You can use parentheses to create subqueries, including operators within the parenthetical statement. For example, `motel+(wifi||luxury)` will search for documents containing the "motel" term and either "wifi" or "luxury" (or both).
38+
39+
Field grouping is similar but scopes the grouping to a single field. For example, `hotelAmenities:(gym+(wifi||pool))` searches the field "hotelAmenities" for "gym" and "wifi", or "gym" and "pool".
3740

38-
## Query behavior anomalies
41+
### Escaping search operators
42+
43+
In order to use any of the search operators as part of the search text, escape the character by prefixing it with a single backslash (`\`). For example, for a wildcard search on `https://`, where `://` is part of the query string, you would specify `search=https\:\/\/*`. Similarly, an escaped phone number pattern might look like this `\+1 \(800\) 642\-7676`.
44+
45+
Special characters that require escaping include the following: `- * ? \ /`
46+
47+
In order to make things simple for the more typical cases, there are two exceptions to this rule where escaping is not needed:
48+
49+
+ The NOT operator `-` only needs to be escaped if it's the first character after whitespace, not if it's in the middle of a term. For example, the following GUID is valid without the escape character: `3352CDD0-EF30-4A2E-A512-3B30AF40F3FD`.
50+
51+
+ The suffix operator `*` needs to be escaped only if it's the last character before whitespace, not if it's in the middle of a term. For example, `4*4=16` does not require a backslash.
52+
53+
> [!NOTE]
54+
> Although escaping keeps tokens together, [lexical analysis](search-lucene-query-architecture.md#stage-2-lexical-analysis) during indexing may strip them out. For example, the standard Lucene analyzer will delete and break words on hyphens, whitespace, and other characters. If you require special characters in the query string, you might need an analyzer that preserves them in the index. Some choices include Microsoft natural [language analyzers](index-add-language-analyzers.md), which preserves hyphenated words, or a custom analyzer for more complex patterns. For more information, see [Partial terms, patterns, and special characters](search-query-partial-matching.md).
3955
40-
Any text with one or more terms is considered a valid starting point for query execution. Azure Cognitive Search will match documents containing any or all of the terms, including any variations found during analysis of the text.
56+
### Encoding unsafe and reserved characters in URLs
4157

42-
As straightforward as this sounds, there is one aspect of query execution in Azure Cognitive Search that *might* produce unexpected results, increasing rather than decreasing search results as more terms and operators are added to the input string. Whether this expansion actually occurs depends on the inclusion of a NOT operator, combined with a `searchMode` parameter setting that determines how NOT is interpreted in terms of AND or OR behaviors. Given the default, `searchMode=Any`, and a NOT operator, the operation is computed as an OR action, such that `"New York" NOT Seattle` returns all cities that are not Seattle.
58+
Please ensure all unsafe and reserved characters are encoded in a URL. For example, '#' is an unsafe character because it is a fragment/anchor identifier in a URL. The character must be encoded to `%23` if used in a URL. '&' and '=' are examples of reserved characters as they delimit parameters and specify values in Azure Cognitive Search. Please see [RFC1738: Uniform Resource Locators (URL)](https://www.ietf.org/rfc/rfc1738.txt) for more details.
4359

44-
Typically, you're more likely to see these behaviors in user interaction patterns for applications that search over content, where users are more likely to include an operator in a query, as opposed to e-commerce sites that have more built-in navigation structures. For more information, see [NOT operator](#not-operator).
60+
Unsafe characters are ``" ` < > # % { } | \ ^ ~ [ ]``. Reserved characters are `; / ? : @ = + &`.
61+
62+
### <a name="bkmk_querysizelimits"></a> Query size limits
63+
64+
There is a limit to the size of queries that you can send to Azure Cognitive Search. Specifically, you can have at most 1024 clauses (expressions separated by AND, OR, and so on). There is also a limit of approximately 32 KB on the size of any individual term in a query. If your application generates search queries programmatically, we recommend designing it in such a way that it does not generate queries of unbounded size.
4565

4666
## Boolean operators (AND, OR, NOT)
4767

@@ -59,41 +79,34 @@ The OR operator is a vertical bar or pipe character. For example, `wifi | luxury
5979

6080
### NOT operator `-`
6181

62-
The NOT operator is a minus sign. For example, `wifi –luxury` will search for documents that have the `wifi` term and/or do not have `luxury` (and/or is controlled by `searchMode`).
82+
The NOT operator is a minus sign. For example, `wifi –luxury` will search for documents that have the `wifi` term and/or do not have `luxury`.
6383

64-
> [!NOTE]
65-
> The `searchMode` option controls whether a term with the NOT operator is ANDed or ORed with the other terms in the query in the absence of a `+` or `|` operator. Recall that `searchMode` can be set to either `any` (default) or `all`. If you use `any`, it will increase the recall of queries by including more results, and by default `-` will be interpreted as "OR NOT". For example, `wifi -luxury` will match documents that either contain the term `wifi` or those that do not contain the term `luxury`. If you use `all`, it will increase the precision of queries by including fewer results, and by default - will be interpreted as "AND NOT". For example, `wifi -luxury` will match documents that contain the term `wifi` and do not contain the term "luxury". This is arguably a more intuitive behavior for the `-` operator. Therefore, you should consider using `searchMode=all` instead of `searchMode=any` if You want to optimize searches for precision instead of recall, *and* Your users frequently use the `-` operator in searches.
84+
The **searchMode** parameter on a query request controls whether a term with the NOT operator is ANDed or ORed with other terms in the query (assuming there is no `+` or `|` operator on the other terms). Valid values include `any` or `all`.
85+
86+
`searchMode=any` increases the recall of queries by including more results, and by default `-` will be interpreted as "OR NOT". For example, `wifi -luxury` will match documents that either contain the term `wifi` or those that do not contain the term `luxury`.
87+
88+
`searchMode=all` increases the precision of queries by including fewer results, and by default - will be interpreted as "AND NOT". For example, `wifi -luxury` will match documents that contain the term `wifi` and do not contain the term "luxury". This is arguably a more intuitive behavior for the `-` operator. Therefore, you should consider using `searchMode=all` instead of `searchMode=any` if you want to optimize searches for precision instead of recall, *and* Your users frequently use the `-` operator in searches.
89+
90+
When deciding on a **searchMode** setting, consider the user interaction patterns for queries in various applications. Users who are searching for information are more likely to include an operator in a query, as opposed to e-commerce sites that have more built-in navigation structures.
6691

6792
<a name="prefix-search"></a>
6893

69-
## Suffix `*` operator for prefix search
94+
## Suffix `*` operator (prefix search)
7095

71-
The suffix operator is an asterisk `*`. For example, `cap*` will search for documents that have a term that starts with `cap`, ignoring case.
96+
The suffix operator is an asterisk `*`. For example, `lingui*` will find "linguistic" or "linguini", ignoring case.
7297

7398
Similar to filters, a prefix query looks for an exact match. As such, there is no relevance scoring (all results receive a search score of 1.0). Prefix queries can be slow, especially if the index is large and the prefix consists of a small number of characters.
7499

75100
If you want to execute a suffix query, matching on the last part of string, use a [wildcard search](query-lucene-syntax.md#bkmk_wildcard) and the full Lucene syntax.
76101

77-
## Phrase search operator
102+
## Phrase operator `"`
78103

79104
The phrase operator encloses a phrase in quotation marks `" "`. For example, while `Roach Motel` (without quotes) would search for documents containing `Roach` and/or `Motel` anywhere in any order, `"Roach Motel"` (with quotes) will only match documents that contain that whole phrase together and in that order (text analysis still applies).
80105

81-
## Precedence operator
82-
83-
The precedence operator encloses the string in parentheses `( )`. For example, `motel+(wifi | luxury)` will search for documents containing the motel term and either `wifi` or `luxury` (or both).
84-
85-
## Escaping search operators
86-
87-
In order to use the above symbols as actual part of the search text, they should be escaped by prefixing them with a backslash. For example, `luxury\+hotel` will result in the term `luxury+hotel`. In order to make things simple for the more typical cases, there are two exceptions to this rule where escaping is not needed:
88-
89-
- The NOT operator `-` only needs to be escaped if it's the first character after whitespace, not if it's in the middle of a term. For example, `wi-fi` is a single term; whereas GUIDs (such as `3352CDD0-EF30-4A2E-A512-3B30AF40F3FD`) are treated as a single token.
90-
- The suffix operator `*` needs to be escaped only if it's the last character before whitespace, not if it's in the middle of a term. For example, `wi*fi` is treated as a single token.
91-
92-
> [!NOTE]
93-
> Although escaping keeps tokens together, text analysis may split them up, depending on the analysis mode. See [Language support &#40;Azure Cognitive Search REST API&#41;](index-add-language-analyzers.md) for details.
94-
95106
## See also
96107

97-
+ [Search Documents &#40;Azure Cognitive Search REST API&#41;](https://docs.microsoft.com/rest/api/searchservice/Search-Documents)
108+
+ [Query examples for simple search](search-query-simple-examples.md)
109+
+ [Query examples for full Lucene search](search-query-lucene-examples.md)
110+
+ [Search Documents &#40;Azure Cognitive Search REST API&#41;](https://docs.microsoft.com/rest/api/searchservice/Search-Documents)
98111
+ [Lucene query syntax](query-lucene-syntax.md)
99112
+ [OData expression syntax](query-odata-filter-orderby-syntax.md)

0 commit comments

Comments
 (0)