Skip to content

Commit ae2a97c

Browse files
authored
Merge pull request #191930 from HeidiSteen/heidist-fresh
[azure search] March freshness
2 parents 65c93f2 + 4bf7bb9 commit ae2a97c

9 files changed

+106
-94
lines changed

articles/search/query-simple-syntax.md

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -8,20 +8,20 @@ author: bevloh
88
ms.author: beloh
99
ms.service: cognitive-search
1010
ms.topic: conceptual
11-
ms.date: 12/14/2020
11+
ms.date: 03/16/2022
1212
---
1313

1414
# Simple query syntax in Azure Cognitive Search
1515

16-
Azure Cognitive Search implements two Lucene-based query languages: [Simple Query Parser](https://lucene.apache.org/core/6_6_1/queryparser/org/apache/lucene/queryparser/simple/SimpleQueryParser.html) and the [Lucene Query Parser](https://lucene.apache.org/core/6_6_1/queryparser/org/apache/lucene/queryparser/classic/package-summary.html). The simple parser is more flexible and will attempt to interpret a request even if it's not perfectly composed. Because of this flexibility, it is the default for queries in Azure Cognitive Search.
16+
Azure Cognitive Search implements two Lucene-based query languages: [Simple Query Parser](https://lucene.apache.org/core/6_6_1/queryparser/org/apache/lucene/queryparser/simple/SimpleQueryParser.html) and the [Lucene Query Parser](https://lucene.apache.org/core/6_6_1/queryparser/org/apache/lucene/queryparser/classic/package-summary.html). The simple parser is more flexible and will attempt to interpret a request even if it's not perfectly composed. Because it's flexible, it's the default for queries in Azure Cognitive Search.
1717

18-
The simple syntax is used for query expressions passed in the **`search`** parameter of a [Search Documents (REST API)](/rest/api/searchservice/search-documents) request, not to be confused with the [OData syntax](query-odata-filter-orderby-syntax.md) used for the [**`$filter`**](search-filters.md) and [**`$orderby`**](search-query-odata-orderby.md) expressions in the same request. OData parameters have different syntax and rules for constructing queries, escaping strings, and so on.
18+
The simple syntax is used for query expressions passed in the "search" parameter of a [Search Documents (REST API)](/rest/api/searchservice/search-documents) request, not to be confused with the [OData syntax](query-odata-filter-orderby-syntax.md) used for the ["$filter"](search-filters.md) and ["$orderby"](search-query-odata-orderby.md) expressions in the same request. OData parameters have different syntax and rules for constructing queries, escaping strings, and so on.
1919

20-
Although the simple parser is based on the [Apache Lucene Simple Query Parser](https://lucene.apache.org/core/6_6_1/queryparser/org/apache/lucene/queryparser/simple/SimpleQueryParser.html) class, the implementation in Cognitive Search excludes fuzzy search. If you need [fuzzy search](search-query-fuzzy.md), consider the alternative [full Lucene query syntax](query-lucene-syntax.md) instead.
20+
Although the simple parser is based on the [Apache Lucene Simple Query Parser](https://lucene.apache.org/core/6_6_1/queryparser/org/apache/lucene/queryparser/simple/SimpleQueryParser.html) class, its implementation in Cognitive Search excludes fuzzy search. If you need [fuzzy search](search-query-fuzzy.md), consider the alternative [full Lucene query syntax](query-lucene-syntax.md) instead.
2121

2222
## Example (simple syntax)
2323

24-
Although **`queryType`** is set below, it's the default and can be omitted unless you are reverting from an alternative type. The following example is a search over independent terms, with a requirement that all matching documents include "pool".
24+
This example shows a simple query, distinguished by `"queryType": "simple"` and valid syntax. Although query type is set below, it's the default and can be omitted unless you are reverting from an alternative type. The following example is a search over independent terms, with a requirement that all matching documents include "pool".
2525

2626
```http
2727
POST https://{{service-name}}.search.windows.net/indexes/hotel-rooms-sample/docs/search?api-version=2020-06-30
@@ -32,25 +32,25 @@ POST https://{{service-name}}.search.windows.net/indexes/hotel-rooms-sample/docs
3232
}
3333
```
3434

35-
The **`searchMode`** parameter is relevant in this example. Whenever boolean operators are on the query, you should generally set `searchMode=all` to ensure that *all* of the criteria is matched. Otherwise, you can use the default `searchMode=any` that favors recall over precision.
35+
The "searchMode" parameter is relevant in this example. Whenever boolean operators are on the query, you should generally set `"searchMode=all"` to ensure that *all* of the criteria is matched. Otherwise, you can use the default `"searchMode=any"` that favors recall over precision.
3636

3737
For additional examples, see [Simple query syntax examples](search-query-simple-examples.md). For details about the query request and parameters, see [Search Documents (REST API)](/rest/api/searchservice/Search-Documents).
3838

3939
## Keyword search on terms and phrases
4040

41-
Strings passed to the **`search`** parameter can include terms or phrases in any supported language, boolean operators, precedence operators, wildcard or prefix characters for "starts with" queries, escape characters, and URL encoding characters. The **`search`** parameter is optional. Unspecified, search (`search=*` or `search=" "`) returns the top 50 documents in arbitrary (unranked) order.
41+
Strings passed to the "search" parameter can include terms or phrases in any supported language, boolean operators, precedence operators, wildcard or prefix characters for "starts with" queries, escape characters, and URL encoding characters. The "search" parameter is optional. Unspecified, search (`search=*` or `search=" "`) returns the top 50 documents in arbitrary (unranked) order.
4242

4343
+ A *term search* is a query of one or more terms, where any of the terms are considered a match.
4444

4545
+ A *phrase search* is an exact phrase enclosed in quotation marks `" "`. For example, while ```Roach Motel``` (without quotes) would search for documents containing ```Roach``` and/or ```Motel``` anywhere in any order, ```"Roach Motel"``` (with quotes) will only match documents that contain that whole phrase together and in that order (lexical analysis still applies).
4646

4747
Depending on your search client, you might need to escape the quotation marks in a phrase search. For example, in Postman in a POST request, a phrase search on `"Roach Motel"` in the request body would be specified as `"\"Roach Motel\""`.
4848

49-
By default, all terms or phrases passed in the **`search`** parameter undergo lexical analysis. Make sure you understand the tokenization behavior of the analyzer you are using. Often, when query results are unexpected, the reason can be traced to how terms are tokenized at query time.
49+
By default, all strings passed in the "search" parameter undergo lexical analysis. Make sure you understand the tokenization behavior of the analyzer you are using. Often, when query results are unexpected, the reason can be traced to how terms are tokenized at query time. You can [test tokenization on specific strings](/rest/api/searchservice/test-analyzer) to confirm the output.
5050

51-
Any text with one or more terms is considered a valid starting point for query execution. Azure Cognitive Search will match documents containing any or all of the terms, including any variations found during analysis of the text.
51+
Any text input with one or more terms is considered a valid starting point for query execution. Azure Cognitive Search will match documents containing any or all of the terms, including any variations found during analysis of the text.
5252

53-
As straightforward as this sounds, there is one aspect of query execution in Azure Cognitive Search that *might* produce unexpected results, increasing rather than decreasing search results as more terms and operators are added to the input string. Whether this expansion actually occurs depends on the inclusion of a NOT operator, combined with a **`searchMode`** parameter setting that determines how NOT is interpreted in terms of AND or OR behaviors. For more information, see the NOT operator under [Boolean operators](#boolean-operators).
53+
As straightforward as this sounds, there is one aspect of query execution in Azure Cognitive Search that *might* produce unexpected results, increasing rather than decreasing search results as more terms and operators are added to the input string. Whether this expansion actually occurs depends on the inclusion of a NOT operator, combined with a "searchMode" parameter setting that determines how NOT is interpreted in terms of AND or OR behaviors. For more information, see the NOT operator under [Boolean operators](#boolean-operators).
5454

5555
## Boolean operators
5656

@@ -60,7 +60,7 @@ You can embed Boolean operators in a query string to improve the precision of a
6060
|----------- |--------|-------|
6161
| `+` | `pool + ocean` | An AND operation. For example, `pool + ocean` stipulates that a document must contain both terms.|
6262
| `|` | `pool | ocean` | An OR operation finds a match when either term is found. In the example, the query engine will return match on documents containing either `pool` or `ocean` or both. Because OR is the default conjunction operator, you could also leave it out, such that `pool ocean` is the equivalent of `pool | ocean`.|
63-
| `-` | `pool – ocean` | A NOT operation returns matches on documents that exclude the term. <br/><br/>To get the expected behavior on a NOT expression, consider setting **`searchMode=all`** on the request. Otherwise, under the default of **`searchMode=any`**, you will get matches on `pool`, plus matches on all documents that do not contain `ocean`, which could be a lot of documents. The **`searchMode`** parameter on a query request controls whether a term with the NOT operator is ANDed or ORed with other terms in the query (assuming there is no `+` or `|` operator on the other terms). Using **`searchMode=all`** increases the precision of queries by including fewer results, and by default - will be interpreted as "AND NOT". <br/><br/>When deciding on a **`searchMode`** setting, consider the user interaction patterns for queries in various applications. Users who are searching for information are more likely to include an operator in a query, as opposed to e-commerce sites that have more built-in navigation structures. |
63+
| `-` | `pool – ocean` | A NOT operation returns matches on documents that exclude the term. </p>To get the expected behavior on a NOT expression, consider setting `"searchMode=all"` on the request. Otherwise, under the default of `"searchMode=any"`, you will get matches on `pool`, plus matches on all documents that do not contain `ocean`, which could be a lot of documents. The "searchMode" parameter on a query request controls whether a term with the NOT operator is ANDed or ORed with other terms in the query (assuming there is no `+` or `|` operator on the other terms). Using `"searchMode=all"` increases the precision of queries by including fewer results, and by default - will be interpreted as "AND NOT". </p>When deciding on a "searchMode" setting, consider the user interaction patterns for queries in various applications. Users who are searching for information are more likely to include an operator in a query, as opposed to e-commerce sites that have more built-in navigation structures. |
6464

6565
<a name="prefix-search"></a>
6666

articles/search/search-create-service-portal.md

Lines changed: 17 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -172,18 +172,27 @@ A second service is not required for high availability. High availability for qu
172172

173173
Cognitive Search restricts the [number of resources](search-limits-quotas-capacity.md#subscription-limits) you can initially create in a subscription. If you exhaust your maximum limit, file a new support request to add more search services.
174174

175-
1. Sign in to the Azure portal, and find your search service.
175+
1. Sign in to the Azure portal and find your search service.
176+
176177
1. On the left-navigation pane, scroll down and select **New Support Request.**
177-
1. For **issue type**, choose **Service and subscription limits (quotas).**
178+
179+
1. In **Issue type**, choose **Service and subscription limits (quotas).**
180+
178181
1. Select the subscription that needs more quota.
179-
1. Under **Quota type**, select **Search**. Then select **Next**.
182+
183+
1. Under **Quota type**, select **Search** and then select **Next**.
184+
180185
1. In the **Problem details** section, select **Enter details**.
181-
1. Follow the prompts to select location and tier.
182-
1. Add the new limit you would like on the subscription. The value must not be empty and must between 0 to 100.
183-
For example: The maximum number of S2 services is 8 and you would like to have 12 services, then request to add 4 of S2 services."
186+
187+
1. Follow the prompts to select the location and tier for which you want to increase the limit.
188+
189+
1. Add the number of new services you would like to add to your quota. The value must not be empty and must between 0 to 100. For example, the maximum number of S2 services is 8. If you want 12 services, you would request 4 of S2 services.
190+
184191
1. When you're finished, select **Save and continue** to continue creating your support request.
185-
1. Complete the rest of the additional information requested, and then select **Next**.
186-
1. On the **review + create** screen, review the details that you'll send to support, and then select **Create**.
192+
193+
1. Provide the additional information required to file the request, and then select **Next**.
194+
195+
1. On **Review + create**, review the details that you'll send to support, and then select **Create**.
187196

188197
## Next steps
189198

articles/search/search-howto-index-json-blobs.md

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -8,10 +8,10 @@ author: HeidiSteen
88
ms.author: heidist
99

1010
ms.service: cognitive-search
11-
ms.topic: conceptual
12-
ms.date: 02/01/2021
11+
ms.topic: how-to
12+
ms.date: 03/16/2022
1313
---
14-
# How to index JSON blobs and files in Azure Cognitive Search
14+
# Index JSON blobs and files in Azure Cognitive Search
1515

1616
**Applies to**: [Blob indexers](search-howto-indexing-azure-blob-storage.md), [File indexers](search-file-storage-integration.md)
1717

@@ -21,7 +21,7 @@ This article shows you how to set JSON-specific properties for blobs or files th
2121
+ A JSON document containing an array of well-formed JSON elements
2222
+ A JSON document containing multiple entities, separated by a newline
2323

24-
The blob indexer provides a **`parsingMode`** parameter to optimize the output of the search document based on the structure Parsing modes consist of the following options:
24+
The blob indexer provides a "parsingMode" parameter to optimize the output of the search document based on the structure. Parsing modes consist of the following options:
2525

2626
| parsingMode | JSON document | Description |
2727
|--------------|-------------|--------------|
@@ -31,9 +31,9 @@ The blob indexer provides a **`parsingMode`** parameter to optimize the output o
3131

3232
For both **`jsonArray`** and **`jsonLines`**, you should review [Indexing one blob to produce many search documents](search-howto-index-one-to-many-blobs.md) to understand how the blob indexer handles disambiguation of the document key for multiple search documents produced from the same blob.
3333

34-
Within the indexer definition, you can optionally set [field mappings](search-indexer-field-mappings.md) to choose which properties of the source JSON document are used to populate your target search index. For example, when using the **`jsonArray`** parsing mode, if the array exists as a lower-level property, you can set a **`document root`** property indicating where the array is placed within the blob.
34+
Within the indexer definition, you can optionally set [field mappings](search-indexer-field-mappings.md) to choose which properties of the source JSON document are used to populate your target search index. For example, when using the **`jsonArray`** parsing mode, if the array exists as a lower-level property, you can set a "documentRoot" property indicating where the array is placed within the blob.
3535

36-
The following sections describe each mode in more detail. If you are unfamiliar with indexer clients and concepts, see [Create a search indexer](search-howto-create-indexers.md). You should also be familiar with the details of [basic blob indexer configuration](search-howto-indexing-azure-blob-storage.md), which isn't repeated here.
36+
The following sections describe each mode in more detail. If you're unfamiliar with indexer clients and concepts, see [Create a search indexer](search-howto-create-indexers.md). You should also be familiar with the details of [basic blob indexer configuration](search-howto-indexing-azure-blob-storage.md), which isn't repeated here.
3737

3838
<a name="parsing-single-blobs"></a>
3939

@@ -73,7 +73,7 @@ api-key: [admin key]
7373
7474
### json example (single hotel JSON files)
7575

76-
The [hotel JSON document data set](https://github.com/Azure-Samples/azure-search-sample-data/tree/master/hotels/hotel-json-documents) on GitHub is helpful for testing JSON parsing, where each blob represents a structured JSON file. You can upload the data files to Blob storage and use the **Import data** wizard to quickly evaluate how this content is parsed into individual search documents.
76+
The [hotel JSON document data set](https://github.com/Azure-Samples/azure-search-sample-data/tree/master/hotels/hotel-json-documents) on GitHub is helpful for testing JSON parsing, where each blob represents a structured JSON file. You can upload the data files to Blob Storage and use the [**Import data** wizard](search-get-started-portal.md) to quickly evaluate how this content is parsed into individual search documents.
7777

7878
The data set consists of five blobs, each containing a hotel document with an address collection and a rooms collection. The blob indexer detects both collections and reflects the structure of the input documents in the index schema.
7979

@@ -91,7 +91,7 @@ Alternatively, you can use the JSON array option. This option is useful when blo
9191
]
9292
```
9393

94-
The **`parameters`** property on the indexer contains parsing mode values. For a JSON array, the indexer definition should look similar to the following example.
94+
The "parameters" property on the indexer contains parsing mode values. For a JSON array, the indexer definition should look similar to the following example.
9595

9696
```http
9797
POST https://[service name].search.windows.net/indexers?api-version=2020-06-30
@@ -108,15 +108,15 @@ api-key: [admin key]
108108

109109
### jsonArrays example (clinical trials sample data)
110110

111-
The [clinical trials JSON data set](https://github.com/Azure-Samples/azure-search-sample-data/tree/master/clinical-trials/clinical-trials-json) on GitHub is helpful for testing JSON array parsing. You can upload the data files to Blob storage and use the **Import data** wizard to quickly evaluate how this content is parsed into individual search documents.
111+
The [clinical trials JSON data set](https://github.com/Azure-Samples/azure-search-sample-data/tree/master/clinical-trials/clinical-trials-json) on GitHub is helpful for testing JSON array parsing. You can upload the data files to Blob storage and use the [**Import data** wizard](search-get-started-portal.md) to quickly evaluate how this content is parsed into individual search documents.
112112

113113
The data set consists of eight blobs, each containing a JSON array of entities, for a total of 100 entities. The entities vary as to which fields are populated, but the end result is one search document per entity, from all arrays, in all blobs.
114114

115115
<a name="nested-json-arrays"></a>
116116

117117
### Parsing nested JSON arrays
118118

119-
For JSON arrays having nested elements, you can specify a **`documentRoot`** to indicate a multi-level structure. For example, if your blobs look like this:
119+
For JSON arrays having nested elements, you can specify a "documentRoot" to indicate a multi-level structure. For example, if your blobs look like this:
120120

121121
```http
122122
{
@@ -200,7 +200,7 @@ You can also refer to individual array elements by using a zero-based index. For
200200
```
201201

202202
> [!NOTE]
203-
> If **`sourceFieldName`** refers to a property that doesn't exist in the JSON blob, that mapping is skipped without an error. This behavior allows indexing to continue for JSON blobs that have a different schema (which is a common use case). Because there is no validation check, check the mappings carefully for typos so that you aren't losing documents for the wrong reason.
203+
> If "sourceFieldName" refers to a property that doesn't exist in the JSON blob, that mapping is skipped without an error. This behavior allows indexing to continue for JSON blobs that have a different schema (which is a common use case). Because there is no validation check, check the mappings carefully for typos so that you aren't losing documents for the wrong reason.
204204
>
205205
206206
## Next steps

0 commit comments

Comments
 (0)