Skip to content

Commit 3f40664

Browse files
Merge pull request #252720 from HeidiSteen/heidist-freshness
[azure search] query doc refresh
2 parents 900a616 + 04da256 commit 3f40664

File tree

2 files changed

+20
-22
lines changed

2 files changed

+20
-22
lines changed

articles/search/search-how-to-create-search-index.md

Lines changed: 19 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -9,56 +9,54 @@ ms.author: heidist
99

1010
ms.service: cognitive-search
1111
ms.topic: how-to
12-
ms.date: 05/05/2023
12+
ms.date: 09/25/2023
1313
---
1414

1515
# Create an index in Azure Cognitive Search
1616

1717
In Azure Cognitive Search, query requests target the searchable text in a [**search index**](search-what-is-an-index.md).
1818

19-
In this article, learn the steps for defining and publishing a search index. Creating an index establishes the physical data structure (folders and files) on your search service. Once the index definition exists, [**loading the index**](search-what-is-data-import.md) follows as a separate task.
19+
In this article, learn the steps for defining and publishing a search index. Creating an index establishes the physical data structures on your search service. Once the index definition exists, [**loading the index**](search-what-is-data-import.md) follows as a separate task.
2020

2121
## Prerequisites
2222

23-
+ Write permissions on the search service. Permission can be granted through an [admin API key](search-security-api-keys.md) on the request. Alternatively, if you're using [role-based access control](search-security-rbac.md), you can issue your request as a member of the Search Contributor role.
23+
+ Write permissions. Permission can be granted through an [admin API key](search-security-api-keys.md) on the request. Alternatively, if you're using [role-based access control](search-security-rbac.md), send a request as a member of the Search Contributor role.
2424

25-
+ An external data source that provides the content to be indexed. You should refer to the data source to understand the schema requirements of your search index. Index creation is largely a schema definition exercise. Before creating one, you should have:
25+
+ An understanding of the data you want to index. Creating an index is a schema definition exercise, so you should have a clear idea of which source fields you want to make searchable, retrievable, filterable, facetable, and sortable (see the [schema checklist](#schema-checklist) for guidance).
2626

27-
+ A clear idea of which source fields you want to make searchable, retrievable, filterable, facetable, and sortable in the search index (see the [schema checklist](#schema-checklist) for guidance).
27+
You must also have a unique field in source data that can be used as the [document key (or ID)](#document-keys) in the index.
2828

29-
+ A unique field in source data that can be used as the [document key (or ID)](#document-keys) in the index.
29+
+ A stable index location. Moving an existing index to a different search service isn't supported out-of-the-box. Revisit application requirements and make sure that your existing search service, its capacity and location, are sufficient for your needs.
3030

31-
+ A stable index location. Moving an existing index to a different search service is not supported out-of-the-box. Revisit application requirements and make sure that your existing search service, its capacity and location, are sufficient for your needs.
32-
33-
+ Finally, all service tiers have [index limits](search-limits-quotas-capacity.md#index-limits) on the number of objects that you can create. For example, if you are experimenting on the Free tier, you can only have 3 indexes at any given time. Within the index itself, there are limits on the number of complex fields and collections.
31+
+ Finally, all service tiers have [index limits](search-limits-quotas-capacity.md#index-limits) on the number of objects that you can create. For example, if you're experimenting on the Free tier, you can only have three indexes at any given time. Within the index itself, there are limits on the number of complex fields and collections.
3432

3533
## Document keys
3634

37-
A search index has one required field: a document key. A document key is the unique identifier of a search document. In Azure Cognitive Search, it must be a string, and it must originate from unique values in the data source that's providing the content to be indexed. A search service does not generate key values, but in some scenarios (such as the [Azure Table indexer](search-howto-indexing-azure-tables.md)) it will synthesize existing values to create a unique key for the documents being indexed.
35+
A search index has one required field: a document key. A document key is the unique identifier of a search document. In Azure Cognitive Search, it must be a string, and it must originate from unique values in the data source that's providing the content to be indexed. A search service doesn't generate key values, but in some scenarios (such as the [Azure Table indexer](search-howto-indexing-azure-tables.md)) it synthesizes existing values to create a unique key for the documents being indexed.
3836

39-
During incremental indexing, where just new and updated content is indexed, incoming documents with new keys are added, while incoming documents with existing keys are either merged or overwritten, depending on whether index fields are null or populated.
37+
During incremental indexing, where new and updated content is indexed, incoming documents with new keys are added, while incoming documents with existing keys are either merged or overwritten, depending on whether index fields are null or populated.
4038

4139
## Schema checklist
4240

4341
Use this checklist to assist the design decisions for your search index.
4442

4543
1. Review [naming conventions](/rest/api/searchservice/naming-rules) so that index and field names conform to the naming rules.
4644

47-
1. Review [supported data types](/rest/api/searchservice/supported-data-types). The data type will impact how the field is used. For example, numeric content is filterable but not full text searchable. The most common data type is `Edm.String` for searchable text, which is tokenized and queried using the full text search engine.
45+
1. Review [supported data types](/rest/api/searchservice/supported-data-types). The data type affects how the field is used. For example, numeric content is filterable but not full text searchable. The most common data type is `Edm.String` for searchable text, which is tokenized and queried using the full text search engine.
4846

49-
1. Identify a [document key](#document-keys). A document key is an index requirement. It's a single string field and it will be populated from a source data field that contains unique values. For example, if you're indexing from Blob Storage, the metadata storage path is often used as the document key because it uniquely identifies each blob in the container.
47+
1. Identify a [document key](#document-keys). A document key is an index requirement. It's a single string field and it's populated from a source data field that contains unique values. For example, if you're indexing from Blob Storage, the metadata storage path is often used as the document key because it uniquely identifies each blob in the container.
5048

51-
1. Identify the fields in your data source that will contribute searchable content in the index. Searchable content includes short or long strings that are queried using the full text search engine. If the content is verbose (small phrases or bigger chunks), experiment with different analyzers to see how the text is tokenized.
49+
1. Identify the fields in your data source that contribute searchable content in the index. Searchable content includes short or long strings that are queried using the full text search engine. If the content is verbose (small phrases or bigger chunks), experiment with different analyzers to see how the text is tokenized.
5250

53-
[Field attribute assignments](search-what-is-an-index.md#index-attributes) will determine both search behaviors and the physical representation of your index on the search service. Determining how fields should be specified is an iterative process for many customers. To speed up iterations, start with sample data so that you can drop and rebuild easily.
51+
[Field attribute assignments](search-what-is-an-index.md#index-attributes) determine both search behaviors and the physical representation of your index on the search service. Determining how fields should be specified is an iterative process for many customers. To speed up iterations, start with sample data so that you can drop and rebuild easily.
5452

5553
1. Identify which source fields can be used as filters. Numeric content and short text fields, particularly those with repeating values, are good choices. When working with filters, remember:
5654

5755
+ Filterable fields can optionally be used in faceted navigation.
5856

5957
+ Filterable fields are returned in arbitrary order, so consider making them sortable as well.
6058

61-
1. Determine whether you'll use the default analyzer (`"analyzer": null`) or a different analyzer. [Analyzers](search-analyzers.md) are used to tokenize text fields during indexing and query execution. If strings are descriptive and semantically rich, or if you have translated strings, consider overriding the default with a [language analyzer](index-add-language-analyzers.md).
59+
1. Determine whether to use the default analyzer (`"analyzer": null`) or a different analyzer. [Analyzers](search-analyzers.md) are used to tokenize text fields during indexing and query execution. If strings are descriptive and semantically rich, or if you have translated strings, consider overriding the default with a [language analyzer](index-add-language-analyzers.md).
6260

6361
> [!NOTE]
6462
> Full text search is conducted over terms that are tokenized during indexing. If your queries fail to return the results you expect, [test for tokenization](/rest/api/searchservice/test-analyzer) to verify the string actually exists. You can try different analyzers on strings to see how tokens are produced for various analyzers.
@@ -170,7 +168,7 @@ For Cognitive Search, the Azure SDKs implement generally available features. As
170168

171169
## Set `corsOptions` for cross-origin queries
172170

173-
Index schemas include a section for setting `corsOptions`. Client-side JavaScript cannot call any APIs by default since the browser will prevent all cross-origin requests. To allow cross-origin queries to your index, enable CORS (Cross-Origin Resource Sharing) by setting the **corsOptions** attribute. For security reasons, only [query APIs](search-query-create.md#choose-query-methods) support CORS.
171+
Index schemas include a section for setting `corsOptions`. By default, client-side JavaScript can't call any APIs because browsers prevent all cross-origin requests. To allow cross-origin queries through to your index, enable CORS (Cross-Origin Resource Sharing) by setting the **corsOptions** attribute. For security reasons, only [query APIs](search-query-create.md#choose-query-methods) support CORS.
174172

175173
```json
176174
"corsOptions": {
@@ -182,11 +180,11 @@ Index schemas include a section for setting `corsOptions`. Client-side JavaScrip
182180

183181
The following properties can be set for CORS:
184182

185-
+ **allowedOrigins** (required): This is a list of origins that will be granted access to your index. This means that any JavaScript code served from those origins will be allowed to query your index (assuming it provides the correct api-key). Each origin is typically of the form `protocol://<fully-qualified-domain-name>:<port>` although `<port>` is often omitted. See [Cross-origin resource sharing (Wikipedia)](https://en.wikipedia.org/wiki/Cross-origin_resource_sharing) for more details.
183+
+ **allowedOrigins** (required): This is a list of origins that are allowed access to your index. JavaScript code served from these origins is allowed to query your index (assuming the caller provides a valid key or has permissions). Each origin is typically of the form `protocol://<fully-qualified-domain-name>:<port>` although `<port>` is often omitted. For more information, see [Cross-origin resource sharing (Wikipedia)](https://en.wikipedia.org/wiki/Cross-origin_resource_sharing).
186184

187-
If you want to allow access to all origins, include `*` as a single item in the **allowedOrigins** array. *This is not a recommended practice for production search services* but it is often useful for development and debugging.
185+
If you want to allow access to all origins, include `*` as a single item in the **allowedOrigins** array. *This isn't a recommended practice for production search services* but it's often useful for development and debugging.
188186

189-
+ **maxAgeInSeconds** (optional): Browsers use this value to determine the duration (in seconds) to cache CORS preflight responses. This must be a non-negative integer. The larger this value is, the better performance will be, but the longer it will take for CORS policy changes to take effect. If it is not set, a default duration of 5 minutes will be used.
187+
+ **maxAgeInSeconds** (optional): Browsers use this value to determine the duration (in seconds) to cache CORS preflight responses. This must be a non-negative integer. A longer cache period delivers better performance, but it extends the amount of time a CORS policy needs to take effect. If this value isn't set, a default duration of five minutes is used.
190188

191189
## Allowed updates on existing indexes
192190

@@ -203,7 +201,7 @@ To minimize churn in the design process, the following table describes which ele
203201
| Field names and types | No |
204202
| Field attributes (searchable, filterable, facetable, sortable) | No |
205203
| Field attribute (retrievable) | Yes |
206-
| [Analyzer](search-analyzers.md) | You can add and modify custom analyzers in the index. Regarding analyzer assignments on string fields, you can only modify "searchAnalyzer". All other assignments and modifications require a rebuild. |
204+
| [Analyzer](search-analyzers.md) | You can add and modify custom analyzers in the index. Regarding analyzer assignments on string fields, you can only modify `searchAnalyzer`. All other assignments and modifications require a rebuild. |
207205
| [Scoring profiles](index-add-scoring-profiles.md) | Yes |
208206
| [Suggesters](index-add-suggesters.md) | No |
209207
| [cross-origin remote scripting (CORS)](#corsoptions) | Yes |

articles/search/semantic-how-to-query-request.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@ A *semantic configuration* is a section in your index that establishes field inp
6060

6161
You can only specify one title field, but you can specify as many content and keyword fields as you like. For content and keyword fields, list the fields in priority order because lower priority fields may get truncated.
6262

63-
Across all configuration properties, fields must be:
63+
Across all semantic configuration properties, the fields you assign must be:
6464

6565
+ Attributed as `searchable` and `retrievable`.
6666
+ Strings of type `Edm.String`, `Edm.ComplexType`, or `Collection(Edm.String)`.

0 commit comments

Comments
 (0)