Skip to content

Commit e5aceb4

Browse files
Merge pull request #110979 from HeidiSteen/heidist-search
[Azure Cog Search] autocomplete/suggester analyzer update
2 parents d953ba2 + 02ef728 commit e5aceb4

File tree

1 file changed

+36
-13
lines changed

1 file changed

+36
-13
lines changed

articles/search/index-add-suggesters.md

Lines changed: 36 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -13,9 +13,9 @@ ms.date: 04/10/2020
1313

1414
# Create a suggester to enable autocomplete and suggestions in Azure Cognitive Search
1515

16-
In Azure Cognitive Search, "search-as-you-type" or typeahead functionality is based on a **suggester** construct that you add to a [search index](search-what-is-an-index.md). A suggester supports two search-as-you-type variants: *autocomplete*, which completes the term or phrase you are typing, and *suggestions* that return a short list of matching documents.
16+
In Azure Cognitive Search, "search-as-you-type" is enabled through a **suggester** construct added to a [search index](search-what-is-an-index.md). A suggester supports two experiences: *autocomplete*, which completes the term or phrase, and *suggestions* that return a short list of matching documents.
1717

18-
The following screenshot, from the [Create your first app in C#](tutorial-csharp-type-ahead-and-suggestions.md) sample, illustrates both experiences. Autocomplete anticipates what the user might type, finishing "tw" with "in" as the prospective search term. Suggestions are actual search results, each one representing a matching document. For suggestions, you can surface any part of a document that best describes the result. In this example, the suggestions are represented by the hotel name field.
18+
The following screenshot from [Create your first app in C#](tutorial-csharp-type-ahead-and-suggestions.md) illustrates both. Autocomplete anticipates a potential term, finishing "tw" with "in". Suggestions are mini search results, where a field like hotel name represents a matching hotel search document from the index. For suggestions, you can surface any field that provides descriptive information.
1919

2020
![Visual comparison of autocomplete and suggested queries](./media/index-add-suggesters/hotel-app-suggestions-autocomplete.png "Visual comparison of autocomplete and suggested queries")
2121

@@ -31,22 +31,31 @@ Search-as-you-type support is enabled on a per-field basis. You can implement bo
3131

3232
A suggester is a data structure that supports search-as-you-type behaviors by storing prefixes for matching on partial queries. Similar to tokenized terms, prefixes are stored in inverted indexes, one for each field specified in a suggester fields collection.
3333

34-
When creating prefixes, a suggester has its own analysis chain, similar to the one used for full text search. However, unlike analysis in full text search, a suggester can only use predefined analyzers (standard Lucene, [language analyzers](index-add-language-analyzers.md), or other analyzers in the [predefined analyzer list](index-add-custom-analyzers.md#predefined-analyzers-reference). [Custom analyzers](index-add-custom-analyzers.md) and configurations are specifically disallowed to avoid random configurations that yield poor results.
34+
When creating prefixes, a suggester has its own analysis chain, similar to the one used for full text search. However, unlike analysis in full text search, a suggester can only operate over fields that use the standard Lucene analyzer (default) or a [language analyzer](index-add-language-analyzers.md). Fields that use [custom analyzers](index-add-custom-analyzers.md) or [predefined analyzers](index-add-custom-analyzers.md#predefined-analyzers-reference) (with the exception of standard Lucene) are explicitly disallowed to prevent poor outcomes.
3535

3636
> [!NOTE]
3737
> If you need to work around the analyzer constraint, use two separate fields for the same content. This will allow one of the fields to have a suggester, while the other can be set up with a custom analyzer configuration.
3838
39-
## Create a suggester
39+
## Define a suggester
4040

41-
Although a suggester has several properties, it is primarily a collection of fields for which you are enabling a typeahead experience. For example, a travel app might want to enable typeahead search on destinations, cities, and attractions. As such, all three fields would go in the fields collection.
41+
Although a suggester has several properties, it is primarily a collection of fields for which you are enabling a search-as-you-type experience. For example, a travel app might want to enable autocomplete on destinations, cities, and attractions. As such, all three fields would go in the fields collection.
4242

43-
To create a suggester, add one to an index schema. You can have one suggester in an index (specifically, one suggester in the suggesters collection).
43+
To create a suggester, add one to an index schema. You can have one suggester in an index (specifically, one suggester in the suggesters collection). A suggester takes a list of fields.
44+
45+
+ For suggestions, choose fields that best represent a single result. Names, titles, or other unique fields that distinguish among documents work best. If fields consist of similar or identical values, the suggestions will be composed of identical results and a user won't know which one to click.
46+
47+
+ Make sure each field in the suggester `sourceFields` list uses either the default standard Lucene analyzer (`"analyzer": null`) or a [language analyzer](index-add-language-analyzers.md) (for example, `"analyzer": "en.Microsoft"`).
48+
49+
Your choice of an analyzer determines how fields are tokenized and subsequently prefixed. For example, for a hyphenated string like "context-sensitive", using a language analyzer will result in these token combinations: "context", "sensitive", "context-sensitive". Had you used the standard Lucene analyzer, the hyphenated string would not exist.
50+
51+
> [!TIP]
52+
> Consider using the [Analyze Text API](https://docs.microsoft.com/rest/api/searchservice/test-analyzer) for insight into how terms are tokenized and subsequently prefixed. Once you build an index, you can try various analyzers on a string to view the tokens it emits.
4453
4554
### When to create a suggester
4655

4756
The best time to create a suggester is when you are also creating the field definition itself.
4857

49-
If you try to create a suggester using pre-existing fields, the API will disallow it. Typeahead text is created during indexing, when partial terms in two or more character combinations are tokenized alongside whole terms. Given that existing fields are already tokenized, you will have to rebuild the index if you want to add them to a suggester. For more information about reindexing, see [How to rebuild an Azure Cognitive Search index](search-howto-reindex.md).
58+
If you try to create a suggester using pre-existing fields, the API will disallow it. Prefixes are generated during indexing, when partial terms in two or more character combinations are tokenized alongside whole terms. Given that existing fields are already tokenized, you will have to rebuild the index if you want to add them to a suggester. For more information, see [How to rebuild an Azure Cognitive Search index](search-howto-reindex.md).
5059

5160
### Create using the REST API
5261

@@ -55,15 +64,30 @@ In the REST API, add suggesters through [Create Index](https://docs.microsoft.co
5564

5665
```json
5766
{
58-
"name": "hotels",
67+
"name": "hotels-sample-index",
5968
"fields": [
6069
. . .
70+
{
71+
"name": "HotelName",
72+
"type": "Edm.String",
73+
"facetable": false,
74+
"filterable": false,
75+
"key": false,
76+
"retrievable": true,
77+
"searchable": true,
78+
"sortable": false,
79+
"analyzer": "en.microsoft",
80+
"indexAnalyzer": null,
81+
"searchAnalyzer": null,
82+
"synonymMaps": [],
83+
"fields": []
84+
},
6185
],
6286
"suggesters": [
6387
{
6488
"name": "sg",
6589
"searchMode": "analyzingInfixMatching",
66-
"sourceFields": ["hotelName", "category"]
90+
"sourceFields": ["HotelName"]
6791
}
6892
],
6993
"scoringProfiles": [
@@ -81,12 +105,12 @@ private static void CreateHotelsIndex(SearchServiceClient serviceClient)
81105
{
82106
var definition = new Index()
83107
{
84-
Name = "hotels",
108+
Name = "hotels-sample-index",
85109
Fields = FieldBuilder.BuildForType<Hotel>(),
86110
Suggesters = new List<Suggester>() {new Suggester()
87111
{
88112
Name = "sg",
89-
SourceFields = new string[] { "HotelId", "Category" }
113+
SourceFields = new string[] { "HotelName", "Category" }
90114
}}
91115
};
92116

@@ -101,7 +125,7 @@ private static void CreateHotelsIndex(SearchServiceClient serviceClient)
101125
|--------------|-----------------|
102126
|`name` |The name of the suggester.|
103127
|`searchMode` |The strategy used to search for candidate phrases. The only mode currently supported is `analyzingInfixMatching`, which performs flexible matching of phrases at the beginning or in the middle of sentences.|
104-
|`sourceFields`|A list of one or more fields that are the source of the content for suggestions. Fields must be of type `Edm.String` and `Collection(Edm.String)`. If an analyzer is specified on the field, it must be a named analyzer from [this list](https://docs.microsoft.com/dotnet/api/microsoft.azure.search.models.analyzername?view=azure-dotnet) (not a custom analyzer).<p/>As a best practice, specify only those fields that lend themselves to an expected and appropriate response, whether it's a completed string in a search bar or a dropdown list.<p/>A hotel name is a good candidate because it has precision. Verbose fields like descriptions and comments are too dense. Similarly, repetitive fields, such as categories and tags, are less effective. In the examples, we include "category" anyway to demonstrate that you can include multiple fields. |
128+
|`sourceFields`|A list of one or more fields that are the source of the content for suggestions. Fields must be of type `Edm.String` and `Collection(Edm.String)`. If an analyzer is specified on the field, it must be a named analyzer from [this list](https://docs.microsoft.com/dotnet/api/microsoft.azure.search.models.analyzername?view=azure-dotnet) (not a custom analyzer).<p/> As a best practice, specify only those fields that lend themselves to an expected and appropriate response, whether it's a completed string in a search bar or a dropdown list.<p/>A hotel name is a good candidate because it has precision. Verbose fields like descriptions and comments are too dense. Similarly, repetitive fields, such as categories and tags, are less effective. In the examples, we include "category" anyway to demonstrate that you can include multiple fields. |
105129

106130
<a name="how-to-use-a-suggester"></a>
107131

@@ -129,7 +153,6 @@ If a suggester is not defined in the index, a call to autocomplete or suggestion
129153

130154
+ [DotNetHowToAutocomplete](https://github.com/Azure-Samples/search-dotnet-getting-started/tree/master/DotNetHowToAutocomplete) is an older sample containing both C# and Java code. It also demonstrates a suggester construction, suggested queries, autocomplete, and faceted navigation. This code sample uses the hosted [NYCJobs](https://github.com/Azure-Samples/search-dotnet-asp-net-mvc-jobs) sample data.
131155

132-
133156
## Next steps
134157

135158
We recommend the following example to see how the requests are formulated.

0 commit comments

Comments
 (0)