You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/search-faceted-navigation.md
+36-35Lines changed: 36 additions & 35 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -126,7 +126,7 @@ Here's a JSON example of the hotels sample index, showing "facetable" and "filte
126
126
127
127
### Prerequisites
128
128
129
-
* A new or existing search index, with plain text fields containing text or numeric content.
129
+
Add faceting to fields in a new or existing search index, on fields containing plain text or numeric content. Supported data types include strings, dates, boolean fields, and numeric fields (but not vectors).
130
130
131
131
You can't set facets on vector fields or fields of type `Edm.GeographyPoint` or `Collection(Edm.GeographyPoint)`.
132
132
@@ -138,24 +138,12 @@ Facets can be calculated over single-value fields and collections. Fields that w
138
138
139
139
* Human readable (nonvector) content.
140
140
141
-
* Low cardinality (a small number of distinct values that repeat throughout documents in your search corpus).
141
+
* Low cardinality (a few distinct values that repeat throughout documents in your search corpus).
142
142
143
143
* Short descriptive values (one or two words) that render nicely in a navigation tree.
144
144
145
-
Fields of type `Edm.String` that are filterable, sortable, or facetable can be at most 32 kilobytes in length. This is because values of such fields are treated as a single search term, and the maximum length of a term in Azure AI Search is 32 kilobytes. If you need to store more text than this in a single string field, you'll need to explicitly set filterable, sortable, and facetable to false in your index definition.
146
-
147
145
The values within a field, and not the field name itself, produce the facets in a faceted navigation structure. If the facet is a string field named *Color*, facets are blue, green, and any other value for that field.
148
146
149
-
For performance and storage optimization, set `"facetable": false` for fields that should never be used as a facet. These include:
150
-
151
-
* String fields for unique values, such as an ID or product name, to prevent their accidental (and ineffective) use in faceted navigation. This is especially true for the REST API that enables filters and facets on string fields by default.
152
-
153
-
* Geo-coordinates. You can't use `Edm.GeographyPoint` or `Collection(Edm.GeographyPoint)` fields in faceted navigation. Recall that facets work best on fields with low cardinality. Due to how geo-coordinates resolve, it's rare that any two sets of coordinates are equal in a given dataset. As such, facets aren't supported for geo-coordinates. You should use a city or region field to facet by location.
154
-
155
-
Setting a field as searchable, filterable, sortable, or facetable has an impact on index size and query performance. Don't set those attributes on fields that aren't meant to be referenced in query expressions.
156
-
157
-
In your code, check fields for null values, misspellings or case discrepancies, and single and plural versions of the same word. By default, filters and facets don't undergo lexical analysis or [spell check](speller-how-to-add.md), which means that all values of a "facetable" field are potential facets, even if the words differ by one character. Optionally, you can [assign a normalizer](search-normalizers.md) to a "filterable" and "facetable" field to smooth out variations in casing and characters.
158
-
159
147
### Defaults in REST and Azure SDKs
160
148
161
149
If you're using one of the Azure SDKs, your code must explicitly set the "facetable" attribute on a field.
@@ -165,51 +153,64 @@ The REST API has defaults for field attributes based on the [data type](/rest/ap
165
153
*`Edm.String` and `Collection(Edm.String)`
166
154
*`Edm.DateTimeOffset` and `Collection(Edm.DateTimeOffset)`
167
155
*`Edm.Boolean` and`Collection(Edm.Boolean)`
168
-
*`Edm.Int32`, `Edm.Int64`, `Edm.Double` and their collection equivalents
169
-
170
-
## Return a facet navigation structure in a query
156
+
*`Edm.Int32`, `Edm.Int64`, `Edm.Double`, and their collection equivalents
171
157
172
-
<!-- facet or facets string Optional. A field to facet by, where the field is attributed as "facetable". When called with GET, facet is a field (facet: field1). When called with POST, this parameter is named facets instead of facet and it's specified as an array (facets: [field1, field2, field3]). The string may contain parameters to customize the faceting, expressed as comma-separated name-value pairs.
158
+
## Return facets in a query
173
159
174
-
Valid parameters are "count", "sort", "values", "interval", and "timeoffset".
160
+
Recall that facets are calculated from results in a query response. You only get facets for documents found by the current query.
175
161
176
-
"count" is the maximum number of facet terms; default is 10. There's no upper limit on the number of terms, but higher values degrade performance, especially if the faceted field contains a large number of unique terms. For example, "facet=category,count:5" gets the top five categories in facet results. If the count parameter is less than the number of unique terms, the results may not be accurate. This is due to the way faceting queries are distributed across shards. You can set count to zero or to a value that's greater than or equal to the number of unique values in the facetable field to get an accurate count across all shards. The tradeoff is increased latency.
162
+
1. Facets are configured at query-time. Use the [Search POST](/rest/api/searchservice/documents/search-post) or [Search GET](/rest/api/searchservice/documents/search-get) request, or an equivalent Azure SDK API, to specify facets.
177
163
178
-
"sort" can be set to "count", "-count", "value", "-value". Use count to sort descending by count. Use -count to sort ascending by count. Use value to sort ascending by value. Use -value to sort descending by value (for example, "facet=category,count:3,sort:count" gets the top three categories in facet results in descending order by the number of documents with each city name). If the top three categories are Budget, Motel, and Luxury, and Budget has 5 hits, Motel has 6, and Luxury has 4, then the buckets are in the order Motel, Budget, Luxury. For -value, "facet=rating,sort:-value" produces buckets for all possible ratings, in descending order by value (for example, if the ratings are from 1 to 5, the buckets are ordered 5, 4, 3, 2, 1, irrespective of how many documents match each rating).
164
+
1. Set facet query parameters in the quest. In Search POST, facets are an array of facet expressions to apply to the search query. Each facet expression contains a field name, optionally followed by a comma-separated list of name:value pairs. Valid facet parameters are `count`, `sort`, `values`, `interval`, and `timeoffset`.
179
165
180
-
"values" can set to pipe-delimited numeric or Edm.DateTimeOffset values specifying a dynamic set of facet entry values (for example, "facet=baseRate,values:10 | 20" produces three buckets: one for base rate 0 up to but not including 10, one for 10 up to but not including 20, and one for 20 and higher). A string "facet=lastRenovationDate,values:2010-02-01T00:00:00Z" produces two buckets: one for hotels renovated before February 2010, and one for hotels renovated February 1, 2010 or later. The values must be listed in sequential, ascending order to get the expected results.
166
+
| Facet parameter | Description and usage |
167
+
|-----------------|-----------------------|
168
+
| `count` | Maximum number of facet terms; default is 10. An example is `Tags,count:5`. There's no upper limit on the number of terms, but higher values degrade performance, especially if the faceted field contains a large number of unique terms. This is due to the way faceting queries are distributed across shards. You can set count to zero or to a value that's greater than or equal to the number of unique values in the facetable field to get an accurate count across all shards. The tradeoff is increased latency.
169
+
|`sort`| Set to "count", "-count", "value", "-value". Use count to sort descending by count. Use -count to sort ascending by count. Use value to sort ascending by value. Use -value to sort descending by value (for example, "facet=category,count:3,sort:count" gets the top three categories in facet results in descending order by the number of documents with each city name). If the top three categories are Budget, Motel, and Luxury, and Budget has five hits, Motel has 6, and Luxury has 4, then the buckets are in the order Motel, Budget, Luxury. For -value, "facet=rating,sort:-value" produces buckets for all possible ratings, in descending order by value (for example, if the ratings are from 1 to 5, the buckets are ordered 5, 4, 3, 2, 1, irrespective of how many documents match each rating). |
170
+
|`values`| Set to pipe-delimited numeric or `Edm.DateTimeOffset` values specifying a dynamic set of facet entry values. For example, "facet=baseRate,values:10 \| 20" produces three buckets: one for base rate 0 up to but not including 10, one for 10 up to but not including 20, and one for 20 and higher. A string "facet=lastRenovationDate,values:2010-02-01T00:00:00Z" produces two buckets: one for hotels renovated before February 2010, and one for hotels renovated February 1, 2010 or later. The values must be listed in sequential, ascending order to get the expected results. |
171
+
| interval| An integer interval greater than 0 for numbers, or minute, hour, day, week, month, quarter, year for date time values. For example, "facet=baseRate,interval:100" produces buckets based on base rate ranges of size 100. If base rates are all between $60 and $600, there are buckets for 0-100, 100-200, 200-300, 300-400, 400-500, and 500-600. The string "facet=lastRenovationDate,interval:year" produces one bucket for each year when hotels were renovated. |
172
+
| `timeoffset` | Can be set to ([+-]hh:mm, [+-]hhmm, or [+-]hh). If used, the timeoffset parameter must be combined with the interval option, and only when applied to a field of type Edm.DateTimeOffset. The value specifies the UTC time offset to account for in setting time boundaries. For example: "facet=lastRenovationDate,interval:day,timeoffset:-01:00" uses the day boundary that starts at 01:00:00 UTC (midnight in the target time zone).
181
173
182
-
"interval" is an integer interval greater than 0 for numbers, or minute, hour, day, week, month, quarter, year for date time values. For example, "facet=baseRate,interval:100" produces buckets based on base rate ranges of size 100. If base rates are all between $60 and $600, there will be buckets for 0-100, 100-200, 200-300, 300-400, 400-500, and 500-600. The string "facet=lastRenovationDate,interval:year" produces one bucket for each year when hotels were renovated.
174
+
`count` and `sort` can be combined in the same facet specification, but they can't be combined with interval or values, and intervaland values can't be combined together.
183
175
184
-
"timeoffset" can be set to ([+-]hh:mm, [+-]hhmm, or [+-]hh). If used, the timeoffset parameter must be combined with the interval option, and only when applied to a field of type Edm.DateTimeOffset. The value specifies the UTC time offset to account for in setting time boundaries. For example: "facet=lastRenovationDate,interval:day,timeoffset:-01:00" uses the day boundary that starts at 01:00:00 UTC (midnight in the target time zone).
176
+
Interval facets on date time are computed based on the UTC time if `timeoffset` isn't specified. For example: for "facet=lastRenovationDate,interval:day", the day boundary starts at 00:00:00 UTC.
185
177
186
-
count and sort can be combined in the same facet specification, but they can't be combined with interval or values, and interval and values can't be combined together.
178
+
### Facet example
187
179
188
-
Interval facets on date time are computed based on the UTC time if timeoffset isn't specified. For example: for "facet=lastRenovationDate,interval:day", the day boundary starts at 00:00:00 UTC.
189
-
-->
190
-
191
-
## Facets syntax
192
-
193
-
A facet query parameter is set to a comma-delimited list of "facetable" fields and depending on the data type, can be further parameterized to set counts, sort orders, and ranges: `count:<integer>`, `sort:<>`, `interval:<integer>`, and `values:<list>`. For more detail about facet parameters, see [query parameters in the REST API](/rest/api/searchservice/documents/search-post#searchrequest).
180
+
The following example works against the hotels sample index. It shows three facets for "Category", "Tags", and "Rating", with a count override on "Tags" and explicit whole number values set on "Rating", which is otherwise a float value in the index.
194
181
195
182
```http
196
183
POST https://{{service_name}}.search.windows.net/indexes/hotels/docs/search?api-version={{api_version}}
For each faceted navigation tree, there's a default limit of the top ten facets. This default makes sense for navigation structures because it keeps the values list to a manageable size. You can override the default by assigning a value to "count". For example, `"Tags,count:5"` reduces the number of tags under the Tags section to the top five.
195
+
For each faceted navigation tree, there's a default limit of the top 10 facet instances found in search results. This default makes sense for navigation structures because it keeps the values list to a manageable size. You can override the default by assigning a value to "count". For example, `"Tags,count:5"` reduces the number of tags under the Tags section to the top five.
205
196
206
197
For Numeric and DateTime values only, you can explicitly set values on the facet field (for example, `facet=Rating,values:1|2|3|4|5`) to separate results into contiguous ranges (either ranges based on numeric values or time periods). Alternatively, you can add "interval", as in `facet=Rating,interval:1`.
207
198
208
199
Each range is built using 0 as a starting point, a value from the list as an endpoint, and then trimmed of the previous range to create discrete intervals.
209
200
210
-
## Tips for working with facets
201
+
## Best practices for working with facets
202
+
203
+
This section is a collection of tips and workarounds that are helpful for application development.
204
+
205
+
### Disable faceting to save on storage and improve performance
211
206
212
-
This section is a collection of tips and workarounds that might be helpful.
207
+
For performance and storage optimization, set `"facetable": false` for fields that should never be used as a facet. These include string fields for unique values, such as an ID or product name, to prevent their accidental (and ineffective) use in faceted navigation. This is especially true for the REST API that enables filters and facets on string fields by default.
208
+
209
+
Remember that you can't use `Edm.GeographyPoint` or `Collection(Edm.GeographyPoint)` fields in faceted navigation. Recall that facets work best on fields with low cardinality. Due to how geo-coordinates resolve, it's rare that any two sets of coordinates are equal in a given dataset. As such, facets aren't supported for geo-coordinates. You should use a city or region field to facet by location.
210
+
211
+
### Add logic for checking facet quality
212
+
213
+
In your code, check fields for null values, misspellings or case discrepancies, and single and plural versions of the same word. By default, filters and facets don't undergo lexical analysis or [spell check](speller-how-to-add.md), which means that all values of a "facetable" field are potential facets, even if the words differ by one character. Optionally, you can [assign a normalizer](search-normalizers.md) to a "filterable" and "facetable" field to smooth out variations in casing and characters.
0 commit comments