You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/reference/query-languages/query-dsl/full-text-filter-tutorial.md
+43-40Lines changed: 43 additions & 40 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -13,23 +13,25 @@ products:
13
13
14
14
This is a hands-on introduction to the basics of full-text search with {{es}}, also known as *lexical search*, using the `_search` API and Query DSL.
15
15
16
-
You'll implement a search function for a cooking blog that contains recipes with textual content, categorical data, and numerical ratings.
17
-
You'll apply filters to narrow down search results and combine multiple search criteria.
18
-
For example, in this scenario you might want to:
16
+
In this tutorial, you'll implement a search function for a cooking blog and learn how to filter data to narrow down search results based on exact criteria.
17
+
The blog contains recipes with various attributes including textual content, categorical data, and numerical ratings.
18
+
The goal is to create search queries to:
19
19
20
20
* Find recipes based on preferred or avoided ingredients
21
21
* Explore dishes that meet specific dietary needs
22
22
* Find top-rated recipes in specific categories
23
23
* Find the latest recipes from favorite authors
24
24
25
+
To achieve these goals, you'll use different {{es}} queries to perform full-text search, apply filters, and combine multiple search criteria.
26
+
25
27
::::{tip}
26
28
The code examples are in [Console](docs-content://explore-analyze/query-filter/tools/console.md) syntax by default.
27
29
You can [convert into other programming languages](docs-content://explore-analyze/query-filter/tools/console.md#import-export-console-requests) in the Console UI.
You can follow these steps in any {{es}} deployment.
34
+
You can follow these steps in any type of {{es}} deployment.
33
35
To see all deployment options, refer to [Choosing your deployment type](docs-content://deploy-manage/deploy.md#choosing-your-deployment-type).
34
36
To get started quickly, set up a [single-node local cluster in Docker](docs-content://solutions/search/run-elasticsearch-locally.md).
35
37
@@ -101,8 +103,8 @@ PUT /cooking_blog/_mapping
101
103
```
102
104
103
105
1.`analyzer`: Used for text analysis. If you don't specify it, the `standard` analyzer is used by default for `text` fields. It's included here for demonstration purposes. To know more about analyzers, refer [Anatomy of an analyzer](https://docs-v3-preview.elastic.dev/elastic/docs-content/tree/main/manage-data/data-store/text-analysis/anatomy-of-an-analyzer).
104
-
2.`ignore_above`: Prevents indexing values longer than 256 characters in the `keyword` field. This is the default value and it's included here for demonstration purposes. It helps to save disk space and avoid potential issues with Lucene's term byte-length limit. For more information, refer [ignore_above parameter](/reference/elasticsearch/mapping-reference/ignore-above.md).
105
-
3.`description`: A field declared with both `text` and `keyword`[data types](/reference/elasticsearch/mapping-reference/field-data-types.md). Such fields are called [Multi-fields](/reference/elasticsearch/mapping-reference/multi-fields.md). This enables both full-text search and exact matching/filtering on the same field. If you use [dynamic mapping](docs-content://manage-data/data-store/mapping/dynamic-field-mapping.md), these multi-fields will be created automatically. A few other fields in the mapping like `author`, `category`, `tags` are also declared as multi-fields.
106
+
2.`ignore_above`: Prevents indexing values longer than 256 characters in the `keyword` field. This is the default value and it's included here for demonstration purposes. It helps to save disk space and avoid potential issues with Lucene's term byte-length limit. For more information, refer to [ignore_above parameter](/reference/elasticsearch/mapping-reference/ignore-above.md).
107
+
3.`description`: A field declared with both `text` and `keyword`[data types](/reference/elasticsearch/mapping-reference/field-data-types.md). Such fields are called [multi-fields](/reference/elasticsearch/mapping-reference/multi-fields.md). This enables both full-text search and exact matching/filtering on the same field. If you use [dynamic mapping](docs-content://manage-data/data-store/mapping/dynamic-field-mapping.md), these multi-fields will be created automatically. A few other fields in the mapping like `author`, `category`, `tags` are also declared as multi-fields.
106
108
107
109
108
110
@@ -113,7 +115,7 @@ Full-text search is powered by [text analysis](docs-content://solutions/search/f
113
115
114
116
## Add sample blog posts to your index [full-text-filter-tutorial-index-data]
115
117
116
-
Next, you'll need to index some example blog posts using the [Bulk API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk). Note that `text` fields are analyzed and multi-fields are generated at index time.
118
+
Next, index some example blog posts using the [bulk API]({{es-apis}}operation/operation-bulk). Note that `text` fields are analyzed and multi-fields are generated at index time.
117
119
118
120
```console
119
121
POST /cooking_blog/_bulk?refresh=wait_for
@@ -135,7 +137,7 @@ Full-text search involves executing text-based queries across one or more docume
135
137
136
138
### Use `match` query [_match_query]
137
139
138
-
The [`match`](/reference/query-languages/query-dsl/query-dsl-match-query.md) query is the standard query for full-text, or "lexical", search. The query text will be analyzed according to the analyzer configuration specified on each field (or at query time).
140
+
The [`match`](/reference/query-languages/query-dsl/query-dsl-match-query.md) query is the standard query for full-text search. The query text will be analyzed according to the analyzer configuration specified on each field (or at query time).
139
141
140
142
First, search the `description` field for "fluffy pancakes":
141
143
@@ -154,7 +156,7 @@ GET /cooking_blog/_search
154
156
155
157
1. By default, the `match` query uses `OR` logic between the resulting tokens. This means it will match documents that contain either "fluffy" or "pancakes", or both, in the description field.
156
158
157
-
At search time, {{es}} defaults to the analyzer defined in the field mapping. In this example, we're using the `standard` analyzer. Using a different analyzer at search time is an [advanced use case](docs-content://manage-data/data-store/text-analysis/index-search-analysis.md#different-analyzers).
159
+
At search time, {{es}} defaults to the analyzer defined in the field mapping. This example uses the `standard` analyzer. Using a different analyzer at search time is an [advanced use case](docs-content://manage-data/data-store/text-analysis/index-search-analysis.md#different-analyzers).
158
160
159
161
::::{dropdown} Example response
160
162
```console-result
@@ -200,14 +202,15 @@ At search time, {{es}} defaults to the analyzer defined in the field mapping. In
200
202
1.`hits`: Contains the total number of matching documents and their relation to the total.
201
203
2.`max_score`: The highest relevance score among all matching documents. In this example, there is only have one matching document.
202
204
3.`_score`: The relevance score for a specific document, indicating how well it matches the query. Higher scores indicate better matches. In this example the `max_score` is the same as the `_score`, as there is only one matching document.
203
-
4. The title contains both "Fluffy" and "Pancakes", matching our search terms exactly.
205
+
4. The title contains both "Fluffy" and "Pancakes", matching the search terms exactly.
204
206
5. The description includes "fluffiest" and "pancakes", further contributing to the document's relevance due to the analysis process.
205
207
206
208
::::
207
209
208
210
### Include all terms match in a query [_require_all_terms_in_a_match_query]
209
211
210
-
Specify the `and` operator to include both terms in the `description` field. This stricter search returns *zero hits* on our sample data, as no document contains both "fluffy" and "pancakes" in the description.
212
+
Specify the `and` operator to include both terms in the `description` field.
213
+
This stricter search returns *zero hits* on the sample data because no documents contain both "fluffy" and "pancakes" in the description.
211
214
212
215
```console
213
216
GET /cooking_blog/_search
@@ -269,9 +272,10 @@ GET /cooking_blog/_search
269
272
}
270
273
```
271
274
272
-
## Search across multiple fields at once [full-text-filter-tutorial-multi-match]
275
+
## Search across multiple fields [full-text-filter-tutorial-multi-match]
273
276
274
-
When users enter a search query, they often don't know (or care) whether their search terms appear in a specific field. A [`multi_match`](/reference/query-languages/query-dsl/query-dsl-multi-match-query.md) query allows searching across multiple fields simultaneously.
277
+
When you enter a search query, you might not know whether the search terms appear in a specific field.
278
+
A [`multi_match`](/reference/query-languages/query-dsl/query-dsl-multi-match-query.md) query enables you to search across multiple fields simultaneously.
275
279
276
280
Start with a basic `multi_match` query:
277
281
@@ -304,11 +308,13 @@ GET /cooking_blog/_search
304
308
}
305
309
```
306
310
307
-
1. The `^` syntax applies a boost to specific fields:*`title^3`: The title field is 3 times more important than an unboosted field
308
-
*`description^2`: The description is 2 times more important
309
-
*`tags`: No boost applied (equivalent to `^1`)
311
+
1. The `^` syntax applies a boost to specific fields:
312
+
313
+
*`title^3`: The title field is 3 times more important than an unboosted field.
314
+
*`description^2`: The description is 2 times more important.
315
+
*`tags`: No boost applied (equivalent to `^1`).
310
316
311
-
These boosts help tune relevance, prioritizing matches in the title over the description, and matches in the description over tags.
317
+
These boosts help tune relevance, prioritizing matches in the title over the description and matches in the description over tags.
312
318
313
319
Learn more about fields and per-field boosting in the [`multi_match` query](/reference/query-languages/query-dsl/query-dsl-multi-match-query.md) reference.
314
320
@@ -354,24 +360,22 @@ Learn more about fields and per-field boosting in the [`multi_match` query](/ref
354
360
}
355
361
```
356
362
357
-
1. The title contains "Vegetarian" and "Curry", which matches our search terms. The title field has the highest boost (^3), contributing significantly to this document's relevance score.
363
+
1. The title contains "Vegetarian" and "Curry", which matches the search terms. The title field has the highest boost (^3), contributing significantly to this document's relevance score.
358
364
2. The description contains "curry" and related terms like "vegetables", further increasing the document's relevance.
359
-
3. The tags include both "vegetarian" and "curry", providing an exact match for our search terms, albeit with no boost.
365
+
3. The tags include both "vegetarian" and "curry", providing an exact match for the search terms, albeit with no boost.
360
366
361
-
362
-
This result demonstrates how the `multi_match` query with field boosts helps users find relevant recipes across multiple fields. Even though the exact phrase "vegetarian curry" doesn't appear in any single field, the combination of matches across fields produces a highly relevant result.
367
+
This result demonstrates how the `multi_match` query with field boosts helps you find relevant recipes across multiple fields.
368
+
Even though the exact phrase "vegetarian curry" doesn't appear in any single field, the combination of matches across fields produces a highly relevant result.
363
369
364
370
::::
365
371
366
-
367
372
::::{tip}
368
-
The `multi_match` query is often recommended over a single `match` query for most text search use cases, as it provides more flexibility and better matches user expectations.
369
-
373
+
The `multi_match` query is often recommended over a single `match` query for most text search use cases because it provides more flexibility and better matches user expectations.
370
374
::::
371
375
372
376
## Filter and find exact matches [full-text-filter-tutorial-filtering]
373
377
374
-
[Filtering](docs-content://explore-analyze/query-filter/languages/querydsl.md#filter-context)allows you to narrow down your search results based on exact criteria. Unlike full-text searches, filters are binary (yes/no) and do not affect the relevance score. Filters execute faster than queries because excluded results don't need to be scored.
378
+
[Filtering](docs-content://explore-analyze/query-filter/languages/querydsl.md#filter-context)enables you to narrow down your search results based on exact criteria. Unlike full-text searches, filters are binary (yes or no) and do not affect the relevance score. Filters run faster than queries because excluded results don't need to be scored.
375
379
376
380
The following [`bool`](/reference/query-languages/query-dsl/query-dsl-bool-query.md) query will return blog posts only in the "Breakfast" category.
377
381
@@ -394,14 +398,14 @@ GET /cooking_blog/_search
394
398
::::{tip}
395
399
The `.keyword` suffix accesses the unanalyzed version of a field, enabling exact, case-sensitive matching. This works in two scenarios:
396
400
397
-
1.**When using dynamic mapping for text fields**. {{es}} automatically creates a `.keyword` sub-field.
398
-
2.**When text fields are explicitly mapped with a `.keyword` sub-field**. For example, you explicitly mapped the `category` field [in an earlier step](#full-text-filter-tutorial-create-index) of this tutorial.
399
-
401
+
1. When using dynamic mapping for text fields. {{es}} automatically creates a `.keyword` sub-field.
402
+
2. When text fields are explicitly mapped with a `.keyword` sub-field. For example, you explicitly mapped the `category` field when you defined the mappings for the `cooking_blog` index.
400
403
::::
401
404
402
-
### Search for posts within a date range [full-text-filter-tutorial-range-query]
405
+
### Search within a date range [full-text-filter-tutorial-range-query]
403
406
404
-
Users often want to find content published within a specific time frame. A [`range`](/reference/query-languages/query-dsl/query-dsl-range-query.md) query finds documents that fall within numeric or date ranges.
407
+
To find content published within a specific time frame, use a [`range`](/reference/query-languages/query-dsl/query-dsl-range-query.md) query.
408
+
It finds documents that fall within numeric or date ranges.
Sometimes users want to search for exact terms to eliminate ambiguity in their search results. A [`term`](/reference/query-languages/query-dsl/query-dsl-term-query.md) query searches for an exact term in a field without analyzing it. Exact, case-sensitive matches on specific terms are often referred to as "keyword" searches.
429
+
Sometimes you might want to search for exact terms to eliminate ambiguity in the search results. A [`term`](/reference/query-languages/query-dsl/query-dsl-term-query.md) query searches for an exact term in a field without analyzing it. Exact, case-sensitive matches on specific terms are often referred to as "keyword" searches.
426
430
427
431
In the following example, you'll search for the author "Maria Rodriguez" in the `author.keyword` field.
428
432
@@ -437,17 +441,16 @@ GET /cooking_blog/_search
437
441
}
438
442
```
439
443
440
-
1. The `term` query has zero flexibility. For example, if the `author.keyword` contains words `maria` or `maria rodriguez`, the query will have zero hits, due to case sensitivity.
444
+
1. The `term` query has zero flexibility. For example, if the `author.keyword` contains words `maria` or `maria rodriguez`, the query will have zero hits due to case sensitivity.
441
445
442
446
::::{tip}
443
-
Avoid using the `term` query for [`text` fields](/reference/elasticsearch/mapping-reference/text.md) because they are transformed by the analysis process.
447
+
Avoid using the `term` query for `text` fields because they are transformed by the analysis process.
A [`bool`](/reference/query-languages/query-dsl/query-dsl-bool-query.md) query allows you to combine multiple query clauses to create sophisticated searches. In this tutorial, it's useful when users have complex requirements for finding recipes.
449
-
450
-
Create a query that addresses the following user needs:
452
+
You can use a [`bool`](/reference/query-languages/query-dsl/query-dsl-bool-query.md) query to combine multiple query clauses and create sophisticated searches.
453
+
For example, create a query that addresses the following requirements:
451
454
452
455
* Must be a vegetarian recipe
453
456
* Should contain "curry" or "spicy" in the title or description
@@ -550,12 +553,12 @@ GET /cooking_blog/_search
550
553
}
551
554
```
552
555
553
-
1. The title contains "Spicy" and "Curry", matching our should condition. With the default [best_fields](/reference/query-languages/query-dsl/query-dsl-multi-match-query.md#type-best-fields) behavior, this field contributes most to the relevance score.
556
+
1. The title contains "Spicy" and "Curry", matching the should condition. With the default [best_fields](/reference/query-languages/query-dsl/query-dsl-multi-match-query.md#type-best-fields) behavior, this field contributes most to the relevance score.
554
557
2. While the description also contains matching terms, only the best matching field's score is used by default.
555
-
3.The recipe was published within the last month, satisfying our recency preference.
558
+
3.If the recipe was published within the last month, it would satisfy the recency preference.
556
559
4. The "Main Course" category satisfies another `should` condition.
557
-
5. The "vegetarian" tag satisfies a `must` condition, while "curry" and "spicy" tags align with our`should` preferences.
558
-
6. The rating of 4.6 meets our minimum rating requirement of 4.5.
560
+
5. The "vegetarian" tag satisfies a `must` condition, while "curry" and "spicy" tags align with the`should` preferences.
561
+
6. The rating of 4.6 meets the minimum rating requirement of 4.5.
0 commit comments