|
| 1 | +--- |
| 2 | +navigation_title: "Search and filter with ES|QL" |
| 3 | +--- |
| 4 | + |
| 5 | +# Tutorial: Full-text search and filtering with {{esql}} |
| 6 | + |
| 7 | +:::{tip} |
| 8 | +This tutorial presents examples in {{esql}} syntax. Refer to [the Query DSL version](query-dsl-full-text-filter-tutorial.md) for the equivalent examples in Query DSL syntax. |
| 9 | +::: |
| 10 | + |
| 11 | +This is a hands-on introduction to the basics of [full-text search](full-text.md) with Elasticsearch, also known as *lexical search*, and how to filter search results based on exact criteria. In this scenario, we're implementing a search function for a cooking blog. The blog contains recipes with various attributes including textual content, categorical data, and numerical ratings. |
| 12 | + |
| 13 | +## Requirements |
| 14 | + |
| 15 | +You'll need a running {{es}} cluster, together with {{kib}} to use the Dev Tools API Console. Refer to [choose your deployment type](/deploy-manage/deploy#choosing-your-deployment-type) for deployment options. |
| 16 | + |
| 17 | +Want to get started quickly? Run the following command in your terminal to set up a [single-node local cluster in Docker](get-started.md): |
| 18 | + |
| 19 | +```sh |
| 20 | +curl -fsSL https://elastic.co/start-local | sh |
| 21 | +``` |
| 22 | + |
| 23 | +## Step 1: Create an index |
| 24 | + |
| 25 | +Create the `cooking_blog` index to get started: |
| 26 | + |
| 27 | +```console |
| 28 | +PUT /cooking_blog |
| 29 | +``` |
| 30 | + |
| 31 | +Now define the mappings for the index: |
| 32 | + |
| 33 | +```console |
| 34 | +PUT /cooking_blog/_mapping |
| 35 | +{ |
| 36 | + "properties": { |
| 37 | + "title": { |
| 38 | + "type": "text", |
| 39 | + "analyzer": "standard", <1> |
| 40 | + "fields": { <2> |
| 41 | + "keyword": { |
| 42 | + "type": "keyword", |
| 43 | + "ignore_above": 256 <3> |
| 44 | + } |
| 45 | + } |
| 46 | + }, |
| 47 | + "description": { |
| 48 | + "type": "text", |
| 49 | + "fields": { |
| 50 | + "keyword": { |
| 51 | + "type": "keyword" |
| 52 | + } |
| 53 | + } |
| 54 | + }, |
| 55 | + "author": { |
| 56 | + "type": "text", |
| 57 | + "fields": { |
| 58 | + "keyword": { |
| 59 | + "type": "keyword" |
| 60 | + } |
| 61 | + } |
| 62 | + }, |
| 63 | + "date": { |
| 64 | + "type": "date", |
| 65 | + "format": "yyyy-MM-dd" |
| 66 | + }, |
| 67 | + "category": { |
| 68 | + "type": "text", |
| 69 | + "fields": { |
| 70 | + "keyword": { |
| 71 | + "type": "keyword" |
| 72 | + } |
| 73 | + } |
| 74 | + }, |
| 75 | + "tags": { |
| 76 | + "type": "text", |
| 77 | + "fields": { |
| 78 | + "keyword": { |
| 79 | + "type": "keyword" |
| 80 | + } |
| 81 | + } |
| 82 | + }, |
| 83 | + "rating": { |
| 84 | + "type": "float" |
| 85 | + } |
| 86 | + } |
| 87 | +} |
| 88 | +``` |
| 89 | + |
| 90 | +1. The `standard` analyzer is used by default for `text` fields if an `analyzer` isn't specified. It's included here for demonstration purposes. |
| 91 | +2. [Multi-fields](elasticsearch://reference/elasticsearch/mapping-reference/multi-fields.md) are used here to index `text` fields as both `text` and `keyword` [data types](elasticsearch://reference/elasticsearch/mapping-reference/field-data-types.md). This enables both full-text search and exact matching/filtering on the same field. Note that if you used [dynamic mapping](../../manage-data/data-store/mapping/dynamic-field-mapping.md), these multi-fields would be created automatically. |
| 92 | +3. The [`ignore_above` parameter](elasticsearch://reference/elasticsearch/mapping-reference/ignore-above.md) prevents indexing values longer than 256 characters in the `keyword` field. Again this is the default value, but it's included here for demonstration purposes. It helps to save disk space and avoid potential issues with Lucene's term byte-length limit. |
| 93 | + |
| 94 | +::::{tip} |
| 95 | +Full-text search is powered by [text analysis](full-text/text-analysis-during-search.md). Text analysis normalizes and standardizes text data so it can be efficiently stored in an inverted index and searched in near real-time. Analysis happens at both [index and search time](../../manage-data/data-store/text-analysis/index-search-analysis.md). This tutorial won't cover analysis in detail, but it's important to understand how text is processed to create effective search queries. |
| 96 | +:::: |
| 97 | + |
| 98 | +## Step 2: Perform basic full-text searches |
| 99 | + |
| 100 | +Full-text search involves executing text-based queries across one or more document fields. These queries calculate a relevance score for each matching document, based on how closely the document's content aligns with the search terms. Elasticsearch offers various query types, each with its own method for matching text and relevance scoring. |
| 101 | + |
| 102 | +:::{tip} |
| 103 | +ES|QL provides two ways to perform full-text searches: |
| 104 | + |
| 105 | +1. Full match function syntax: `match(field, "search terms")` |
| 106 | +1. Compact syntax using the colon operator: `field:"search terms"` |
| 107 | + |
| 108 | +Both are equivalent and can be used interchangeably. The compact syntax is more concise, while the function syntax allows for more configuration options. We'll use the compact syntax in most examples for brevity. |
| 109 | +::: |
| 110 | + |
| 111 | +### Basic full-text query |
| 112 | + |
| 113 | +Here's how to search the `description` field for "fluffy pancakes": |
| 114 | + |
| 115 | +```esql |
| 116 | +POST /_query?format=txt |
| 117 | +{ |
| 118 | + "query": """ |
| 119 | + FROM cooking_blog |
| 120 | + | WHERE description:"fluffy pancakes" |
| 121 | + | LIMIT 1000 |
| 122 | + """ |
| 123 | +} |
| 124 | +``` |
| 125 | + |
| 126 | +By default, like the Query DSL `match` query, ES|QL uses `OR` logic between terms. This means it will match documents that contain either "fluffy" or "pancakes", or both, in the description field. |
| 127 | + |
| 128 | +:::{tip} |
| 129 | +You can control which fields to include in the response using the `KEEP` command: |
| 130 | + |
| 131 | +```esql |
| 132 | +POST /_query?format=txt |
| 133 | +{ |
| 134 | + "query": """ |
| 135 | + FROM cooking_blog |
| 136 | + | WHERE description:"fluffy pancakes" |
| 137 | + | KEEP title, description, rating |
| 138 | + | LIMIT 1000 |
| 139 | + """ |
| 140 | +} |
| 141 | +``` |
| 142 | +::: |
| 143 | + |
| 144 | +### Require all terms in a match query |
| 145 | + |
| 146 | +Sometimes you need to require that all search terms appear in the matching documents. Here's how to do that using the function syntax with the `operator` parameter: |
| 147 | + |
| 148 | +```esql |
| 149 | +POST /_query?format=txt |
| 150 | +{ |
| 151 | + "query": """ |
| 152 | + FROM cooking_blog |
| 153 | + | WHERE match(description, "fluffy pancakes", {"operator": "AND"}) |
| 154 | + | LIMIT 1000 |
| 155 | + """ |
| 156 | +} |
| 157 | +``` |
| 158 | + |
| 159 | +This stricter search returns *zero hits* on our sample data, as no document contains both "fluffy" and "pancakes" in the description. |
| 160 | + |
| 161 | +### Specify a minimum number of terms to match |
| 162 | + |
| 163 | +Sometimes requiring all terms is too strict, but the default OR behavior is too lenient. You can specify a minimum number of terms that must match: |
| 164 | + |
| 165 | +```esql |
| 166 | +POST /_query?format=txt |
| 167 | +{ |
| 168 | + "query": """ |
| 169 | + FROM cooking_blog |
| 170 | + | WHERE match(title, "fluffy pancakes breakfast", {"minimum_should_match": 2}) |
| 171 | + | LIMIT 1000 |
| 172 | + """ |
| 173 | +} |
| 174 | +``` |
| 175 | + |
| 176 | +This query searches the title field to match at least 2 of the 3 terms: "fluffy", "pancakes", or "breakfast". |
| 177 | + |
| 178 | +## Step 3: Search across multiple fields at once |
| 179 | + |
| 180 | +When users enter a search query, they often don't know (or care) whether their search terms appear in a specific field. ES|QL provides ways to search across multiple fields simultaneously: |
| 181 | + |
| 182 | +```esql |
| 183 | +POST /_query?format=txt |
| 184 | +{ |
| 185 | + "query": """ |
| 186 | + FROM cooking_blog |
| 187 | + | WHERE title:"vegetarian curry" OR description:"vegetarian curry" OR tags:"vegetarian curry" |
| 188 | + | LIMIT 1000 |
| 189 | + """ |
| 190 | +} |
| 191 | +``` |
| 192 | + |
| 193 | +This query searches for "vegetarian curry" across the title, description, and tags fields. Each field is treated with equal importance. |
| 194 | + |
| 195 | +However, in many cases, matches in certain fields (like the title) might be more relevant than others. We can adjust the importance of each field using scoring: |
| 196 | + |
| 197 | +```esql |
| 198 | +POST /_query?format=txt |
| 199 | +{ |
| 200 | + "query": """ |
| 201 | + FROM cooking_blog METADATA _score |
| 202 | + | WHERE match(title, "vegetarian curry", {"boost": 2.0}) |
| 203 | + OR match(description, "vegetarian curry") |
| 204 | + OR match(tags, "vegetarian curry") |
| 205 | + | KEEP title, description, tags, _score |
| 206 | + | SORT _score DESC |
| 207 | + | LIMIT 1000 |
| 208 | + """ |
| 209 | +} |
| 210 | +``` |
| 211 | + |
| 212 | +In this example, we're using the `boost` parameter to make matches in the title field twice as important as matches in other fields. We also request the `_score` metadata field to sort results by relevance. |
| 213 | + |
| 214 | +## Step 4: Filter and find exact matches |
| 215 | + |
| 216 | +Filtering allows you to narrow down your search results based on exact criteria. Unlike full-text searches, filters are binary (yes/no) and do not affect the relevance score. Filters execute faster than queries because excluded results don't need to be scored. |
| 217 | + |
| 218 | +```esql |
| 219 | +POST /_query?format=txt |
| 220 | +{ |
| 221 | + "query": """ |
| 222 | + FROM cooking_blog |
| 223 | + | WHERE category.keyword == "Breakfast" |
| 224 | + | KEEP title, author, rating, tags |
| 225 | + | SORT rating DESC |
| 226 | + | LIMIT 1000 |
| 227 | + """ |
| 228 | +} |
| 229 | +``` |
| 230 | + |
| 231 | +Note the use of `category.keyword` here. This refers to the [`keyword`](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md) multi-field of the `category` field, ensuring an exact, case-sensitive match. |
| 232 | + |
| 233 | +### Search for posts within a date range |
| 234 | + |
| 235 | +Often users want to find content published within a specific time frame: |
| 236 | + |
| 237 | +```esql |
| 238 | +POST /_query?format=txt |
| 239 | +{ |
| 240 | + "query": """ |
| 241 | + FROM cooking_blog |
| 242 | + | WHERE date >= "2023-05-01" AND date <= "2023-05-31" |
| 243 | + | KEEP title, author, date, rating |
| 244 | + | LIMIT 1000 |
| 245 | + """ |
| 246 | +} |
| 247 | +``` |
| 248 | + |
| 249 | +### Find exact matches |
| 250 | + |
| 251 | +Sometimes users want to search for exact terms to eliminate ambiguity in their search results: |
| 252 | + |
| 253 | +```esql |
| 254 | +POST /_query?format=txt |
| 255 | +{ |
| 256 | + "query": """ |
| 257 | + FROM cooking_blog |
| 258 | + | WHERE tags.keyword == "vegetarian" |
| 259 | + | KEEP title, author, rating, tags |
| 260 | + | LIMIT 1000 |
| 261 | + """ |
| 262 | +} |
| 263 | +``` |
| 264 | + |
| 265 | +Like the `term` query in Query DSL, this has zero flexibility and is case-sensitive. |
| 266 | + |
| 267 | +## Step 5: Combine multiple search criteria |
| 268 | + |
| 269 | +Complex searches often require combining multiple search criteria: |
| 270 | + |
| 271 | +```esql |
| 272 | +POST /_query?format=txt |
| 273 | +{ |
| 274 | + "query": """ |
| 275 | + FROM cooking_blog METADATA _score |
| 276 | + | WHERE rating >= 4.5 |
| 277 | + AND NOT category.keyword == "Dessert" |
| 278 | + AND (title:"curry spicy" OR description:"curry spicy") |
| 279 | + | SORT _score DESC |
| 280 | + | KEEP title, author, rating, tags, description |
| 281 | + | LIMIT 1000 |
| 282 | + """ |
| 283 | +} |
| 284 | +``` |
| 285 | + |
| 286 | +For more complex relevance scoring with combined criteria, you can use the `EVAL` command to calculate custom scores: |
| 287 | + |
| 288 | +```esql |
| 289 | +POST /_query?format=txt |
| 290 | +{ |
| 291 | + "query": """ |
| 292 | + FROM cooking_blog METADATA _score |
| 293 | + | WHERE tags.keyword == "vegetarian" AND rating >= 4.5 |
| 294 | + | EVAL title_score = SCORE(match(title, "curry spicy")) * 2 |
| 295 | + | EVAL desc_score = SCORE(match(description, "curry spicy")) |
| 296 | + | EVAL combined_score = title_score + desc_score |
| 297 | + | EVAL category_boost = IF(category.keyword == "Main Course", 1.0, 0.0) |
| 298 | + | EVAL date_boost = IF(date >= "now-1M/d", 0.5, 0.0) |
| 299 | + | EVAL final_score = combined_score + category_boost + date_boost |
| 300 | + | WHERE NOT category.keyword == "Dessert" |
| 301 | + | WHERE final_score > 0 |
| 302 | + | SORT final_score DESC |
| 303 | + | LIMIT 1000 |
| 304 | + """ |
| 305 | +} |
| 306 | +``` |
| 307 | + |
| 308 | +This ES|QL query uses an explicit scoring mechanism: |
| 309 | +1. Requires "vegetarian" tag and rating >= 4.5 |
| 310 | +2. Computes separate scores for `title` and `description` matches |
| 311 | +3. Adds boosts for Main Course category and recent dates |
| 312 | +4. Excludes Desserts |
| 313 | +5. Sorts by the final combined score |
| 314 | + |
| 315 | + |
| 316 | +:::{warning} |
| 317 | +TODO |
| 318 | + |
| 319 | +This section shouldn't live in a tutorial, leaving it here for comments/suggestions if it might be useful |
| 320 | +::: |
| 321 | + |
| 322 | +## Optimizing your ES|QL queries |
| 323 | + |
| 324 | +ES|QL queries can be optimized for better performance and more relevant results. Here are some key optimization strategies: |
| 325 | + |
| 326 | +### Field filtering with KEEP |
| 327 | + |
| 328 | +Using `KEEP` early in your query pipeline can significantly improve performance by reducing the fields that need to be fetched: |
| 329 | + |
| 330 | +```esql |
| 331 | +POST /_query?format=txt |
| 332 | +{ |
| 333 | + "query": """ |
| 334 | + FROM cooking_blog |
| 335 | + | KEEP title, description, rating |
| 336 | + | WHERE title:"curry" |
| 337 | + | LIMIT 1000 |
| 338 | + """ |
| 339 | +} |
| 340 | +``` |
| 341 | + |
| 342 | +However, there's an important caveat: if you need to filter on fields not included in `KEEP`, you should place your `WHERE` clauses before `KEEP`: |
| 343 | + |
| 344 | +```esql |
| 345 | +POST /_query?format=txt |
| 346 | +{ |
| 347 | + "query": """ |
| 348 | + FROM cooking_blog |
| 349 | + | WHERE category.keyword == "Main Course" AND rating >= 4.0 |
| 350 | + | KEEP title, description, rating |
| 351 | + | LIMIT 1000 |
| 352 | + """ |
| 353 | +} |
| 354 | +``` |
| 355 | + |
| 356 | +Placing `WHERE` before `KEEP` allows ES|QL to optimize field caps, only requesting the fields needed for filtering and display. |
| 357 | + |
| 358 | +### Optimal query order |
| 359 | + |
| 360 | +For best performance, structure your ES|QL queries in this general order: |
| 361 | + |
| 362 | +1. `FROM` to select your index |
| 363 | +2. `WHERE` clauses for filtering |
| 364 | +3. `KEEP` to select only needed fields |
| 365 | +4. Processing operations (`EVAL`, aggregations, etc.) |
| 366 | +5. `SORT` to order results |
| 367 | +6. `LIMIT` to restrict result count |
| 368 | + |
| 369 | +This order allows Elasticsearch to apply filters early, reducing the dataset before performing more expensive operations. |
| 370 | + |
| 371 | +### Use keyword fields for exact matching |
| 372 | + |
| 373 | +Always use the `.keyword` suffix for exact matching on text fields. This improves performance and ensures case-sensitive, exact matches: |
| 374 | + |
| 375 | +```esql |
| 376 | +POST /_query?format=txt |
| 377 | +{ |
| 378 | + "query": """ |
| 379 | + FROM cooking_blog |
| 380 | + | WHERE tags.keyword == "vegetarian" |
| 381 | + | LIMIT 1000 |
| 382 | + """ |
| 383 | +} |
| 384 | +``` |
| 385 | + |
| 386 | +## Learn more |
| 387 | + |
| 388 | +This tutorial introduced the basics of full-text search and filtering in ES|QL. Building a real-world search experience requires understanding many more advanced concepts and techniques. Here are some resources once you're ready to dive deeper: |
| 389 | + |
| 390 | +- [Full-text search](full-text.md): Learn about the core components of full-text search in Elasticsearch. |
| 391 | + - [Text analysis](full-text/text-analysis-during-search.md): Understand how text is processed for full-text search. |
| 392 | +- [Query and filter data](/explore-analyze/query-filter.md): Understand all your options for searching and analyzing data in {{es}} in the Explore & Analyze section. |
| 393 | +- [Search your data](../search.md): Learn about more advanced search techniques including semantic search. |
0 commit comments