From 9725f7a61f9914aa5cb560886baf17393cde2025 Mon Sep 17 00:00:00 2001 From: Alexander Spies Date: Tue, 25 Mar 2025 09:38:37 +0100 Subject: [PATCH] [8.x] ESQL: Add more details on ENRICH vs. LOOKUP JOIN #125487 (#125528) Manual backport of docs-PR #125487 --- docs/reference/esql/esql-enrich-data.asciidoc | 4 +-- docs/reference/esql/esql-lookup-join.asciidoc | 22 +++++++--------- .../esql/processing-commands/lookup.asciidoc | 25 +++++++++++-------- 3 files changed, 25 insertions(+), 26 deletions(-) diff --git a/docs/reference/esql/esql-enrich-data.asciidoc b/docs/reference/esql/esql-enrich-data.asciidoc index 9325dcef12f40..34125fb74d32c 100644 --- a/docs/reference/esql/esql-enrich-data.asciidoc +++ b/docs/reference/esql/esql-enrich-data.asciidoc @@ -19,10 +19,10 @@ For example, you can use `ENRICH` to: * Enrichment data doesn't change frequently * You can accept index-time overhead -* You are working with structured enrichment patterns * You can accept having multiple matches combined into multi-values * You can accept being limited to predefined match fields -* `ENRICH` has a simplified security model. There are no restrictions to specific enrich policies or document and field level security. +* You do not need fine-grained security: There are no restrictions to specific enrich policies or document and field level security. +* You want to match using ranges or spatial relations [discrete] [[esql-how-enrich-works]] diff --git a/docs/reference/esql/esql-lookup-join.asciidoc b/docs/reference/esql/esql-lookup-join.asciidoc index 58d2adfc6ee17..a2ef4f0f77883 100644 --- a/docs/reference/esql/esql-lookup-join.asciidoc +++ b/docs/reference/esql/esql-lookup-join.asciidoc @@ -4,7 +4,7 @@ Correlate data with LOOKUP JOIN ++++ -The {esql} <> +The {esql} <> processing command combines data from your {esql} query results table with matching records from a specified lookup index. It adds fields from the lookup index as new columns to your results table based @@ -29,12 +29,11 @@ in the fact that they both help you join data together. You should use * Your enrichment data changes frequently * You want to avoid index-time processing -* You're working with regular indices -* You need to preserve distinct matches +* You want SQL-like behavior, so that multiple matches result in multiple rows * You need to match on any field in a lookup index * You use document or field level security -* You want to restrict users to a specific lookup indices that they can -you +* You want to restrict users to use only specific lookup indices +* You do not need to match using ranges or spatial relations [discrete] [[esql-how-lookup-join-works]] @@ -146,7 +145,7 @@ To use `LOOKUP JOIN`, the following requirements must be met: * *Compatible data types*: The join key and join field in the lookup index must have compatible data types. This means: ** The data types must either be identical or be internally represented -as the same type in Elasticsearch's type system +as the same type in {esql} ** Numeric types follow these compatibility rules: *** `short` and `byte` are compatible with `integer` (all represented as `int`) @@ -164,18 +163,15 @@ representations, see the <> mode are always single-sharded. * Cross cluster search is unsupported. Both source and lookup indices must be local. +* Currently, only matching on equality is supported. * `LOOKUP JOIN` can only use a single match field and a single index. Wildcards, aliases, datemath, and datastreams are not supported. -* The name of the match field in -`LOOKUP JOIN lu++_++idx ON match++_++field` must match an existing field -in the query. This may require renames or evals to achieve. +* The name of the match field in `LOOKUP JOIN lu_idx ON match_field` must match +an existing field in the query. This may require `RENAME`s or `EVAL`s to +achieve. * The query will circuit break if there are too many matching documents in the lookup index, or if the documents are too large. More precisely, `LOOKUP JOIN` works in batches of, normally, about 10,000 rows; a large diff --git a/docs/reference/esql/processing-commands/lookup.asciidoc b/docs/reference/esql/processing-commands/lookup.asciidoc index 268aad3778676..cde5130a68815 100644 --- a/docs/reference/esql/processing-commands/lookup.asciidoc +++ b/docs/reference/esql/processing-commands/lookup.asciidoc @@ -3,7 +3,7 @@ === `LOOKUP JOIN` [WARNING] -==== +==== This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support @@ -15,16 +15,10 @@ and analysis workflows. *Syntax* -.... -FROM -| LOOKUP JOIN ON -.... - [source,esql] ---- -FROM firewall_logs -| LOOKUP JOIN threat_list ON source.IP -| WHERE threat_level IS NOT NULL +FROM +| LOOKUP JOIN ON ---- *Parameters* @@ -33,7 +27,7 @@ FROM firewall_logs The name of the lookup index. This must be a specific index name - wildcards, aliases, and remote cluster references are not supported. -`field_name`:: +`field_name`:: The field to join on. This field must exist in both your current query results and in the lookup index. If the field contains multi-valued entries, those entries will not match anything @@ -68,6 +62,15 @@ FROM firewall_logs | LOOKUP JOIN threat_list ON source.IP ---- +To filter only for those rows that have a matching `threat_list` entry, use `WHERE ... IS NOT NULL` with a field from the lookup index: + +[source,esql] +---- +FROM firewall_logs +| LOOKUP JOIN threat_list ON source.IP +| WHERE threat_level IS NOT NULL +---- + *Host metadata correlation*: This query pulls in environment or ownership details for each host to correlate with your metrics data. @@ -107,5 +110,5 @@ FROM Left ---- FROM Left | LOOKUP JOIN Right ON Key -| WHERE Language IS NOT NULL +| WHERE Language IS NOT NULL ----