From 9725f7a61f9914aa5cb560886baf17393cde2025 Mon Sep 17 00:00:00 2001
From: Alexander Spies <alexander.spies@elastic.co>
Date: Tue, 25 Mar 2025 09:38:37 +0100
Subject: [PATCH] [8.x] ESQL: Add more details on ENRICH vs. LOOKUP JOIN
 #125487 (#125528)

Manual backport of docs-PR #125487
---
 docs/reference/esql/esql-enrich-data.asciidoc |  4 +--
 docs/reference/esql/esql-lookup-join.asciidoc | 22 +++++++---------
 .../esql/processing-commands/lookup.asciidoc  | 25 +++++++++++--------
 3 files changed, 25 insertions(+), 26 deletions(-)
diff --git a/docs/reference/esql/esql-enrich-data.asciidoc b/docs/reference/esql/esql-enrich-data.asciidoc
index 9325dcef12f40..34125fb74d32c 100644
--- a/docs/reference/esql/esql-enrich-data.asciidoc
+++ b/docs/reference/esql/esql-enrich-data.asciidoc
@@ -19,10 +19,10 @@ For example, you can use `ENRICH` to:
 
 * Enrichment data doesn't change frequently
 * You can accept index-time overhead
-* You are working with structured enrichment patterns
 * You can accept having multiple matches combined into multi-values
 * You can accept being limited to predefined match fields
-* `ENRICH` has a simplified security model. There are no restrictions to specific enrich policies or document and field level security.
+* You do not need fine-grained security: There are no restrictions to specific enrich policies or document and field level security.
+* You want to match using ranges or spatial relations
 
 [discrete]
 [[esql-how-enrich-works]]
diff --git a/docs/reference/esql/esql-lookup-join.asciidoc b/docs/reference/esql/esql-lookup-join.asciidoc
index 58d2adfc6ee17..a2ef4f0f77883 100644
--- a/docs/reference/esql/esql-lookup-join.asciidoc
+++ b/docs/reference/esql/esql-lookup-join.asciidoc
@@ -4,7 +4,7 @@
 <titleabbrev>Correlate data with LOOKUP JOIN</titleabbrev>
 ++++
 
-The {esql} <<esql-lookup-join,LOOKUP join>> 
+The {esql} <<esql-lookup-join,LOOKUP join>>
 processing command combines data from your {esql} query results
 table with matching records from a specified lookup index. It adds
 fields from the lookup index as new columns to your results table based
@@ -29,12 +29,11 @@ in the fact that they both help you join data together. You should use
 
 * Your enrichment data changes frequently
 * You want to avoid index-time processing
-* You're working with regular indices
-* You need to preserve distinct matches
+* You want SQL-like behavior, so that multiple matches result in multiple rows
 * You need to match on any field in a lookup index
 * You use document or field level security
-* You want to restrict users to a specific lookup indices that they can
-you
+* You want to restrict users to use only specific lookup indices
+* You do not need to match using ranges or spatial relations
 
 [discrete]
 [[esql-how-lookup-join-works]]
@@ -146,7 +145,7 @@ To use `LOOKUP JOIN`, the following requirements must be met:
 * *Compatible data types*: The join key and join field in the lookup
 index must have compatible data types. This means:
 ** The data types must either be identical or be internally represented
-as the same type in Elasticsearch's type system
+as the same type in {esql}
 ** Numeric types follow these compatibility rules:
 *** `short` and `byte` are compatible with `integer` (all represented as
 `int`)
@@ -164,18 +163,15 @@ representations, see the <<esql-supported-types,Supported Field Types documentat
 
 The following are the current limitations with `LOOKUP JOIN`
 
-* `LOOKUP JOIN` will be successful if the join field in the lookup index
-is a `KEYWORD` type. If the main index's join field is `TEXT` type, it
-must have an exact `.keyword` subfield that can be matched with the
-lookup index's `KEYWORD` field.
 * Indices in <<index-mode-setting,lookup>> mode are always single-sharded.
 * Cross cluster search is unsupported. Both source and lookup indices
 must be local.
+* Currently, only matching on equality is supported.
 * `LOOKUP JOIN` can only use a single match field and a single index.
 Wildcards, aliases, datemath, and datastreams are not supported.
-* The name of the match field in
-`LOOKUP JOIN lu++_++idx ON match++_++field` must match an existing field
-in the query. This may require renames or evals to achieve.
+* The name of the match field in `LOOKUP JOIN lu_idx ON match_field` must match
+an existing field in the query. This may require `RENAME`s or `EVAL`s to
+achieve.
 * The query will circuit break if there are too many matching documents
 in the lookup index, or if the documents are too large. More precisely,
 `LOOKUP JOIN` works in batches of, normally, about 10,000 rows; a large
diff --git a/docs/reference/esql/processing-commands/lookup.asciidoc b/docs/reference/esql/processing-commands/lookup.asciidoc
index 268aad3778676..cde5130a68815 100644
--- a/docs/reference/esql/processing-commands/lookup.asciidoc
+++ b/docs/reference/esql/processing-commands/lookup.asciidoc
@@ -3,7 +3,7 @@
 === `LOOKUP JOIN`
 
 [WARNING]
-==== 
+====
 This functionality is in technical preview and may be
 changed or removed in a future release. Elastic will work to fix any
 issues, but features in technical preview are not subject to the support
@@ -15,16 +15,10 @@ and analysis workflows.
 
 *Syntax*
 
-....
-FROM <source_index>
-| LOOKUP JOIN <lookup_index> ON <field_name>
-....
-
 [source,esql]
 ----
-FROM firewall_logs
-| LOOKUP JOIN threat_list ON source.IP
-| WHERE threat_level IS NOT NULL
+FROM <source_index>
+| LOOKUP JOIN <lookup_index> ON <field_name>
 ----
 
 *Parameters*
@@ -33,7 +27,7 @@ FROM firewall_logs
 The name of the lookup index. This must be a specific index name - wildcards, aliases, and remote cluster
 references are not supported.
 
-`field_name`:: 
+`field_name`::
 The field to join on. This field must exist
 in both your current query results and in the lookup index. If the field
 contains multi-valued entries, those entries will not match anything
@@ -68,6 +62,15 @@ FROM firewall_logs
 | LOOKUP JOIN threat_list ON source.IP
 ----
 
+To filter only for those rows that have a matching `threat_list` entry, use `WHERE ... IS NOT NULL` with a field from the lookup index:
+
+[source,esql]
+----
+FROM firewall_logs
+| LOOKUP JOIN threat_list ON source.IP
+| WHERE threat_level IS NOT NULL
+----
+
 *Host metadata correlation*: This query pulls in environment or
 ownership details for each host to correlate with your metrics data.
 
@@ -107,5 +110,5 @@ FROM Left
 ----
 FROM Left
 | LOOKUP JOIN Right ON Key
-| WHERE Language IS NOT NULL 
+| WHERE Language IS NOT NULL
 ----