Skip to content

Commit de41857

Browse files
committed
[DOCS][ESQL][8.x] Cleanup and cross-reference LOOKUP JOIN reference and landing pages
1 parent 5a6e936 commit de41857

File tree

2 files changed

+133
-65
lines changed

2 files changed

+133
-65
lines changed
Lines changed: 129 additions & 63 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,16 @@
11
=== LOOKUP JOIN
2-
32
++++
43
<titleabbrev>Correlate data with LOOKUP JOIN</titleabbrev>
54
++++
65

6+
[WARNING]
7+
====
8+
This functionality is in technical preview and may be
9+
changed or removed in a future release. Elastic will work to fix any
10+
issues, but features in technical preview are not subject to the support
11+
SLA of official GA features.
12+
====
13+
714
The {esql} <<esql-lookup-join,LOOKUP join>>
815
processing command combines data from your {esql} query results
916
table with matching records from a specified lookup index. It adds
@@ -23,6 +30,10 @@ your metrics data.
2330
* Tag logs with the owning team or escalation info for faster triage and
2431
incident response.
2532
33+
[discrete]
34+
[[esql-compare-with-enrich]]
35+
==== Compare with ENRICH
36+
2637
<<esql-lookup-join,LOOKUP join>> is similar to <<esql-enrich-data,ENRICH>>
2738
in the fact that they both help you join data together. You should use
2839
`LOOKUP JOIN` when:
@@ -37,105 +48,160 @@ in the fact that they both help you join data together. You should use
3748

3849
[discrete]
3950
[[esql-how-lookup-join-works]]
40-
==== How the `LOOKUP JOIN` command works
51+
==== How the command works
4152

42-
The `LOOKUP JOIN` command adds new columns to a table, with data from
43-
{es} indices.
53+
The `LOOKUP JOIN` command adds fields from the lookup index as new columns
54+
to your results table based on matching values in the join field.
4455

45-
image::images/esql/esql-lookup-join.png[align="center"]
56+
[source,esql]
57+
----
58+
LOOKUP JOIN <lookup_index> ON <field_name>
59+
----
60+
61+
The command requires two parameters:
4662

4763
[[esql-lookup-join-lookup-index]]
4864
lookup_index::
4965
The name of the lookup index. This must
5066
be a specific index name - wildcards, aliases, and remote cluster
5167
references are not supported. Indices used for lookups must be configured with the <<index-mode-setting,`lookup` mode>>.
5268

53-
5469
[[esql-lookup-join-field-name]]
5570
field_name::
5671
The field to join on. This field must exist
5772
in both your current query results and in the lookup index. If the field
5873
contains multi-valued entries, those entries will not match anything
5974
(the added fields will contain `null` for those rows).
6075

76+
image::images/esql/esql-lookup-join.png[align="center"]
77+
78+
If you're familiar with SQL, `LOOKUP JOIN` has left-join behavior. This means that
79+
if no rows match in the lookup index, the incoming row is retained and `null`s are added. If many rows in the lookup index match, `LOOKUP JOIN` adds one row per match.
80+
6181
[discrete]
6282
[[esql-lookup-join-example]]
6383
==== Example
6484

65-
`LOOKUP JOIN` has left-join behavior. If no rows match in the lookup index, `LOOKUP JOIN` retains the incoming row and adds nulls. If many rows in the lookup index match, `LOOKUP JOIN` adds one row per match.
85+
You can run this example for yourself to see how it works by setting up the indices and adding sample data. Otherwise, you just inspect the query and response.
6686

67-
In this example, we have two sample tables:
87+
[discrete]
88+
[[esql-lookup-join-example-setup-sample-data]]
89+
===== Sample data
6890

69-
*employees*
91+
.*Expand for setup instructions*
92+
[%collapsible]
93+
==============
7094
71-
[cols=",,,,,",options="header",]
72-
|===
73-
|birth++_++date |emp++_++no |first++_++name |gender |hire++_++date
74-
|language
75-
|1955-10-04T00:00:00Z |10091 |Amabile |M |1992-11-18T00:00:00Z |3
95+
**Set up indices**
7696
77-
|1964-10-18T00:00:00Z |10092 |Valdiodio |F |1989-09-22T00:00:00Z |1
97+
First, let's create two indices with mappings: `threat_list` and `firewall_logs`.
7898
79-
|1964-06-11T00:00:00Z |10093 |Sailaja |M |1996-11-05T00:00:00Z |3
99+
[source,console]
100+
----
101+
PUT threat_list
102+
{
103+
"settings": {
104+
"index.mode": "lookup" <1>
105+
},
106+
"mappings": {
107+
"properties": {
108+
"source.ip": { "type": "ip" },
109+
"threat_level": { "type": "keyword" },
110+
"threat_type": { "type": "keyword" },
111+
"last_updated": { "type": "date" }
112+
}
113+
}
114+
}
115+
----
116+
<1> The lookup index must be set up using this mode
80117
81-
|1957-05-25T00:00:00Z |10094 |Arumugam |F |1987-04-18T00:00:00Z |5
118+
[source,console]
119+
----
120+
PUT firewall_logs
121+
{
122+
"mappings": {
123+
"properties": {
124+
"timestamp": { "type": "date" },
125+
"source.ip": { "type": "ip" },
126+
"destination.ip": { "type": "ip" },
127+
"action": { "type": "keyword" },
128+
"bytes_transferred": { "type": "long" }
129+
}
130+
}
131+
}
132+
----
82133
83-
|1965-01-03T00:00:00Z |10095 |Hilari |M |1986-07-15T00:00:00Z |4
84-
|===
134+
*Add sample data*
85135
86-
*languages++_++non++_++unique++_++key*
136+
Next, let's add some sample data to both indices. The `threat_list` index contains known malicious IPs, while the `firewall_logs` index contains logs of network traffic.
87137
88-
[cols=",,",options="header",]
89-
|===
90-
|language++_++code |language++_++name |country
91-
|1 |English |Canada
92-
|1 |English |
93-
|1 | |United Kingdom
94-
|1 |English |United States of America
95-
|2 |German |++[++Germany{vbar}Austria++]++
96-
|2 |German |Switzerland
97-
|2 |German |
98-
|4 |Quenya |
99-
|5 | |Atlantis
100-
|++[++6{vbar}7++]++ |Mv-Lang |Mv-Land
101-
|++[++7{vbar}8++]++ |Mv-Lang2 |Mv-Land2
102-
|Null-Lang |Null-Land |
103-
|Null-Lang2 |Null-Land2 |
104-
|===
138+
[source,console]
139+
----
140+
POST threat_list/_bulk
141+
{"index":{}}
142+
{"source.ip":"203.0.113.5","threat_level":"high","threat_type":"C2_SERVER","last_updated":"2025-04-22"}
143+
{"index":{}}
144+
{"source.ip":"198.51.100.2","threat_level":"medium","threat_type":"SCANNER","last_updated":"2025-04-23"}
145+
----
105146
106-
Running the following query would provide the results shown below.
147+
[source,console]
148+
----
149+
POST firewall_logs/_bulk
150+
{"index":{}}
151+
{"timestamp":"2025-04-23T10:00:01Z","source.ip":"192.0.2.1","destination.ip":"10.0.0.100","action":"allow","bytes_transferred":1024}
152+
{"index":{}}
153+
{"timestamp":"2025-04-23T10:00:05Z","source.ip":"203.0.113.5","destination.ip":"10.0.0.55","action":"allow","bytes_transferred":2048}
154+
{"index":{}}
155+
{"timestamp":"2025-04-23T10:00:08Z","source.ip":"198.51.100.2","destination.ip":"10.0.0.200","action":"block","bytes_transferred":0}
156+
{"index":{}}
157+
{"timestamp":"2025-04-23T10:00:15Z","source.ip":"203.0.113.5","destination.ip":"10.0.0.44","action":"allow","bytes_transferred":4096}
158+
{"index":{}}
159+
{"timestamp":"2025-04-23T10:00:30Z","source.ip":"192.0.2.1","destination.ip":"10.0.0.100","action":"allow","bytes_transferred":512}
160+
----
161+
==============
162+
163+
[discrete]
164+
[[esql-lookup-join-example-query]]
165+
===== Query the Data
107166

108167
[source,esql]
109168
----
110-
FROM employees
111-
| EVAL language_code = emp_no % 10
112-
| LOOKUP JOIN languages_lookup_non_unique_key ON language_code
113-
| WHERE emp_no > 10090 AND emp_no < 10096
114-
| SORT emp_no, country
115-
| KEEP emp_no, language_code, language_name, country;
169+
FROM firewall_logs <1>
170+
| LOOKUP JOIN threat_list ON source.ip <2>
171+
| WHERE threat_level IS NOT NULL <3>
172+
| SORT timestamp <4>
173+
| KEEP source.ip, action, threat_level, threat_type <5>
174+
| LIMIT 10 <6>
116175
----
117176

118-
[cols=",,,",options="header",]
177+
<1> The source index
178+
<2> The lookup index and join field
179+
<3> Filter for rows with non-null threat levels
180+
<4> LOOKUP JOIN does not guarantee output order, so you must explicitly sort
181+
<5> Keep only relevant fields
182+
<6> Limit the output to 10 rows
183+
184+
[discrete]
185+
[[esql-lookup-join-example-response]]
186+
===== Response
187+
188+
A successful query will output a table like this:
189+
190+
[cols="4*",options="header"]
119191
|===
120-
|emp++_++no |language++_++code |language++_++name |country
121-
|10091 |1 |English |Canada
122-
|10091 |1 |null |United Kingdom
123-
|10091 |1 |English |United States of America
124-
|10091 |1 |English |null
125-
|10092 |2 |German |++[++Germany, Austria++]++
126-
|10092 |2 |German |Switzerland
127-
|10092 |2 |German |null
128-
|10093 |3 |null |null
129-
|10094 |4 |Spanish |null
130-
|10095 |5 |null |France
192+
|source.ip |action |threat_type |threat_level
193+
|203.0.113.5 |allow |C2_SERVER |high
194+
|198.51.100.2 |block |SCANNER |medium
195+
|203.0.113.5 |allow |C2_SERVER |high
131196
|===
132197

133-
[IMPORTANT]
134-
====
135-
`LOOKUP JOIN` does not guarantee the output to be in
136-
any particular order. If a certain order is required, users should use a
137-
<<esql-sort,`SORT`>> somewhere after the `LOOKUP JOIN`.
138-
====
198+
In this example, you can see that the `source.ip` field from the `firewall_logs` index is matched with the `source.ip` field in the `threat_list` index, and the corresponding `threat_level` and `threat_type` fields are added to the output.
199+
200+
[discrete]
201+
[[esql-lookup-join-additional-examples]]
202+
===== Additional examples
203+
204+
Refer to the examples section of the <<esql-lookup-join,LOOKUP JOIN>> command reference for more examples.
139205

140206
[discrete]
141207
[[esql-lookup-join-prereqs]]
@@ -182,4 +248,4 @@ in the lookup index, or if the documents are too large. More precisely,
182248
`LOOKUP JOIN` works in batches of, normally, about 10,000 rows; a large
183249
amount of heap space is needed if the matching documents from the lookup
184250
index for a batch are multiple megabytes or larger. This is roughly the
185-
same as for `ENRICH`.
251+
same as for `ENRICH`.

docs/reference/esql/processing-commands/lookup.asciidoc

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,10 +9,13 @@ changed or removed in a future release. Elastic will work to fix any
99
issues, but features in technical preview are not subject to the support
1010
SLA of official GA features.
1111
====
12+
1213
`LOOKUP JOIN` enables you to add data from another index, AKA a 'lookup'
1314
index, to your {esql} query results, simplifying data enrichment
1415
and analysis workflows.
1516

17+
See {ref}/esql-lookup-join.html[the high-level landing page] for an overview of the `LOOKUP JOIN` command, including use cases, prerequisites, and current limitations.
18+
1619
*Syntax*
1720

1821
[source,esql]
@@ -24,8 +27,7 @@ FROM <source_index>
2427
*Parameters*
2528

2629
`lookup_index`::
27-
The name of the lookup index. This must be a specific index name - wildcards, aliases, and remote cluster
28-
references are not supported.
30+
The name of the lookup index. This must be a specific index name - wildcards, aliases, and remote cluster references are not supported. Indices used for lookups must be configured with the `lookup` <<index-mode-setting,index mode setting>>.
2931

3032
`field_name`::
3133
The field to join on. This field must exist

0 commit comments

Comments
 (0)