Skip to content

Commit 50d536f

Browse files
authored
[DOCS][ESQL][8.x] Cleanup and cross-reference LOOKUP JOIN reference and landing pages (#127316) (#127325)
* [DOCS][ESQL][8.x] Cleanup and cross-reference LOOKUP JOIN reference and landing pages * Add missing id to fix linking problem
1 parent ffae8d7 commit 50d536f

File tree

2 files changed

+136
-65
lines changed

2 files changed

+136
-65
lines changed
Lines changed: 132 additions & 63 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,19 @@
11
=== LOOKUP JOIN
2-
32
++++
43
<titleabbrev>Correlate data with LOOKUP JOIN</titleabbrev>
54
++++
65

6+
// hack because page didn't have explicit id originally we could link to using internal link syntax
7+
[[esql-lookup-join-landing-page]]
8+
9+
[WARNING]
10+
====
11+
This functionality is in technical preview and may be
12+
changed or removed in a future release. Elastic will work to fix any
13+
issues, but features in technical preview are not subject to the support
14+
SLA of official GA features.
15+
====
16+
717
The {esql} <<esql-lookup-join,LOOKUP join>>
818
processing command combines data from your {esql} query results
919
table with matching records from a specified lookup index. It adds
@@ -23,6 +33,10 @@ your metrics data.
2333
* Tag logs with the owning team or escalation info for faster triage and
2434
incident response.
2535
36+
[discrete]
37+
[[esql-compare-with-enrich]]
38+
==== Compare with ENRICH
39+
2640
<<esql-lookup-join,LOOKUP join>> is similar to <<esql-enrich-data,ENRICH>>
2741
in the fact that they both help you join data together. You should use
2842
`LOOKUP JOIN` when:
@@ -37,105 +51,160 @@ in the fact that they both help you join data together. You should use
3751

3852
[discrete]
3953
[[esql-how-lookup-join-works]]
40-
==== How the `LOOKUP JOIN` command works
54+
==== How the command works
4155

42-
The `LOOKUP JOIN` command adds new columns to a table, with data from
43-
{es} indices.
56+
The `LOOKUP JOIN` command adds fields from the lookup index as new columns
57+
to your results table based on matching values in the join field.
4458

45-
image::images/esql/esql-lookup-join.png[align="center"]
59+
[source,esql]
60+
----
61+
LOOKUP JOIN <lookup_index> ON <field_name>
62+
----
63+
64+
The command requires two parameters:
4665

4766
[[esql-lookup-join-lookup-index]]
4867
lookup_index::
4968
The name of the lookup index. This must
5069
be a specific index name - wildcards, aliases, and remote cluster
5170
references are not supported. Indices used for lookups must be configured with the <<index-mode-setting,`lookup` mode>>.
5271

53-
5472
[[esql-lookup-join-field-name]]
5573
field_name::
5674
The field to join on. This field must exist
5775
in both your current query results and in the lookup index. If the field
5876
contains multi-valued entries, those entries will not match anything
5977
(the added fields will contain `null` for those rows).
6078

79+
image::images/esql/esql-lookup-join.png[align="center"]
80+
81+
If you're familiar with SQL, `LOOKUP JOIN` has left-join behavior. This means that
82+
if no rows match in the lookup index, the incoming row is retained and `null`s are added. If many rows in the lookup index match, `LOOKUP JOIN` adds one row per match.
83+
6184
[discrete]
6285
[[esql-lookup-join-example]]
6386
==== Example
6487

65-
`LOOKUP JOIN` has left-join behavior. If no rows match in the lookup index, `LOOKUP JOIN` retains the incoming row and adds nulls. If many rows in the lookup index match, `LOOKUP JOIN` adds one row per match.
88+
You can run this example for yourself to see how it works by setting up the indices and adding sample data. Otherwise, you just inspect the query and response.
6689

67-
In this example, we have two sample tables:
90+
[discrete]
91+
[[esql-lookup-join-example-setup-sample-data]]
92+
===== Sample data
6893

69-
*employees*
94+
.*Expand for setup instructions*
95+
[%collapsible]
96+
==============
7097
71-
[cols=",,,,,",options="header",]
72-
|===
73-
|birth++_++date |emp++_++no |first++_++name |gender |hire++_++date
74-
|language
75-
|1955-10-04T00:00:00Z |10091 |Amabile |M |1992-11-18T00:00:00Z |3
98+
**Set up indices**
7699
77-
|1964-10-18T00:00:00Z |10092 |Valdiodio |F |1989-09-22T00:00:00Z |1
100+
First, let's create two indices with mappings: `threat_list` and `firewall_logs`.
101+
102+
[source,console]
103+
----
104+
PUT threat_list
105+
{
106+
"settings": {
107+
"index.mode": "lookup" <1>
108+
},
109+
"mappings": {
110+
"properties": {
111+
"source.ip": { "type": "ip" },
112+
"threat_level": { "type": "keyword" },
113+
"threat_type": { "type": "keyword" },
114+
"last_updated": { "type": "date" }
115+
}
116+
}
117+
}
118+
----
119+
<1> The lookup index must be set up using this mode
78120
79-
|1964-06-11T00:00:00Z |10093 |Sailaja |M |1996-11-05T00:00:00Z |3
121+
[source,console]
122+
----
123+
PUT firewall_logs
124+
{
125+
"mappings": {
126+
"properties": {
127+
"timestamp": { "type": "date" },
128+
"source.ip": { "type": "ip" },
129+
"destination.ip": { "type": "ip" },
130+
"action": { "type": "keyword" },
131+
"bytes_transferred": { "type": "long" }
132+
}
133+
}
134+
}
135+
----
80136
81-
|1957-05-25T00:00:00Z |10094 |Arumugam |F |1987-04-18T00:00:00Z |5
137+
*Add sample data*
82138
83-
|1965-01-03T00:00:00Z |10095 |Hilari |M |1986-07-15T00:00:00Z |4
84-
|===
139+
Next, let's add some sample data to both indices. The `threat_list` index contains known malicious IPs, while the `firewall_logs` index contains logs of network traffic.
85140
86-
*languages++_++non++_++unique++_++key*
141+
[source,console]
142+
----
143+
POST threat_list/_bulk
144+
{"index":{}}
145+
{"source.ip":"203.0.113.5","threat_level":"high","threat_type":"C2_SERVER","last_updated":"2025-04-22"}
146+
{"index":{}}
147+
{"source.ip":"198.51.100.2","threat_level":"medium","threat_type":"SCANNER","last_updated":"2025-04-23"}
148+
----
87149
88-
[cols=",,",options="header",]
89-
|===
90-
|language++_++code |language++_++name |country
91-
|1 |English |Canada
92-
|1 |English |
93-
|1 | |United Kingdom
94-
|1 |English |United States of America
95-
|2 |German |++[++Germany{vbar}Austria++]++
96-
|2 |German |Switzerland
97-
|2 |German |
98-
|4 |Quenya |
99-
|5 | |Atlantis
100-
|++[++6{vbar}7++]++ |Mv-Lang |Mv-Land
101-
|++[++7{vbar}8++]++ |Mv-Lang2 |Mv-Land2
102-
|Null-Lang |Null-Land |
103-
|Null-Lang2 |Null-Land2 |
104-
|===
150+
[source,console]
151+
----
152+
POST firewall_logs/_bulk
153+
{"index":{}}
154+
{"timestamp":"2025-04-23T10:00:01Z","source.ip":"192.0.2.1","destination.ip":"10.0.0.100","action":"allow","bytes_transferred":1024}
155+
{"index":{}}
156+
{"timestamp":"2025-04-23T10:00:05Z","source.ip":"203.0.113.5","destination.ip":"10.0.0.55","action":"allow","bytes_transferred":2048}
157+
{"index":{}}
158+
{"timestamp":"2025-04-23T10:00:08Z","source.ip":"198.51.100.2","destination.ip":"10.0.0.200","action":"block","bytes_transferred":0}
159+
{"index":{}}
160+
{"timestamp":"2025-04-23T10:00:15Z","source.ip":"203.0.113.5","destination.ip":"10.0.0.44","action":"allow","bytes_transferred":4096}
161+
{"index":{}}
162+
{"timestamp":"2025-04-23T10:00:30Z","source.ip":"192.0.2.1","destination.ip":"10.0.0.100","action":"allow","bytes_transferred":512}
163+
----
164+
==============
105165

106-
Running the following query would provide the results shown below.
166+
[discrete]
167+
[[esql-lookup-join-example-query]]
168+
===== Query the Data
107169

108170
[source,esql]
109171
----
110-
FROM employees
111-
| EVAL language_code = emp_no % 10
112-
| LOOKUP JOIN languages_lookup_non_unique_key ON language_code
113-
| WHERE emp_no > 10090 AND emp_no < 10096
114-
| SORT emp_no, country
115-
| KEEP emp_no, language_code, language_name, country;
172+
FROM firewall_logs <1>
173+
| LOOKUP JOIN threat_list ON source.ip <2>
174+
| WHERE threat_level IS NOT NULL <3>
175+
| SORT timestamp <4>
176+
| KEEP source.ip, action, threat_level, threat_type <5>
177+
| LIMIT 10 <6>
116178
----
117179

118-
[cols=",,,",options="header",]
180+
<1> The source index
181+
<2> The lookup index and join field
182+
<3> Filter for rows with non-null threat levels
183+
<4> LOOKUP JOIN does not guarantee output order, so you must explicitly sort
184+
<5> Keep only relevant fields
185+
<6> Limit the output to 10 rows
186+
187+
[discrete]
188+
[[esql-lookup-join-example-response]]
189+
===== Response
190+
191+
A successful query will output a table like this:
192+
193+
[cols="4*",options="header"]
119194
|===
120-
|emp++_++no |language++_++code |language++_++name |country
121-
|10091 |1 |English |Canada
122-
|10091 |1 |null |United Kingdom
123-
|10091 |1 |English |United States of America
124-
|10091 |1 |English |null
125-
|10092 |2 |German |++[++Germany, Austria++]++
126-
|10092 |2 |German |Switzerland
127-
|10092 |2 |German |null
128-
|10093 |3 |null |null
129-
|10094 |4 |Spanish |null
130-
|10095 |5 |null |France
195+
|source.ip |action |threat_type |threat_level
196+
|203.0.113.5 |allow |C2_SERVER |high
197+
|198.51.100.2 |block |SCANNER |medium
198+
|203.0.113.5 |allow |C2_SERVER |high
131199
|===
132200

133-
[IMPORTANT]
134-
====
135-
`LOOKUP JOIN` does not guarantee the output to be in
136-
any particular order. If a certain order is required, users should use a
137-
<<esql-sort,`SORT`>> somewhere after the `LOOKUP JOIN`.
138-
====
201+
In this example, you can see that the `source.ip` field from the `firewall_logs` index is matched with the `source.ip` field in the `threat_list` index, and the corresponding `threat_level` and `threat_type` fields are added to the output.
202+
203+
[discrete]
204+
[[esql-lookup-join-additional-examples]]
205+
===== Additional examples
206+
207+
Refer to the examples section of the <<esql-lookup-join,LOOKUP JOIN>> command reference for more examples.
139208

140209
[discrete]
141210
[[esql-lookup-join-prereqs]]
@@ -182,4 +251,4 @@ in the lookup index, or if the documents are too large. More precisely,
182251
`LOOKUP JOIN` works in batches of, normally, about 10,000 rows; a large
183252
amount of heap space is needed if the matching documents from the lookup
184253
index for a batch are multiple megabytes or larger. This is roughly the
185-
same as for `ENRICH`.
254+
same as for `ENRICH`.

docs/reference/esql/processing-commands/lookup.asciidoc

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,10 +9,13 @@ changed or removed in a future release. Elastic will work to fix any
99
issues, but features in technical preview are not subject to the support
1010
SLA of official GA features.
1111
====
12+
1213
`LOOKUP JOIN` enables you to add data from another index, AKA a 'lookup'
1314
index, to your {esql} query results, simplifying data enrichment
1415
and analysis workflows.
1516

17+
See <<esql-lookup-join-landing-page,the high-level landing page>> for an overview of the `LOOKUP JOIN` command, including use cases, prerequisites, and current limitations.
18+
1619
*Syntax*
1720

1821
[source,esql]
@@ -24,8 +27,7 @@ FROM <source_index>
2427
*Parameters*
2528

2629
`lookup_index`::
27-
The name of the lookup index. This must be a specific index name - wildcards, aliases, and remote cluster
28-
references are not supported.
30+
The name of the lookup index. This must be a specific index name - wildcards, aliases, and remote cluster references are not supported. Indices used for lookups must be configured with the `lookup` <<index-mode-setting,index mode setting>>.
2931

3032
`field_name`::
3133
The field to join on. This field must exist

0 commit comments

Comments
 (0)