You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# Join data from multiple indices with`LOOKUP JOIN`[esql-lookup-join-reference]
8
8
9
-
`LOOKUP JOIN`[esql-lookup-join-reference]
9
+
The {{esql}} [`LOOKUP JOIN`](/reference/query-languages/esql/commands/processing-commands.md#esql-lookup-join) processing command combines data from your {{esql}} query results table with matching records from a specified lookup index. It adds fields from the lookup index as new columns to your results table based on matching values in the join field.
processing command combines data from your {{esql}} query results table with
14
-
matching records from a specified lookup index. It adds fields from the lookup
15
-
index as new columns to your results table based on matching values in the join
16
-
field.
17
-
18
-
Teams often have data scattered across multiple indices – like logs, IPs, user
19
-
IDs, hosts, employees etc. Without a direct way to enrich or correlate each
20
-
event with reference data, root-cause analysis, security checks, and operational
21
-
insights become time-consuming.
11
+
Teams often have data scattered across multiple indices – like logs, IPs, user IDs, hosts, employees etc. Without a direct way to enrich or correlate each event with reference data, root-cause analysis, security checks, and operational insights become time-consuming.
22
12
23
13
For example, you can use `LOOKUP JOIN` to:
24
14
25
-
* Retrieve environment or ownership details for each host to correlate your
26
-
metrics data.
15
+
* Retrieve environment or ownership details for each host to correlate your metrics data.
27
16
* Quickly see if any source IPs match known malicious addresses.
28
-
* Tag logs with the owning team or escalation info for faster triage and
29
-
incident response.
17
+
* Tag logs with the owning team or escalation info for faster triage and incident response.
in the fact that they both help you join data together. You should use
38
-
`LOOKUP JOIN` when:
21
+
[`LOOKUP JOIN`](/reference/query-languages/esql/commands/processing-commands.md#esql-lookup-join) is similar to [`ENRICH`](/reference/query-languages/esql/commands/processing-commands.md#esql-enrich) in the fact that they both help you join data together. You should use `LOOKUP JOIN` when:
39
22
40
23
* Your enrichment data changes frequently
41
24
* You want to avoid index-time processing
@@ -47,36 +30,27 @@ in the fact that they both help you join data together. You should use
47
30
48
31
## How the command works [esql-how-lookup-join-works]
49
32
50
-
The `LOOKUP JOIN` command adds fields from the lookup index as new columns to
51
-
your results table based on matching values in the join field.
33
+
The `LOOKUP JOIN` command adds fields from the lookup index as new columns to your results table based on matching values in the join field.
52
34
53
35
The command requires two parameters:
54
-
55
-
- The name of the lookup index (which must have the `lookup`[
- The name of the lookup index (which must have the `lookup`[`index.mode setting`](/reference/elasticsearch/index-settings/index-modules.md#index-mode-setting))
57
37
- The name of the field to join on
58
38
59
39
```esql
60
40
LOOKUP JOIN <lookup_index> ON <field_name>
61
41
```
62
42
63
43
:::{image} ../images/esql-lookup-join.png
64
-
:alt: Illustration of the `LOOKUP JOIN` command, where the input table is joined
65
-
with a lookup index to create an enriched output table.
44
+
:alt: Illustration of the `LOOKUP JOIN` command, where the input table is joined with a lookup index to create an enriched output table.
66
45
:::
67
46
68
-
If you're familiar with SQL, `LOOKUP JOIN` has left-join behavior. This means
69
-
that if no rows match in the lookup index, the incoming row is retained and
70
-
`null`s are added. If many rows in the lookup index match, `LOOKUP JOIN` adds
71
-
one row per match.
47
+
If you're familiar with SQL, `LOOKUP JOIN` has left-join behavior. This means that if no rows match in the lookup index, the incoming row is retained and `null`s are added. If many rows in the lookup index match, `LOOKUP JOIN` adds one row per match.
72
48
73
49
## Example
74
50
75
-
You can run this example for yourself if you'd like to see how it works, by
76
-
setting up the indices and adding sample data.
51
+
You can run this example for yourself if you'd like to see how it works, by setting up the indices and adding sample data.
77
52
78
53
### Sample data
79
-
80
54
:::{dropdown} Expand for setup instructions
81
55
82
56
**Set up indices**
@@ -99,7 +73,6 @@ PUT threat_list
99
73
}
100
74
}
101
75
```
102
-
103
76
```console
104
77
PUT firewall_logs
105
78
{
@@ -117,9 +90,7 @@ PUT firewall_logs
117
90
118
91
**Add sample data**
119
92
120
-
Next, let's add some sample data to both indices. The `threat_list` index
121
-
contains known malicious IPs, while the `firewall_logs` index contains logs of
122
-
network traffic.
93
+
Next, let's add some sample data to both indices. The `threat_list` index contains known malicious IPs, while the `firewall_logs` index contains logs of network traffic.
@@ -158,38 +128,31 @@ FROM firewall_logs # The source index
158
128
159
129
### Response
160
130
161
-
A successful query will output a table. In this example, you can see that the
162
-
`source.ip` field from the `firewall_logs` index is matched with the `source.ip`
163
-
field in the `threat_list` index, and the corresponding `threat_level` and
164
-
`threat_type` fields are added to the output.
131
+
A successful query will output a table. In this example, you can see that the `source.ip` field from the `firewall_logs` index is matched with the `source.ip` field in the `threat_list` index, and the corresponding `threat_level` and `threat_type` fields are added to the output.
Refer to the examples section of the [`LOOKUP JOIN`](/reference/query-languages/esql/commands/processing-commands.md#esql-lookup-join) command reference for more examples.
177
142
178
143
## Prerequisites [esql-lookup-join-prereqs]
179
144
180
145
### Index configuration
181
146
182
-
Indices used for lookups must be configured with the [
183
-
`lookup` index mode](/reference/elasticsearch/index-settings/index-modules.md#index-mode-setting).
147
+
Indices used for lookups must be configured with the [`lookup` index mode](/reference/elasticsearch/index-settings/index-modules.md#index-mode-setting).
184
148
185
149
### Data type compatibility
186
150
187
-
Join keys must have compatible data types between the source and lookup indices.
188
-
Types within the same compatibility group can be joined together:
151
+
Join keys must have compatible data types between the source and lookup indices. Types within the same compatibility group can be joined together:
|**Keyword family**|`keyword`, `text.keyword`| Text fields only as join key on left-hand side and must have `.keyword` subfield |
194
157
|**Date (Exact)**|`date`| Must match exactly |
195
158
|**Date Nanos (Exact)**|`date_nanos`| Must match exactly |
@@ -201,9 +164,7 @@ To obtain a join key with a compatible type, use a [conversion function](/refere
201
164
202
165
### Unsupported Types
203
166
204
-
In addition to
205
-
the [{{esql}} unsupported field types](/reference/query-languages/esql/limitations.md#_unsupported_types),
206
-
`LOOKUP JOIN` does not support:
167
+
In addition to the [{{esql}} unsupported field types](/reference/query-languages/esql/limitations.md#_unsupported_types), `LOOKUP JOIN` does not support:
207
168
208
169
*`VERSION`
209
170
*`UNSIGNED_LONG`
@@ -216,14 +177,11 @@ For a complete list of all types supported in `LOOKUP JOIN`, refer to the [`LOOK
216
177
217
178
## Usage notes
218
179
219
-
This section covers important details about `LOOKUP JOIN` that impact query
220
-
behavior and results. Review these details to ensure your queries work as
221
-
expected and to troubleshoot unexpected results.
180
+
This section covers important details about `LOOKUP JOIN` that impact query behavior and results. Review these details to ensure your queries work as expected and to troubleshoot unexpected results.
222
181
223
182
### Handling name collisions
224
183
225
-
When fields from the lookup index match existing column names, the new columns
226
-
override the existing ones.
184
+
When fields from the lookup index match existing column names, the new columns override the existing ones.
227
185
Before the `LOOKUP JOIN` command, preserve columns by either:
228
186
229
187
* Using `RENAME` to assign non-conflicting names
@@ -239,24 +197,11 @@ any `LOOKUP JOIN`s.
239
197
240
198
The following are the current limitations with `LOOKUP JOIN`:
* Cross cluster search is unsupported initially. Both source and lookup indices
246
-
must be local.
200
+
* Indices in [`lookup` mode](/reference/elasticsearch/index-settings/index-modules.md#index-mode-setting) are always single-sharded.
201
+
* Cross cluster search is unsupported initially. Both source and lookup indices must be local.
247
202
* Currently, only matching on equality is supported.
248
-
*`LOOKUP JOIN` can only use a single match field and a single index. Wildcards
249
-
are not supported.
250
-
* Aliases, datemath, and datastreams are supported, as long as the index
251
-
pattern matches a single concrete index {applies_to}`stack: ga 9.1.0`.
252
-
* Limitation on matching on a single field is removed. You can use a
253
-
comma separated list of fields in the `ON` clause
254
-
{applies_to}`stack: ga 9.2.0`.
255
-
* The name of the match field in `LOOKUP JOIN lu_idx ON match_field` must match
256
-
an existing field in the query. This may require `RENAME`s or `EVAL`s to
257
-
achieve.
258
-
* The query will circuit break if there are too many matching documents in the
259
-
lookup index, or if the documents are too large. More precisely, `LOOKUP JOIN`
260
-
works in batches of, normally, about 10,000 rows; a large amount of heap space
261
-
is needed if the matching documents from the lookup index for a batch are
262
-
multiple megabytes or larger. This is roughly the same as for `ENRICH`.
203
+
*`LOOKUP JOIN` can only use a single match field and a single index. Wildcards are not supported.
204
+
* Aliases, datemath, and datastreams are supported, as long as the index pattern matches a single concrete index {applies_to}`stack: ga 9.1.0`.
205
+
* Limitation on matching on a single field is removed. You can use a comma separated list of fields in the `ON` clause {applies_to}`stack: ga 9.2.0`.
206
+
* The name of the match field in `LOOKUP JOIN lu_idx ON match_field` must match an existing field in the query. This may require `RENAME`s or `EVAL`s to achieve.
207
+
* The query will circuit break if there are too many matching documents in the lookup index, or if the documents are too large. More precisely, `LOOKUP JOIN` works in batches of, normally, about 10,000 rows; a large amount of heap space is needed if the matching documents from the lookup index for a batch are multiple megabytes or larger. This is roughly the same as for `ENRICH`.
0 commit comments