You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# Join data from multiple indices with`LOOKUP JOIN`[esql-lookup-join-reference]
7
+
# Join data from multiple indices with
8
8
9
-
The {{esql}} [`LOOKUP JOIN`](/reference/query-languages/esql/commands/processing-commands.md#esql-lookup-join) processing command combines data from your {{esql}} query results table with matching records from a specified lookup index. It adds fields from the lookup index as new columns to your results table based on matching values in the join field.
9
+
`LOOKUP JOIN`[esql-lookup-join-reference]
10
10
11
-
Teams often have data scattered across multiple indices – like logs, IPs, user IDs, hosts, employees etc. Without a direct way to enrich or correlate each event with reference data, root-cause analysis, security checks, and operational insights become time-consuming.
processing command combines data from your {{esql}} query results table with
14
+
matching records from a specified lookup index. It adds fields from the lookup
15
+
index as new columns to your results table based on matching values in the join
16
+
field.
17
+
18
+
Teams often have data scattered across multiple indices – like logs, IPs, user
19
+
IDs, hosts, employees etc. Without a direct way to enrich or correlate each
20
+
event with reference data, root-cause analysis, security checks, and operational
21
+
insights become time-consuming.
12
22
13
23
For example, you can use `LOOKUP JOIN` to:
14
24
15
-
* Retrieve environment or ownership details for each host to correlate your metrics data.
25
+
* Retrieve environment or ownership details for each host to correlate your
26
+
metrics data.
16
27
* Quickly see if any source IPs match known malicious addresses.
17
-
* Tag logs with the owning team or escalation info for faster triage and incident response.
28
+
* Tag logs with the owning team or escalation info for faster triage and
29
+
incident response.
18
30
19
31
## Compare with `ENRICH`
20
32
21
-
[`LOOKUP JOIN`](/reference/query-languages/esql/commands/processing-commands.md#esql-lookup-join) is similar to [`ENRICH`](/reference/query-languages/esql/commands/processing-commands.md#esql-enrich) in the fact that they both help you join data together. You should use `LOOKUP JOIN` when:
in the fact that they both help you join data together. You should use
38
+
`LOOKUP JOIN` when:
22
39
23
40
* Your enrichment data changes frequently
24
41
* You want to avoid index-time processing
@@ -30,27 +47,36 @@ For example, you can use `LOOKUP JOIN` to:
30
47
31
48
## How the command works [esql-how-lookup-join-works]
32
49
33
-
The `LOOKUP JOIN` command adds fields from the lookup index as new columns to your results table based on matching values in the join field.
50
+
The `LOOKUP JOIN` command adds fields from the lookup index as new columns to
51
+
your results table based on matching values in the join field.
34
52
35
53
The command requires two parameters:
36
-
- The name of the lookup index (which must have the `lookup`[`index.mode setting`](/reference/elasticsearch/index-settings/index-modules.md#index-mode-setting))
54
+
55
+
- The name of the lookup index (which must have the `lookup`[
:alt: Illustration of the `LOOKUP JOIN` command, where the input table is joined with a lookup index to create an enriched output table.
64
+
:alt: Illustration of the `LOOKUP JOIN` command, where the input table is joined
65
+
with a lookup index to create an enriched output table.
45
66
:::
46
67
47
-
If you're familiar with SQL, `LOOKUP JOIN` has left-join behavior. This means that if no rows match in the lookup index, the incoming row is retained and `null`s are added. If many rows in the lookup index match, `LOOKUP JOIN` adds one row per match.
68
+
If you're familiar with SQL, `LOOKUP JOIN` has left-join behavior. This means
69
+
that if no rows match in the lookup index, the incoming row is retained and
70
+
`null`s are added. If many rows in the lookup index match, `LOOKUP JOIN` adds
71
+
one row per match.
48
72
49
73
## Example
50
74
51
-
You can run this example for yourself if you'd like to see how it works, by setting up the indices and adding sample data.
75
+
You can run this example for yourself if you'd like to see how it works, by
76
+
setting up the indices and adding sample data.
52
77
53
78
### Sample data
79
+
54
80
:::{dropdown} Expand for setup instructions
55
81
56
82
**Set up indices**
@@ -73,6 +99,7 @@ PUT threat_list
73
99
}
74
100
}
75
101
```
102
+
76
103
```console
77
104
PUT firewall_logs
78
105
{
@@ -90,7 +117,9 @@ PUT firewall_logs
90
117
91
118
**Add sample data**
92
119
93
-
Next, let's add some sample data to both indices. The `threat_list` index contains known malicious IPs, while the `firewall_logs` index contains logs of network traffic.
120
+
Next, let's add some sample data to both indices. The `threat_list` index
121
+
contains known malicious IPs, while the `firewall_logs` index contains logs of
@@ -128,31 +158,38 @@ FROM firewall_logs # The source index
128
158
129
159
### Response
130
160
131
-
A successful query will output a table. In this example, you can see that the `source.ip` field from the `firewall_logs` index is matched with the `source.ip` field in the `threat_list` index, and the corresponding `threat_level` and `threat_type` fields are added to the output.
161
+
A successful query will output a table. In this example, you can see that the
162
+
`source.ip` field from the `firewall_logs` index is matched with the `source.ip`
163
+
field in the `threat_list` index, and the corresponding `threat_level` and
Refer to the examples section of the [`LOOKUP JOIN`](/reference/query-languages/esql/commands/processing-commands.md#esql-lookup-join) command reference for more examples.
Indices used for lookups must be configured with the [`lookup` index mode](/reference/elasticsearch/index-settings/index-modules.md#index-mode-setting).
182
+
Indices used for lookups must be configured with the [
183
+
`lookup` index mode](/reference/elasticsearch/index-settings/index-modules.md#index-mode-setting).
148
184
149
185
### Data type compatibility
150
186
151
-
Join keys must have compatible data types between the source and lookup indices. Types within the same compatibility group can be joined together:
187
+
Join keys must have compatible data types between the source and lookup indices.
188
+
Types within the same compatibility group can be joined together:
|**Keyword family**|`keyword`, `text.keyword`| Text fields only as join key on left-hand side and must have `.keyword` subfield |
157
194
|**Date (Exact)**|`date`| Must match exactly |
158
195
|**Date Nanos (Exact)**|`date_nanos`| Must match exactly |
@@ -164,7 +201,9 @@ To obtain a join key with a compatible type, use a [conversion function](/refere
164
201
165
202
### Unsupported Types
166
203
167
-
In addition to the [{{esql}} unsupported field types](/reference/query-languages/esql/limitations.md#_unsupported_types), `LOOKUP JOIN` does not support:
204
+
In addition to
205
+
the [{{esql}} unsupported field types](/reference/query-languages/esql/limitations.md#_unsupported_types),
206
+
`LOOKUP JOIN` does not support:
168
207
169
208
*`VERSION`
170
209
*`UNSIGNED_LONG`
@@ -177,11 +216,14 @@ For a complete list of all types supported in `LOOKUP JOIN`, refer to the [`LOOK
177
216
178
217
## Usage notes
179
218
180
-
This section covers important details about `LOOKUP JOIN` that impact query behavior and results. Review these details to ensure your queries work as expected and to troubleshoot unexpected results.
219
+
This section covers important details about `LOOKUP JOIN` that impact query
220
+
behavior and results. Review these details to ensure your queries work as
221
+
expected and to troubleshoot unexpected results.
181
222
182
223
### Handling name collisions
183
224
184
-
When fields from the lookup index match existing column names, the new columns override the existing ones.
225
+
When fields from the lookup index match existing column names, the new columns
226
+
override the existing ones.
185
227
Before the `LOOKUP JOIN` command, preserve columns by either:
186
228
187
229
* Using `RENAME` to assign non-conflicting names
@@ -197,10 +239,24 @@ any `LOOKUP JOIN`s.
197
239
198
240
The following are the current limitations with `LOOKUP JOIN`:
199
241
200
-
* Indices in [`lookup` mode](/reference/elasticsearch/index-settings/index-modules.md#index-mode-setting) are always single-sharded.
201
-
* Cross cluster search is unsupported initially. Both source and lookup indices must be local.
* Cross cluster search is unsupported initially. Both source and lookup indices
246
+
must be local.
202
247
* Currently, only matching on equality is supported.
203
-
*`LOOKUP JOIN` can only use a single match field and a single index. Wildcards are not supported.
204
-
* Aliases, datemath, and datastreams are supported, as long as the index pattern matches a single concrete index {applies_to}`stack: ga 9.1.0`.
205
-
* The name of the match field in `LOOKUP JOIN lu_idx ON match_field` must match an existing field in the query. This may require `RENAME`s or `EVAL`s to achieve.
206
-
* The query will circuit break if there are too many matching documents in the lookup index, or if the documents are too large. More precisely, `LOOKUP JOIN` works in batches of, normally, about 10,000 rows; a large amount of heap space is needed if the matching documents from the lookup index for a batch are multiple megabytes or larger. This is roughly the same as for `ENRICH`.
248
+
*`LOOKUP JOIN` can only use a single match field and a single index. Wildcards
249
+
are not supported.
250
+
* Aliases, datemath, and datastreams are supported, as long as the index
251
+
pattern matches a single concrete index {applies_to}`stack: ga 9.1.0`.
252
+
* Limitation on matching on a single field is removed. You can use a
253
+
comma separated list of fields in the `ON` clause
254
+
{applies_to}`stack: ga 9.2.0`.
255
+
* The name of the match field in `LOOKUP JOIN lu_idx ON match_field` must match
256
+
an existing field in the query. This may require `RENAME`s or `EVAL`s to
257
+
achieve.
258
+
* The query will circuit break if there are too many matching documents in the
259
+
lookup index, or if the documents are too large. More precisely, `LOOKUP JOIN`
260
+
works in batches of, normally, about 10,000 rows; a large amount of heap space
261
+
is needed if the matching documents from the lookup index for a batch are
262
+
multiple megabytes or larger. This is roughly the same as for `ENRICH`.
0 commit comments