Skip to content

Commit 43aa7e1

Browse files
Fix docs
1 parent 71adaa8 commit 43aa7e1

File tree

1 file changed

+33
-88
lines changed

1 file changed

+33
-88
lines changed
Lines changed: 33 additions & 88 deletions
Original file line numberDiff line numberDiff line change
@@ -1,41 +1,24 @@
11
---
22
navigation_title: "Join data with LOOKUP JOIN"
33
mapped_pages:
4-
- https://www.elastic.co/guide/en/elasticsearch/reference/8.18/_lookup_join.html
4+
- https://www.elastic.co/guide/en/elasticsearch/reference/8.18/_lookup_join.html
55
---
66

7-
# Join data from multiple indices with
7+
# Join data from multiple indices with `LOOKUP JOIN` [esql-lookup-join-reference]
88

9-
`LOOKUP JOIN` [esql-lookup-join-reference]
9+
The {{esql}} [`LOOKUP JOIN`](/reference/query-languages/esql/commands/processing-commands.md#esql-lookup-join) processing command combines data from your {{esql}} query results table with matching records from a specified lookup index. It adds fields from the lookup index as new columns to your results table based on matching values in the join field.
1010

11-
The {{esql}} [
12-
`LOOKUP JOIN`](/reference/query-languages/esql/commands/processing-commands.md#esql-lookup-join)
13-
processing command combines data from your {{esql}} query results table with
14-
matching records from a specified lookup index. It adds fields from the lookup
15-
index as new columns to your results table based on matching values in the join
16-
field.
17-
18-
Teams often have data scattered across multiple indices – like logs, IPs, user
19-
IDs, hosts, employees etc. Without a direct way to enrich or correlate each
20-
event with reference data, root-cause analysis, security checks, and operational
21-
insights become time-consuming.
11+
Teams often have data scattered across multiple indices – like logs, IPs, user IDs, hosts, employees etc. Without a direct way to enrich or correlate each event with reference data, root-cause analysis, security checks, and operational insights become time-consuming.
2212

2313
For example, you can use `LOOKUP JOIN` to:
2414

25-
* Retrieve environment or ownership details for each host to correlate your
26-
metrics data.
15+
* Retrieve environment or ownership details for each host to correlate your metrics data.
2716
* Quickly see if any source IPs match known malicious addresses.
28-
* Tag logs with the owning team or escalation info for faster triage and
29-
incident response.
17+
* Tag logs with the owning team or escalation info for faster triage and incident response.
3018

3119
## Compare with `ENRICH`
3220

33-
[
34-
`LOOKUP JOIN`](/reference/query-languages/esql/commands/processing-commands.md#esql-lookup-join)
35-
is similar to [
36-
`ENRICH`](/reference/query-languages/esql/commands/processing-commands.md#esql-enrich)
37-
in the fact that they both help you join data together. You should use
38-
`LOOKUP JOIN` when:
21+
[`LOOKUP JOIN`](/reference/query-languages/esql/commands/processing-commands.md#esql-lookup-join) is similar to [`ENRICH`](/reference/query-languages/esql/commands/processing-commands.md#esql-enrich) in the fact that they both help you join data together. You should use `LOOKUP JOIN` when:
3922

4023
* Your enrichment data changes frequently
4124
* You want to avoid index-time processing
@@ -47,36 +30,27 @@ in the fact that they both help you join data together. You should use
4730

4831
## How the command works [esql-how-lookup-join-works]
4932

50-
The `LOOKUP JOIN` command adds fields from the lookup index as new columns to
51-
your results table based on matching values in the join field.
33+
The `LOOKUP JOIN` command adds fields from the lookup index as new columns to your results table based on matching values in the join field.
5234

5335
The command requires two parameters:
54-
55-
- The name of the lookup index (which must have the `lookup` [
56-
`index.mode setting`](/reference/elasticsearch/index-settings/index-modules.md#index-mode-setting))
36+
- The name of the lookup index (which must have the `lookup` [`index.mode setting`](/reference/elasticsearch/index-settings/index-modules.md#index-mode-setting))
5737
- The name of the field to join on
5838

5939
```esql
6040
LOOKUP JOIN <lookup_index> ON <field_name>
6141
```
6242

6343
:::{image} ../images/esql-lookup-join.png
64-
:alt: Illustration of the `LOOKUP JOIN` command, where the input table is joined
65-
with a lookup index to create an enriched output table.
44+
:alt: Illustration of the `LOOKUP JOIN` command, where the input table is joined with a lookup index to create an enriched output table.
6645
:::
6746

68-
If you're familiar with SQL, `LOOKUP JOIN` has left-join behavior. This means
69-
that if no rows match in the lookup index, the incoming row is retained and
70-
`null`s are added. If many rows in the lookup index match, `LOOKUP JOIN` adds
71-
one row per match.
47+
If you're familiar with SQL, `LOOKUP JOIN` has left-join behavior. This means that if no rows match in the lookup index, the incoming row is retained and `null`s are added. If many rows in the lookup index match, `LOOKUP JOIN` adds one row per match.
7248

7349
## Example
7450

75-
You can run this example for yourself if you'd like to see how it works, by
76-
setting up the indices and adding sample data.
51+
You can run this example for yourself if you'd like to see how it works, by setting up the indices and adding sample data.
7752

7853
### Sample data
79-
8054
:::{dropdown} Expand for setup instructions
8155

8256
**Set up indices**
@@ -99,7 +73,6 @@ PUT threat_list
9973
}
10074
}
10175
```
102-
10376
```console
10477
PUT firewall_logs
10578
{
@@ -117,9 +90,7 @@ PUT firewall_logs
11790

11891
**Add sample data**
11992

120-
Next, let's add some sample data to both indices. The `threat_list` index
121-
contains known malicious IPs, while the `firewall_logs` index contains logs of
122-
network traffic.
93+
Next, let's add some sample data to both indices. The `threat_list` index contains known malicious IPs, while the `firewall_logs` index contains logs of network traffic.
12394

12495
```console
12596
POST threat_list/_bulk
@@ -142,7 +113,6 @@ POST firewall_logs/_bulk
142113
{"index":{}}
143114
{"timestamp":"2025-04-23T10:00:30Z","source.ip":"192.0.2.1","destination.ip":"10.0.0.100","action":"allow","bytes_transferred":512}
144115
```
145-
146116
:::
147117

148118
### Query the data
@@ -158,38 +128,31 @@ FROM firewall_logs # The source index
158128

159129
### Response
160130

161-
A successful query will output a table. In this example, you can see that the
162-
`source.ip` field from the `firewall_logs` index is matched with the `source.ip`
163-
field in the `threat_list` index, and the corresponding `threat_level` and
164-
`threat_type` fields are added to the output.
131+
A successful query will output a table. In this example, you can see that the `source.ip` field from the `firewall_logs` index is matched with the `source.ip` field in the `threat_list` index, and the corresponding `threat_level` and `threat_type` fields are added to the output.
165132

166-
| source.ip | action | threat_type | threat_level |
167-
|--------------|--------|-------------|--------------|
168-
| 203.0.113.5 | allow | C2_SERVER | high |
169-
| 198.51.100.2 | block | SCANNER | medium |
170-
| 203.0.113.5 | allow | C2_SERVER | high |
133+
|source.ip|action|threat_type|threat_level|
134+
|---|---|---|---|
135+
|203.0.113.5|allow|C2_SERVER|high|
136+
|198.51.100.2|block|SCANNER|medium|
137+
|203.0.113.5|allow|C2_SERVER|high|
171138

172139
### Additional examples
173140

174-
Refer to the examples section of the [
175-
`LOOKUP JOIN`](/reference/query-languages/esql/commands/processing-commands.md#esql-lookup-join)
176-
command reference for more examples.
141+
Refer to the examples section of the [`LOOKUP JOIN`](/reference/query-languages/esql/commands/processing-commands.md#esql-lookup-join) command reference for more examples.
177142

178143
## Prerequisites [esql-lookup-join-prereqs]
179144

180145
### Index configuration
181146

182-
Indices used for lookups must be configured with the [
183-
`lookup` index mode](/reference/elasticsearch/index-settings/index-modules.md#index-mode-setting).
147+
Indices used for lookups must be configured with the [`lookup` index mode](/reference/elasticsearch/index-settings/index-modules.md#index-mode-setting).
184148

185149
### Data type compatibility
186150

187-
Join keys must have compatible data types between the source and lookup indices.
188-
Types within the same compatibility group can be joined together:
151+
Join keys must have compatible data types between the source and lookup indices. Types within the same compatibility group can be joined together:
189152

190153
| Compatibility group | Types | Notes |
191154
|------------------------|-------------------------------------------------------------------------------------|----------------------------------------------------------------------------------|
192-
| **Numeric family** | `byte`, `short`, `integer`, `long`, `half_float`, `float`, `scaled_float`, `double` | All compatible |
155+
| **Numeric family** | `byte`, `short`, `integer`, `long`, `half_float`, `float`, `scaled_float`, `double` | All compatible |
193156
| **Keyword family** | `keyword`, `text.keyword` | Text fields only as join key on left-hand side and must have `.keyword` subfield |
194157
| **Date (Exact)** | `date` | Must match exactly |
195158
| **Date Nanos (Exact)** | `date_nanos` | Must match exactly |
@@ -201,9 +164,7 @@ To obtain a join key with a compatible type, use a [conversion function](/refere
201164

202165
### Unsupported Types
203166

204-
In addition to
205-
the [{{esql}} unsupported field types](/reference/query-languages/esql/limitations.md#_unsupported_types),
206-
`LOOKUP JOIN` does not support:
167+
In addition to the [{{esql}} unsupported field types](/reference/query-languages/esql/limitations.md#_unsupported_types), `LOOKUP JOIN` does not support:
207168

208169
* `VERSION`
209170
* `UNSIGNED_LONG`
@@ -216,14 +177,11 @@ For a complete list of all types supported in `LOOKUP JOIN`, refer to the [`LOOK
216177

217178
## Usage notes
218179

219-
This section covers important details about `LOOKUP JOIN` that impact query
220-
behavior and results. Review these details to ensure your queries work as
221-
expected and to troubleshoot unexpected results.
180+
This section covers important details about `LOOKUP JOIN` that impact query behavior and results. Review these details to ensure your queries work as expected and to troubleshoot unexpected results.
222181

223182
### Handling name collisions
224183

225-
When fields from the lookup index match existing column names, the new columns
226-
override the existing ones.
184+
When fields from the lookup index match existing column names, the new columns override the existing ones.
227185
Before the `LOOKUP JOIN` command, preserve columns by either:
228186

229187
* Using `RENAME` to assign non-conflicting names
@@ -239,24 +197,11 @@ any `LOOKUP JOIN`s.
239197

240198
The following are the current limitations with `LOOKUP JOIN`:
241199

242-
* Indices in [
243-
`lookup` mode](/reference/elasticsearch/index-settings/index-modules.md#index-mode-setting)
244-
are always single-sharded.
245-
* Cross cluster search is unsupported initially. Both source and lookup indices
246-
must be local.
200+
* Indices in [`lookup` mode](/reference/elasticsearch/index-settings/index-modules.md#index-mode-setting) are always single-sharded.
201+
* Cross cluster search is unsupported initially. Both source and lookup indices must be local.
247202
* Currently, only matching on equality is supported.
248-
* `LOOKUP JOIN` can only use a single match field and a single index. Wildcards
249-
are not supported.
250-
* Aliases, datemath, and datastreams are supported, as long as the index
251-
pattern matches a single concrete index {applies_to}`stack: ga 9.1.0`.
252-
* Limitation on matching on a single field is removed. You can use a
253-
comma separated list of fields in the `ON` clause
254-
{applies_to}`stack: ga 9.2.0`.
255-
* The name of the match field in `LOOKUP JOIN lu_idx ON match_field` must match
256-
an existing field in the query. This may require `RENAME`s or `EVAL`s to
257-
achieve.
258-
* The query will circuit break if there are too many matching documents in the
259-
lookup index, or if the documents are too large. More precisely, `LOOKUP JOIN`
260-
works in batches of, normally, about 10,000 rows; a large amount of heap space
261-
is needed if the matching documents from the lookup index for a batch are
262-
multiple megabytes or larger. This is roughly the same as for `ENRICH`.
203+
* `LOOKUP JOIN` can only use a single match field and a single index. Wildcards are not supported.
204+
* Aliases, datemath, and datastreams are supported, as long as the index pattern matches a single concrete index {applies_to}`stack: ga 9.1.0`.
205+
* Limitation on matching on a single field is removed. You can use a comma separated list of fields in the `ON` clause {applies_to}`stack: ga 9.2.0`.
206+
* The name of the match field in `LOOKUP JOIN lu_idx ON match_field` must match an existing field in the query. This may require `RENAME`s or `EVAL`s to achieve.
207+
* The query will circuit break if there are too many matching documents in the lookup index, or if the documents are too large. More precisely, `LOOKUP JOIN` works in batches of, normally, about 10,000 rows; a large amount of heap space is needed if the matching documents from the lookup index for a batch are multiple megabytes or larger. This is roughly the same as for `ENRICH`.

0 commit comments

Comments
 (0)