|
| 1 | +--- |
| 2 | +navigation_title: "Correlate data with LOOKUP JOIN" |
| 3 | +mapped_pages: |
| 4 | + - https://www.elastic.co/guide/en/elasticsearch/reference/current/esql-enrich-data.html |
| 5 | +--- |
| 6 | + |
| 7 | +# LOOKUP JOIN [esql-lookup-join-reference] |
| 8 | + |
| 9 | +The {{esql}} [`LOOKUP JOIN`](/reference/query-languages/esql/esql-commands.md#esql-lookup-join) processing command combines data from your {esql} query results table with matching records from a specified lookup index. It adds fields from the lookup index as new columns to your results table based on matching values in the join field. |
| 10 | + |
| 11 | +Teams often have data scattered across multiple indices – like logs, IPs, user IDs, hosts, employees etc. Without a direct way to enrich or correlate each event with reference data, root-cause analysis, security checks, and operational insights become time-consuming. |
| 12 | + |
| 13 | +For example, you can use `LOOKUP JOIN` to: |
| 14 | + |
| 15 | +* Retrieve environment or ownership details for each host to correlate your metrics data. |
| 16 | +* Quickly see if any source IPs match known malicious addresses. |
| 17 | +* Tag logs with the owning team or escalation info for faster triage and incident response. |
| 18 | + |
| 19 | +[`LOOKUP join`](/reference/query-languages/esql/esql-commands.md#esql-lookup-join) is similar to [`ENRICH`](/reference/query-languages/esql/esql-commands.md#esql-enrich) in the fact that they both help you join data together. You should use `LOOKUP JOIN` when: |
| 20 | + |
| 21 | +* Your enrichment data changes frequently |
| 22 | +* You want to avoid index-time processing |
| 23 | +* You're working with regular indices |
| 24 | +* You need to preserve distinct matches |
| 25 | +* You need to match on any field in a lookup index |
| 26 | +* You use document or field level security |
| 27 | +* You want to restrict users to a specific lookup indices that they can you |
| 28 | + |
| 29 | +## How the `LOOKUP JOIN` command works [esql-how-lookup-join-works] |
| 30 | + |
| 31 | +The `LOOKUP JOIN` command adds new columns to a table, with data from {{es}} indices. |
| 32 | + |
| 33 | +:::{image} ../../../images/esql-lookup-join.png |
| 34 | +:alt: esql lookup join |
| 35 | +::: |
| 36 | + |
| 37 | +`<lookup_index>` |
| 38 | +: The name of the lookup index. This must be a specific index name - wildcards, aliases, and remote cluster references are not supported. |
| 39 | + |
| 40 | +`<field_name>` |
| 41 | +: The field to join on. This field must exist in both your current query results and in the lookup index. If the field contains multi-valued entries, those entries will not match anything (the added fields will contain `null` for those rows). |
| 42 | + |
| 43 | +## Example |
| 44 | + |
| 45 | +`LOOKUP JOIN` has left-join behavior. If no rows match in the looked index, `LOOKUP JOIN` retains the incoming row and adds `null`s. If many rows in the lookedup index match, `LOOKUP JOIN` adds one row per match. |
| 46 | + |
| 47 | +In this example, we have two sample tables: |
| 48 | + |
| 49 | +**employees** |
| 50 | + |
| 51 | +| birth_date|emp_no|first_name|gender|hire_date|language| |
| 52 | +|---|---|---|---|---|---| |
| 53 | +|1955-10-04T00:00:00Z|10091|Amabile |M|1992-11-18T00:00:00Z|3| |
| 54 | +|1964-10-18T00:00:00Z|10092|Valdiodio |F|1989-09-22T00:00:00Z|1| |
| 55 | +|1964-06-11T00:00:00Z|10093|Sailaja |M|1996-11-05T00:00:00Z|3| |
| 56 | +|1957-05-25T00:00:00Z|10094|Arumugam |F|1987-04-18T00:00:00Z|5| |
| 57 | +|1965-01-03T00:00:00Z|10095|Hilari |M|1986-07-15T00:00:00Z|4| |
| 58 | + |
| 59 | +**languages_non_unique_key** |
| 60 | + |
| 61 | +|language_code|language_name|country| |
| 62 | +|---|---|---| |
| 63 | +|1|English|Canada| |
| 64 | +|1|English| |
| 65 | +|1||United Kingdom| |
| 66 | +|1|English|United States of America| |
| 67 | +|2|German|[Germany\|Austria]| |
| 68 | +|2|German|Switzerland| |
| 69 | +|2|German| |
| 70 | +|4|Spanish| |
| 71 | +|5||France| |
| 72 | +|[6\|7]|Mv-Lang|Mv-Land| |
| 73 | +|[7\|8]|Mv-Lang2|Mv-Land2| |
| 74 | +||Null-Lang|Null-Land| |
| 75 | +||Null-Lang2|Null-Land2| |
| 76 | + |
| 77 | +Running the following query would provide the results shown below. |
| 78 | + |
| 79 | +```esql |
| 80 | +FROM employees |
| 81 | +| EVAL language_code = emp_no % 10 |
| 82 | +| LOOKUP JOIN languages_lookup_non_unique_key ON language_code |
| 83 | +| WHERE emp_no > 10090 AND emp_no < 10096 |
| 84 | +| SORT emp_no, country |
| 85 | +| KEEP emp_no, language_code, language_name, country; |
| 86 | +``` |
| 87 | + |
| 88 | +|emp_no|language_code|language_name|country| |
| 89 | +|---|---|---|---| |
| 90 | +| 10091 | 1 | English | Canada| |
| 91 | +| 10091 | 1 | null | United Kingdom| |
| 92 | +| 10091 | 1 | English | United States of America| |
| 93 | +| 10091 | 1 | English | null| |
| 94 | +| 10092 | 2 | German | [Germany, Austria]| |
| 95 | +| 10092 | 2 | German | Switzerland| |
| 96 | +| 10092 | 2 | German | null| |
| 97 | +| 10093 | 3 | null | null| |
| 98 | +| 10094 | 4 | Spanish | null| |
| 99 | +| 10095 | 5 | null | France| |
| 100 | + |
| 101 | +::::{important} |
| 102 | +`LOOKUP JOIN` does not guarantee the output to be in any particular order. If a certain order is required, users should use a [`SORT`](/reference/query-languages/esql/esql-commands.md#esql-sort) somewhere after the `LOOKUP JOIN`. |
| 103 | + |
| 104 | +:::: |
| 105 | + |
| 106 | +## Prerequisites [esql-lookup-join-prereqs] |
| 107 | + |
| 108 | +To use `LOOKUP JOIN`, the following requirements must be met: |
| 109 | + |
| 110 | +* **Compatible data types**: The join key and join field in the lookup index must have compatible data types. This means: |
| 111 | + * The data types must either be identical or be internally represented as the same type in Elasticsearch's type system |
| 112 | + * Numeric types follow these compatibility rules: |
| 113 | + * `short` and `byte` are compatible with `integer` (all represented as `int`) |
| 114 | + * `float`, `half_float`, and `scaled_float` are compatible with `double` (all represented as `double`) |
| 115 | + * For text fields: You can use text fields on the left-hand side of the join only if they have a `.keyword` subfield |
| 116 | + |
| 117 | +For a complete list of supported data types and their internal representations, see the [Supported Field Types documentation](/reference/query-languages/esql/limitations.md#_supported_types). |
| 118 | + |
| 119 | +## Limitations |
| 120 | + |
| 121 | +The following are the current limitations with `LOOKUP JOIN` |
| 122 | + |
| 123 | +* `LOOKUP JOIN` will be successful if the join field in the lookup index is a `KEYWORD` type. If the main index's join field is `TEXT` type, it must have an exact `.keyword` subfield that can be matched with the lookup index's `KEYWORD` field. |
| 124 | +* Indices in [lookup](/reference/elasticsearch/index-settings/index-modules.md#index-mode-setting) mode are always single-sharded. |
| 125 | +* Cross cluster search is unsupported. Both source and lookup indices must be local. |
| 126 | +* `LOOKUP JOIN` can only use a single match field and a single index. Wildcards, aliases, datemath, and datastreams are not supported. |
| 127 | +* The name of the match field in `LOOKUP JOIN lu_idx ON match_field` must match an existing field in the query. This may require renames or evals to achieve. |
| 128 | +* The query will circuit break if there are too many matching documents in the lookup index, or if the documents are too large. More precisely, `LOOKUP JOIN` works in batches of, normally, about 10,000 rows; a large amount of heap space is needed if the matching documents from the lookup index for a batch are multiple megabytes or larger. This is roughly the same as for `ENRICH`. |
0 commit comments