Skip to content

Commit 6967ffb

Browse files
committed
Support tested lookup join examples in docs
1 parent 7fdf9c1 commit 6967ffb

20 files changed

+316
-50
lines changed
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
% This is generated by ESQL's AbstractFunctionTestCase. Do no edit it. See ../README.md for how to regenerate it.
2+
3+
```esql
4+
FROM system_metrics
5+
| LOOKUP JOIN host_inventory ON host.name
6+
| LOOKUP JOIN ownerships ON host.name
7+
```
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
% This is generated by ESQL's AbstractFunctionTestCase. Do no edit it. See ../README.md for how to regenerate it.
2+
3+
```esql
4+
FROM app_logs
5+
| LOOKUP JOIN service_owners ON service_id
6+
```
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
% This is generated by ESQL's AbstractFunctionTestCase. Do no edit it. See ../README.md for how to regenerate it.
2+
3+
```esql
4+
FROM firewall_logs
5+
| LOOKUP JOIN threat_list ON source.IP
6+
```
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
% This is generated by ESQL's AbstractFunctionTestCase. Do no edit it. See ../README.md for how to regenerate it.
2+
3+
```esql
4+
FROM firewall_logs
5+
| LOOKUP JOIN threat_list ON source.IP
6+
| WHERE threat_level IS NOT NULL
7+
```
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
% This is generated by ESQL's AbstractFunctionTestCase. Do no edit it. See ../README.md for how to regenerate it.
2+
3+
```esql
4+
FROM employees
5+
| EVAL language_code = languages
6+
| WHERE emp_no >= 10091 AND emp_no < 10094
7+
| LOOKUP JOIN languages_lookup ON language_code
8+
```
9+
10+
| emp_no:integer | language_code:integer | language_name:keyword |
11+
| --- | --- | --- |
12+
| 10091 | 3 | Spanish |
13+
| 10092 | 1 | English |
14+
| 10093 | 3 | Spanish |
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
% This is generated by ESQL's AbstractFunctionTestCase. Do no edit it. See ../README.md for how to regenerate it.
2+
3+
```esql
4+
FROM employees
5+
| EVAL language_code = languages
6+
| LOOKUP JOIN languages_lookup ON language_code
7+
| WHERE emp_no >= 10091 AND emp_no < 10094
8+
```
9+
10+
| emp_no:integer | language_code:integer | language_name:keyword |
11+
| --- | --- | --- |
12+
| 10091 | 3 | Spanish |
13+
| 10092 | 1 | English |
14+
| 10093 | 3 | Spanish |

docs/reference/query-languages/esql/_snippets/commands/layout/lookup-join.md

Lines changed: 14 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -52,53 +52,37 @@ In case of name collisions, the newly created columns will override existing col
5252
**IP Threat correlation**: This query would allow you to see if any source
5353
IPs match known malicious addresses.
5454

55-
```esql
56-
FROM firewall_logs
57-
| LOOKUP JOIN threat_list ON source.IP
58-
```
55+
:::{include} ../examples/docs-lookup-join.csv-spec/lookupJoinSourceIp.md
56+
:::
5957

6058
To filter only for those rows that have a matching `threat_list` entry, use `WHERE ... IS NOT NULL` with a field from the lookup index:
6159

62-
```esql
63-
FROM firewall_logs
64-
| LOOKUP JOIN threat_list ON source.IP
65-
| WHERE threat_level IS NOT NULL
66-
```
60+
:::{include} ../examples/docs-lookup-join.csv-spec/lookupJoinSourceIpWhere.md
61+
:::
6762

6863
**Host metadata correlation**: This query pulls in environment or
6964
ownership details for each host to correlate with your metrics data.
7065

71-
```esql
72-
FROM system_metrics
73-
| LOOKUP JOIN host_inventory ON host.name
74-
| LOOKUP JOIN employees ON host.name
75-
```
66+
:::{include} ../examples/docs-lookup-join.csv-spec/lookupJoinHostNameTwice.md
67+
:::
7668

7769
**Service ownership mapping**: This query would show logs with the owning
7870
team or escalation information for faster triage and incident response.
7971

80-
```esql
81-
FROM app_logs
82-
| LOOKUP JOIN service_owners ON service_id
83-
```
72+
:::{include} ../examples/docs-lookup-join.csv-spec/lookupJoinServiceId.md
73+
:::
8474

8575
`LOOKUP JOIN` is generally faster when there are fewer rows to join
8676
with. {{esql}} will try and perform any `WHERE` clause before the
8777
`LOOKUP JOIN` where possible.
8878

89-
The two following examples will have the same results. The two examples
90-
have the `WHERE` clause before and after the `LOOKUP JOIN`. It does not
79+
The following two examples will have the same results. One has the
80+
`WHERE` clause before and the other after the `LOOKUP JOIN`. It does not
9181
matter how you write your query, our optimizer will move the filter
9282
before the lookup when possible.
9383

94-
```esql
95-
FROM Left
96-
| WHERE Language IS NOT NULL
97-
| LOOKUP JOIN Right ON Key
98-
```
84+
:::{include} ../examples/lookup-join.csv-spec/filterOnLeftSide.md
85+
:::
9986

100-
```esql
101-
FROM Left
102-
| LOOKUP JOIN Right ON Key
103-
| WHERE Language IS NOT NULL
104-
```
87+
:::{include} ../examples/lookup-join.csv-spec/filterOnRightSide.md
88+
:::

x-pack/plugin/esql/qa/testFixtures/src/main/java/org/elasticsearch/xpack/esql/CsvTestsDataLoader.java

Lines changed: 36 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -63,15 +63,14 @@ public class CsvTestsDataLoader {
6363
private static final TestDataset APPS = new TestDataset("apps");
6464
private static final TestDataset APPS_SHORT = APPS.withIndex("apps_short").withTypeMapping(Map.of("id", "short"));
6565
private static final TestDataset LANGUAGES = new TestDataset("languages");
66-
private static final TestDataset LANGUAGES_LOOKUP = LANGUAGES.withIndex("languages_lookup")
67-
.withSetting("languages_lookup-settings.json");
66+
private static final TestDataset LANGUAGES_LOOKUP = LANGUAGES.withIndex("languages_lookup").withSetting("lookup-settings.json");
6867
private static final TestDataset LANGUAGES_LOOKUP_NON_UNIQUE_KEY = LANGUAGES_LOOKUP.withIndex("languages_lookup_non_unique_key")
6968
.withData("languages_non_unique_key.csv");
7069
private static final TestDataset LANGUAGES_NESTED_FIELDS = new TestDataset(
7170
"languages_nested_fields",
7271
"mapping-languages_nested_fields.json",
7372
"languages_nested_fields.csv"
74-
).withSetting("languages_lookup-settings.json");
73+
).withSetting("lookup-settings.json");
7574
private static final TestDataset ALERTS = new TestDataset("alerts");
7675
private static final TestDataset UL_LOGS = new TestDataset("ul_logs");
7776
private static final TestDataset SAMPLE_DATA = new TestDataset("sample_data");
@@ -102,11 +101,17 @@ public class CsvTestsDataLoader {
102101
"partial_mapping_sample_data.csv"
103102
).withSetting("source_parameters-settings.json");
104103
private static final TestDataset CLIENT_IPS = new TestDataset("clientips");
105-
private static final TestDataset CLIENT_IPS_LOOKUP = CLIENT_IPS.withIndex("clientips_lookup")
106-
.withSetting("clientips_lookup-settings.json");
104+
private static final TestDataset CLIENT_IPS_LOOKUP = CLIENT_IPS.withIndex("clientips_lookup").withSetting("lookup-settings.json");
107105
private static final TestDataset MESSAGE_TYPES = new TestDataset("message_types");
108106
private static final TestDataset MESSAGE_TYPES_LOOKUP = MESSAGE_TYPES.withIndex("message_types_lookup")
109-
.withSetting("message_types_lookup-settings.json");
107+
.withSetting("lookup-settings.json");
108+
private static final TestDataset FIREWALL_LOGS = new TestDataset("firewall_logs").noData();
109+
private static final TestDataset THREAT_LIST = new TestDataset("threat_list").withSetting("lookup-settings.json").noData();
110+
private static final TestDataset APP_LOGS = new TestDataset("app_logs").noData();
111+
private static final TestDataset SERVICE_OWNERS = new TestDataset("service_owners").withSetting("lookup-settings.json").noData();
112+
private static final TestDataset SYSTEM_METRICS = new TestDataset("system_metrics").noData();
113+
private static final TestDataset HOST_INVENTORY = new TestDataset("host_inventory").withSetting("lookup-settings.json").noData();
114+
private static final TestDataset OWNERSHIPS = new TestDataset("ownerships").withSetting("lookup-settings.json").noData();
110115
private static final TestDataset CLIENT_CIDR = new TestDataset("client_cidr");
111116
private static final TestDataset AGES = new TestDataset("ages");
112117
private static final TestDataset HEIGHTS = new TestDataset("heights");
@@ -160,6 +165,13 @@ public class CsvTestsDataLoader {
160165
Map.entry(CLIENT_IPS_LOOKUP.indexName, CLIENT_IPS_LOOKUP),
161166
Map.entry(MESSAGE_TYPES.indexName, MESSAGE_TYPES),
162167
Map.entry(MESSAGE_TYPES_LOOKUP.indexName, MESSAGE_TYPES_LOOKUP),
168+
Map.entry(FIREWALL_LOGS.indexName, FIREWALL_LOGS),
169+
Map.entry(THREAT_LIST.indexName, THREAT_LIST),
170+
Map.entry(APP_LOGS.indexName, APP_LOGS),
171+
Map.entry(SERVICE_OWNERS.indexName, SERVICE_OWNERS),
172+
Map.entry(SYSTEM_METRICS.indexName, SYSTEM_METRICS),
173+
Map.entry(HOST_INVENTORY.indexName, HOST_INVENTORY),
174+
Map.entry(OWNERSHIPS.indexName, OWNERSHIPS),
163175
Map.entry(CLIENT_CIDR.indexName, CLIENT_CIDR),
164176
Map.entry(AGES.indexName, AGES),
165177
Map.entry(HEIGHTS.indexName, HEIGHTS),
@@ -459,11 +471,14 @@ private static URL getResource(String name) {
459471

460472
private static void load(RestClient client, TestDataset dataset, Logger logger, IndexCreator indexCreator) throws IOException {
461473
URL mapping = getResource("/" + dataset.mappingFileName);
462-
URL data = getResource("/data/" + dataset.dataFileName);
463-
464474
Settings indexSettings = dataset.readSettingsFile();
465475
indexCreator.createIndex(client, dataset.indexName, readMappingFile(mapping, dataset.typeMapping), indexSettings);
466-
loadCsvData(client, dataset.indexName, data, dataset.allowSubFields, logger);
476+
477+
// Some examples only test that the query and mappings are valid, and don't need example data. Use .noData() for those
478+
if (dataset.dataFileName != null) {
479+
URL data = getResource("/data/" + dataset.dataFileName);
480+
loadCsvData(client, dataset.indexName, data, dataset.allowSubFields, logger);
481+
}
467482
}
468483

469484
private static String readMappingFile(URL resource, Map<String, String> typeMapping) throws IOException {
@@ -738,6 +753,18 @@ public TestDataset withData(String dataFileName) {
738753
);
739754
}
740755

756+
public TestDataset noData() {
757+
return new TestDataset(
758+
indexName,
759+
mappingFileName,
760+
null,
761+
settingFileName,
762+
allowSubFields,
763+
typeMapping,
764+
requiresInferenceEndpoint
765+
);
766+
}
767+
741768
public TestDataset withSetting(String settingFileName) {
742769
return new TestDataset(
743770
indexName,
Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
###########################################################
2+
# These tests were created specifically to satisfy the needs
3+
# of the docs, and the lookup-join.md file in particular.
4+
# Since those docs do not display output results, we only
5+
# need to ensure that the tests run without error.
6+
# This requires index mappings to be set up correctly,
7+
# but no data needs to be loaded into the indices.
8+
###########################################################
9+
10+
# **IP Threat correlation**: This query would allow you to see if any source
11+
# IPs match known malicious addresses.
12+
13+
lookupJoinSourceIp
14+
required_capability: join_lookup_v12
15+
16+
// tag::lookupJoinSourceIp
17+
FROM firewall_logs
18+
| LOOKUP JOIN threat_list ON source.IP
19+
// end:lookupJoinSourceIp
20+
;
21+
22+
@timestamp:datetime | destination.IP:ip | message:keyword | source.IP:ip | threat_level:keyword
23+
;
24+
25+
# To filter only for those rows that have a matching `threat_list` entry,
26+
# use `WHERE ... IS NOT NULL` with a field from the lookup index:
27+
28+
lookupJoinSourceIpWhere
29+
required_capability: join_lookup_v12
30+
31+
// tag::lookupJoinSourceIpWhere
32+
FROM firewall_logs
33+
| LOOKUP JOIN threat_list ON source.IP
34+
| WHERE threat_level IS NOT NULL
35+
// end:lookupJoinSourceIpWhere
36+
;
37+
38+
@timestamp:datetime | destination.IP:ip | message:keyword | source.IP:ip | threat_level:keyword
39+
;
40+
41+
# **Host metadata correlation**: This query pulls in environment or
42+
# ownership details for each host to correlate with your metrics data.
43+
44+
lookupJoinHostNameTwice
45+
required_capability: join_lookup_v12
46+
47+
// tag::lookupJoinHostNameTwice
48+
FROM system_metrics
49+
| LOOKUP JOIN host_inventory ON host.name
50+
| LOOKUP JOIN ownerships ON host.name
51+
// end:lookupJoinHostNameTwice
52+
;
53+
54+
count:long | details:keyword | host.name:keyword | description:keyword | host.os:keyword | host.version:keyword | owner.name:keyword
55+
;
56+
57+
# **Service ownership mapping**: This query would show logs with the owning
58+
# team or escalation information for faster triage and incident response.
59+
60+
lookupJoinIpServiceId
61+
required_capability: join_lookup_v12
62+
63+
// tag::lookupJoinIpServiceId
64+
FROM app_logs
65+
| LOOKUP JOIN service_owners ON service_id
66+
// end:lookupJoinIpServiceId
67+
;
68+
69+
@timestamp:datetime | message:keyword | service_id:keyword | owner:keyword
70+
;

x-pack/plugin/esql/qa/testFixtures/src/main/resources/languages_lookup-settings.json

Lines changed: 0 additions & 5 deletions
This file was deleted.

0 commit comments

Comments
 (0)