Skip to content

Commit 9a9077a

Browse files
committed
Make LOOKUP JOIN docs examples fully tested (elastic#126622)
The current LOOKUP JOIN docs include examples that are not tested by the ES|QL tests, unlike most other examples in the documentation. This PR fixes that, changing two examples to use existing tests, and adding a new csv-spec file for the remaining four examples. These four are not required to show results, so the tests have empty data and do not require any results. This means we are testing only the syntax (parsing and semantic analysis), which is sufficient for the docs.
1 parent 4b2750e commit 9a9077a

20 files changed

+316
-50
lines changed
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
% This is generated by ESQL's AbstractFunctionTestCase. Do no edit it. See ../README.md for how to regenerate it.
2+
3+
```esql
4+
FROM system_metrics
5+
| LOOKUP JOIN host_inventory ON host.name
6+
| LOOKUP JOIN ownerships ON host.name
7+
```
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
% This is generated by ESQL's AbstractFunctionTestCase. Do no edit it. See ../README.md for how to regenerate it.
2+
3+
```esql
4+
FROM app_logs
5+
| LOOKUP JOIN service_owners ON service_id
6+
```
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
% This is generated by ESQL's AbstractFunctionTestCase. Do no edit it. See ../README.md for how to regenerate it.
2+
3+
```esql
4+
FROM firewall_logs
5+
| LOOKUP JOIN threat_list ON source.IP
6+
```
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
% This is generated by ESQL's AbstractFunctionTestCase. Do no edit it. See ../README.md for how to regenerate it.
2+
3+
```esql
4+
FROM firewall_logs
5+
| LOOKUP JOIN threat_list ON source.IP
6+
| WHERE threat_level IS NOT NULL
7+
```
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
% This is generated by ESQL's AbstractFunctionTestCase. Do no edit it. See ../README.md for how to regenerate it.
2+
3+
```esql
4+
FROM employees
5+
| EVAL language_code = languages
6+
| WHERE emp_no >= 10091 AND emp_no < 10094
7+
| LOOKUP JOIN languages_lookup ON language_code
8+
```
9+
10+
| emp_no:integer | language_code:integer | language_name:keyword |
11+
| --- | --- | --- |
12+
| 10091 | 3 | Spanish |
13+
| 10092 | 1 | English |
14+
| 10093 | 3 | Spanish |
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
% This is generated by ESQL's AbstractFunctionTestCase. Do no edit it. See ../README.md for how to regenerate it.
2+
3+
```esql
4+
FROM employees
5+
| EVAL language_code = languages
6+
| LOOKUP JOIN languages_lookup ON language_code
7+
| WHERE emp_no >= 10091 AND emp_no < 10094
8+
```
9+
10+
| emp_no:integer | language_code:integer | language_name:keyword |
11+
| --- | --- | --- |
12+
| 10091 | 3 | Spanish |
13+
| 10092 | 1 | English |
14+
| 10093 | 3 | Spanish |

docs/reference/query-languages/esql/_snippets/commands/layout/lookup-join.md

Lines changed: 14 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -52,53 +52,37 @@ In case of name collisions, the newly created columns will override existing col
5252
**IP Threat correlation**: This query would allow you to see if any source
5353
IPs match known malicious addresses.
5454

55-
```esql
56-
FROM firewall_logs
57-
| LOOKUP JOIN threat_list ON source.IP
58-
```
55+
:::{include} ../examples/docs-lookup-join.csv-spec/lookupJoinSourceIp.md
56+
:::
5957

6058
To filter only for those rows that have a matching `threat_list` entry, use `WHERE ... IS NOT NULL` with a field from the lookup index:
6159

62-
```esql
63-
FROM firewall_logs
64-
| LOOKUP JOIN threat_list ON source.IP
65-
| WHERE threat_level IS NOT NULL
66-
```
60+
:::{include} ../examples/docs-lookup-join.csv-spec/lookupJoinSourceIpWhere.md
61+
:::
6762

6863
**Host metadata correlation**: This query pulls in environment or
6964
ownership details for each host to correlate with your metrics data.
7065

71-
```esql
72-
FROM system_metrics
73-
| LOOKUP JOIN host_inventory ON host.name
74-
| LOOKUP JOIN employees ON host.name
75-
```
66+
:::{include} ../examples/docs-lookup-join.csv-spec/lookupJoinHostNameTwice.md
67+
:::
7668

7769
**Service ownership mapping**: This query would show logs with the owning
7870
team or escalation information for faster triage and incident response.
7971

80-
```esql
81-
FROM app_logs
82-
| LOOKUP JOIN service_owners ON service_id
83-
```
72+
:::{include} ../examples/docs-lookup-join.csv-spec/lookupJoinServiceId.md
73+
:::
8474

8575
`LOOKUP JOIN` is generally faster when there are fewer rows to join
8676
with. {{esql}} will try and perform any `WHERE` clause before the
8777
`LOOKUP JOIN` where possible.
8878

89-
The two following examples will have the same results. The two examples
90-
have the `WHERE` clause before and after the `LOOKUP JOIN`. It does not
79+
The following two examples will have the same results. One has the
80+
`WHERE` clause before and the other after the `LOOKUP JOIN`. It does not
9181
matter how you write your query, our optimizer will move the filter
9282
before the lookup when possible.
9383

94-
```esql
95-
FROM Left
96-
| WHERE Language IS NOT NULL
97-
| LOOKUP JOIN Right ON Key
98-
```
84+
:::{include} ../examples/lookup-join.csv-spec/filterOnLeftSide.md
85+
:::
9986

100-
```esql
101-
FROM Left
102-
| LOOKUP JOIN Right ON Key
103-
| WHERE Language IS NOT NULL
104-
```
87+
:::{include} ../examples/lookup-join.csv-spec/filterOnRightSide.md
88+
:::

x-pack/plugin/esql/qa/testFixtures/src/main/java/org/elasticsearch/xpack/esql/CsvTestsDataLoader.java

Lines changed: 36 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -63,15 +63,14 @@ public class CsvTestsDataLoader {
6363
private static final TestDataset APPS = new TestDataset("apps");
6464
private static final TestDataset APPS_SHORT = APPS.withIndex("apps_short").withTypeMapping(Map.of("id", "short"));
6565
private static final TestDataset LANGUAGES = new TestDataset("languages");
66-
private static final TestDataset LANGUAGES_LOOKUP = LANGUAGES.withIndex("languages_lookup")
67-
.withSetting("languages_lookup-settings.json");
66+
private static final TestDataset LANGUAGES_LOOKUP = LANGUAGES.withIndex("languages_lookup").withSetting("lookup-settings.json");
6867
private static final TestDataset LANGUAGES_LOOKUP_NON_UNIQUE_KEY = LANGUAGES_LOOKUP.withIndex("languages_lookup_non_unique_key")
6968
.withData("languages_non_unique_key.csv");
7069
private static final TestDataset LANGUAGES_NESTED_FIELDS = new TestDataset(
7170
"languages_nested_fields",
7271
"mapping-languages_nested_fields.json",
7372
"languages_nested_fields.csv"
74-
).withSetting("languages_lookup-settings.json");
73+
).withSetting("lookup-settings.json");
7574
private static final TestDataset ALERTS = new TestDataset("alerts");
7675
private static final TestDataset UL_LOGS = new TestDataset("ul_logs");
7776
private static final TestDataset SAMPLE_DATA = new TestDataset("sample_data");
@@ -102,11 +101,17 @@ public class CsvTestsDataLoader {
102101
"partial_mapping_sample_data.csv"
103102
).withSetting("source_parameters-settings.json");
104103
private static final TestDataset CLIENT_IPS = new TestDataset("clientips");
105-
private static final TestDataset CLIENT_IPS_LOOKUP = CLIENT_IPS.withIndex("clientips_lookup")
106-
.withSetting("clientips_lookup-settings.json");
104+
private static final TestDataset CLIENT_IPS_LOOKUP = CLIENT_IPS.withIndex("clientips_lookup").withSetting("lookup-settings.json");
107105
private static final TestDataset MESSAGE_TYPES = new TestDataset("message_types");
108106
private static final TestDataset MESSAGE_TYPES_LOOKUP = MESSAGE_TYPES.withIndex("message_types_lookup")
109-
.withSetting("message_types_lookup-settings.json");
107+
.withSetting("lookup-settings.json");
108+
private static final TestDataset FIREWALL_LOGS = new TestDataset("firewall_logs").noData();
109+
private static final TestDataset THREAT_LIST = new TestDataset("threat_list").withSetting("lookup-settings.json").noData();
110+
private static final TestDataset APP_LOGS = new TestDataset("app_logs").noData();
111+
private static final TestDataset SERVICE_OWNERS = new TestDataset("service_owners").withSetting("lookup-settings.json").noData();
112+
private static final TestDataset SYSTEM_METRICS = new TestDataset("system_metrics").noData();
113+
private static final TestDataset HOST_INVENTORY = new TestDataset("host_inventory").withSetting("lookup-settings.json").noData();
114+
private static final TestDataset OWNERSHIPS = new TestDataset("ownerships").withSetting("lookup-settings.json").noData();
110115
private static final TestDataset CLIENT_CIDR = new TestDataset("client_cidr");
111116
private static final TestDataset AGES = new TestDataset("ages");
112117
private static final TestDataset HEIGHTS = new TestDataset("heights");
@@ -160,6 +165,13 @@ public class CsvTestsDataLoader {
160165
Map.entry(CLIENT_IPS_LOOKUP.indexName, CLIENT_IPS_LOOKUP),
161166
Map.entry(MESSAGE_TYPES.indexName, MESSAGE_TYPES),
162167
Map.entry(MESSAGE_TYPES_LOOKUP.indexName, MESSAGE_TYPES_LOOKUP),
168+
Map.entry(FIREWALL_LOGS.indexName, FIREWALL_LOGS),
169+
Map.entry(THREAT_LIST.indexName, THREAT_LIST),
170+
Map.entry(APP_LOGS.indexName, APP_LOGS),
171+
Map.entry(SERVICE_OWNERS.indexName, SERVICE_OWNERS),
172+
Map.entry(SYSTEM_METRICS.indexName, SYSTEM_METRICS),
173+
Map.entry(HOST_INVENTORY.indexName, HOST_INVENTORY),
174+
Map.entry(OWNERSHIPS.indexName, OWNERSHIPS),
163175
Map.entry(CLIENT_CIDR.indexName, CLIENT_CIDR),
164176
Map.entry(AGES.indexName, AGES),
165177
Map.entry(HEIGHTS.indexName, HEIGHTS),
@@ -418,11 +430,14 @@ private static URL getResource(String name) {
418430

419431
private static void load(RestClient client, TestDataset dataset, Logger logger, IndexCreator indexCreator) throws IOException {
420432
URL mapping = getResource("/" + dataset.mappingFileName);
421-
URL data = getResource("/data/" + dataset.dataFileName);
422-
423433
Settings indexSettings = dataset.readSettingsFile();
424434
indexCreator.createIndex(client, dataset.indexName, readMappingFile(mapping, dataset.typeMapping), indexSettings);
425-
loadCsvData(client, dataset.indexName, data, dataset.allowSubFields, logger);
435+
436+
// Some examples only test that the query and mappings are valid, and don't need example data. Use .noData() for those
437+
if (dataset.dataFileName != null) {
438+
URL data = getResource("/data/" + dataset.dataFileName);
439+
loadCsvData(client, dataset.indexName, data, dataset.allowSubFields, logger);
440+
}
426441
}
427442

428443
private static String readMappingFile(URL resource, Map<String, String> typeMapping) throws IOException {
@@ -697,6 +712,18 @@ public TestDataset withData(String dataFileName) {
697712
);
698713
}
699714

715+
public TestDataset noData() {
716+
return new TestDataset(
717+
indexName,
718+
mappingFileName,
719+
null,
720+
settingFileName,
721+
allowSubFields,
722+
typeMapping,
723+
requiresInferenceEndpoint
724+
);
725+
}
726+
700727
public TestDataset withSetting(String settingFileName) {
701728
return new TestDataset(
702729
indexName,
Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
###########################################################
2+
# These tests were created specifically to satisfy the needs
3+
# of the docs, and the lookup-join.md file in particular.
4+
# Since those docs do not display output results, we only
5+
# need to ensure that the tests run without error.
6+
# This requires index mappings to be set up correctly,
7+
# but no data needs to be loaded into the indices.
8+
###########################################################
9+
10+
# **IP Threat correlation**: This query would allow you to see if any source
11+
# IPs match known malicious addresses.
12+
13+
lookupJoinSourceIp
14+
required_capability: join_lookup_v12
15+
16+
// tag::lookupJoinSourceIp[]
17+
FROM firewall_logs
18+
| LOOKUP JOIN threat_list ON source.IP
19+
// end::lookupJoinSourceIp[]
20+
;
21+
22+
@timestamp:datetime | destination.IP:ip | message:keyword | source.IP:ip | threat_level:keyword
23+
;
24+
25+
# To filter only for those rows that have a matching `threat_list` entry,
26+
# use `WHERE ... IS NOT NULL` with a field from the lookup index:
27+
28+
lookupJoinSourceIpWhere
29+
required_capability: join_lookup_v12
30+
31+
// tag::lookupJoinSourceIpWhere[]
32+
FROM firewall_logs
33+
| LOOKUP JOIN threat_list ON source.IP
34+
| WHERE threat_level IS NOT NULL
35+
// end::lookupJoinSourceIpWhere[]
36+
;
37+
38+
@timestamp:datetime | destination.IP:ip | message:keyword | source.IP:ip | threat_level:keyword
39+
;
40+
41+
# **Host metadata correlation**: This query pulls in environment or
42+
# ownership details for each host to correlate with your metrics data.
43+
44+
lookupJoinHostNameTwice
45+
required_capability: join_lookup_v12
46+
47+
// tag::lookupJoinHostNameTwice[]
48+
FROM system_metrics
49+
| LOOKUP JOIN host_inventory ON host.name
50+
| LOOKUP JOIN ownerships ON host.name
51+
// end::lookupJoinHostNameTwice[]
52+
;
53+
54+
count:long | details:keyword | host.name:keyword | description:keyword | host.os:keyword | host.version:keyword | owner.name:keyword
55+
;
56+
57+
# **Service ownership mapping**: This query would show logs with the owning
58+
# team or escalation information for faster triage and incident response.
59+
60+
lookupJoinIpServiceId
61+
required_capability: join_lookup_v12
62+
63+
// tag::lookupJoinServiceId[]
64+
FROM app_logs
65+
| LOOKUP JOIN service_owners ON service_id
66+
// end::lookupJoinServiceId[]
67+
;
68+
69+
@timestamp:datetime | message:keyword | service_id:keyword | owner:keyword
70+
;

x-pack/plugin/esql/qa/testFixtures/src/main/resources/languages_lookup-settings.json

Lines changed: 0 additions & 5 deletions
This file was deleted.

0 commit comments

Comments
 (0)