Skip to content

Commit c8250e6

Browse files
authored
Merge branch 'main' into bugfix/address-source-confirmed-npe
2 parents 557f051 + 83300ea commit c8250e6

File tree

22 files changed

+318
-241
lines changed

22 files changed

+318
-241
lines changed

BUILDING.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -144,7 +144,7 @@ To wire this registered cluster into a `TestClusterAware` task (e.g. `RestIntegT
144144
Additional integration tests for a certain Elasticsearch modules that are specific to certain cluster configuration can be declared in a separate so called `qa` subproject of your module.
145145

146146
The benefit of a dedicated project for these tests are:
147-
- `qa` projects are dedicated two specific use-cases and easier to maintain
147+
- `qa` projects are dedicated to specific use-cases and easier to maintain
148148
- It keeps the specific test logic separated from the common test logic.
149149
- You can run those tests in parallel to other projects of the build.
150150

docs/changelog/125922.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 125922
2+
summary: Fix text structure NPE when fields in list have null value
3+
area: Machine Learning
4+
type: bug
5+
issues: []

docs/changelog/127229.yaml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
pr: 127229
2+
summary: Return BAD_REQUEST when a field scorer references a missing field
3+
area: Ranking
4+
type: bug
5+
issues:
6+
- 127162

docs/reference/enrich-processor/date-processor.md

Lines changed: 63 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,6 @@ mapped_pages:
66

77
# Date processor [date-processor]
88

9-
109
Parses dates from fields, and then uses the date or timestamp as the timestamp for the document. By default, the date processor adds the parsed date as a new field called `@timestamp`. You can specify a different field by setting the `target_field` configuration parameter. Multiple date formats are supported as part of the same date processor definition. They will be used sequentially to attempt parsing the date field, in the same order they were defined as part of the processor definition.
1110

1211
$$$date-options$$$
@@ -16,7 +15,7 @@ $$$date-options$$$
1615
| `field` | yes | - | The field to get the date from. |
1716
| `target_field` | no | @timestamp | The field that will hold the parsed date. |
1817
| `formats` | yes | - | An array of the expected date formats. Can be a [java time pattern](/reference/elasticsearch/mapping-reference/mapping-date-format.md) or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N. |
19-
| `timezone` | no | UTC | The timezone to use when parsing the date. Supports [template snippets](docs-content://manage-data/ingest/transform-enrich/ingest-pipelines.md#template-snippets). |
18+
| `timezone` | no | UTC | The default [timezone](#date-processor-timezones) used by the processor. Supports [template snippets](docs-content://manage-data/ingest/transform-enrich/ingest-pipelines.md#template-snippets). |
2019
| `locale` | no | ENGLISH | The locale to use when parsing the date, relevant when parsing month names or week days. Supports [template snippets](docs-content://manage-data/ingest/transform-enrich/ingest-pipelines.md#template-snippets). |
2120
| `output_format` | no | `yyyy-MM-dd'T'HH:mm:ss.SSSXXX` | The format to use when writing the date to `target_field`. Must be a valid [java time pattern](/reference/elasticsearch/mapping-reference/mapping-date-format.md). |
2221
| `description` | no | - | Description of the processor. Useful for describing the purpose of the processor or its configuration. |
@@ -25,6 +24,20 @@ $$$date-options$$$
2524
| `on_failure` | no | - | Handle failures for the processor. See [Handling pipeline failures](docs-content://manage-data/ingest/transform-enrich/ingest-pipelines.md#handling-pipeline-failures). |
2625
| `tag` | no | - | Identifier for the processor. Useful for debugging and metrics. |
2726

27+
## Timezones [date-processor-timezones]
28+
29+
The `timezone` option may have two effects on the behavior of the processor:
30+
- If the string being parsed matches a format representing a local date-time, such as `yyyy-MM-dd HH:mm:ss`, it will be assumed to be in the timezone specified by this option. This is not applicable if the string matches a format representing a zoned date-time, such as `yyyy-MM-dd HH:mm:ss zzz`: in that case, the timezone parsed from the string will be used. It is also not applicable if the string matches an absolute time format, such as `epoch_millis`.
31+
- The date-time will be converted into the timezone given by this option before it is formatted and written into the target field. This is not applicable if the `output_format` is an absolute time format such as `epoch_millis`.
32+
33+
::::{warning}
34+
We recommend avoiding the use of short abbreviations for timezone names, since they can be ambiguous. For example, one JDK might interpret `PST` as `America/Tijuana`, i.e. Pacific (Standard) Time, while another JDK might interpret it as `Asia/Manila`, i.e. Philippine Standard Time. If your input data contains such abbreviations, you should convert them into either standard full names or UTC offsets before parsing them, using your own knowledge of what each abbreviation means in your data. See [below](#date-processor-short-timezone-example) for an example. (This does not apply to `UTC`, which is safe.)
35+
::::
36+
37+
## Examples [date-processor-examples]
38+
39+
### Simple example [date-processor-simple-example]
40+
2841
Here is an example that adds the parsed date to the `timestamp` field based on the `initial_date` field:
2942

3043
```js
@@ -43,6 +56,8 @@ Here is an example that adds the parsed date to the `timestamp` field based on t
4356
}
4457
```
4558

59+
### Example using templated parameters [date-processor-templated-example]
60+
4661
The `timezone` and `locale` processor parameters are templated. This means that their values can be extracted from fields within documents. The example below shows how to extract the locale/timezone details from existing fields, `my_timezone` and `my_locale`, in the ingested document that contain the timezone and locale values.
4762

4863
```js
@@ -62,3 +77,49 @@ The `timezone` and `locale` processor parameters are templated. This means that
6277
}
6378
```
6479

80+
### Example dealing with short timezone abbreviations safely [date-processor-short-timezone-example]
81+
82+
In the example below, the `message` field in the input is expected to be a string formed of a local date-time in `yyyyMMddHHmmss` format, a timezone abbreviated to one of `PST`, `CET`, or `JST` representing Pacific, Central European, or Japan time, and a payload. This field is split up using a `grok` processor, then the timezones are converted into full names using a `script` processor, then the date-time is parsed using a `date` processor, and finally the unwanted fields are discarded using a `drop` processor.
83+
84+
```js
85+
{
86+
"description" : "...",
87+
"processors": [
88+
{
89+
"grok": {
90+
"field": "message",
91+
"patterns": ["%{DATESTAMP_EVENTLOG:local_date_time} %{TZ:short_tz} %{GREEDYDATA:payload}"],
92+
"pattern_definitions": {
93+
"TZ": "[A-Z]{3}"
94+
}
95+
}
96+
},
97+
{
98+
"script": {
99+
"source": "ctx['full_tz'] = params['tz_map'][ctx['short_tz']]",
100+
"params": {
101+
"tz_map": {
102+
"PST": "America/Los_Angeles",
103+
"CET": "Europe/Amsterdam",
104+
"JST": "Asia/Tokyo"
105+
}
106+
}
107+
}
108+
},
109+
{
110+
"date": {
111+
"field": "local_date_time",
112+
"formats": ["yyyyMMddHHmmss"],
113+
"timezone": "{{{full_tz}}}"
114+
}
115+
},
116+
{
117+
"remove": {
118+
"field": ["message", "local_date_time", "short_tz", "full_tz"]
119+
}
120+
}
121+
]
122+
}
123+
```
124+
125+
With this pipeline, a `message` field with the value `20250102123456 PST Hello world` will result in a `@timestamp` field with the value `2025-01-02T12:34:56.000-08:00` and a `payload` field with the value `Hello world`. (Note: A `@timestamp` field will normally be mapped to a `date` type, and therefore it will be indexed as an integer representing milliseconds since the epoch, although the original format and timezone may be preserved in the `_source`.)

modules/data-streams/src/internalClusterTest/java/org/elasticsearch/datastreams/DataStreamIT.java

Lines changed: 9 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -202,8 +202,8 @@ public void testBasicScenario() throws Exception {
202202
int numDocsFoo = randomIntBetween(2, 16);
203203
indexDocs("metrics-foo", numDocsFoo);
204204

205-
verifyDocs("metrics-bar", numDocsBar, 1, 1);
206-
verifyDocs("metrics-foo", numDocsFoo, 1, 1);
205+
verifyDocs("metrics-bar", numDocsBar);
206+
verifyDocs("metrics-foo", numDocsFoo);
207207

208208
RolloverResponse fooRolloverResponse = indicesAdmin().rolloverIndex(new RolloverRequest("metrics-foo", null)).get();
209209
assertThat(fooRolloverResponse.getNewIndex(), backingIndexEqualTo("metrics-foo", 2));
@@ -234,8 +234,8 @@ public void testBasicScenario() throws Exception {
234234
int numDocsFoo2 = randomIntBetween(2, 16);
235235
indexDocs("metrics-foo", numDocsFoo2);
236236

237-
verifyDocs("metrics-bar", numDocsBar + numDocsBar2, 1, 2);
238-
verifyDocs("metrics-foo", numDocsFoo + numDocsFoo2, 1, 2);
237+
verifyDocs("metrics-bar", numDocsBar + numDocsBar2);
238+
verifyDocs("metrics-foo", numDocsFoo + numDocsFoo2);
239239

240240
DeleteDataStreamAction.Request deleteDataStreamRequest = new DeleteDataStreamAction.Request(TEST_REQUEST_TIMEOUT, "metrics-*");
241241
client().execute(DeleteDataStreamAction.INSTANCE, deleteDataStreamRequest).actionGet();
@@ -469,7 +469,7 @@ public void testComposableTemplateOnlyMatchingWithDataStreamName() throws Except
469469

470470
int numDocs = randomIntBetween(2, 16);
471471
indexDocs(dataStreamName, numDocs);
472-
verifyDocs(dataStreamName, numDocs, 1, 1);
472+
verifyDocs(dataStreamName, numDocs);
473473

474474
GetDataStreamAction.Request getDataStreamRequest = new GetDataStreamAction.Request(TEST_REQUEST_TIMEOUT, new String[] { "*" });
475475
GetDataStreamAction.Response getDataStreamResponse = client().execute(GetDataStreamAction.INSTANCE, getDataStreamRequest)
@@ -504,7 +504,7 @@ public void testComposableTemplateOnlyMatchingWithDataStreamName() throws Except
504504

505505
int numDocs2 = randomIntBetween(2, 16);
506506
indexDocs(dataStreamName, numDocs2);
507-
verifyDocs(dataStreamName, numDocs + numDocs2, 1, 2);
507+
verifyDocs(dataStreamName, numDocs + numDocs2);
508508

509509
getDataStreamRequest = new GetDataStreamAction.Request(TEST_REQUEST_TIMEOUT, new String[] { "*" });
510510
getDataStreamResponse = client().execute(GetDataStreamAction.INSTANCE, getDataStreamRequest).actionGet();
@@ -959,7 +959,7 @@ public void testAliasActionsFailOnDataStreamBackingIndices() throws Exception {
959959
);
960960
client().execute(CreateDataStreamAction.INSTANCE, createDataStreamRequest).get();
961961

962-
String backingIndex = DataStream.getDefaultBackingIndexName(dataStreamName, 1);
962+
String backingIndex = getDataStreamBackingIndexNames(dataStreamName).getFirst();
963963
AliasActions addAction = new AliasActions(AliasActions.Type.ADD).index(backingIndex).aliases("first_gen");
964964
IndicesAliasesRequest aliasesAddRequest = new IndicesAliasesRequest(TEST_REQUEST_TIMEOUT, TEST_REQUEST_TIMEOUT);
965965
aliasesAddRequest.addAliasAction(addAction);
@@ -2050,11 +2050,8 @@ static void verifyDocs(String dataStream, long expectedNumHits, List<String> exp
20502050
});
20512051
}
20522052

2053-
static void verifyDocs(String dataStream, long expectedNumHits, long minGeneration, long maxGeneration) {
2054-
List<String> expectedIndices = new ArrayList<>();
2055-
for (long k = minGeneration; k <= maxGeneration; k++) {
2056-
expectedIndices.add(DataStream.getDefaultBackingIndexName(dataStream, k));
2057-
}
2053+
static void verifyDocs(String dataStream, long expectedNumHits) {
2054+
List<String> expectedIndices = getDataStreamBackingIndexNames(dataStream);
20582055
verifyDocs(dataStream, expectedNumHits, expectedIndices);
20592056
}
20602057

modules/data-streams/src/internalClusterTest/java/org/elasticsearch/datastreams/DataStreamsSnapshotsIT.java

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -35,8 +35,8 @@
3535
import org.elasticsearch.action.support.IndicesOptions;
3636
import org.elasticsearch.action.support.master.AcknowledgedResponse;
3737
import org.elasticsearch.client.internal.Client;
38-
import org.elasticsearch.cluster.metadata.DataStream;
3938
import org.elasticsearch.cluster.metadata.DataStreamAlias;
39+
import org.elasticsearch.cluster.metadata.DataStreamTestHelper;
4040
import org.elasticsearch.common.settings.Settings;
4141
import org.elasticsearch.common.unit.ByteSizeUnit;
4242
import org.elasticsearch.index.Index;
@@ -301,8 +301,8 @@ public void testSnapshotAndRestoreInPlace() {
301301
RolloverRequest rolloverRequest = new RolloverRequest("ds", null);
302302
RolloverResponse rolloverResponse = client.admin().indices().rolloverIndex(rolloverRequest).actionGet();
303303
assertThat(rolloverResponse.isRolledOver(), is(true));
304-
String backingIndexAfterSnapshot = DataStream.getDefaultBackingIndexName("ds", 2);
305-
assertThat(rolloverResponse.getNewIndex(), equalTo(backingIndexAfterSnapshot));
304+
String backingIndexAfterSnapshot = getDataStreamBackingIndexNames("ds").getFirst();
305+
assertThat(rolloverResponse.getNewIndex(), DataStreamTestHelper.backingIndexEqualTo("ds", 2));
306306

307307
// Close all backing indices of ds data stream:
308308
CloseIndexRequest closeIndexRequest = new CloseIndexRequest(".ds-ds-*");
@@ -337,7 +337,7 @@ public void testSnapshotAndRestoreInPlace() {
337337
rolloverRequest = new RolloverRequest("ds", null);
338338
rolloverResponse = client.admin().indices().rolloverIndex(rolloverRequest).actionGet();
339339
assertThat(rolloverResponse.isRolledOver(), is(true));
340-
assertThat(rolloverResponse.getNewIndex(), equalTo(DataStream.getDefaultBackingIndexName("ds", 3)));
340+
assertThat(rolloverResponse.getNewIndex(), DataStreamTestHelper.backingIndexEqualTo("ds", 3));
341341
}
342342

343343
public void testFailureStoreSnapshotAndRestore() {
@@ -915,7 +915,7 @@ public void testDataStreamAndBackingIndicesAreRenamedUsingRegex() {
915915
List<GetDataStreamAction.Response.DataStreamInfo> dataStreamInfos = getDataStreamInfo("test-ds");
916916
assertThat(
917917
dataStreamInfos.get(0).getDataStream().getIndices().get(0).getName(),
918-
is(DataStream.getDefaultBackingIndexName("test-ds", 1L))
918+
DataStreamTestHelper.backingIndexEqualTo("test-ds", 1)
919919
);
920920

921921
// data stream "ds" should still exist in the system

plugins/examples/rescore/src/yamlRestTest/resources/rest-api-spec/test/example-rescore/30_factor_field.yml

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,24 @@ setup:
4848
- match: { hits.hits.1._score: 20 }
4949
- match: { hits.hits.2._score: 10 }
5050

51+
---
52+
"referencing a missing field returns bad request":
53+
- requires:
54+
cluster_features: [ "search.rescorer.missing.field.bad.request" ]
55+
reason: "Testing the behaviour change with this feature"
56+
- do:
57+
catch: bad_request
58+
search:
59+
index: test
60+
body:
61+
rescore:
62+
example:
63+
factor: 1
64+
factor_field: missing
65+
- match: { status: 400 }
66+
- match: { error.root_cause.0.type: "illegal_argument_exception" }
67+
- match: { error.root_cause.0.reason: "Missing value for field [missing]" }
68+
5169
---
5270
"sorted based on a numeric field and rescored based on a factor field using a window size":
5371
- do:

qa/smoke-test-http/src/internalClusterTest/java/org/elasticsearch/http/SearchErrorTraceIT.java

Lines changed: 17 additions & 61 deletions
Original file line numberDiff line numberDiff line change
@@ -127,33 +127,7 @@ public void testSearchFailingQueryErrorTraceFalse() throws IOException {
127127
assertFalse(hasStackTrace.getAsBoolean());
128128
}
129129

130-
public void testDataNodeDoesNotLogStackTraceWhenErrorTraceTrue() throws IOException {
131-
setupIndexWithDocs();
132-
133-
Request searchRequest = new Request("POST", "/_search");
134-
searchRequest.setJsonEntity("""
135-
{
136-
"query": {
137-
"simple_query_string" : {
138-
"query": "foo",
139-
"fields": ["field"]
140-
}
141-
}
142-
}
143-
""");
144-
145-
String errorTriggeringIndex = "test2";
146-
int numShards = getNumShards(errorTriggeringIndex).numPrimaries;
147-
try (var mockLog = MockLog.capture(SearchService.class)) {
148-
ErrorTraceHelper.addUnseenLoggingExpectations(numShards, mockLog, errorTriggeringIndex);
149-
150-
searchRequest.addParameter("error_trace", "true");
151-
getRestClient().performRequest(searchRequest);
152-
mockLog.assertAllExpectationsMatched();
153-
}
154-
}
155-
156-
public void testDataNodeLogsStackTraceWhenErrorTraceFalseOrEmpty() throws IOException {
130+
public void testDataNodeLogsStackTrace() throws IOException {
157131
setupIndexWithDocs();
158132

159133
Request searchRequest = new Request("POST", "/_search");
@@ -173,10 +147,14 @@ public void testDataNodeLogsStackTraceWhenErrorTraceFalseOrEmpty() throws IOExce
173147
try (var mockLog = MockLog.capture(SearchService.class)) {
174148
ErrorTraceHelper.addSeenLoggingExpectations(numShards, mockLog, errorTriggeringIndex);
175149

176-
// error_trace defaults to false so we can test both cases with some randomization
177-
if (randomBoolean()) {
150+
// No matter the value of error_trace (empty, true, or false) we should see stack traces logged
151+
int errorTraceValue = randomIntBetween(0, 2);
152+
if (errorTraceValue == 0) {
153+
searchRequest.addParameter("error_trace", "true");
154+
} else if (errorTraceValue == 1) {
178155
searchRequest.addParameter("error_trace", "false");
179-
}
156+
} // else empty
157+
180158
getRestClient().performRequest(searchRequest);
181159
mockLog.assertAllExpectationsMatched();
182160
}
@@ -233,32 +211,7 @@ public void testMultiSearchFailingQueryErrorTraceFalse() throws IOException {
233211
assertFalse(hasStackTrace.getAsBoolean());
234212
}
235213

236-
public void testDataNodeDoesNotLogStackTraceWhenErrorTraceTrueMultiSearch() throws IOException {
237-
setupIndexWithDocs();
238-
239-
XContentType contentType = XContentType.JSON;
240-
MultiSearchRequest multiSearchRequest = new MultiSearchRequest().add(
241-
new SearchRequest("test*").source(new SearchSourceBuilder().query(simpleQueryStringQuery("foo").field("field")))
242-
);
243-
Request searchRequest = new Request("POST", "/_msearch");
244-
byte[] requestBody = MultiSearchRequest.writeMultiLineFormat(multiSearchRequest, contentType.xContent());
245-
searchRequest.setEntity(
246-
new NByteArrayEntity(requestBody, ContentType.create(contentType.mediaTypeWithoutParameters(), (Charset) null))
247-
);
248-
249-
searchRequest.addParameter("error_trace", "true");
250-
251-
String errorTriggeringIndex = "test2";
252-
int numShards = getNumShards(errorTriggeringIndex).numPrimaries;
253-
try (var mockLog = MockLog.capture(SearchService.class)) {
254-
ErrorTraceHelper.addUnseenLoggingExpectations(numShards, mockLog, errorTriggeringIndex);
255-
256-
getRestClient().performRequest(searchRequest);
257-
mockLog.assertAllExpectationsMatched();
258-
}
259-
}
260-
261-
public void testDataNodeLogsStackTraceWhenErrorTraceFalseOrEmptyMultiSearch() throws IOException {
214+
public void testDataNodeLogsStackTraceMultiSearch() throws IOException {
262215
setupIndexWithDocs();
263216

264217
XContentType contentType = XContentType.JSON;
@@ -271,16 +224,19 @@ public void testDataNodeLogsStackTraceWhenErrorTraceFalseOrEmptyMultiSearch() th
271224
new NByteArrayEntity(requestBody, ContentType.create(contentType.mediaTypeWithoutParameters(), (Charset) null))
272225
);
273226

274-
// error_trace defaults to false so we can test both cases with some randomization
275-
if (randomBoolean()) {
276-
searchRequest.addParameter("error_trace", "false");
277-
}
278-
279227
String errorTriggeringIndex = "test2";
280228
int numShards = getNumShards(errorTriggeringIndex).numPrimaries;
281229
try (var mockLog = MockLog.capture(SearchService.class)) {
282230
ErrorTraceHelper.addSeenLoggingExpectations(numShards, mockLog, errorTriggeringIndex);
283231

232+
// No matter the value of error_trace (empty, true, or false) we should see stack traces logged
233+
int errorTraceValue = randomIntBetween(0, 2);
234+
if (errorTraceValue == 0) {
235+
searchRequest.addParameter("error_trace", "true");
236+
} else if (errorTraceValue == 1) {
237+
searchRequest.addParameter("error_trace", "false");
238+
} // else empty
239+
284240
getRestClient().performRequest(searchRequest);
285241
mockLog.assertAllExpectationsMatched();
286242
}

0 commit comments

Comments
 (0)