Skip to content

Commit 9d3a44f

Browse files
committed
Merge branch 'main' into add_esql_hash
2 parents 8d76d16 + d411ad8 commit 9d3a44f

File tree

154 files changed

+7596
-1064
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

154 files changed

+7596
-1064
lines changed

docs/changelog/117589.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 117589
2+
summary: "Add Inference Unified API for chat completions for OpenAI"
3+
area: Machine Learning
4+
type: enhancement
5+
issues: []

docs/changelog/117657.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 117657
2+
summary: Ignore cancellation exceptions
3+
area: ES|QL
4+
type: bug
5+
issues: []

docs/changelog/118064.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 118064
2+
summary: Add Highlighter for Semantic Text Fields
3+
area: Highlighting
4+
type: feature
5+
issues: []

docs/plugins/analysis-kuromoji.asciidoc

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -750,3 +750,39 @@ Which results in:
750750
]
751751
}
752752
--------------------------------------------------
753+
754+
[[analysis-kuromoji-completion]]
755+
==== `kuromoji_completion` token filter
756+
757+
The `kuromoji_completion` token filter adds Japanese romanized tokens to the term attributes along with the original tokens (surface forms).
758+
759+
[source,console]
760+
--------------------------------------------------
761+
GET _analyze
762+
{
763+
"analyzer": "kuromoji_completion",
764+
"text": "寿司" <1>
765+
}
766+
--------------------------------------------------
767+
768+
<1> Returns `寿司`, `susi` (Kunrei-shiki) and `sushi` (Hepburn-shiki).
769+
770+
The `kuromoji_completion` token filter accepts the following settings:
771+
772+
`mode`::
773+
+
774+
--
775+
776+
The tokenization mode determines how the tokenizer handles compound and
777+
unknown words. It can be set to:
778+
779+
`index`::
780+
781+
Simple romanization. Expected to be used when indexing.
782+
783+
`query`::
784+
785+
Input Method aware romanization. Expected to be used when querying.
786+
787+
Defaults to `index`.
788+
--

docs/reference/mapping/types/semantic-text.asciidoc

Lines changed: 24 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -112,50 +112,43 @@ Trying to <<delete-inference-api,delete an {infer} endpoint>> that is used on a
112112
{infer-cap} endpoints have a limit on the amount of text they can process.
113113
To allow for large amounts of text to be used in semantic search, `semantic_text` automatically generates smaller passages if needed, called _chunks_.
114114

115-
Each chunk will include the text subpassage and the corresponding embedding generated from it.
115+
Each chunk refers to a passage of the text and the corresponding embedding generated from it.
116116
When querying, the individual passages will be automatically searched for each document, and the most relevant passage will be used to compute a score.
117117

118118
For more details on chunking and how to configure chunking settings, see <<infer-chunking-config, Configuring chunking>> in the Inference API documentation.
119119

120+
Refer to <<semantic-search-semantic-text,this tutorial>> to learn more about
121+
semantic search using `semantic_text` and the `semantic` query.
120122

121123
[discrete]
122-
[[semantic-text-structure]]
123-
==== `semantic_text` structure
124+
[[semantic-text-highlighting]]
125+
==== Extracting Relevant Fragments from Semantic Text
124126

125-
Once a document is ingested, a `semantic_text` field will have the following structure:
127+
You can extract the most relevant fragments from a semantic text field by using the <<highlighting,highlight parameter>> in the <<search-search-api-request-body,Search API>>.
126128

127-
[source,console-result]
129+
[source,console]
128130
------------------------------------------------------------
129-
"inference_field": {
130-
"text": "these are not the droids you're looking for", <1>
131-
"inference": {
132-
"inference_id": "my-elser-endpoint", <2>
133-
"model_settings": { <3>
134-
"task_type": "sparse_embedding"
131+
PUT test-index
132+
{
133+
"query": {
134+
"semantic": {
135+
"field": "my_semantic_field"
136+
}
135137
},
136-
"chunks": [ <4>
137-
{
138-
"text": "these are not the droids you're looking for",
139-
"embeddings": {
140-
(...)
138+
"highlight": {
139+
"fields": {
140+
"my_semantic_field": {
141+
"type": "semantic",
142+
"number_of_fragments": 2, <1>
143+
"order": "score" <2>
144+
}
141145
}
142-
}
143-
]
144-
}
146+
}
145147
}
146148
------------------------------------------------------------
147-
// TEST[skip:TBD]
148-
<1> The field will become an object structure to accommodate both the original
149-
text and the inference results.
150-
<2> The `inference_id` used to generate the embeddings.
151-
<3> Model settings, including the task type and dimensions/similarity if
152-
applicable.
153-
<4> Inference results will be grouped in chunks, each with its corresponding
154-
text and embeddings.
155-
156-
Refer to <<semantic-search-semantic-text,this tutorial>> to learn more about
157-
semantic search using `semantic_text` and the `semantic` query.
158-
149+
// TEST[skip:Requires inference endpoint]
150+
<1> Specifies the maximum number of fragments to return.
151+
<2> Sorts highlighted fragments by score when set to `score`. By default, fragments will be output in the order they appear in the field (order: none).
159152

160153
[discrete]
161154
[[custom-indexing]]

modules/repository-s3/src/javaRestTest/java/org/elasticsearch/repositories/s3/AbstractRepositoryS3RestTestCase.java

Lines changed: 15 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@
1919
import org.elasticsearch.rest.RestStatus;
2020
import org.elasticsearch.test.ESTestCase;
2121
import org.elasticsearch.test.rest.ESRestTestCase;
22+
import org.elasticsearch.test.rest.ObjectPath;
2223

2324
import java.io.Closeable;
2425
import java.io.IOException;
@@ -27,7 +28,6 @@
2728
import java.util.function.UnaryOperator;
2829
import java.util.stream.Collectors;
2930

30-
import static org.hamcrest.Matchers.allOf;
3131
import static org.hamcrest.Matchers.containsString;
3232
import static org.hamcrest.Matchers.equalTo;
3333

@@ -152,10 +152,9 @@ private void testNonexistentBucket(Boolean readonly) throws Exception {
152152

153153
final var responseException = expectThrows(ResponseException.class, () -> client().performRequest(registerRequest));
154154
assertEquals(RestStatus.INTERNAL_SERVER_ERROR.getStatus(), responseException.getResponse().getStatusLine().getStatusCode());
155-
assertThat(
156-
responseException.getMessage(),
157-
allOf(containsString("repository_verification_exception"), containsString("is not accessible on master node"))
158-
);
155+
final var responseObjectPath = ObjectPath.createFromResponse(responseException.getResponse());
156+
assertThat(responseObjectPath.evaluate("error.type"), equalTo("repository_verification_exception"));
157+
assertThat(responseObjectPath.evaluate("error.reason"), containsString("is not accessible on master node"));
159158
}
160159

161160
public void testNonexistentClient() throws Exception {
@@ -181,15 +180,11 @@ private void testNonexistentClient(Boolean readonly) throws Exception {
181180

182181
final var responseException = expectThrows(ResponseException.class, () -> client().performRequest(registerRequest));
183182
assertEquals(RestStatus.INTERNAL_SERVER_ERROR.getStatus(), responseException.getResponse().getStatusLine().getStatusCode());
184-
assertThat(
185-
responseException.getMessage(),
186-
allOf(
187-
containsString("repository_verification_exception"),
188-
containsString("is not accessible on master node"),
189-
containsString("illegal_argument_exception"),
190-
containsString("Unknown s3 client name")
191-
)
192-
);
183+
final var responseObjectPath = ObjectPath.createFromResponse(responseException.getResponse());
184+
assertThat(responseObjectPath.evaluate("error.type"), equalTo("repository_verification_exception"));
185+
assertThat(responseObjectPath.evaluate("error.reason"), containsString("is not accessible on master node"));
186+
assertThat(responseObjectPath.evaluate("error.caused_by.type"), equalTo("illegal_argument_exception"));
187+
assertThat(responseObjectPath.evaluate("error.caused_by.reason"), containsString("Unknown s3 client name"));
193188
}
194189

195190
public void testNonexistentSnapshot() throws Exception {
@@ -212,21 +207,24 @@ private void testNonexistentSnapshot(Boolean readonly) throws Exception {
212207
final var getSnapshotRequest = new Request("GET", "/_snapshot/" + repositoryName + "/" + randomIdentifier());
213208
final var getSnapshotException = expectThrows(ResponseException.class, () -> client().performRequest(getSnapshotRequest));
214209
assertEquals(RestStatus.NOT_FOUND.getStatus(), getSnapshotException.getResponse().getStatusLine().getStatusCode());
215-
assertThat(getSnapshotException.getMessage(), containsString("snapshot_missing_exception"));
210+
final var getResponseObjectPath = ObjectPath.createFromResponse(getSnapshotException.getResponse());
211+
assertThat(getResponseObjectPath.evaluate("error.type"), equalTo("snapshot_missing_exception"));
216212

217213
final var restoreRequest = new Request("POST", "/_snapshot/" + repositoryName + "/" + randomIdentifier() + "/_restore");
218214
if (randomBoolean()) {
219215
restoreRequest.addParameter("wait_for_completion", Boolean.toString(randomBoolean()));
220216
}
221217
final var restoreException = expectThrows(ResponseException.class, () -> client().performRequest(restoreRequest));
222218
assertEquals(RestStatus.INTERNAL_SERVER_ERROR.getStatus(), restoreException.getResponse().getStatusLine().getStatusCode());
223-
assertThat(restoreException.getMessage(), containsString("snapshot_restore_exception"));
219+
final var restoreResponseObjectPath = ObjectPath.createFromResponse(restoreException.getResponse());
220+
assertThat(restoreResponseObjectPath.evaluate("error.type"), equalTo("snapshot_restore_exception"));
224221

225222
if (readonly != Boolean.TRUE) {
226223
final var deleteRequest = new Request("DELETE", "/_snapshot/" + repositoryName + "/" + randomIdentifier());
227224
final var deleteException = expectThrows(ResponseException.class, () -> client().performRequest(deleteRequest));
228225
assertEquals(RestStatus.NOT_FOUND.getStatus(), deleteException.getResponse().getStatusLine().getStatusCode());
229-
assertThat(deleteException.getMessage(), containsString("snapshot_missing_exception"));
226+
final var deleteResponseObjectPath = ObjectPath.createFromResponse(deleteException.getResponse());
227+
assertThat(deleteResponseObjectPath.evaluate("error.type"), equalTo("snapshot_missing_exception"));
230228
}
231229
}
232230
}

muted-tests.yml

Lines changed: 40 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -117,9 +117,6 @@ tests:
117117
- class: org.elasticsearch.xpack.deprecation.DeprecationHttpIT
118118
method: testDeprecatedSettingsReturnWarnings
119119
issue: https://github.com/elastic/elasticsearch/issues/108628
120-
- class: org.elasticsearch.action.search.SearchQueryThenFetchAsyncActionTests
121-
method: testBottomFieldSort
122-
issue: https://github.com/elastic/elasticsearch/issues/116249
123120
- class: org.elasticsearch.xpack.shutdown.NodeShutdownIT
124121
method: testAllocationPreventedForRemoval
125122
issue: https://github.com/elastic/elasticsearch/issues/116363
@@ -242,12 +239,12 @@ tests:
242239
- class: org.elasticsearch.packaging.test.ConfigurationTests
243240
method: test30SymlinkedDataPath
244241
issue: https://github.com/elastic/elasticsearch/issues/118111
245-
- class: org.elasticsearch.datastreams.ResolveClusterDataStreamIT
246-
method: testClusterResolveWithDataStreamsUsingAlias
247-
issue: https://github.com/elastic/elasticsearch/issues/118124
248242
- class: org.elasticsearch.packaging.test.KeystoreManagementTests
249243
method: test30KeystorePasswordFromFile
250244
issue: https://github.com/elastic/elasticsearch/issues/118123
245+
- class: org.elasticsearch.packaging.test.KeystoreManagementTests
246+
method: test31WrongKeystorePasswordFromFile
247+
issue: https://github.com/elastic/elasticsearch/issues/118123
251248
- class: org.elasticsearch.packaging.test.ArchiveTests
252249
method: test41AutoconfigurationNotTriggeredWhenNodeCannotContainData
253250
issue: https://github.com/elastic/elasticsearch/issues/118110
@@ -260,6 +257,43 @@ tests:
260257
- class: org.elasticsearch.xpack.remotecluster.CrossClusterEsqlRCS2UnavailableRemotesIT
261258
method: testEsqlRcs2UnavailableRemoteScenarios
262259
issue: https://github.com/elastic/elasticsearch/issues/117419
260+
- class: org.elasticsearch.packaging.test.DebPreservationTests
261+
method: test40RestartOnUpgrade
262+
issue: https://github.com/elastic/elasticsearch/issues/118170
263+
- class: org.elasticsearch.xpack.inference.DefaultEndPointsIT
264+
method: testInferDeploysDefaultRerank
265+
issue: https://github.com/elastic/elasticsearch/issues/118184
266+
- class: org.elasticsearch.xpack.esql.action.EsqlActionTaskIT
267+
method: testCancelRequestWhenFailingFetchingPages
268+
issue: https://github.com/elastic/elasticsearch/issues/118193
269+
- class: org.elasticsearch.packaging.test.MemoryLockingTests
270+
method: test20MemoryLockingEnabled
271+
issue: https://github.com/elastic/elasticsearch/issues/118195
272+
- class: org.elasticsearch.packaging.test.ArchiveTests
273+
method: test42AutoconfigurationNotTriggeredWhenNodeCannotBecomeMaster
274+
issue: https://github.com/elastic/elasticsearch/issues/118196
275+
- class: org.elasticsearch.packaging.test.ArchiveTests
276+
method: test43AutoconfigurationNotTriggeredWhenTlsAlreadyConfigured
277+
issue: https://github.com/elastic/elasticsearch/issues/118202
278+
- class: org.elasticsearch.packaging.test.ArchiveTests
279+
method: test44AutoConfigurationNotTriggeredOnNotWriteableConfDir
280+
issue: https://github.com/elastic/elasticsearch/issues/118208
281+
- class: org.elasticsearch.packaging.test.ArchiveTests
282+
method: test51AutoConfigurationWithPasswordProtectedKeystore
283+
issue: https://github.com/elastic/elasticsearch/issues/118212
284+
- class: org.elasticsearch.xpack.inference.InferenceCrudIT
285+
method: testUnifiedCompletionInference
286+
issue: https://github.com/elastic/elasticsearch/issues/118210
287+
- class: org.elasticsearch.ingest.common.IngestCommonClientYamlTestSuiteIT
288+
issue: https://github.com/elastic/elasticsearch/issues/118215
289+
- class: org.elasticsearch.datastreams.DataStreamsClientYamlTestSuiteIT
290+
method: test {p0=data_stream/120_data_streams_stats/Multiple data stream}
291+
issue: https://github.com/elastic/elasticsearch/issues/118217
292+
- class: org.elasticsearch.xpack.security.operator.OperatorPrivilegesIT
293+
method: testEveryActionIsEitherOperatorOnlyOrNonOperator
294+
issue: https://github.com/elastic/elasticsearch/issues/118220
295+
- class: org.elasticsearch.validation.DotPrefixClientYamlTestSuiteIT
296+
issue: https://github.com/elastic/elasticsearch/issues/118224
263297

264298
# Examples:
265299
#
Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
{
2+
"migrate.reindex":{
3+
"documentation":{
4+
"url":"https://www.elastic.co/guide/en/elasticsearch/reference/master/data-stream-reindex.html",
5+
"description":"This API reindexes all legacy backing indices for a data stream. It does this in a persistent task. The persistent task id is returned immediately, and the reindexing work is completed in that task"
6+
},
7+
"stability":"experimental",
8+
"visibility":"private",
9+
"headers":{
10+
"accept": [ "application/json"],
11+
"content_type": ["application/json"]
12+
},
13+
"url":{
14+
"paths":[
15+
{
16+
"path":"/_migration/reindex",
17+
"methods":[
18+
"POST"
19+
]
20+
}
21+
]
22+
},
23+
"body":{
24+
"description":"The body contains the fields `mode` and `source.index, where the only mode currently supported is `upgrade`, and the `source.index` must be a data stream name",
25+
"required":true
26+
}
27+
}
28+
}
29+

server/src/main/java/org/elasticsearch/common/xcontent/ChunkedToXContentHelper.java

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@
1212
import org.elasticsearch.common.collect.Iterators;
1313
import org.elasticsearch.xcontent.ToXContent;
1414

15+
import java.util.Collections;
1516
import java.util.Iterator;
1617

1718
public enum ChunkedToXContentHelper {
@@ -53,6 +54,14 @@ public static Iterator<ToXContent> field(String name, String value) {
5354
return Iterators.single(((builder, params) -> builder.field(name, value)));
5455
}
5556

57+
public static Iterator<ToXContent> optionalField(String name, String value) {
58+
if (value == null) {
59+
return Collections.emptyIterator();
60+
} else {
61+
return field(name, value);
62+
}
63+
}
64+
5665
/**
5766
* Creates an Iterator of a single ToXContent object that serializes the given object as a single chunk. Just wraps {@link
5867
* Iterators#single}, but still useful because it avoids any type ambiguity.

server/src/main/java/org/elasticsearch/inference/InferenceService.java

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -112,6 +112,23 @@ void infer(
112112
);
113113

114114
/**
115+
* Perform completion inference on the model using the unified schema.
116+
*
117+
* @param model The model
118+
* @param request Parameters for the request
119+
* @param timeout The timeout for the request
120+
* @param listener Inference result listener
121+
*/
122+
void unifiedCompletionInfer(
123+
Model model,
124+
UnifiedCompletionRequest request,
125+
TimeValue timeout,
126+
ActionListener<InferenceServiceResults> listener
127+
);
128+
129+
/**
130+
* Chunk long text.
131+
*
115132
* @param model The model
116133
* @param query Inference query, mainly for re-ranking
117134
* @param input Inference input

0 commit comments

Comments
 (0)