Skip to content

Commit de3160d

Browse files
committed
Merge remote-tracking branch 'upstream/main' into structured_source_benchmark
2 parents ee797b3 + ccf9893 commit de3160d

File tree

95 files changed

+4001
-2895
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

95 files changed

+4001
-2895
lines changed

build-tools/src/main/java/org/elasticsearch/gradle/test/TestBuildInfoPlugin.java

Lines changed: 4 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -57,13 +57,9 @@ public void apply(Project project) {
5757
task.into("META-INF", copy -> copy.from(testBuildInfoTask));
5858
});
5959

60-
if (project.getRootProject().getName().equals("elasticsearch")) {
61-
project.getTasks()
62-
.withType(Test.class)
63-
.matching(test -> List.of("test", "internalClusterTest").contains(test.getName()))
64-
.configureEach(test -> {
65-
test.systemProperty("es.entitlement.enableForTests", "true");
66-
});
67-
}
60+
project.getTasks()
61+
.withType(Test.class)
62+
.matching(test -> List.of("test", "internalClusterTest").contains(test.getName()))
63+
.configureEach(test -> test.getSystemProperties().putIfAbsent("es.entitlement.enableForTests", "true"));
6864
}
6965
}

docs/changelog/113949.yaml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
pr: 113949
2+
summary: Support kNN filter on nested metadata
3+
area: Vector Search
4+
type: enhancement
5+
issues:
6+
- 128803
7+
- 106994

docs/changelog/131517.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 131517
2+
summary: Refresh potential lost connections at query start for field caps
3+
area: Search
4+
type: enhancement
5+
issues: []

docs/changelog/131937.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 131937
2+
summary: Fix race condition in `RemoteClusterService.collectNodes()`
3+
area: Distributed
4+
type: bug
5+
issues: []

docs/reference/elasticsearch/mapping-reference/semantic-text.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,14 @@ the embedding generation, indexing, and query to use.
3737
[quantized](/reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-quantization)
3838
to `bbq_hnsw` automatically.
3939

40+
## Default and custom endpoints
41+
42+
You can use either preconfigured endpoints in your `semantic_text` fields which
43+
are ideal for most use cases or create custom endpoints and reference them in
44+
the field mappings.
45+
46+
### Using the default ELSER endpoint
47+
4048
If you use the preconfigured `.elser-2-elasticsearch` endpoint, you can set up
4149
`semantic_text` with the following API request:
4250

@@ -53,6 +61,8 @@ PUT my-index-000001
5361
}
5462
```
5563

64+
### Using a custom endpoint
65+
5666
To use a custom {{infer}} endpoint instead of the default
5767
`.elser-2-elasticsearch`, you
5868
must [Create {{infer}} API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-inference-put)
@@ -96,6 +106,35 @@ PUT my-index-000003
96106
}
97107
```
98108

109+
### Using ELSER on EIS
110+
111+
```{applies_to}
112+
stack: preview 9.1
113+
serverless: preview
114+
```
115+
116+
If you use the preconfigured `.elser-2-elastic` endpoint that utilizes the ELSER model as a service through the Elastic Inference Service ([ELSER on EIS](docs-content://explore-analyze/elastic-inference/eis.md#elser-on-eis)), you can
117+
set up `semantic_text` with the following API request:
118+
119+
```console
120+
PUT my-index-000001
121+
{
122+
"mappings": {
123+
"properties": {
124+
"inference_field": {
125+
"type": "semantic_text",
126+
"inference_id": ".elser-2-elastic"
127+
}
128+
}
129+
}
130+
}
131+
```
132+
133+
::::{note}
134+
While we do encourage experimentation, we do not recommend implementing production use cases on top of this feature while it is in Technical Preview.
135+
136+
::::
137+
99138
## Parameters for `semantic_text` fields [semantic-text-params]
100139

101140
`inference_id`

docs/reference/query-languages/esql/_snippets/commands/layout/completion.md

Lines changed: 52 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -9,10 +9,26 @@ The `COMPLETION` command allows you to send prompts and context to a Large Langu
99

1010
**Syntax**
1111

12+
::::{tab-set}
13+
14+
:::{tab-item} 9.2.0+
15+
1216
```esql
13-
COMPLETION [column =] prompt WITH inference_id
17+
COMPLETION [column =] prompt WITH { "inference_id" : "my_inference_endpoint" }
1418
```
1519

20+
:::
21+
22+
:::{tab-item} 9.1.x only
23+
24+
```esql
25+
COMPLETION [column =] prompt WITH my_inference_endpoint
26+
```
27+
28+
:::
29+
30+
::::
31+
1632
**Parameters**
1733

1834
`column`
@@ -24,7 +40,7 @@ COMPLETION [column =] prompt WITH inference_id
2440
: The input text or expression used to prompt the LLM.
2541
This can be a string literal or a reference to a column containing text.
2642

27-
`inference_id`
43+
`my_inference_endpoint`
2844
: The ID of the [inference endpoint](docs-content://explore-analyze/elastic-inference/inference-api.md) to use for the task.
2945
The inference endpoint must be configured with the `completion` task type.
3046

@@ -46,16 +62,46 @@ including:
4662
**Requirements**
4763

4864
To use this command, you must deploy your LLM model in Elasticsearch as
49-
an [inference endpoint](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-inference-put) with the
65+
an [inference endpoint](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-inference-put) with the
5066
task type `completion`.
5167

68+
#### Handling timeouts
69+
70+
`COMPLETION` commands may time out when processing large datasets or complex prompts. The default timeout is 10 minutes, but you can increase this limit if necessary.
71+
72+
How you increase the timeout depends on your deployment type:
73+
74+
::::{tab-set}
75+
:::{tab-item} {{ech}}
76+
* You can adjust {{es}} settings in the [Elastic Cloud Console](docs-content://deploy-manage/deploy/elastic-cloud/edit-stack-settings.md)
77+
* You can also adjust the `search.default_search_timeout` cluster setting using [Kibana's Advanced settings](kibana://reference/advanced-settings.md#kibana-search-settings)
78+
:::
79+
80+
:::{tab-item} Self-managed
81+
* You can configure at the cluster level by setting `search.default_search_timeout` in `elasticsearch.yml` or updating via [Cluster Settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings)
82+
* You can also adjust the `search:timeout` setting using [Kibana's Advanced settings](kibana://reference/advanced-settings.md#kibana-search-settings)
83+
* Alternatively, you can add timeout parameters to individual queries
84+
:::
85+
86+
:::{tab-item} {{serverless-full}}
87+
* Requires a manual override from Elastic Support because you cannot modify timeout settings directly
88+
:::
89+
::::
90+
91+
If you don't want to increase the timeout limit, try the following:
92+
93+
* Reduce data volume with `LIMIT` or more selective filters before the `COMPLETION` command
94+
* Split complex operations into multiple simpler queries
95+
* Configure your HTTP client's response timeout (Refer to [HTTP client configuration](/reference/elasticsearch/configuration-reference/networking-settings.md#_http_client_configuration))
96+
97+
5298
**Examples**
5399

54100
Use the default column name (results stored in `completion` column):
55101

56102
```esql
57103
ROW question = "What is Elasticsearch?"
58-
| COMPLETION question WITH test_completion_model
104+
| COMPLETION question WITH { "inference_id" : "my_inference_endpoint" }
59105
| KEEP question, completion
60106
```
61107

@@ -67,7 +113,7 @@ Specify the output column (results stored in `answer` column):
67113

68114
```esql
69115
ROW question = "What is Elasticsearch?"
70-
| COMPLETION answer = question WITH test_completion_model
116+
| COMPLETION answer = question WITH { "inference_id" : "my_inference_endpoint" }
71117
| KEEP question, answer
72118
```
73119

@@ -87,7 +133,7 @@ FROM movies
87133
"Synopsis: ", synopsis, "\n",
88134
"Actors: ", MV_CONCAT(actors, ", "), "\n",
89135
)
90-
| COMPLETION summary = prompt WITH test_completion_model
136+
| COMPLETION summary = prompt WITH { "inference_id" : "my_inference_endpoint" }
91137
| KEEP title, summary, rating
92138
```
93139

docs/reference/query-languages/query-dsl/query-dsl-knn-query.md

Lines changed: 51 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -203,10 +203,19 @@ POST my-image-index/_search
203203
`knn` query can be used inside a nested query. The behaviour here is similar to [top level nested kNN search](docs-content://solutions/search/vector/knn.md#nested-knn-search):
204204

205205
* kNN search over nested dense_vectors diversifies the top results over the top-level document
206-
* `filter` over the top-level document metadata is supported and acts as a pre-filter
207-
* `filter` over `nested` field metadata is not supported
206+
* `filter` both over the top-level document metadata and `nested` is supported and acts as a pre-filter
207+
208+
::::{note}
209+
To ensure correct results: each individual filter must be either over
210+
the top-level metadata or `nested` metadata. However, a single knn query
211+
supports multiple filters, where some filters can be over the top-level
212+
metadata and some over nested.
213+
::::
208214

209-
A sample query can look like below:
215+
216+
Below is a sample query with filter over nested metadata.
217+
For scoring parents' documents, this query only considers vectors that
218+
have "paragraph.language" set to "EN".
210219

211220
```json
212221
{
@@ -215,12 +224,46 @@ A sample query can look like below:
215224
"path" : "paragraph",
216225
"query" : {
217226
"knn": {
218-
"query_vector": [
219-
0.45,
220-
45
221-
],
227+
"query_vector": [0.45, 0.50],
222228
"field": "paragraph.vector",
223-
"num_candidates": 2
229+
"filter": {
230+
"match": {
231+
"paragraph.language": "EN"
232+
}
233+
}
234+
}
235+
}
236+
}
237+
}
238+
}
239+
```
240+
241+
Below is a sample query with two filters: one over nested metadata
242+
and another over the top level metadata. For scoring parents' documents,
243+
this query only considers vectors whose parent's title contain "essay"
244+
word and have "paragraph.language" set to "EN".
245+
246+
```json
247+
{
248+
"query" : {
249+
"nested" : {
250+
"path" : "paragraph",
251+
"query" : {
252+
"knn": {
253+
"query_vector": [0.45, 0.50],
254+
"field": "paragraph.vector",
255+
"filter": [
256+
{
257+
"match": {
258+
"paragraph.language": "EN"
259+
}
260+
},
261+
{
262+
"match": {
263+
"title": "essay"
264+
}
265+
}
266+
]
224267
}
225268
}
226269
}

muted-tests.yml

Lines changed: 34 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -407,9 +407,6 @@ tests:
407407
- class: org.elasticsearch.xpack.esql.analysis.VerifierTests
408408
method: testMatchInsideEval
409409
issue: https://github.com/elastic/elasticsearch/issues/131336
410-
- class: org.elasticsearch.packaging.test.DockerTests
411-
method: test022InstallPluginsFromLocalArchive
412-
issue: https://github.com/elastic/elasticsearch/issues/116866
413410
- class: org.elasticsearch.packaging.test.DockerTests
414411
method: test071BindMountCustomPathWithDifferentUID
415412
issue: https://github.com/elastic/elasticsearch/issues/120917
@@ -479,21 +476,9 @@ tests:
479476
- class: org.elasticsearch.compute.lucene.read.SortedSetOrdinalsBuilderTests
480477
method: testReader
481478
issue: https://github.com/elastic/elasticsearch/issues/131573
482-
- class: org.elasticsearch.search.SearchWithIndexBlocksIT
483-
method: testSearchShardsOnIndicesWithIndexRefreshBlocks
484-
issue: https://github.com/elastic/elasticsearch/issues/131662
485-
- class: org.elasticsearch.search.SearchWithIndexBlocksIT
486-
method: testSearchIndicesWithIndexRefreshBlocks
487-
issue: https://github.com/elastic/elasticsearch/issues/131693
488-
- class: org.elasticsearch.search.SearchWithIndexBlocksIT
489-
method: testOpenPITOnIndicesWithIndexRefreshBlocks
490-
issue: https://github.com/elastic/elasticsearch/issues/131695
491479
- class: org.elasticsearch.xpack.esql.ccq.MultiClustersIT
492480
method: testLookupJoinAliasesSkipOld
493481
issue: https://github.com/elastic/elasticsearch/issues/131697
494-
- class: org.elasticsearch.search.SearchWithIndexBlocksIT
495-
method: testMultiSearchIndicesWithIndexRefreshBlocks
496-
issue: https://github.com/elastic/elasticsearch/issues/131698
497482
- class: org.elasticsearch.indices.cluster.RemoteSearchForceConnectTimeoutIT
498483
method: testTimeoutSetting
499484
issue: https://github.com/elastic/elasticsearch/issues/131656
@@ -515,9 +500,6 @@ tests:
515500
- class: org.elasticsearch.test.rest.yaml.RcsCcsCommonYamlTestSuiteIT
516501
method: test {p0=search/600_flattened_ignore_above/flattened ignore_above multi-value field}
517502
issue: https://github.com/elastic/elasticsearch/issues/131967
518-
- class: org.elasticsearch.search.routing.SearchReplicaSelectionIT
519-
method: testNodeSelection
520-
issue: https://github.com/elastic/elasticsearch/issues/132017
521503
- class: org.elasticsearch.xpack.remotecluster.CrossClusterEsqlRCS1EnrichUnavailableRemotesIT
522504
method: testEsqlEnrichWithSkipUnavailable
523505
issue: https://github.com/elastic/elasticsearch/issues/132078
@@ -599,6 +581,40 @@ tests:
599581
- class: org.elasticsearch.index.mapper.vectors.DenseVectorFieldIndexTypeUpdateIT
600582
method: testDenseVectorMappingUpdate {initialType=int8_flat updateType=bbq_hnsw}
601583
issue: https://github.com/elastic/elasticsearch/issues/132141
584+
- class: org.elasticsearch.index.engine.MergeWithLowDiskSpaceIT
585+
method: testRelocationWhileForceMerging
586+
issue: https://github.com/elastic/elasticsearch/issues/131789
587+
- class: org.elasticsearch.index.mapper.vectors.DenseVectorFieldIndexTypeUpdateIT
588+
method: testDenseVectorMappingUpdate {initialType=flat updateType=int4_hnsw}
589+
issue: https://github.com/elastic/elasticsearch/issues/132149
590+
- class: org.elasticsearch.index.mapper.vectors.DenseVectorFieldIndexTypeUpdateIT
591+
method: testDenseVectorMappingUpdate {initialType=int4_flat updateType=hnsw}
592+
issue: https://github.com/elastic/elasticsearch/issues/132150
593+
- class: org.elasticsearch.index.mapper.vectors.DenseVectorFieldIndexTypeUpdateIT
594+
method: testDenseVectorMappingUpdate {initialType=int8_flat updateType=bbq_flat}
595+
issue: https://github.com/elastic/elasticsearch/issues/132151
596+
- class: org.elasticsearch.index.mapper.vectors.DenseVectorFieldIndexTypeUpdateIT
597+
method: "testDenseVectorMappingUpdate {initialType=bbq_hnsw updateType=bbq_disk #2}"
598+
issue: https://github.com/elastic/elasticsearch/issues/132152
599+
- class: org.elasticsearch.index.mapper.vectors.DenseVectorFieldIndexTypeUpdateIT
600+
method: testDenseVectorMappingUpdate {initialType=hnsw updateType=int8_hnsw}
601+
issue: https://github.com/elastic/elasticsearch/issues/132164
602+
- class: org.elasticsearch.index.mapper.vectors.DenseVectorFieldIndexTypeUpdateIT
603+
method: testDenseVectorMappingUpdate {initialType=int4_hnsw updateType=bbq_disk}
604+
issue: https://github.com/elastic/elasticsearch/issues/132165
605+
- class: org.elasticsearch.indices.cluster.FieldCapsForceConnectTimeoutIT
606+
method: testTimeoutSetting
607+
issue: https://github.com/elastic/elasticsearch/issues/132179
608+
- class: org.elasticsearch.index.mapper.vectors.DenseVectorFieldIndexTypeUpdateIT
609+
method: "testDenseVectorMappingUpdate {initialType=bbq_flat updateType=bbq_disk #2}"
610+
issue: https://github.com/elastic/elasticsearch/issues/132184
611+
- class: org.elasticsearch.index.mapper.vectors.DenseVectorFieldIndexTypeUpdateIT
612+
method: testDenseVectorMappingUpdate {initialType=bbq_hnsw updateType=bbq_disk}
613+
issue: https://github.com/elastic/elasticsearch/issues/132188
614+
- class: org.elasticsearch.index.mapper.vectors.DenseVectorFieldIndexTypeUpdateIT
615+
method: "testDenseVectorMappingUpdate {initialType=int8_flat updateType=bbq_disk #2}"
616+
issue: https://github.com/elastic/elasticsearch/issues/132189
617+
602618

603619
# Examples:
604620
#

plugins/examples/build.gradle

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,11 @@ subprojects {
2020
targetCompatibility = 21
2121
}
2222

23+
test {
24+
// testing with entitlements doesn't work for example plugins ES-12453
25+
systemProperty 'es.entitlement.enableForTests', 'false'
26+
}
27+
2328
repositories {
2429
// Only necessary when building plugins against SNAPSHOT versions of Elasticsearch
2530
if (gradle.includedBuilds.isEmpty()) {

qa/multi-data-path/build.gradle

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
apply plugin: 'elasticsearch.internal-yaml-rest-test'
2+
3+
// This subproject verifies MDP continues to work with entitlements.
4+
5+
restResources {
6+
restApi {
7+
include '_common', 'capabilities', 'index', 'indices', 'indices.create'
8+
}
9+
}
10+

0 commit comments

Comments
 (0)