Skip to content

Commit 0e09b67

Browse files
Merge remote-tracking branch 'origin/main' into ai21-chat-completion
# Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java
2 parents 2e4b869 + 9ad04ff commit 0e09b67

File tree

82 files changed

+1974
-465
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

82 files changed

+1974
-465
lines changed

docs/changelog/130847.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 130847
2+
summary: "Pipelines: Add `created_date` and `modified_date`"
3+
area: Ingest Node
4+
type: enhancement
5+
issues: []

docs/changelog/131027.yaml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
pr: 131027
2+
summary: Handle structured log messages
3+
area: Ingest Node
4+
type: feature
5+
issues:
6+
- 130333

docs/changelog/131658.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 131658
2+
summary: Fix `aggregate_metric_double` sorting and `mv_expand` issues
3+
area: ES|QL
4+
type: bug
5+
issues: []

docs/reference/elasticsearch/configuration-reference/security-settings.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1933,7 +1933,7 @@ You can configure the following TLS/SSL settings.
19331933
`xpack.security.transport.ssl.trust_restrictions.x509_fields` ![logo cloud](https://doc-icons.s3.us-east-2.amazonaws.com/logo_cloud.svg "Supported on Elastic Cloud Hosted")
19341934
: Specifies which field(s) from the TLS certificate is used to match for the restricted trust management that is used for remote clusters connections. This should only be set when a self managed cluster can not create certificates that follow the Elastic Cloud pattern. The default value is ["subjectAltName.otherName.commonName"], the Elastic Cloud pattern. "subjectAltName.dnsName" is also supported and can be configured in addition to or in replacement of the default.
19351935

1936-
`xpack.security.transport.ssl.handshake_timeout`
1936+
`xpack.security.transport.ssl.handshake_timeout` {applies_to}`stack: ga 9.2`
19371937
: Specifies the timeout for a TLS handshake when opening a transport connection. Defaults to `10s`.
19381938

19391939
### Transport TLS/SSL key and trusted certificate settings [security-transport-tls-ssl-key-trusted-certificate-settings]
@@ -2133,7 +2133,7 @@ You can configure the following TLS/SSL settings.
21332133

21342134
For more information, see Oracle’s [Java Cryptography Architecture documentation](https://docs.oracle.com/en/java/javase/11/security/java-cryptography-architecture-jca-reference-guide.html).
21352135

2136-
`xpack.security.remote_cluster_server.ssl.handshake_timeout`
2136+
`xpack.security.remote_cluster_server.ssl.handshake_timeout` {applies_to}`stack: ga 9.2`
21372137
: Specifies the timeout for a TLS handshake when handling an inbound remote-cluster connection. Defaults to `10s`.
21382138

21392139

@@ -2265,7 +2265,7 @@ You can configure the following TLS/SSL settings.
22652265

22662266
For more information, see Oracle’s [Java Cryptography Architecture documentation](https://docs.oracle.com/en/java/javase/11/security/java-cryptography-architecture-jca-reference-guide.html).
22672267

2268-
`xpack.security.remote_cluster_client.ssl.handshake_timeout`
2268+
`xpack.security.remote_cluster_client.ssl.handshake_timeout` {applies_to}`stack: ga 9.2`
22692269
: Specifies the timeout for a TLS handshake when opening a remote-cluster connection. Defaults to `10s`.
22702270

22712271

docs/reference/elasticsearch/configuration-reference/thread-pool-settings.md

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,12 @@ $$$search-throttled$$$`search_throttled`
3535
`write`
3636
: For write operations and ingest processors. Thread pool type is `fixed` with a size of [`# of allocated processors`](#node.processors), queue_size of `max(10000, (`[`# of allocated processors`](#node.processors)`* 750))`. The maximum size for this pool is `1 + `[`# of allocated processors`](#node.processors).
3737

38-
`write_coordination`
38+
:::{note}
39+
In {{stack}} 9.0 and earlier, the `write` thread pool was also used for bulk requests.
40+
In {{stack}} 9.1 and earlier, the queue_size was 10000.
41+
:::
42+
43+
`write_coordination` {applies_to}`stack: ga 9.1`
3944
: For bulk request coordination operations. Thread pool type is `fixed` with a size of [`# of allocated processors`](#node.processors), queue_size of `10000`. The maximum size for this pool is `1 + `[`# of allocated processors`](#node.processors).
4045

4146
`snapshot`
@@ -74,7 +79,7 @@ $$$search-throttled$$$`search_throttled`
7479
`system_write`
7580
: For write operations on system indices. Thread pool type is `fixed` with a default maximum size of `min(5, (`[`# of allocated processors`](#node.processors)`) / 2)`.
7681

77-
`system_write_coordination`
82+
`system_write_coordination` {applies_to}`stack: ga 9.1`
7883
: For bulk request coordination operations on system indices. Thread pool type is `fixed` with a default maximum size of `min(5, (`[`# of allocated processors`](#node.processors)`) / 2)`.
7984

8085
`system_critical_read`

docs/reference/elasticsearch/index-settings/index-modules.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -256,8 +256,8 @@ $$$index-final-pipeline$$$
256256
$$$index-hidden$$$ `index.hidden` {applies_to}`serverless: all`
257257
: Indicates whether the index should be hidden by default. Hidden indices are not returned by default when using a wildcard expression. This behavior is controlled per request through the use of the `expand_wildcards` parameter. Possible values are `true` and `false` (default).
258258

259-
$$$index-dense-vector-hnsw-filter-heuristic$$$ `index.dense_vector.hnsw_filter_heuristic` {applies_to}`serverless: all`
260-
: The heuristic to utilize when executing a filtered search against vectors in an HNSW graph. This setting is in technical preview may be changed or removed in a future release. It can be set to:
259+
$$$index-dense-vector-hnsw-filter-heuristic$$$ `index.dense_vector.hnsw_filter_heuristic` {applies_to}`serverless: preview` {applies_to}`stack: preview 9.1`
260+
: The heuristic to utilize when executing a filtered search against vectors in an HNSW graph. It can be set to:
261261

262262
* `acorn` (default) - Only vectors that match the filter criteria are searched. This is the fastest option, and generally provides faster searches at similar recall to `fanout`, but `num_candidates` might need to be increased for exceptionally high recall requirements.
263263
* `fanout` - All vectors are compared with the query vector, but only those passing the criteria are added to the search results. Can be slower than `acorn`, but may yield higher recall.

docs/reference/elasticsearch/jvm-settings.md

Lines changed: 18 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -139,9 +139,25 @@ If you are running {{es}} as a Windows service, you can change the heap size usi
139139

140140
## JVM heap dump path setting [heap-dump-path-setting]
141141

142-
By default, {{es}} configures the JVM to dump the heap on out of memory exceptions to the default logs directory. On [RPM](docs-content://deploy-manage/deploy/self-managed/install-elasticsearch-with-rpm.md) and [Debian](docs-content://deploy-manage/deploy/self-managed/install-elasticsearch-with-debian-package.md) packages, the logs directory is `/var/log/elasticsearch`. On [Linux and MacOS](docs-content://deploy-manage/deploy/self-managed/install-elasticsearch-from-archive-on-linux-macos.md) and [Windows](docs-content://deploy-manage/deploy/self-managed/install-elasticsearch-with-zip-on-windows.md) distributions, the `logs` directory is located under the root of the {{es}} installation.
142+
Depending on your stack version, {{es}} configures the JVM to dump the heap on out of memory exceptions to the following location by default:
143+
144+
* {applies_to}`stack: ga 9.1` The default logs directory
145+
* {applies_to}`stack: ga 9.0` The default data directory
146+
147+
Directory location:
148+
149+
::::{tab-set}
150+
:::{tab-item} Logs directory
151+
* [RPM](docs-content://deploy-manage/deploy/self-managed/install-elasticsearch-with-rpm.md) and [Debian](docs-content://deploy-manage/deploy/self-managed/install-elasticsearch-with-debian-package.md) packages: `/var/log/elasticsearch`
152+
* [Linux and MacOS](docs-content://deploy-manage/deploy/self-managed/install-elasticsearch-from-archive-on-linux-macos.md) and [Windows](docs-content://deploy-manage/deploy/self-managed/install-elasticsearch-with-zip-on-windows.md) distributions: The `logs` directory at the root of the {{es}} installation
153+
:::
154+
:::{tab-item} Data directory
155+
* [RPM](docs-content://deploy-manage/deploy/self-managed/install-elasticsearch-with-rpm.md) and [Debian](docs-content://deploy-manage/deploy/self-managed/install-elasticsearch-with-debian-package.md) packages: `/var/lib/elasticsearch`
156+
* [Linux and MacOS](docs-content://deploy-manage/deploy/self-managed/install-elasticsearch-from-archive-on-linux-macos.md) and [Windows](docs-content://deploy-manage/deploy/self-managed/install-elasticsearch-with-zip-on-windows.md) distributions: The `data` directory at the root of the {{es}} installation
157+
:::
158+
::::
143159

144-
If this path is not suitable for receiving heap dumps, add the `-XX:HeapDumpPath=...` entry in [`jvm.options`](#set-jvm-options):
160+
If this path is not suitable for receiving heap dumps, modify or add the `-XX:HeapDumpPath=...` entry in [`jvm.options`](#set-jvm-options):
145161

146162
* If you specify a directory, the JVM will generate a filename for the heap dump based on the PID of the running instance.
147163
* If you specify a fixed filename instead of a directory, the file must not exist when the JVM needs to perform a heap dump on an out of memory exception. Otherwise, the heap dump will fail.

docs/reference/elasticsearch/mapping-reference/dense-vector.md

Lines changed: 26 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,14 @@ In many cases, a brute-force kNN search is not efficient enough. For this reason
5555

5656
Unmapped array fields of float elements with size between 128 and 4096 are dynamically mapped as `dense_vector` with a default similariy of `cosine`. You can override the default similarity by explicitly mapping the field as `dense_vector` with the desired similarity.
5757

58-
Indexing is enabled by default for dense vector fields and indexed as `bbq_hnsw` if dimensions are greater than or equal to 384, otherwise they are indexed as `int8_hnsw`. When indexing is enabled, you can define the vector similarity to use in kNN search:
58+
Indexing is enabled by default for dense vector fields and indexed as `bbq_hnsw` if dimensions are greater than or equal to 384, otherwise they are indexed as `int8_hnsw`. {applies_to}`stack: ga 9.1`
59+
60+
:::{note}
61+
In {{stack}} 9.0, dense vector fields are always indexed as `int8_hnsw`.
62+
:::
63+
64+
65+
When indexing is enabled, you can define the vector similarity to use in kNN search:
5966

6067
```console
6168
PUT my-index-2
@@ -107,6 +114,10 @@ When using a quantized format, you may want to oversample and rescore the result
107114

108115
To use a quantized index, you can set your index type to `int8_hnsw`, `int4_hnsw`, or `bbq_hnsw`. When indexing `float` vectors, the current default index type is `bbq_hnsw` for vectors with greater than or equal to 384 dimensions, otherwise it's `int8_hnsw`.
109116

117+
:::{note}
118+
In {{stack}} 9.0, dense vector fields are always indexed as `int8_hnsw`.
119+
:::
120+
110121
Quantized vectors can use [oversampling and rescoring](docs-content://solutions/search/vector/knn.md#dense-vector-knn-search-rescoring) to improve accuracy on approximate kNN search results.
111122

112123
::::{note}
@@ -255,9 +266,16 @@ $$$dense-vector-index-options$$$
255266
`type`
256267
: (Required, string) The type of kNN algorithm to use. Can be either any of:
257268
* `hnsw` - This utilizes the [HNSW algorithm](https://arxiv.org/abs/1603.09320) for scalable approximate kNN search. This supports all `element_type` values.
258-
* `int8_hnsw` - The default index type for float vectors with less than 384 dimensions. This utilizes the [HNSW algorithm](https://arxiv.org/abs/1603.09320) in addition to automatically scalar quantization for scalable approximate kNN search with `element_type` of `float`. This can reduce the memory footprint by 4x at the cost of some accuracy. See [Automatically quantize vectors for kNN search](#dense-vector-quantization).
269+
* `int8_hnsw` - The default index type for some float vectors:
270+
271+
* {applies_to}`stack: ga 9.1` Default for float vectors with less than 384 dimensions.
272+
* {applies_to}`stack: ga 9.0` Default for float all vectors.
273+
274+
This utilizes the [HNSW algorithm](https://arxiv.org/abs/1603.09320) in addition to automatically scalar quantization for scalable approximate kNN search with `element_type` of `float`. This can reduce the memory footprint by 4x at the cost of some accuracy. See [Automatically quantize vectors for kNN search](#dense-vector-quantization).
259275
* `int4_hnsw` - This utilizes the [HNSW algorithm](https://arxiv.org/abs/1603.09320) in addition to automatically scalar quantization for scalable approximate kNN search with `element_type` of `float`. This can reduce the memory footprint by 8x at the cost of some accuracy. See [Automatically quantize vectors for kNN search](#dense-vector-quantization).
260-
* `bbq_hnsw` - The default index type for float vectors with greater than or equal to 384 dimensions. This utilizes the [HNSW algorithm](https://arxiv.org/abs/1603.09320) in addition to automatically binary quantization for scalable approximate kNN search with `element_type` of `float`. This can reduce the memory footprint by 32x at the cost of accuracy. See [Automatically quantize vectors for kNN search](#dense-vector-quantization).
276+
* `bbq_hnsw` - This utilizes the [HNSW algorithm](https://arxiv.org/abs/1603.09320) in addition to automatically binary quantization for scalable approximate kNN search with `element_type` of `float`. This can reduce the memory footprint by 32x at the cost of accuracy. See [Automatically quantize vectors for kNN search](#dense-vector-quantization).
277+
278+
{applies_to}`stack: ga 9.1` `bbq_hnsw` is the default index type for float vectors with greater than or equal to 384 dimensions.
261279
* `flat` - This utilizes a brute-force search algorithm for exact kNN search. This supports all `element_type` values.
262280
* `int8_flat` - This utilizes a brute-force search algorithm in addition to automatically scalar quantization. Only supports `element_type` of `float`.
263281
* `int4_flat` - This utilizes a brute-force search algorithm in addition to automatically half-byte scalar quantization. Only supports `element_type` of `float`.
@@ -273,11 +291,14 @@ $$$dense-vector-index-options$$$
273291
: (Optional, float) Only applicable to `int8_hnsw`, `int4_hnsw`, `int8_flat`, and `int4_flat` index types. The confidence interval to use when quantizing the vectors. Can be any value between and including `0.90` and `1.0` or exactly `0`. When the value is `0`, this indicates that dynamic quantiles should be calculated for optimized quantization. When between `0.90` and `1.0`, this value restricts the values used when calculating the quantization thresholds. For example, a value of `0.95` will only use the middle 95% of the values when calculating the quantization thresholds (e.g. the highest and lowest 2.5% of values will be ignored). Defaults to `1/(dims + 1)` for `int8` quantized vectors and `0` for `int4` for dynamic quantile calculation.
274292

275293

276-
`rescore_vector`
294+
`rescore_vector` {applies_to}`stack: preview 9.0, ga 9.1`
277295
: (Optional, object) An optional section that configures automatic vector rescoring on knn queries for the given field. Only applicable to quantized index types.
278296
:::::{dropdown} Properties of rescore_vector
279297
`oversample`
280-
: (required, float) The amount to oversample the search results by. This value should be greater than `1.0` and less than `10.0` or exactly `0` to indicate no oversampling & rescoring should occur. The higher the value, the more vectors will be gathered and rescored with the raw values per shard.
298+
: (required, float) The amount to oversample the search results by. This value should be one of the following:
299+
* Greater than `1.0` and less than `10.0`
300+
* Exactly `0` to indicate no oversampling and rescoring should occur {applies_to}`stack: ga 9.1`
301+
: The higher the value, the more vectors will be gathered and rescored with the raw values per shard.
281302
: In case a knn query specifies a `rescore_vector` parameter, the query `rescore_vector` parameter will be used instead.
282303
: See [oversampling and rescoring quantized vectors](docs-content://solutions/search/vector/knn.md#dense-vector-knn-search-rescoring) for details.
283304
:::::

docs/reference/elasticsearch/rest-apis/api-conventions.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -421,15 +421,18 @@ GET /_nodes/ra*:2
421421
GET /_nodes/ra*:2*
422422
```
423423

424-
### Component Selectors [api-component-selectors]
424+
### Component selectors [api-component-selectors]
425+
```{applies_to}
426+
stack: ga 9.1
427+
```
425428

426429
A data stream component is a logical grouping of indices that help organize data inside a data stream. All data streams contain a `data` component by default. The `data` component comprises the data stream's backing indices. When searching, managing, or indexing into a data stream, the `data` component is what you are interacting with by default.
427430

428431
Some data stream features are exposed as additional components alongside its `data` component. These other components are comprised of separate sets of backing indices. These additional components store supplemental data independent of the data stream's regular backing indices. An example of another component is the `failures` component exposed by the data stream [failure store](docs-content://manage-data/data-store/data-streams/failure-store.md) feature, which captures documents that fail to be ingested in a separate set of backing indices on the data stream.
429432

430433
Some APIs that accept a `<data-stream>`, `<index>`, or `<target>` request path parameter also support *selector syntax* which defines which component on a data stream the API should operate on. To use a selector, it is appended to the index or data stream name. Selectors can be combined with other index pattern syntax like [date math](#api-date-math-index-names) and wildcards.
431434

432-
There are currently two selector suffixes supported by {{es}} APIs:
435+
There are two selector suffixes supported by {{es}} APIs:
433436

434437
`::data`
435438
: Refers to a data stream's backing indices containing regular data. Data streams always contain a data component.

docs/reference/elasticsearch/rest-apis/retrievers/knn-retriever.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@ A kNN retriever returns top documents from a [k-nearest neighbor search (kNN)](d
6060
Read more here: [knn similarity search](docs-content://solutions/search/vector/knn.md#knn-similarity-search)
6161

6262

63-
`rescore_vector`
63+
`rescore_vector` {applies_to}`stack: preview 9.0, ga 9.1`
6464
: (Optional, object) Apply oversampling and rescoring to quantized vectors.
6565

6666
::::{note}

0 commit comments

Comments
 (0)