Skip to content

Commit eeeeb50

Browse files
authored
Merge branch 'main' into ES-12330-prevent-auto-shard-on-lookup-index
2 parents 50805b1 + 7673059 commit eeeeb50

File tree

94 files changed

+5346
-520
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

94 files changed

+5346
-520
lines changed

docs/changelog/130847.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 130847
2+
summary: "Pipelines: Add `created_date` and `modified_date`"
3+
area: Ingest Node
4+
type: enhancement
5+
issues: []

docs/changelog/131027.yaml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
pr: 131027
2+
summary: Handle structured log messages
3+
area: Ingest Node
4+
type: feature
5+
issues:
6+
- 130333

docs/changelog/131658.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 131658
2+
summary: Fix `aggregate_metric_double` sorting and `mv_expand` issues
3+
area: ES|QL
4+
type: bug
5+
issues: []

docs/changelog/131733.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 131733
2+
summary: Replace `RoundTo` linear search evaluator with manual evaluators
3+
area: ES|QL
4+
type: enhancement
5+
issues: []

docs/changelog/131917.yaml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
pr: 131917
2+
summary: Fix NPE on empty to_lower/to_upper call
3+
area: ES|QL
4+
type: bug
5+
issues:
6+
- 131913

docs/reference/elasticsearch/configuration-reference/security-settings.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1933,7 +1933,7 @@ You can configure the following TLS/SSL settings.
19331933
`xpack.security.transport.ssl.trust_restrictions.x509_fields` ![logo cloud](https://doc-icons.s3.us-east-2.amazonaws.com/logo_cloud.svg "Supported on Elastic Cloud Hosted")
19341934
: Specifies which field(s) from the TLS certificate is used to match for the restricted trust management that is used for remote clusters connections. This should only be set when a self managed cluster can not create certificates that follow the Elastic Cloud pattern. The default value is ["subjectAltName.otherName.commonName"], the Elastic Cloud pattern. "subjectAltName.dnsName" is also supported and can be configured in addition to or in replacement of the default.
19351935

1936-
`xpack.security.transport.ssl.handshake_timeout`
1936+
`xpack.security.transport.ssl.handshake_timeout` {applies_to}`stack: ga 9.2`
19371937
: Specifies the timeout for a TLS handshake when opening a transport connection. Defaults to `10s`.
19381938

19391939
### Transport TLS/SSL key and trusted certificate settings [security-transport-tls-ssl-key-trusted-certificate-settings]
@@ -2133,7 +2133,7 @@ You can configure the following TLS/SSL settings.
21332133

21342134
For more information, see Oracle’s [Java Cryptography Architecture documentation](https://docs.oracle.com/en/java/javase/11/security/java-cryptography-architecture-jca-reference-guide.html).
21352135

2136-
`xpack.security.remote_cluster_server.ssl.handshake_timeout`
2136+
`xpack.security.remote_cluster_server.ssl.handshake_timeout` {applies_to}`stack: ga 9.2`
21372137
: Specifies the timeout for a TLS handshake when handling an inbound remote-cluster connection. Defaults to `10s`.
21382138

21392139

@@ -2265,7 +2265,7 @@ You can configure the following TLS/SSL settings.
22652265

22662266
For more information, see Oracle’s [Java Cryptography Architecture documentation](https://docs.oracle.com/en/java/javase/11/security/java-cryptography-architecture-jca-reference-guide.html).
22672267

2268-
`xpack.security.remote_cluster_client.ssl.handshake_timeout`
2268+
`xpack.security.remote_cluster_client.ssl.handshake_timeout` {applies_to}`stack: ga 9.2`
22692269
: Specifies the timeout for a TLS handshake when opening a remote-cluster connection. Defaults to `10s`.
22702270

22712271

docs/reference/elasticsearch/configuration-reference/thread-pool-settings.md

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,12 @@ $$$search-throttled$$$`search_throttled`
3535
`write`
3636
: For write operations and ingest processors. Thread pool type is `fixed` with a size of [`# of allocated processors`](#node.processors), queue_size of `max(10000, (`[`# of allocated processors`](#node.processors)`* 750))`. The maximum size for this pool is `1 + `[`# of allocated processors`](#node.processors).
3737

38-
`write_coordination`
38+
:::{note}
39+
In {{stack}} 9.0 and earlier, the `write` thread pool was also used for bulk requests.
40+
In {{stack}} 9.1 and earlier, the queue_size was 10000.
41+
:::
42+
43+
`write_coordination` {applies_to}`stack: ga 9.1`
3944
: For bulk request coordination operations. Thread pool type is `fixed` with a size of [`# of allocated processors`](#node.processors), queue_size of `10000`. The maximum size for this pool is `1 + `[`# of allocated processors`](#node.processors).
4045

4146
`snapshot`
@@ -74,7 +79,7 @@ $$$search-throttled$$$`search_throttled`
7479
`system_write`
7580
: For write operations on system indices. Thread pool type is `fixed` with a default maximum size of `min(5, (`[`# of allocated processors`](#node.processors)`) / 2)`.
7681

77-
`system_write_coordination`
82+
`system_write_coordination` {applies_to}`stack: ga 9.1`
7883
: For bulk request coordination operations on system indices. Thread pool type is `fixed` with a default maximum size of `min(5, (`[`# of allocated processors`](#node.processors)`) / 2)`.
7984

8085
`system_critical_read`

docs/reference/elasticsearch/index-settings/index-modules.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -256,8 +256,8 @@ $$$index-final-pipeline$$$
256256
$$$index-hidden$$$ `index.hidden` {applies_to}`serverless: all`
257257
: Indicates whether the index should be hidden by default. Hidden indices are not returned by default when using a wildcard expression. This behavior is controlled per request through the use of the `expand_wildcards` parameter. Possible values are `true` and `false` (default).
258258

259-
$$$index-dense-vector-hnsw-filter-heuristic$$$ `index.dense_vector.hnsw_filter_heuristic` {applies_to}`serverless: all`
260-
: The heuristic to utilize when executing a filtered search against vectors in an HNSW graph. This setting is in technical preview may be changed or removed in a future release. It can be set to:
259+
$$$index-dense-vector-hnsw-filter-heuristic$$$ `index.dense_vector.hnsw_filter_heuristic` {applies_to}`serverless: preview` {applies_to}`stack: preview 9.1`
260+
: The heuristic to utilize when executing a filtered search against vectors in an HNSW graph. It can be set to:
261261

262262
* `acorn` (default) - Only vectors that match the filter criteria are searched. This is the fastest option, and generally provides faster searches at similar recall to `fanout`, but `num_candidates` might need to be increased for exceptionally high recall requirements.
263263
* `fanout` - All vectors are compared with the query vector, but only those passing the criteria are added to the search results. Can be slower than `acorn`, but may yield higher recall.

docs/reference/elasticsearch/jvm-settings.md

Lines changed: 18 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -139,9 +139,25 @@ If you are running {{es}} as a Windows service, you can change the heap size usi
139139

140140
## JVM heap dump path setting [heap-dump-path-setting]
141141

142-
By default, {{es}} configures the JVM to dump the heap on out of memory exceptions to the default logs directory. On [RPM](docs-content://deploy-manage/deploy/self-managed/install-elasticsearch-with-rpm.md) and [Debian](docs-content://deploy-manage/deploy/self-managed/install-elasticsearch-with-debian-package.md) packages, the logs directory is `/var/log/elasticsearch`. On [Linux and MacOS](docs-content://deploy-manage/deploy/self-managed/install-elasticsearch-from-archive-on-linux-macos.md) and [Windows](docs-content://deploy-manage/deploy/self-managed/install-elasticsearch-with-zip-on-windows.md) distributions, the `logs` directory is located under the root of the {{es}} installation.
142+
Depending on your stack version, {{es}} configures the JVM to dump the heap on out of memory exceptions to the following location by default:
143+
144+
* {applies_to}`stack: ga 9.1` The default logs directory
145+
* {applies_to}`stack: ga 9.0` The default data directory
146+
147+
Directory location:
148+
149+
::::{tab-set}
150+
:::{tab-item} Logs directory
151+
* [RPM](docs-content://deploy-manage/deploy/self-managed/install-elasticsearch-with-rpm.md) and [Debian](docs-content://deploy-manage/deploy/self-managed/install-elasticsearch-with-debian-package.md) packages: `/var/log/elasticsearch`
152+
* [Linux and MacOS](docs-content://deploy-manage/deploy/self-managed/install-elasticsearch-from-archive-on-linux-macos.md) and [Windows](docs-content://deploy-manage/deploy/self-managed/install-elasticsearch-with-zip-on-windows.md) distributions: The `logs` directory at the root of the {{es}} installation
153+
:::
154+
:::{tab-item} Data directory
155+
* [RPM](docs-content://deploy-manage/deploy/self-managed/install-elasticsearch-with-rpm.md) and [Debian](docs-content://deploy-manage/deploy/self-managed/install-elasticsearch-with-debian-package.md) packages: `/var/lib/elasticsearch`
156+
* [Linux and MacOS](docs-content://deploy-manage/deploy/self-managed/install-elasticsearch-from-archive-on-linux-macos.md) and [Windows](docs-content://deploy-manage/deploy/self-managed/install-elasticsearch-with-zip-on-windows.md) distributions: The `data` directory at the root of the {{es}} installation
157+
:::
158+
::::
143159

144-
If this path is not suitable for receiving heap dumps, add the `-XX:HeapDumpPath=...` entry in [`jvm.options`](#set-jvm-options):
160+
If this path is not suitable for receiving heap dumps, modify or add the `-XX:HeapDumpPath=...` entry in [`jvm.options`](#set-jvm-options):
145161

146162
* If you specify a directory, the JVM will generate a filename for the heap dump based on the PID of the running instance.
147163
* If you specify a fixed filename instead of a directory, the file must not exist when the JVM needs to perform a heap dump on an out of memory exception. Otherwise, the heap dump will fail.

docs/reference/elasticsearch/mapping-reference/dense-vector.md

Lines changed: 26 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,14 @@ In many cases, a brute-force kNN search is not efficient enough. For this reason
5555

5656
Unmapped array fields of float elements with size between 128 and 4096 are dynamically mapped as `dense_vector` with a default similariy of `cosine`. You can override the default similarity by explicitly mapping the field as `dense_vector` with the desired similarity.
5757

58-
Indexing is enabled by default for dense vector fields and indexed as `bbq_hnsw` if dimensions are greater than or equal to 384, otherwise they are indexed as `int8_hnsw`. When indexing is enabled, you can define the vector similarity to use in kNN search:
58+
Indexing is enabled by default for dense vector fields and indexed as `bbq_hnsw` if dimensions are greater than or equal to 384, otherwise they are indexed as `int8_hnsw`. {applies_to}`stack: ga 9.1`
59+
60+
:::{note}
61+
In {{stack}} 9.0, dense vector fields are always indexed as `int8_hnsw`.
62+
:::
63+
64+
65+
When indexing is enabled, you can define the vector similarity to use in kNN search:
5966

6067
```console
6168
PUT my-index-2
@@ -107,6 +114,10 @@ When using a quantized format, you may want to oversample and rescore the result
107114

108115
To use a quantized index, you can set your index type to `int8_hnsw`, `int4_hnsw`, or `bbq_hnsw`. When indexing `float` vectors, the current default index type is `bbq_hnsw` for vectors with greater than or equal to 384 dimensions, otherwise it's `int8_hnsw`.
109116

117+
:::{note}
118+
In {{stack}} 9.0, dense vector fields are always indexed as `int8_hnsw`.
119+
:::
120+
110121
Quantized vectors can use [oversampling and rescoring](docs-content://solutions/search/vector/knn.md#dense-vector-knn-search-rescoring) to improve accuracy on approximate kNN search results.
111122

112123
::::{note}
@@ -255,9 +266,16 @@ $$$dense-vector-index-options$$$
255266
`type`
256267
: (Required, string) The type of kNN algorithm to use. Can be either any of:
257268
* `hnsw` - This utilizes the [HNSW algorithm](https://arxiv.org/abs/1603.09320) for scalable approximate kNN search. This supports all `element_type` values.
258-
* `int8_hnsw` - The default index type for float vectors with less than 384 dimensions. This utilizes the [HNSW algorithm](https://arxiv.org/abs/1603.09320) in addition to automatically scalar quantization for scalable approximate kNN search with `element_type` of `float`. This can reduce the memory footprint by 4x at the cost of some accuracy. See [Automatically quantize vectors for kNN search](#dense-vector-quantization).
269+
* `int8_hnsw` - The default index type for some float vectors:
270+
271+
* {applies_to}`stack: ga 9.1` Default for float vectors with less than 384 dimensions.
272+
* {applies_to}`stack: ga 9.0` Default for float all vectors.
273+
274+
This utilizes the [HNSW algorithm](https://arxiv.org/abs/1603.09320) in addition to automatically scalar quantization for scalable approximate kNN search with `element_type` of `float`. This can reduce the memory footprint by 4x at the cost of some accuracy. See [Automatically quantize vectors for kNN search](#dense-vector-quantization).
259275
* `int4_hnsw` - This utilizes the [HNSW algorithm](https://arxiv.org/abs/1603.09320) in addition to automatically scalar quantization for scalable approximate kNN search with `element_type` of `float`. This can reduce the memory footprint by 8x at the cost of some accuracy. See [Automatically quantize vectors for kNN search](#dense-vector-quantization).
260-
* `bbq_hnsw` - The default index type for float vectors with greater than or equal to 384 dimensions. This utilizes the [HNSW algorithm](https://arxiv.org/abs/1603.09320) in addition to automatically binary quantization for scalable approximate kNN search with `element_type` of `float`. This can reduce the memory footprint by 32x at the cost of accuracy. See [Automatically quantize vectors for kNN search](#dense-vector-quantization).
276+
* `bbq_hnsw` - This utilizes the [HNSW algorithm](https://arxiv.org/abs/1603.09320) in addition to automatically binary quantization for scalable approximate kNN search with `element_type` of `float`. This can reduce the memory footprint by 32x at the cost of accuracy. See [Automatically quantize vectors for kNN search](#dense-vector-quantization).
277+
278+
{applies_to}`stack: ga 9.1` `bbq_hnsw` is the default index type for float vectors with greater than or equal to 384 dimensions.
261279
* `flat` - This utilizes a brute-force search algorithm for exact kNN search. This supports all `element_type` values.
262280
* `int8_flat` - This utilizes a brute-force search algorithm in addition to automatically scalar quantization. Only supports `element_type` of `float`.
263281
* `int4_flat` - This utilizes a brute-force search algorithm in addition to automatically half-byte scalar quantization. Only supports `element_type` of `float`.
@@ -273,11 +291,14 @@ $$$dense-vector-index-options$$$
273291
: (Optional, float) Only applicable to `int8_hnsw`, `int4_hnsw`, `int8_flat`, and `int4_flat` index types. The confidence interval to use when quantizing the vectors. Can be any value between and including `0.90` and `1.0` or exactly `0`. When the value is `0`, this indicates that dynamic quantiles should be calculated for optimized quantization. When between `0.90` and `1.0`, this value restricts the values used when calculating the quantization thresholds. For example, a value of `0.95` will only use the middle 95% of the values when calculating the quantization thresholds (e.g. the highest and lowest 2.5% of values will be ignored). Defaults to `1/(dims + 1)` for `int8` quantized vectors and `0` for `int4` for dynamic quantile calculation.
274292

275293

276-
`rescore_vector`
294+
`rescore_vector` {applies_to}`stack: preview 9.0, ga 9.1`
277295
: (Optional, object) An optional section that configures automatic vector rescoring on knn queries for the given field. Only applicable to quantized index types.
278296
:::::{dropdown} Properties of rescore_vector
279297
`oversample`
280-
: (required, float) The amount to oversample the search results by. This value should be greater than `1.0` and less than `10.0` or exactly `0` to indicate no oversampling & rescoring should occur. The higher the value, the more vectors will be gathered and rescored with the raw values per shard.
298+
: (required, float) The amount to oversample the search results by. This value should be one of the following:
299+
* Greater than `1.0` and less than `10.0`
300+
* Exactly `0` to indicate no oversampling and rescoring should occur {applies_to}`stack: ga 9.1`
301+
: The higher the value, the more vectors will be gathered and rescored with the raw values per shard.
281302
: In case a knn query specifies a `rescore_vector` parameter, the query `rescore_vector` parameter will be used instead.
282303
: See [oversampling and rescoring quantized vectors](docs-content://solutions/search/vector/knn.md#dense-vector-knn-search-rescoring) for details.
283304
:::::

0 commit comments

Comments
 (0)