Skip to content

Commit 0ba6198

Browse files
Merge branch 'main' into pkar/msearch-flatworld-search-project-routing
2 parents 2d61c55 + 2fd70f2 commit 0ba6198

File tree

320 files changed

+8964
-7183
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

320 files changed

+8964
-7183
lines changed

benchmarks/src/main/java/org/elasticsearch/benchmark/_nightly/esql/QueryPlanningBenchmark.java

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,6 @@ public class QueryPlanningBenchmark {
7171
}
7272

7373
private PlanTelemetry telemetry;
74-
private EsqlParser defaultParser;
7574
private Analyzer manyFieldsAnalyzer;
7675
private LogicalPlanOptimizer defaultOptimizer;
7776
private Configuration config;
@@ -111,7 +110,6 @@ public void setup() {
111110
TransportVersion minimumVersion = TransportVersion.current();
112111

113112
telemetry = new PlanTelemetry(functionRegistry);
114-
defaultParser = new EsqlParser();
115113
manyFieldsAnalyzer = new Analyzer(
116114
new AnalyzerContext(
117115
config,
@@ -128,14 +126,14 @@ public void setup() {
128126
}
129127

130128
private LogicalPlan plan(EsqlParser parser, Analyzer analyzer, LogicalPlanOptimizer optimizer, String query) {
131-
var parsed = parser.createStatement(query, new QueryParams(), telemetry);
129+
var parsed = parser.parseQuery(query, new QueryParams(), telemetry);
132130
var analyzed = analyzer.analyze(parsed);
133131
var optimized = optimizer.optimize(analyzed);
134132
return optimized;
135133
}
136134

137135
@Benchmark
138136
public void manyFields(Blackhole blackhole) {
139-
blackhole.consume(plan(defaultParser, manyFieldsAnalyzer, defaultOptimizer, "FROM test | LIMIT 10"));
137+
blackhole.consume(plan(EsqlParser.INSTANCE, manyFieldsAnalyzer, defaultOptimizer, "FROM test | LIMIT 10"));
140138
}
141139
}

docs/changelog/138465.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 138465
2+
summary: "Allocation: add duration and count metrics for write load hotspot"
3+
area: Allocation
4+
type: enhancement
5+
issues: []

docs/changelog/138492.yaml

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
pr: 138492
2+
summary: Enable bfloat16 and on-disk rescoring for dense vectors
3+
area: Vector Search
4+
type: feature
5+
issues: []
6+
highlight:
7+
title: New dense_vector options for storing bfloat16 vectors and utilising on-disk
8+
rescoring
9+
body: |-
10+
New options have been added to the `dense_vector` field type.
11+
12+
The first is support for storing vectors in bfloat16 format.
13+
This is a floating-point format that utilises two bytes per value rather than four, halving the storage space
14+
required compared to `element_type: float`. This can be specified with `element_type: bfloat16`
15+
when creating the index, for all `dense_vector` indexing types.
16+
17+
Float values are automatically rounded to two bytes when writing to disk, so this format can be used
18+
with original source vectors at two- or four-byte precision. BFloat16 values are zero-expanded back to four-byte floats
19+
when read into memory. Using `bfloat16` will cause a loss of precision compared to
20+
the original vector values, as well as a small performance hit due to converting between `bfloat16` and `float`
21+
when reading and writing vectors; however this may be counterbalanced by a corresponding decrease in I/O,
22+
depending on your workload.
23+
24+
The second option is to enable on-disk rescoring. When rescoring vectors during kNN searches, the raw vectors
25+
are read into memory. When the vector data is larger than the amount of available RAM, this might cause the OS
26+
to evict some in-memory pages that then need to be paged back in immediately afterwards. This can cause
27+
a significant slowdown in search speed. Enabling on-disk rescoring causes rescoring to use raw vector data
28+
on-disk during rescoring, and to not read it into memory first. This can significantly increase search performance
29+
in such low-memory situations.
30+
31+
Enable on-disk rescoring using the `on_disk_rescore: true` index option.
32+
notable: true

docs/changelog/138726.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 136624
2+
summary: Added Azure OpenAI chat_completion support to the Inference Plugin
3+
area: Machine Learning
4+
type: enhancement
5+
issues: []

docs/changelog/138776.yaml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
pr: 138776
2+
summary: "[Inference API] Support chunking settings for sparse embeddings in custom\
3+
\ service"
4+
area: Machine Learning
5+
type: bug
6+
issues: []

docs/changelog/139053.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 139053
2+
summary: Minimize doc values fetches in TSDBSyntheticIdFieldsProducer
3+
area: TSDB
4+
type: enhancement
5+
issues: []

docs/changelog/139075.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 139075
2+
summary: Bump jruby/joni to 2.2.6
3+
area: Ingest Node
4+
type: enhancement
5+
issues: []

docs/changelog/139084.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 139084
2+
summary: "GPU: Restrict GPU indexing to FLOAT element types"
3+
area: Vector Search
4+
type: enhancement
5+
issues: []

docs/reference/elasticsearch/mapping-reference/dense-vector.md

Lines changed: 15 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -156,7 +156,7 @@ This setting is compatible with synthetic `_source`, where the entire `_source`
156156

157157
### Rehydration and precision
158158

159-
When vector values are rehydrated (e.g., for reindex, recovery, or explicit `_source` requests), they are restored from their internal format. Internally, vectors are stored at float precision, so if they were originally indexed as higher-precision types (e.g., `double` or `long`), the rehydrated values will have reduced precision. This lossy representation is intended to save space while preserving search quality.
159+
When vector values are rehydrated (e.g., for reindex, recovery, or explicit `_source` requests), they are restored from their internal format. By default, vectors are stored at float precision, so if they were originally indexed as higher-precision types (e.g., `double` or `long`), the rehydrated values will have reduced precision. This lossy representation is intended to save space while preserving search quality. Additionally, using an `element_type` of `bfloat16` will cause a further loss in precision in restored vectors.
160160

161161
### Storing original vectors in `_source`
162162

@@ -283,12 +283,15 @@ The following mapping parameters are accepted:
283283
$$$dense-vector-element-type$$$
284284

285285
`element_type`
286-
: (Optional, string) The data type used to encode vectors. The supported data types are `float` (default), `byte`, and `bit`.
286+
: (Optional, string) The data type used to encode vectors.
287287

288288
::::{dropdown} Valid values for element_type
289289
`float`
290290
: indexes a 4-byte floating-point value per dimension. This is the default value.
291291

292+
`bfloat16` {applies_to}`stack: ga 9.3`
293+
: indexes a 2-byte floating-point value per dimension. This uses the bfloat16 encoding, _not_ IEEE-754 float16, to maintain the same value range as 4-byte floats. Using `bfloat16` is likely to cause a loss of precision in the stored values compared to `float`.
294+
292295
`byte`
293296
: indexes a 1-byte integer value per dimension.
294297

@@ -353,16 +356,16 @@ $$$dense-vector-index-options$$$
353356
* `int8_hnsw` - The default index type for some float vectors:
354357
* {applies_to}`stack: ga 9.1` Default for float vectors with less than 384 dimensions.
355358
* {applies_to}`stack: ga 9.0` Default for float all vectors.
356-
This utilizes the [HNSW algorithm](https://arxiv.org/abs/1603.09320) in addition to automatically scalar quantization for scalable approximate kNN search with `element_type` of `float`. This can reduce the memory footprint by 4x at the cost of some accuracy. See [Automatically quantize vectors for kNN search](#dense-vector-quantization).
357-
* `int4_hnsw` - This utilizes the [HNSW algorithm](https://arxiv.org/abs/1603.09320) in addition to automatically scalar quantization for scalable approximate kNN search with `element_type` of `float`. This can reduce the memory footprint by 8x at the cost of some accuracy. See [Automatically quantize vectors for kNN search](#dense-vector-quantization).
358-
* `bbq_hnsw` - This utilizes the [HNSW algorithm](https://arxiv.org/abs/1603.09320) in addition to automatically binary quantization for scalable approximate kNN search with `element_type` of `float`. This can reduce the memory footprint by 32x at the cost of accuracy. See [Automatically quantize vectors for kNN search](#dense-vector-quantization).
359+
This utilizes the [HNSW algorithm](https://arxiv.org/abs/1603.09320) in addition to automatically scalar quantization for scalable approximate kNN search with `element_type` of `float` or `bfloat16`. This can reduce the memory footprint by 4x at the cost of some accuracy. See [Automatically quantize vectors for kNN search](#dense-vector-quantization).
360+
* `int4_hnsw` - This utilizes the [HNSW algorithm](https://arxiv.org/abs/1603.09320) in addition to automatically scalar quantization for scalable approximate kNN search with `element_type` of `float` or `bfloat16`. This can reduce the memory footprint by 8x at the cost of some accuracy. See [Automatically quantize vectors for kNN search](#dense-vector-quantization).
361+
* `bbq_hnsw` - This utilizes the [HNSW algorithm](https://arxiv.org/abs/1603.09320) in addition to automatically binary quantization for scalable approximate kNN search with `element_type` of `float` or `bfloat16`. This can reduce the memory footprint by 32x at the cost of accuracy. See [Automatically quantize vectors for kNN search](#dense-vector-quantization).
359362

360363
{applies_to}`stack: ga 9.1` `bbq_hnsw` is the default index type for float vectors with greater than or equal to 384 dimensions.
361364
* `flat` - This utilizes a brute-force search algorithm for exact kNN search. This supports all `element_type` values.
362-
* `int8_flat` - This utilizes a brute-force search algorithm in addition to automatic scalar quantization. Only supports `element_type` of `float`.
363-
* `int4_flat` - This utilizes a brute-force search algorithm in addition to automatic half-byte scalar quantization. Only supports `element_type` of `float`.
364-
* `bbq_flat` - This utilizes a brute-force search algorithm in addition to automatic binary quantization. Only supports `element_type` of `float`.
365-
* {applies_to}`stack: ga 9.2` `bbq_disk` - This utilizes a variant of [k-means clustering algorithm](https://en.wikipedia.org/wiki/K-means_clustering) in addition to automatic binary quantization to partition vectors and search subspaces rather than an entire graph structure as in with HNSW. Only supports `element_type` of `float`. This combines the benefits of BBQ quantization with partitioning to further reduces the required memory overhead when compared with HNSW and can effectively be run at the smallest possible RAM and heap sizes when HNSW would otherwise cause swapping and grind to a halt. DiskBBQ largely scales linearly with the total RAM. And search performance is enhanced at scale as a subset of the total vector space is loaded. This requires an [Enterprise subscription](https://www.elastic.co/subscriptions).
365+
* `int8_flat` - This utilizes a brute-force search algorithm in addition to automatic scalar quantization. Only supports `element_type` of `float` or `bfloat16`.
366+
* `int4_flat` - This utilizes a brute-force search algorithm in addition to automatic half-byte scalar quantization. Only supports `element_type` of `float` or `bfloat16`.
367+
* `bbq_flat` - This utilizes a brute-force search algorithm in addition to automatic binary quantization. Only supports `element_type` of `float` or `bfloat16`.
368+
* {applies_to}`stack: ga 9.2` `bbq_disk` - This utilizes a variant of [k-means clustering algorithm](https://en.wikipedia.org/wiki/K-means_clustering) in addition to automatic binary quantization to partition vectors and search subspaces rather than an entire graph structure as in with HNSW. Only supports `element_type` of `float` or `bfloat16`. This combines the benefits of BBQ quantization with partitioning to further reduces the required memory overhead when compared with HNSW and can effectively be run at the smallest possible RAM and heap sizes when HNSW would otherwise cause swapping and grind to a halt. DiskBBQ largely scales linearly with the total RAM. And search performance is enhanced at scale as a subset of the total vector space is loaded. This requires an [Enterprise subscription](https://www.elastic.co/subscriptions).
366369

367370
`m`
368371
: (Optional, integer) The number of neighbors each node will be connected to in the HNSW graph. Defaults to `16`. Only applicable to `hnsw`, `int8_hnsw`, `int4_hnsw` and `bbq_hnsw` index types.
@@ -390,6 +393,9 @@ $$$dense-vector-index-options$$$
390393
: In case a knn query specifies a `rescore_vector` parameter, the query `rescore_vector` parameter will be used instead.
391394
: See [oversampling and rescoring quantized vectors](docs-content://solutions/search/vector/knn.md#dense-vector-knn-search-rescoring) for details.
392395
:::::
396+
397+
`on_disk_rescore` {applies_to}`stack: preview 9.3` {applies_to}`serverless: unavailable`
398+
: (Optional, boolean) Only applicable to quantized HNSW and `bbq_disk` index types. When `true`, vector rescoring will read the raw vector data directly from disk, and will not copy it in memory. This can improve performance when vector data is larger than the amount of available RAM. This setting only applies to newly-indexed vectors; after changing this setting, the vectors must be reindexed or force-merged to apply the new setting to the whole index. Defaults to `false`.
393399
::::
394400

395401

docs/reference/query-languages/esql/kibana/definition/settings/project_routing.json

Lines changed: 2 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)