Skip to content

Commit 6a757e9

Browse files
author
elasticsearchmachine
committed
Merge remote-tracking branch 'origin/main' into lucene_snapshot
2 parents 470da3f + d0f71fc commit 6a757e9

File tree

124 files changed

+2748
-841
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

124 files changed

+2748
-841
lines changed

.buildkite/packer_cache.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,6 @@ for branch in "${branches[@]}"; do
2929
fi
3030

3131
export JAVA_HOME="$HOME/.java/$ES_BUILD_JAVA"
32-
"checkout/${branch}/gradlew" --project-dir "$CHECKOUT_DIR" --parallel -s resolveAllDependencies -Dorg.gradle.warning.mode=none -DisCI
32+
"checkout/${branch}/gradlew" --project-dir "$CHECKOUT_DIR" --parallel -s resolveAllDependencies -Dorg.gradle.warning.mode=none -DisCI --max-workers=4
3333
rm -rf "checkout/${branch}"
3434
done

docs/changelog/114681.yaml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
pr: 114681
2+
summary: "Support for unsigned 64 bit numbers in Cpu stats"
3+
area: Infra/Core
4+
type: enhancement
5+
issues:
6+
- 112274

docs/changelog/114855.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 114855
2+
summary: Add query rules retriever
3+
area: Relevance
4+
type: enhancement
5+
issues: [ ]

docs/changelog/115656.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 115656
2+
summary: Fix stream support for `TaskType.ANY`
3+
area: Machine Learning
4+
type: bug
5+
issues: []

docs/changelog/115715.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 115715
2+
summary: Avoid `catch (Throwable t)` in `AmazonBedrockStreamingChatProcessor`
3+
area: Machine Learning
4+
type: bug
5+
issues: []

docs/changelog/115721.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 115721
2+
summary: Change Reindexing metrics unit from millis to seconds
3+
area: Reindex
4+
type: enhancement
5+
issues: []

docs/reference/cat/shards.asciidoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ For <<data-streams,data streams>>, the API returns information about the stream'
3333
* If the {es} {security-features} are enabled, you must have the `monitor` or
3434
`manage` <<privileges-list-cluster,cluster privilege>> to use this API.
3535
You must also have the `monitor` or `manage` <<privileges-list-indices,index privilege>>
36-
for any data stream, index, or alias you retrieve.
36+
to view the full information for any data stream, index, or alias you retrieve.
3737

3838
[[cat-shards-path-params]]
3939
==== {api-path-parms-title}

docs/reference/esql/esql-kibana.asciidoc

Lines changed: 35 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -171,14 +171,44 @@ FROM kibana_sample_data_logs
171171
[[esql-kibana-time-filter]]
172172
=== Time filtering
173173

174-
To display data within a specified time range, use the
175-
{kibana-ref}/set-time-filter.html[time filter]. The time filter is only enabled
176-
when the indices you're querying have a field called `@timestamp`.
174+
To display data within a specified time range, you can use the standard time filter,
175+
custom time parameters, or a WHERE command.
177176

178-
If your indices do not have a timestamp field called `@timestamp`, you can limit
179-
the time range using the <<esql-where>> command and the <<esql-now>> function.
177+
[discrete]
178+
==== Standard time filter
179+
The standard {kibana-ref}/set-time-filter.html[time filter] is enabled
180+
when the indices you're querying have a field named `@timestamp`.
181+
182+
[discrete]
183+
==== Custom time parameters
184+
If your indices do not have a field named `@timestamp`, you can use
185+
the `?_tstart` and `?_tend` parameters to specify a time range. These parameters
186+
work with any timestamp field and automatically sync with the {kibana-ref}/set-time-filter.html[time filter].
187+
188+
[source,esql]
189+
----
190+
FROM my_index
191+
| WHERE custom_timestamp >= ?_tstart AND custom_timestamp < ?_tend
192+
----
193+
194+
You can also use the `?_tstart` and `?_tend` parameters with the <<esql-bucket>> function
195+
to create auto-incrementing time buckets in {esql} <<esql-kibana-visualizations,visualizations>>.
196+
For example:
197+
198+
[source,esql]
199+
----
200+
FROM kibana_sample_data_logs
201+
| STATS average_bytes = AVG(bytes) BY BUCKET(@timestamp, 50, ?_tstart, ?_tend)
202+
----
203+
204+
This example uses `50` buckets, which is the maximum number of buckets.
205+
206+
[discrete]
207+
==== WHERE command
208+
You can also limit the time range using the <<esql-where>> command and the <<esql-now>> function.
180209
For example, if the timestamp field is called `timestamp`, to query the last 15
181210
minutes of data:
211+
182212
[source,esql]
183213
----
184214
FROM kibana_sample_data_logs

docs/reference/how-to/knn-search.asciidoc

Lines changed: 21 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -16,10 +16,11 @@ structures. So these same recommendations also help with indexing speed.
1616
The default <<dense-vector-element-type,`element_type`>> is `float`. But this
1717
can be automatically quantized during index time through
1818
<<dense-vector-quantization,`quantization`>>. Quantization will reduce the
19-
required memory by 4x, but it will also reduce the precision of the vectors and
20-
increase disk usage for the field (by up to 25%). Increased disk usage is a
19+
required memory by 4x, 8x, or as much as 32x, but it will also reduce the precision of the vectors and
20+
increase disk usage for the field (by up to 25%, 12.5%, or 3.125%, respectively). Increased disk usage is a
2121
result of {es} storing both the quantized and the unquantized vectors.
22-
For example, when quantizing 40GB of floating point vectors an extra 10GB of data will be stored for the quantized vectors. The total disk usage amounts to 50GB, but the memory usage for fast search will be reduced to 10GB.
22+
For example, when int8 quantizing 40GB of floating point vectors an extra 10GB of data will be stored for the quantized vectors.
23+
The total disk usage amounts to 50GB, but the memory usage for fast search will be reduced to 10GB.
2324

2425
For `float` vectors with `dim` greater than or equal to `384`, using a
2526
<<dense-vector-quantization,`quantized`>> index is highly recommended.
@@ -68,12 +69,23 @@ Another option is to use <<synthetic-source,synthetic `_source`>>.
6869
kNN search. HNSW is a graph-based algorithm which only works efficiently when
6970
most vector data is held in memory. You should ensure that data nodes have at
7071
least enough RAM to hold the vector data and index structures. To check the
71-
size of the vector data, you can use the <<indices-disk-usage>> API. As a
72-
loose rule of thumb, and assuming the default HNSW options, the bytes used will
73-
be `num_vectors * 4 * (num_dimensions + 12)`. When using the `byte` <<dense-vector-element-type,`element_type`>>
74-
the space required will be closer to `num_vectors * (num_dimensions + 12)`. Note that
75-
the required RAM is for the filesystem cache, which is separate from the Java
76-
heap.
72+
size of the vector data, you can use the <<indices-disk-usage>> API.
73+
74+
Here are estimates for different element types and quantization levels:
75+
+
76+
--
77+
`element_type: float`: `num_vectors * num_dimensions * 4`
78+
`element_type: float` with `quantization: int8`: `num_vectors * (num_dimensions + 4)`
79+
`element_type: float` with `quantization: int4`: `num_vectors * (num_dimensions/2 + 4)`
80+
`element_type: float` with `quantization: bbq`: `num_vectors * (num_dimensions/8 + 12)`
81+
`element_type: byte`: `num_vectors * num_dimensions`
82+
`element_type: bit`: `num_vectors * (num_dimensions/8)`
83+
--
84+
85+
If utilizing HNSW, the graph must also be in memory, to estimate the required bytes use `num_vectors * 4 * HNSW.m`. The
86+
default value for `HNSW.m` is 16, so by default `num_vectors * 4 * 16`.
87+
88+
Note that the required RAM is for the filesystem cache, which is separate from the Java heap.
7789

7890
The data nodes should also leave a buffer for other ways that RAM is needed.
7991
For example your index might also include text fields and numerics, which also

docs/reference/inference/inference-apis.asciidoc

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -35,21 +35,21 @@ Elastic –, then create an {infer} endpoint by the <<put-inference-api>>.
3535
Now use <<semantic-search-semantic-text, semantic text>> to perform
3636
<<semantic-search, semantic search>> on your data.
3737

38-
[discrete]
39-
[[default-enpoints]]
40-
=== Default {infer} endpoints
38+
//[discrete]
39+
//[[default-enpoints]]
40+
//=== Default {infer} endpoints
4141

42-
Your {es} deployment contains some preconfigured {infer} endpoints that makes it easier for you to use them when defining `semantic_text` fields or {infer} processors.
43-
The following list contains the default {infer} endpoints listed by `inference_id`:
42+
//Your {es} deployment contains some preconfigured {infer} endpoints that makes it easier for you to use them when defining `semantic_text` fields or {infer} processors.
43+
//The following list contains the default {infer} endpoints listed by `inference_id`:
4444

45-
* `.elser-2-elasticsearch`: uses the {ml-docs}/ml-nlp-elser.html[ELSER] built-in trained model for `sparse_embedding` tasks (recommended for English language texts)
46-
* `.multilingual-e5-small-elasticsearch`: uses the {ml-docs}/ml-nlp-e5.html[E5] built-in trained model for `text_embedding` tasks (recommended for non-English language texts)
45+
//* `.elser-2-elasticsearch`: uses the {ml-docs}/ml-nlp-elser.html[ELSER] built-in trained model for `sparse_embedding` tasks (recommended for English language texts)
46+
//* `.multilingual-e5-small-elasticsearch`: uses the {ml-docs}/ml-nlp-e5.html[E5] built-in trained model for `text_embedding` tasks (recommended for non-English language texts)
4747

48-
Use the `inference_id` of the endpoint in a <<semantic-text,`semantic_text`>> field definition or when creating an <<inference-processor,{infer} processor>>.
49-
The API call will automatically download and deploy the model which might take a couple of minutes.
50-
Default {infer} enpoints have {ml-docs}/ml-nlp-auto-scale.html#nlp-model-adaptive-allocations[adaptive allocations] enabled.
51-
For these models, the minimum number of allocations is `0`.
52-
If there is no {infer} activity that uses the endpoint, the number of allocations will scale down to `0` automatically after 15 minutes.
48+
//Use the `inference_id` of the endpoint in a <<semantic-text,`semantic_text`>> field definition or when creating an <<inference-processor,{infer} processor>>.
49+
//The API call will automatically download and deploy the model which might take a couple of minutes.
50+
//Default {infer} enpoints have {ml-docs}/ml-nlp-auto-scale.html#nlp-model-adaptive-allocations[adaptive allocations] enabled.
51+
//For these models, the minimum number of allocations is `0`.
52+
//If there is no {infer} activity that uses the endpoint, the number of allocations will scale down to `0` automatically after 15 minutes.
5353

5454

5555
[discrete]

0 commit comments

Comments
 (0)