Skip to content

Commit a6bdc90

Browse files
authored
Merge branch 'main' into markjhoy/default_token_pruning_sparse_vector
2 parents 8f6672f + 52bc94e commit a6bdc90

File tree

167 files changed

+5313
-603
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

167 files changed

+5313
-603
lines changed

build-tools-internal/src/main/java/org/elasticsearch/gradle/internal/ElasticsearchTestBasePlugin.java

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -120,6 +120,7 @@ public void execute(Task t) {
120120
"--add-opens=java.base/java.nio.file=ALL-UNNAMED",
121121
"--add-opens=java.base/java.time=ALL-UNNAMED",
122122
"--add-opens=java.management/java.lang.management=ALL-UNNAMED",
123+
"--enable-native-access=ALL-UNNAMED",
123124
"-XX:+HeapDumpOnOutOfMemoryError"
124125
);
125126

docs/changelog/126581.yaml

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
pr: 126581
2+
summary: "Optimize shared blob cache evictions on shard removal
3+
Shared blob cache evictions occur on the cluster applier thread when shards are
4+
removed from a node. These can be expensive if a large number of shards are
5+
being removed. This change uses the context of the removal to avoid unnecessary
6+
evictions that might hold up the applier thread.
7+
"
8+
area: Snapshot/Restore
9+
type: enhancement
10+
issues: []

docs/changelog/128241.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 128241
2+
summary: Adding VoyageAI's v3.5 models
3+
area: Machine Learning
4+
type: enhancement
5+
issues: []

docs/changelog/128259.yaml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
pr: 128259
2+
summary: Added geometry validation for GEO types to exit early on invalid latitudes
3+
area: Geo
4+
type: bug
5+
issues:
6+
- 128234

docs/changelog/128263.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 128263
2+
summary: Allow lookup join on mixed numeric fields
3+
area: ES|QL
4+
type: enhancement
5+
issues: []

docs/changelog/128449.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 128449
2+
summary: "[Draft] Support concurrent multipart uploads in Azure"
3+
area: Snapshot/Restore
4+
type: enhancement
5+
issues: []

docs/changelog/128473.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 128473
2+
summary: Conditionally force sequential reading in `LuceneSyntheticSourceChangesSnapshot`
3+
area: Logs
4+
type: enhancement
5+
issues: []

docs/reference/aggregations/search-aggregations-bucket-significantterms-aggregation.md

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -253,8 +253,8 @@ Like most design decisions, this is the basis of a trade-off in which we have ch
253253
The JLH score can be used as a significance score by adding the parameter
254254

255255
```js
256-
"jlh": {
257-
}
256+
"jlh": {
257+
}
258258
```
259259

260260
The scores are derived from the doc frequencies in *foreground* and *background* sets. The *absolute* change in popularity (foregroundPercent - backgroundPercent) would favor common terms whereas the *relative* change in popularity (foregroundPercent/ backgroundPercent) would favor rare terms. Rare vs common is essentially a precision vs recall balance and so the absolute and relative changes are multiplied to provide a sweet spot between precision and recall.
@@ -265,9 +265,9 @@ The scores are derived from the doc frequencies in *foreground* and *background*
265265
Mutual information as described in "Information Retrieval", Manning et al., Chapter 13.5.1 can be used as significance score by adding the parameter
266266

267267
```js
268-
"mutual_information": {
269-
"include_negatives": true
270-
}
268+
"mutual_information": {
269+
"include_negatives": true
270+
}
271271
```
272272

273273
Mutual information does not differentiate between terms that are descriptive for the subset or for documents outside the subset. The significant terms therefore can contain terms that appear more or less frequent in the subset than outside the subset. To filter out the terms that appear less often in the subset than in documents outside the subset, `include_negatives` can be set to `false`.
@@ -284,8 +284,8 @@ Per default, the assumption is that the documents in the bucket are also contain
284284
Chi square as described in "Information Retrieval", Manning et al., Chapter 13.5.2 can be used as significance score by adding the parameter
285285

286286
```js
287-
"chi_square": {
288-
}
287+
"chi_square": {
288+
}
289289
```
290290

291291
Chi square behaves like mutual information and can be configured with the same parameters `include_negatives` and `background_is_superset`.
@@ -296,8 +296,8 @@ Chi square behaves like mutual information and can be configured with the same p
296296
Google normalized distance as described in ["The Google Similarity Distance", Cilibrasi and Vitanyi, 2007](https://arxiv.org/pdf/cs/0412098v3.pdf) can be used as significance score by adding the parameter
297297

298298
```js
299-
"gnd": {
300-
}
299+
"gnd": {
300+
}
301301
```
302302

303303
`gnd` also accepts the `background_is_superset` parameter.
@@ -394,8 +394,8 @@ The benefit of this heuristic is that the scoring logic is simple to explain to
394394
It would be hard for a seasoned boxer to win a championship if the prize was awarded purely on the basis of percentage of fights won - by these rules a newcomer with only one fight under their belt would be impossible to beat. Multiple observations are typically required to reinforce a view so it is recommended in these cases to set both `min_doc_count` and `shard_min_doc_count` to a higher value such as 10 in order to filter out the low-frequency terms that otherwise take precedence.
395395

396396
```js
397-
"percentage": {
398-
}
397+
"percentage": {
398+
}
399399
```
400400

401401

@@ -413,11 +413,11 @@ If none of the above measures suits your usecase than another option is to imple
413413
Customized scores can be implemented via a script:
414414

415415
```js
416-
"script_heuristic": {
416+
"script_heuristic": {
417417
"script": {
418-
"lang": "painless",
419-
"source": "params._subset_freq/(params._superset_freq - params._subset_freq + 1)"
420-
}
418+
"lang": "painless",
419+
"source": "params._subset_freq/(params._superset_freq - params._subset_freq + 1)"
420+
}
421421
}
422422
```
423423

docs/reference/elasticsearch-plugins/discovery-azure-classic-scale.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ mapped_pages:
33
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/discovery-azure-classic-scale.html
44
---
55

6-
# Scaling out! [discovery-azure-classic-scale]
6+
# Scaling out [discovery-azure-classic-scale]
77

88
You need first to create an image of your previous machine. Disconnect from your machine and run locally the following commands:
99

docs/reference/elasticsearch/mapping-reference/parent-join.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,13 @@
11
---
2+
applies_to:
3+
serverless: unavailable
24
navigation_title: "Join"
35
mapped_pages:
46
- https://www.elastic.co/guide/en/elasticsearch/reference/current/parent-join.html
57
---
68

79
# Join field type [parent-join]
810

9-
1011
The `join` data type is a special field that creates parent/child relation within documents of the same index. The `relations` section defines a set of possible relations within the documents, each relation being a parent name and a child name.
1112

1213
::::{warning}

0 commit comments

Comments
 (0)