Skip to content

Commit a089586

Browse files
authored
Merge branch 'main' into ml-eis-integration-jbc
2 parents deab545 + c7b61bd commit a089586

File tree

106 files changed

+2116
-1305
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

106 files changed

+2116
-1305
lines changed

docs/changelog/117949.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 117949
2+
summary: Move `SlowLogFieldProvider` instantiation to node construction
3+
area: Infra/Logging
4+
type: bug
5+
issues: []

docs/changelog/118804.yaml

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
pr: 118804
2+
summary: Add new experimental `rank_vectors` mapping for late-interaction second order
3+
ranking
4+
area: Vector Search
5+
type: feature
6+
issues: []
7+
highlight:
8+
title: Add new experimental `rank_vectors` mapping for late-interaction second order
9+
ranking
10+
body:
11+
Late-interaction models are powerful rerankers. While their size and overall
12+
cost doesn't lend itself for HNSW indexing, utilizing them as second order reranking
13+
can provide excellent boosts in relevance. The new `rank_vectors` mapping allows for rescoring
14+
over new and novel multi-vector late-interaction models like ColBERT or ColPali.
15+
notable: true

docs/plugins/discovery-ec2.asciidoc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -241,7 +241,7 @@ The `discovery-ec2` plugin can automatically set the `aws_availability_zone`
241241
node attribute to the availability zone of each node. This node attribute
242242
allows you to ensure that each shard has copies allocated redundantly across
243243
multiple availability zones by using the
244-
{ref}/modules-cluster.html#shard-allocation-awareness[Allocation Awareness]
244+
{ref}/shard-allocation-awareness.html#[Allocation Awareness]
245245
feature.
246246

247247
In order to enable the automatic definition of the `aws_availability_zone`
@@ -333,7 +333,7 @@ labelled as `Moderate` or `Low`.
333333

334334
* It is a good idea to distribute your nodes across multiple
335335
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html[availability
336-
zones] and use {ref}/modules-cluster.html#shard-allocation-awareness[shard
336+
zones] and use {ref}/shard-allocation-awareness.html[shard
337337
allocation awareness] to ensure that each shard has copies in more than one
338338
availability zone.
339339

docs/reference/analysis.asciidoc

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,7 @@
99
--
1010

1111
_Text analysis_ is the process of converting unstructured text, like
12-
the body of an email or a product description, into a structured format that's
13-
optimized for search.
12+
the body of an email or a product description, into a structured format that's <<full-text-search,optimized for search>>.
1413

1514
[discrete]
1615
[[when-to-configure-analysis]]

docs/reference/analysis/tokenizers.asciidoc

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,14 @@
11
[[analysis-tokenizers]]
22
== Tokenizer reference
33

4+
.Difference between {es} tokenization and neural tokenization
5+
[NOTE]
6+
====
7+
{es}'s tokenization process produces linguistic tokens, optimized for search and retrieval.
8+
This differs from neural tokenization in the context of machine learning and natural language processing. Neural tokenizers translate strings into smaller, subword tokens, which are encoded into vectors for consumptions by neural networks.
9+
{es} does not have built-in neural tokenizers.
10+
====
11+
412
A _tokenizer_ receives a stream of characters, breaks it up into individual
513
_tokens_ (usually individual words), and outputs a stream of _tokens_. For
614
instance, a <<analysis-whitespace-tokenizer,`whitespace`>> tokenizer breaks

docs/reference/cat/nodeattrs.asciidoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ console. They are _not_ intended for use by applications. For application
1717
consumption, use the <<cluster-nodes-info,nodes info API>>.
1818
====
1919

20-
Returns information about <<shard-allocation-filtering,custom node attributes>>.
20+
Returns information about <<custom-node-attributes,custom node attributes>>.
2121

2222
[[cat-nodeattrs-api-request]]
2323
==== {api-request-title}

docs/reference/cluster.asciidoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ one of the following:
3535
master-eligible nodes, all data nodes, all ingest nodes, all voting-only
3636
nodes, all machine learning nodes, and all coordinating-only nodes.
3737
* a pair of patterns, using `*` wildcards, of the form `attrname:attrvalue`,
38-
which adds to the subset all nodes with a custom node attribute whose name
38+
which adds to the subset all nodes with a <<custom-node-attributes,custom node attribute>> whose name
3939
and value match the respective patterns. Custom node attributes are
4040
configured by setting properties in the configuration file of the form
4141
`node.attr.attrname: attrvalue`.

docs/reference/commands/node-tool.asciidoc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,8 @@ bin/elasticsearch-node repurpose|unsafe-bootstrap|detach-cluster|override-versio
2323
This tool has a number of modes:
2424

2525
* `elasticsearch-node repurpose` can be used to delete unwanted data from a
26-
node if it used to be a <<data-node,data node>> or a
27-
<<master-node,master-eligible node>> but has been repurposed not to have one
26+
node if it used to be a <<data-node-role,data node>> or a
27+
<<master-node-role,master-eligible node>> but has been repurposed not to have one
2828
or other of these roles.
2929

3030
* `elasticsearch-node remove-settings` can be used to remove persistent settings

docs/reference/data-management.asciidoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ Data older than this period can be deleted by {es} at a later time.
4343

4444
**Elastic Curator** is a tool that allows you to manage your indices and snapshots using user-defined filters and predefined actions. If ILM provides the functionality to manage your index lifecycle, and you have at least a Basic license, consider using ILM in place of Curator. Many stack components make use of ILM by default. {curator-ref-current}/ilm.html[Learn more].
4545

46-
NOTE: <<xpack-rollup,Data rollup>> is a deprecated Elasticsearch feature that allows you to manage the amount of data that is stored in your cluster, similar to the downsampling functionality of {ilm-init} and data stream lifecycle. This feature should not be used for new deployments.
46+
NOTE: <<xpack-rollup,Data rollup>> is a deprecated {es} feature that allows you to manage the amount of data that is stored in your cluster, similar to the downsampling functionality of {ilm-init} and data stream lifecycle. This feature should not be used for new deployments.
4747

4848
[TIP]
4949
====

docs/reference/data-management/migrate-index-allocation-filters.asciidoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
[[migrate-index-allocation-filters]]
33
== Migrate index allocation filters to node roles
44

5-
If you currently use custom node attributes and
5+
If you currently use <<custom-node-attributes,custom node attributes>> and
66
<<shard-allocation-filtering, attribute-based allocation filters>> to
77
move indices through <<data-tiers, data tiers>> in a
88
https://www.elastic.co/blog/implementing-hot-warm-cold-in-elasticsearch-with-index-lifecycle-management[hot-warm-cold architecture],

0 commit comments

Comments
 (0)