Skip to content

Commit 4089209

Browse files
authored
Merge branch 'main' into repository-hdfs-commons-lang3-version
2 parents 6fae26d + ef8a008 commit 4089209

File tree

180 files changed

+5513
-1749
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

180 files changed

+5513
-1749
lines changed

distribution/docker/src/docker/dockerfiles/cloud_ess_fips/Dockerfile

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@
2525
# Extract Elasticsearch artifact
2626
################################################################################
2727
28-
FROM docker.elastic.co/wolfi/chainguard-base-fips:latest@sha256:a02c67d96cd6ec1b50a055f1f5515e0987b643d001d436088cefc02e5b786bf9 AS builder
28+
FROM docker.elastic.co/wolfi/chainguard-base-fips:latest@sha256:f30b871373f23e31c9b083625d9e77075e6cde801227520b10b081b749b7f8c1 AS builder
2929
3030
# Install required packages to extract the Elasticsearch distribution
3131
RUN <%= retry.loop(package_manager, "export DEBIAN_FRONTEND=noninteractive && ${package_manager} update && ${package_manager} update && ${package_manager} add --no-cache curl") %>
@@ -104,7 +104,7 @@ WORKDIR /usr/share/elasticsearch/config
104104
# Add entrypoint
105105
################################################################################
106106

107-
FROM docker.elastic.co/wolfi/chainguard-base-fips:latest@sha256:a02c67d96cd6ec1b50a055f1f5515e0987b643d001d436088cefc02e5b786bf9
107+
FROM docker.elastic.co/wolfi/chainguard-base-fips:latest@sha256:f30b871373f23e31c9b083625d9e77075e6cde801227520b10b081b749b7f8c1
108108

109109
RUN <%= retry.loop(package_manager,
110110
"export DEBIAN_FRONTEND=noninteractive && \n" +

docs/changelog/137367.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 137367
2+
summary: GROUP BY ALL
3+
area: ES|QL
4+
type: enhancement
5+
issues: []

docs/changelog/138023.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 138023
2+
summary: Push down COUNT(*) BY DATE_TRUNC
3+
area: ES|QL
4+
type: feature
5+
issues: []

docs/changelog/138140.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 138140
2+
summary: "Fix semantic highlighting when using a `knn` query with minimum `similarity` and when using `bbq_disk`"
3+
area: Relevance
4+
type: bug
5+
issues: []

docs/changelog/138539.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 138539
2+
summary: Handle serialization of null blocks in `AggregateMetricDoubleBlock`
3+
area: ES|QL
4+
type: bug
5+
issues: []

docs/internal/DistributedArchitectureGuide.md

Lines changed: 221 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,227 @@ A guide to the general Elasticsearch components can be found [here](https://gith
1212

1313
# Networking
1414

15+
Every elasticsearch node maintains various networking clients and servers,
16+
protocols, and synchronous/asynchronous handling. Our public docs cover user
17+
facing settings and some internal aspects - [Network Settings](https://www.elastic.co/docs/reference/elasticsearch/configuration-reference/networking-settings).
18+
19+
## HTTP Server
20+
21+
The HTTP Server is a single entry point for all external clients (excluding
22+
cross-cluster communication). Management, ingestion, search, and all other
23+
external operations pass through the HTTP server.
24+
25+
Elasticsearch works over HTTP 1.1 and supports features such as TLS, chunked
26+
transfer encoding, content compression, and pipelining. While attempting to
27+
be HTTP spec compliant, Elasticsearch is not a webserver. ES Supports `GET`
28+
requests with a payload (though some old proxies may drop content) and
29+
`POST` for clients unable to send `GET-with-body`. Requests cannot be cached
30+
by middle boxes.
31+
32+
There is no connection limit, but a limit on payload size exists. The default
33+
maximum payload is 100MB after compression. It's a very large number and almost
34+
never a good target that the client should approach. See
35+
`HttpTransportSettings` class.
36+
37+
Security features, including basic security: authentication(authc),
38+
authorization(authz), Transport Layer Security (TLS) are available in the free
39+
tier and achieved with separate x-pack modules.
40+
41+
The HTTP server provides two options for content processing: full aggregation
42+
and incremental processing. Aggregated content is a preferable choice for small
43+
messages that do not fit for incremental parsing (e.g., JSON). Aggregation has
44+
drawbacks: it requires more memory, which is reserved until all bytes are
45+
received. Concurrent incomplete requests can lead to unbounded memory growth and
46+
potential OOMs. Large delimited content, such as bulk indexing, which is
47+
processed in byte chunks, provides better control over memory usage but is more
48+
complicated for application code.
49+
50+
Incremental bulk indexing includes a back-pressure feature. See `org.
51+
elasticsearch.index.IndexingPressure`. When memory pressure grows high
52+
(`LOW_WATERMARK`), reading bytes from TCP sockets is paused for some
53+
connections, allowing only a few to proceed until the pressure is resolved.
54+
When memory grows too high (`HIGH_WATERMARK`) bulk items are rejected with 429.
55+
This mechanism protects against unbounded memory usage and `OutOfMemory`
56+
errors (OOMs).
57+
58+
ES supports multiple `Content-Type`s for the payload. These are
59+
implementations of `MediaType` interface. A common implementation is called
60+
`XContentType`, including CBOR, JSON, SMILE, YAML, and their versioned types.
61+
X-pack extensions includes PLAIN_TEXT, CSV, etc. Classes that implement
62+
`ToXContent` and friends can be serialized and sent over HTTP.
63+
64+
HTTP routing is based on a combination of Method and URI. For example,
65+
`RestCreateIndexAction` handler uses `("PUT", "/{index}")`, where curly braces
66+
indicate path variables. `RestBulkAction` specifies a list of routes
67+
68+
```java
69+
@Override
70+
public List<Route> routes() {
71+
return List.of(
72+
new Route(POST, "/_bulk"),
73+
new Route(PUT, "/_bulk"),
74+
new Route(POST, "/{index}/_bulk"),
75+
new Route(PUT, "/{index}/_bulk")
76+
);
77+
}
78+
```
79+
80+
Every REST handler must be declared in the `ActionModule` class in the
81+
`initRestHandlers` method. Plugins implementing `ActionPlugin` can extend the
82+
list of handlers via the `getRestHandlers` override. Every REST handler
83+
should extend `BaseRestHandler`.
84+
85+
The REST handler’s job is to parse and validate the HTTP request and construct a
86+
typed version of the request, often a Transport request. When security is
87+
enabled, the HTTP layer handles authentication (based on headers), and the
88+
Transport layer handles authorization.
89+
90+
Request handling flow from Java classes view goes as:
91+
92+
```
93+
(if security enabled) Security.getHttpServerTransportWithHeadersValidator
94+
-> `Netty4HttpServerTransport`
95+
-> `AbstractHttpServerTransport`
96+
-> `RestController`
97+
-> `BaseRestHandler`
98+
-> `Rest{Some}Action`
99+
```
100+
101+
`Netty4HttpServerTransport` is a single implementation of
102+
`AbstractHttpServerTransport` from the `transport-netty4`
103+
module. The `x-pack/security` module injects TLS and headers validator.
104+
105+
## Transport
106+
107+
Transport is the term for node-to-node communication, utilizing a TCP-based
108+
custom binary protocol. Every node acts as both a client and a server.
109+
Node-to-node communication almost never uses HTTP transport (except for
110+
reindex-from-remote).
111+
112+
`Netty4Transport` is the sole implementation of TCP transport, initializing
113+
both the Transport client and server. The `x-pack/security` plugin provides
114+
a secure version: `SecurityNetty4Transport` (with TLS and authentication).
115+
116+
A `Connection` between nodes is a pool of `Channel`s, where each channel is a
117+
non-blocking TCP connection (Java NIO terminology). Once a cluster is
118+
discovered, a `Connection` (pool of `Channel`s) is opened to every other node,
119+
and every other node opens a `Connection` back. This results in two
120+
`Connection`s between any two nodes `(A→B and B→A)`. A node sends requests only
121+
on the `Connection` it opens (acting as a client). The default pool is around 13
122+
`Channel`s, divided into sub-pools for different purposes (e.g., ping,
123+
node-state, bulks). The pool structure is defined in the `ConnectionProfile`
124+
class.
125+
126+
ES never behaves incorrectly (e.g. loses data) in the face of network outages
127+
but it may become unavailable unless the network is stable. Network stability
128+
between nodes is assumed, though connectivity issues remain a constant
129+
challenge.
130+
131+
Request timeouts are discouraged, as Transport requests are guaranteed to
132+
eventually receive a response, even without a timeout. `SO_KEEPALIVE` helps
133+
detect and tear down dead connections. When a connection closes with an error,
134+
the entire pool is closed, outstanding requests fail, and the pool is
135+
reconnected.
136+
137+
There are no retries on the Transport layer itself. The application layer
138+
decides when and how to retry (e.g., via `RetryableAction` or
139+
`TransportMasterNodeAction`). In the future Transport framework might support
140+
retries #95100.
141+
142+
Transport can multiplex requests and responses in a single `Channel`, but
143+
cannot multiplex parts of messages. Each transport message must be fully
144+
dispatched before the next can be sent. Proper application-layer sizing/chunking
145+
of messages is recommended to ensure fairness of delivery across multiple
146+
senders. A Transport message cannot be larger than 30% of heap (
147+
`org.elasticsearch.transport.TcpTransport#THIRTY_PER_HEAP_SIZE`) or 2GB (due to
148+
`org.elasticsearch.transport.Header#networkMessageSize` being an `int`).
149+
150+
The `TransportMessage` family tree includes various types (node-to-node,
151+
broadcast, master node acknowledged) to ensure correct dispatch and response
152+
handling. For example when a message must be accepted on all nodes.
153+
154+
## Other networking stacks
155+
156+
Snapshotting to remote repositories involves different networking clients
157+
and SDKs. For example AWS SDK comes with Apache or Netty HTTP client, Azure
158+
with Netty-based Project-Reactor, GCP uses default Java HTTP client.
159+
Underlying clients may be reused between repositories, with varying levels of
160+
control over networking settings.
161+
162+
There are other features such as SAML/JWT metadata reloading, Watcher HTTP
163+
action, reindex and ML related features such as inference that also use HTTP
164+
clients.
165+
166+
## Sync/Async IO and threading
167+
168+
ES handles a mix of I/O operations (disk, HTTP server,
169+
Transport client/server, repositories), resulting in a combination of
170+
synchronous and asynchronous styles. Asynchronous IO utilizes a small set of
171+
threads by running small tasks, minimizing context switch. Synchronous IO
172+
uses many threads and relies on an OS scheduler. ES typically runs with 100+
173+
threads, where Async and Sync threads compete for resources.
174+
175+
## Netty
176+
177+
Netty is a networking framework/toolkit used extensively for HTTP and Transport
178+
networks, providing foundational building blocks for networking applications.
179+
180+
### Event-Loop (Transport-Thread)
181+
182+
Netty is an Async IO framework, it runs with a few threads. An event-loop is
183+
a thread that processes events for one or many `Channels` (TCP connections).
184+
Every `Channel` has exactly one, unchanging event-loop, eliminating the need to
185+
synchronize events within that `Channel`. A single, CPU-bound `Transport
186+
ThreadPool` (e.g.,4 threads for 4 cores) serves all HTTP and Transport
187+
servers and clients, handling potentially hundreds or thousands of connections.
188+
189+
Event-loop threads serve many connections each, it's critical to not block
190+
threads for a long time. Fork any blocking operation or heavy computation to
191+
another thread pool. Forking, however, comes with overhead. Do not fork
192+
simple requests that can be served from memory and do not require heavy
193+
computations (milliseconds).
194+
195+
Transport threads are monitored by `ThreadWatchdog`. A warning log appears if a
196+
single task runs longer than 5 seconds. Slowness can be caused by blocking, GC
197+
pauses, or CPU starvation from other thread pools.
198+
199+
### ByteBuf - byte buffers and reference counting
200+
201+
Netty's controlled memory allocation provides a performance edge by managing and
202+
reusing byte buffer pools (e.g., pools of 1MiB byte chunks sliced into 16KiB
203+
pages). Some pages might not be in use while taking up heap space and show up in
204+
the heap dump.
205+
206+
Netty reads socket bytes into direct buffers, and ES copies them into pooled
207+
byte-buffers (`CopyBytesSocketChannel`). The application is responsible for
208+
retaining (increasing ref-count) and releasing (decreasing ref-count) for
209+
pooled buffers.
210+
211+
Reference counting introduces two primary problems:
212+
213+
1. Use after release (free): Accessing a buffer after it has been explicitly
214+
released.
215+
2. Never release (leak): Failing to release a buffer, leading to memory leaks.
216+
217+
The compiler does not help detect these issues. They require careful testing
218+
using Netty's LeakDetector with a Paranoid level. It's enabled by default in
219+
all tests.
220+
221+
### Async methods return futures
222+
223+
Every asynchronous operation in Netty returns a future. It is easy to forget
224+
to check the result, as a following call always succeeds:
225+
226+
```java
227+
ctx.write(message)
228+
```
229+
230+
Check the result of an async operation:
231+
232+
```java
233+
ctx.write(message).addListener(f -> { if (f.isSuccess() ...)});
234+
```
235+
15236
### ThreadPool
16237

17238
(We have many thread pools, what and why)
@@ -28,8 +249,6 @@ See the [Javadocs for `ActionListener`](https://github.com/elastic/elasticsearch
28249

29250
### Performance
30251

31-
### Netty
32-
33252
(long running actions should be forked off of the Netty thread. Keep short operations to avoid forking costs)
34253

35254
### Work Queues

docs/reference/elasticsearch/rest-apis/retrievers/diversify-retriever.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ This is useful when you want to maximize diversity by preventing similar documen
1111
Practical use cases include:
1212
- **eCommerce applications**: Show users a wider variety of products rather than multiple similar items
1313
- **Retrieval augmented generation (RAG) workflows**: Provide more diverse context to the LLM, reducing redundancy in the prompt
14-
-
14+
1515
The retriever uses [MMR (Maximum Marginal Relevance)](https://www.cs.cmu.edu/~jgc/publication/The_Use_MMR_Diversity_Based_LTMIR_1998.pdf) diversification to discard results that are too similar to each other.
1616
Similarity is determined based on the `field` parameter and the optionally provided `query_vector`.
1717
:::{note}

docs/reference/query-languages/esql/_snippets/commands/layout/lookup-join.md

Lines changed: 6 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -13,28 +13,20 @@ Refer to [the high-level landing page](../../../../esql/esql-lookup-join.md) for
1313

1414
```esql
1515
FROM <source_index>
16-
| LOOKUP JOIN <lookup_index> ON <field_name>
17-
```
18-
19-
```esql
20-
FROM <source_index>
21-
| LOOKUP JOIN <lookup_index> ON <field_name1>, <field_name2>, <field_name3>
22-
```
23-
```esql
24-
FROM <source_index>
25-
| LOOKUP JOIN <lookup_index> ON <left_field1> >= <lookup_field1> AND <left_field2> == <lookup_field2>
16+
| LOOKUP JOIN <lookup_index> ON <join_condition>
2617
```
2718

2819
**Parameters**
2920

3021
`<lookup_index>`
3122
: The name of the lookup index. This must be a specific index name - wildcards, aliases, and remote cluster references are not supported. Indices used for lookups must be configured with the [`lookup` index mode](/reference/elasticsearch/index-settings/index-modules.md#index-mode-setting).
3223

33-
`<field_name>` or `<field_name1>, <field_name2>, <field_name3>` or `<left_field1> >= <lookup_field1> AND <left_field2> == <lookup_field2>`
34-
: The join condition. Can be one of the following:
24+
`<join_condition>`
25+
: Can be one of the following:
3526
* A single field name
36-
* A comma-separated list of field names {applies_to}`stack: ga 9.2`
37-
* An expression with one or more join conditions linked by `AND`. Each condition compares a field from the left index with a field from the lookup index using [binary operators](/reference/query-languages/esql/functions-operators/operators.md#esql-binary-operators) (`==`, `>=`, `<=`, `>`, `<`, `!=`). Each field name in the join condition must exist in only one of the indexes. Use RENAME to resolve naming conflicts. {applies_to}`stack: preview 9.2` {applies_to}`serverless: preview`
27+
* A comma-separated list of field names, for example `<field1>, <field2>, <field3>` {applies_to}`stack: ga 9.2`
28+
* An expression with one or more predicates linked by `AND`, for example `<left_field1> >= <lookup_field1> AND <left_field2> == <lookup_field2>`. Each predicate compares a field from the left index with a field from the lookup index using [binary operators](/reference/query-languages/esql/functions-operators/operators.md#esql-binary-operators) (`==`, `>=`, `<=`, `>`, `<`, `!=`). Each field name in the join condition must exist in only one of the indexes. Use RENAME to resolve naming conflicts. {applies_to}`stack: preview 9.2` {applies_to}`serverless: preview`
29+
* An expression that includes [full text functions](/reference/query-languages/esql/functions-operators/search-functions.md) and other Lucene-pushable functions, for example `MATCH(<lookup_field>, "search term") AND <left_field> == <lookup_field>`. These functions can be combined with binary operators and logical operators (`AND`, `OR`, `NOT`) to create complex join conditions. At least one condition that relates the lookup index fields to the left side of the join fields is still required. {applies_to}`stack: preview 9.3` {applies_to}`serverless: preview`
3830
: If using join on a single field or a field list, the fields used must exist in both your current query results and in the lookup index. If the fields contains multi-valued entries, those entries will not match anything (the added fields will contain `null` for those rows).
3931

4032

docs/reference/query-languages/esql/_snippets/functions/parameters/count_over_time.md

Lines changed: 3 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

docs/reference/query-languages/esql/_snippets/functions/parameters/first_over_time.md

Lines changed: 3 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)