elastic
diff --git a/‎distribution/docker/src/docker/dockerfiles/cloud_ess_fips/Dockerfile‎
Lines changed: 2 additions & 2 deletions b/‎distribution/docker/src/docker/dockerfiles/cloud_ess_fips/Dockerfile‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/changelog/137367.yaml‎
Lines changed: 5 additions & 0 deletions b/‎docs/changelog/137367.yaml‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎docs/changelog/138023.yaml‎
Lines changed: 5 additions & 0 deletions b/‎docs/changelog/138023.yaml‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎docs/changelog/138140.yaml‎
Lines changed: 5 additions & 0 deletions b/‎docs/changelog/138140.yaml‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎docs/changelog/138539.yaml‎
Lines changed: 5 additions & 0 deletions b/‎docs/changelog/138539.yaml‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎docs/internal/DistributedArchitectureGuide.md‎
Lines changed: 221 additions & 2 deletions b/‎docs/internal/DistributedArchitectureGuide.md‎
Lines changed: 221 additions & 2 deletions
diff --git a/‎docs/reference/elasticsearch/rest-apis/retrievers/diversify-retriever.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/reference/elasticsearch/rest-apis/retrievers/diversify-retriever.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/reference/query-languages/esql/_snippets/commands/layout/lookup-join.md‎
Lines changed: 6 additions & 14 deletions b/‎docs/reference/query-languages/esql/_snippets/commands/layout/lookup-join.md‎
Lines changed: 6 additions & 14 deletions
diff --git a/‎docs/reference/query-languages/esql/_snippets/functions/parameters/count_over_time.md‎
Lines changed: 3 additions & 0 deletions b/‎docs/reference/query-languages/esql/_snippets/functions/parameters/count_over_time.md‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎docs/reference/query-languages/esql/_snippets/functions/parameters/first_over_time.md‎
Lines changed: 3 additions & 0 deletions b/‎docs/reference/query-languages/esql/_snippets/functions/parameters/first_over_time.md‎
Lines changed: 3 additions & 0 deletions
@@ -25,7 +25,7 @@
 # Extract Elasticsearch artifact
 ################################################################################
 
-FROM docker.elastic.co/wolfi/chainguard-base-fips:latest@sha256:a02c67d96cd6ec1b50a055f1f5515e0987b643d001d436088cefc02e5b786bf9 AS builder
+FROM docker.elastic.co/wolfi/chainguard-base-fips:latest@sha256:f30b871373f23e31c9b083625d9e77075e6cde801227520b10b081b749b7f8c1 AS builder
 
 # Install required packages to extract the Elasticsearch distribution
 RUN <%= retry.loop(package_manager, "export DEBIAN_FRONTEND=noninteractive && ${package_manager} update && ${package_manager} update && ${package_manager} add --no-cache curl") %>
@@ -104,7 +104,7 @@ WORKDIR /usr/share/elasticsearch/config
 # Add entrypoint
 ################################################################################
 
-FROM docker.elastic.co/wolfi/chainguard-base-fips:latest@sha256:a02c67d96cd6ec1b50a055f1f5515e0987b643d001d436088cefc02e5b786bf9
+FROM docker.elastic.co/wolfi/chainguard-base-fips:latest@sha256:f30b871373f23e31c9b083625d9e77075e6cde801227520b10b081b749b7f8c1
 
 RUN <%= retry.loop(package_manager,
           "export DEBIAN_FRONTEND=noninteractive && \n" +
 
@@ -0,0 +1,5 @@
+pr: 137367
+summary: GROUP BY ALL
+area: ES|QL
+type: enhancement
+issues: []
@@ -0,0 +1,5 @@
+pr: 138023
+summary: Push down COUNT(*) BY DATE_TRUNC
+area: ES|QL
+type: feature
+issues: []
@@ -0,0 +1,5 @@
+pr: 138140
+summary: "Fix semantic highlighting when using a `knn` query with minimum `similarity` and when using `bbq_disk`"
+area: Relevance
+type: bug
+issues: []
@@ -0,0 +1,5 @@
+pr: 138539
+summary: Handle serialization of null blocks in `AggregateMetricDoubleBlock`
+area: ES|QL
+type: bug
+issues: []
@@ -12,6 +12,227 @@ A guide to the general Elasticsearch components can be found [here](https://gith
 
 # Networking
 
+Every elasticsearch node maintains various networking clients and servers,
+protocols, and synchronous/asynchronous handling. Our public docs cover user
+facing settings and some internal aspects - [Network Settings](https://www.elastic.co/docs/reference/elasticsearch/configuration-reference/networking-settings).
+
+## HTTP Server
+
+The HTTP Server is a single entry point for all external clients (excluding
+cross-cluster communication). Management, ingestion, search, and all other
+external operations pass through the HTTP server.
+
+Elasticsearch works over HTTP 1.1 and supports features such as TLS, chunked
+transfer encoding, content compression, and pipelining. While attempting to
+be HTTP spec compliant, Elasticsearch is not a webserver. ES Supports `GET`
+requests with a payload (though some old proxies may drop content) and
+`POST` for clients unable to send `GET-with-body`. Requests cannot be cached
+by middle boxes.
+
+There is no connection limit, but a limit on payload size exists. The default
+maximum payload is 100MB after compression. It's a very large number and almost
+never a good target that the client should approach. See
+`HttpTransportSettings` class.
+
+Security features, including basic security: authentication(authc),
+authorization(authz), Transport Layer Security (TLS) are available in the free
+tier and achieved with separate x-pack modules.
+
+The HTTP server provides two options for content processing: full aggregation
+and incremental processing. Aggregated content is a preferable choice for small
+messages that do not fit for incremental parsing (e.g., JSON). Aggregation has
+drawbacks: it requires more memory, which is reserved until all bytes are
+received. Concurrent incomplete requests can lead to unbounded memory growth and
+potential OOMs. Large delimited content, such as bulk indexing, which is
+processed in byte chunks, provides better control over memory usage but is more
+complicated for application code.
+
+Incremental bulk indexing includes a back-pressure feature. See `org.
+elasticsearch.index.IndexingPressure`. When memory pressure grows high
+(`LOW_WATERMARK`), reading bytes from TCP sockets is paused for some
+connections, allowing only a few to proceed until the pressure is resolved.
+When memory grows too high (`HIGH_WATERMARK`) bulk items are rejected with 429.
+This mechanism protects against unbounded memory usage and `OutOfMemory`
+errors (OOMs).
+
+ES supports multiple `Content-Type`s for the payload. These are
+implementations of `MediaType` interface. A common implementation is called
+`XContentType`, including CBOR, JSON, SMILE, YAML, and their versioned types.
+X-pack extensions includes PLAIN_TEXT, CSV, etc. Classes that implement
+`ToXContent` and friends can be serialized and sent over HTTP.
+
+HTTP routing is based on a combination of Method and URI. For example,
+`RestCreateIndexAction` handler uses `("PUT", "/{index}")`, where curly braces
+indicate path variables. `RestBulkAction` specifies a list of routes
+
+```java
+@Override
+  public List<Route> routes() {
+    return List.of(
+      new Route(POST, "/_bulk"),
+      new Route(PUT, "/_bulk"),
+      new Route(POST, "/{index}/_bulk"),
+      new Route(PUT, "/{index}/_bulk")
+    );
+  }
+```
+
+Every REST handler must be declared in the `ActionModule` class in the
+`initRestHandlers` method. Plugins implementing `ActionPlugin` can extend the
+list of handlers via the `getRestHandlers` override. Every REST handler
+should extend `BaseRestHandler`.
+
+The REST handler’s job is to parse and validate the HTTP request and construct a
+typed version of the request, often a Transport request. When security is
+enabled, the HTTP layer handles authentication (based on headers), and the
+Transport layer handles authorization.
+
+Request handling flow from Java classes view goes as:
+
+```
+(if security enabled) Security.getHttpServerTransportWithHeadersValidator
+-> `Netty4HttpServerTransport`
+-> `AbstractHttpServerTransport`
+-> `RestController`
+-> `BaseRestHandler`
+-> `Rest{Some}Action`
+```
+
+`Netty4HttpServerTransport` is a single implementation of
+`AbstractHttpServerTransport` from the `transport-netty4`
+module. The `x-pack/security` module injects TLS and headers validator.
+
+## Transport
+
+Transport is the term for node-to-node communication, utilizing a TCP-based
+custom binary protocol. Every node acts as both a client and a server.
+Node-to-node communication almost never uses HTTP transport (except for
+reindex-from-remote).
+
+`Netty4Transport` is the sole implementation of TCP transport, initializing
+both the Transport client and server. The `x-pack/security` plugin provides
+a secure version: `SecurityNetty4Transport` (with TLS and authentication).
+
+A `Connection` between nodes is a pool of `Channel`s, where each channel is a
+non-blocking TCP connection (Java NIO terminology). Once a cluster is
+discovered, a `Connection` (pool of `Channel`s) is opened to every other node,
+and every other node opens a `Connection` back. This results in two
+`Connection`s between any two nodes `(A→B and B→A)`. A node sends requests only
+on the `Connection` it opens (acting as a client). The default pool is around 13
+`Channel`s, divided into sub-pools for different purposes (e.g., ping,
+node-state, bulks). The pool structure is defined in the `ConnectionProfile`
+class.
+
+ES never behaves incorrectly (e.g. loses data) in the face of network outages
+but it may become unavailable unless the network is stable. Network stability
+between nodes is assumed, though connectivity issues remain a constant
+challenge.
+
+Request timeouts are discouraged, as Transport requests are guaranteed to
+eventually receive a response, even without a timeout. `SO_KEEPALIVE` helps
+detect and tear down dead connections. When a connection closes with an error,
+the entire pool is closed, outstanding requests fail, and the pool is
+reconnected.
+
+There are no retries on the Transport layer itself. The application layer
+decides when and how to retry (e.g., via `RetryableAction` or
+`TransportMasterNodeAction`). In the future Transport framework might support
+retries #95100.
+
+Transport can multiplex requests and responses in a single `Channel`, but
+cannot multiplex parts of messages. Each transport message must be fully
+dispatched before the next can be sent. Proper application-layer sizing/chunking
+of messages is recommended to ensure fairness of delivery across multiple
+senders. A Transport message cannot be larger than 30% of heap (
+`org.elasticsearch.transport.TcpTransport#THIRTY_PER_HEAP_SIZE`) or 2GB (due to
+`org.elasticsearch.transport.Header#networkMessageSize` being an `int`).
+
+The `TransportMessage` family tree includes various types (node-to-node,
+broadcast, master node acknowledged) to ensure correct dispatch and response
+handling. For example when a message must be accepted on all nodes.
+
+## Other networking stacks
+
+Snapshotting to remote repositories involves different networking clients
+and SDKs. For example AWS SDK comes with Apache or Netty HTTP client, Azure
+with Netty-based Project-Reactor, GCP uses default Java HTTP client.
+Underlying clients may be reused between repositories, with varying levels of
+control over networking settings.
+
+There are other features such as SAML/JWT metadata reloading, Watcher HTTP
+action, reindex and ML related features such as inference that also use HTTP
+clients.
+
+## Sync/Async IO and threading
+
+ES handles a mix of I/O operations (disk, HTTP server,
+Transport client/server, repositories), resulting in a combination of
+synchronous and asynchronous styles. Asynchronous IO utilizes a small set of
+threads by running small tasks, minimizing context switch. Synchronous IO
+uses many threads and relies on an OS scheduler. ES typically runs with 100+
+threads, where Async and Sync threads compete for resources.
+
+## Netty
+
+Netty is a networking framework/toolkit used extensively for HTTP and Transport
+networks, providing foundational building blocks for networking applications.
+
+### Event-Loop (Transport-Thread)
+
+Netty is an Async IO framework, it runs with a few threads. An event-loop is
+a thread that processes events for one or many `Channels` (TCP connections).
+Every `Channel` has exactly one, unchanging event-loop, eliminating the need to
+synchronize events within that `Channel`. A single, CPU-bound `Transport
+ThreadPool` (e.g.,4 threads for 4 cores) serves all HTTP and Transport
+servers and clients, handling potentially hundreds or thousands of connections.
+
+Event-loop threads serve many connections each, it's critical to not block
+threads for a long time. Fork any blocking operation or heavy computation to
+another thread pool. Forking, however, comes with overhead. Do not fork
+simple requests that can be served from memory and do not require heavy
+computations (milliseconds).
+
+Transport threads are monitored by `ThreadWatchdog`. A warning log appears if a
+single task runs longer than 5 seconds. Slowness can be caused by blocking, GC
+pauses, or CPU starvation from other thread pools.
+
+### ByteBuf - byte buffers and reference counting
+
+Netty's controlled memory allocation provides a performance edge by managing and
+reusing byte buffer pools (e.g., pools of 1MiB byte chunks sliced into 16KiB
+pages). Some pages might not be in use while taking up heap space and show up in
+the heap dump.
+
+Netty reads socket bytes into direct buffers, and ES copies them into pooled
+byte-buffers (`CopyBytesSocketChannel`). The application is responsible for
+retaining (increasing ref-count) and releasing (decreasing ref-count) for
+pooled buffers.
+
+Reference counting introduces two primary problems:
+
+1. Use after release (free): Accessing a buffer after it has been explicitly
+   released.
+2. Never release (leak): Failing to release a buffer, leading to memory leaks.
+
+The compiler does not help detect these issues. They require careful testing
+using Netty's LeakDetector with a Paranoid level. It's enabled by default in
+all tests.
+
+### Async methods return futures
+
+Every asynchronous operation in Netty returns a future. It is easy to forget
+to check the result, as a following call always succeeds:
+
+```java
+ctx.write(message)
+```
+
+Check the result of an async operation:
+
+```java
+ctx.write(message).addListener(f -> { if (f.isSuccess() ...)});
+```
+
 ### ThreadPool
 
 (We have many thread pools, what and why)
@@ -28,8 +249,6 @@ See the [Javadocs for `ActionListener`](https://github.com/elastic/elasticsearch
 
 ### Performance
 
-### Netty
-
 (long running actions should be forked off of the Netty thread. Keep short operations to avoid forking costs)
 
 ### Work Queues
 
@@ -11,7 +11,7 @@ This is useful when you want to maximize diversity by preventing similar documen
 Practical use cases include:
 - **eCommerce applications**: Show users a wider variety of products rather than multiple similar items
 - **Retrieval augmented generation (RAG) workflows**: Provide more diverse context to the LLM, reducing redundancy in the prompt
--
+
 The retriever uses [MMR (Maximum Marginal Relevance)](https://www.cs.cmu.edu/~jgc/publication/The_Use_MMR_Diversity_Based_LTMIR_1998.pdf) diversification to discard results that are too similar to each other.
 Similarity is determined based on the `field` parameter and the optionally provided `query_vector`.
 :::{note}
 
@@ -13,28 +13,20 @@ Refer to [the high-level landing page](../../../../esql/esql-lookup-join.md) for
 
 ```esql
 FROM <source_index>
-| LOOKUP JOIN <lookup_index> ON <field_name>
-```
-
-```esql
-FROM <source_index>
-| LOOKUP JOIN <lookup_index> ON <field_name1>, <field_name2>, <field_name3>
-```
-```esql
-FROM <source_index>
-| LOOKUP JOIN <lookup_index> ON <left_field1> >= <lookup_field1> AND <left_field2> == <lookup_field2>
+| LOOKUP JOIN <lookup_index> ON <join_condition>
 ```
 
 **Parameters**
 
 `<lookup_index>`
 :   The name of the lookup index. This must be a specific index name - wildcards, aliases, and remote cluster references are not supported. Indices used for lookups must be configured with the [`lookup` index mode](/reference/elasticsearch/index-settings/index-modules.md#index-mode-setting).
 
-`<field_name>` or `<field_name1>, <field_name2>, <field_name3>` or `<left_field1> >= <lookup_field1> AND <left_field2> == <lookup_field2>`
-:   The join condition. Can be one of the following:
+`<join_condition>`
+:   Can be one of the following:
    * A single field name
-   * A comma-separated list of field names {applies_to}`stack: ga 9.2`
-   * An expression with one or more join conditions linked by `AND`. Each condition compares a field from the left index with a field from the lookup index using [binary operators](/reference/query-languages/esql/functions-operators/operators.md#esql-binary-operators) (`==`, `>=`, `<=`, `>`, `<`, `!=`). Each field name in the join condition must exist in only one of the indexes. Use RENAME to resolve naming conflicts. {applies_to}`stack: preview 9.2` {applies_to}`serverless: preview`
+   * A comma-separated list of field names, for example `<field1>, <field2>, <field3>`  {applies_to}`stack: ga 9.2`
+   * An expression with one or more predicates linked by `AND`, for example `<left_field1> >= <lookup_field1> AND <left_field2> == <lookup_field2>`. Each predicate compares a field from the left index with a field from the lookup index using [binary operators](/reference/query-languages/esql/functions-operators/operators.md#esql-binary-operators) (`==`, `>=`, `<=`, `>`, `<`, `!=`). Each field name in the join condition must exist in only one of the indexes. Use RENAME to resolve naming conflicts. {applies_to}`stack: preview 9.2` {applies_to}`serverless: preview`
+   * An expression that includes [full text functions](/reference/query-languages/esql/functions-operators/search-functions.md) and other Lucene-pushable functions, for example `MATCH(<lookup_field>, "search term") AND <left_field> == <lookup_field>`. These functions can be combined with binary operators and logical operators (`AND`, `OR`, `NOT`) to create complex join conditions. At least one condition that relates the lookup index fields to the left side of the join fields is still required. {applies_to}`stack: preview 9.3` {applies_to}`serverless: preview`
 :   If using join on a single field or a field list, the fields used must exist in both your current query results and in the lookup index. If the fields contains multi-valued entries, those entries will not match anything (the added fields will contain `null` for those rows).