Implement chunked fetch streaming with circuit breaker integration#139124
Implement chunked fetch streaming with circuit breaker integration#139124drempapis wants to merge 428 commits intoelastic:mainfrom
Conversation
server/src/main/java/org/elasticsearch/search/fetch/FetchPhaseDocsIterator.java
Outdated
Show resolved
Hide resolved
...rc/main/java/org/elasticsearch/search/fetch/chunk/TransportFetchPhaseCoordinationAction.java
Outdated
Show resolved
Hide resolved
...rc/main/java/org/elasticsearch/search/fetch/chunk/TransportFetchPhaseCoordinationAction.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/search/fetch/FetchPhaseDocsIterator.java
Outdated
Show resolved
Hide resolved
|
@elasticmachine run elasticsearch-ci/part-2 |
DaveCTurner
left a comment
There was a problem hiding this comment.
Couple of thoughts about blocking of threads.
server/src/main/java/org/elasticsearch/search/fetch/FetchPhase.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/search/fetch/FetchPhaseDocsIterator.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/search/fetch/FetchPhaseDocsIterator.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/search/fetch/FetchPhaseDocsIterator.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/search/fetch/FetchPhaseDocsIterator.java
Outdated
Show resolved
Hide resolved
…rch into chunked_fetch_phase
|
@elasticmachine run elasticsearch-ci/part-1 |
|
@elasticmachine run elasticsearch-ci/part-2 |
|
Buildkite benchmark this with pmc-3n-4g please |
|
Buildkite benchmark this with wikipedia please |
|
Buildkite benchmark this with geoshape please |
|
Buildkite benchmark this with geoshape please |
|
Buildkite benchmark this with sql-3n-4g please |
|
@elasticmachine run elasticsearch-ci/part-1 |
|
Buildkite benchmark this with sql-3n-4g please |
|
Buildkite benchmark this with sql-3n-4g please |
|
Buildkite benchmark this with noaa-3n-2g please |
|
Buildkite benchmark this with noaa-3n-2g please |
|
Buildkite benchmark this with esql please |
|
Buildkite benchmark this with geoshape please |
⏳ Build in-progress
This build attempts two geoshape benchmarks to evaluate performance impact of this PR. To estimate benchmark completion time inspect previous nightly runs here. History
|
In the current implementation, when Elasticsearch executes a search query that returns a large number of documents, the fetch phase retrieves the actual document content from each shard, which can lead to significant memory pressure on data nodes.
This PR implements chunked streaming for the fetch phase to reduce memory pressure when handling large result sets. Instead of accumulating all search hits in memory on the data node before sending them to the coordinator, hits are streamed in configurable chunks (default: 256 KB) as they are produced. Memory usage is bounded by circuit breakers on both the data and coordinator nodes.
How OOM is Prevented on the Data NodeHow OOM is Prevented on the Coordinator NodeFlow Diagram
The implementation followed the paradigm of
TransportRepositoryVerifyIntegrityCoordinationActionbut it streams only between the coordinator and data-nodes.