Skip to content

Conversation

@Mikep86
Copy link
Contributor

@Mikep86 Mikep86 commented Mar 24, 2025

We have observed many OOMs due to the memory required to inject chunked inference results for semantic_text fields. This PR uses coordinating indexing pressure to account for this memory usage. When indexing pressure memory usage exceeds the threshold set by indexing_pressure.memory.limit, chunked inference result injection will be suspended to prevent OOMs.

@Mikep86 Mikep86 added >non-issue :ml Machine learning :SearchOrg/Relevance Label for the Search (solution/org) Relevance team v8.19.0 labels Mar 24, 2025
@Mikep86 Mikep86 requested a review from kderusso March 24, 2025 16:59
@Mikep86
Copy link
Contributor Author

Mikep86 commented Apr 8, 2025

@elasticmachine update branch

Copy link
Member

@kderusso kderusso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for iterating

Copy link
Member

@davidkyle davidkyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Thanks @Mikep86

@elasticsearchmachine
Copy link
Collaborator

Hi @Mikep86, I've created a changelog YAML for you.

@jimczi jimczi added the :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. label Apr 11, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-indexing (Team:Distributed Indexing)

Copy link
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @Mikep86.
Let’s update the PR title and summary to reflect the use of indexing pressure.
It would be great if someone from @elastic/es-distributed-indexing could review this section.

@Mikep86 Mikep86 changed the title Add Semantic Text Chunking OOM Circuit Breaker Semantic Text Chunking Indexing Pressure Apr 11, 2025
@Mikep86 Mikep86 merged commit 85713f7 into elastic:main Apr 14, 2025
17 checks passed
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.x Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 125517

@Mikep86
Copy link
Contributor Author

Mikep86 commented Apr 28, 2025

💚 All backports created successfully

Status Branch Result
8.19

Questions ?

Please refer to the Backport tool documentation

Mikep86 added a commit to Mikep86/elasticsearch that referenced this pull request Apr 28, 2025
We have observed many OOMs due to the memory required to inject chunked inference results for semantic_text fields. This PR uses coordinating indexing pressure to account for this memory usage. When indexing pressure memory usage exceeds the threshold set by indexing_pressure.memory.limit, chunked inference result injection will be suspended to prevent OOMs.

(cherry picked from commit 85713f7)

# Conflicts:
#	server/src/main/java/org/elasticsearch/node/NodeConstruction.java
#	server/src/main/java/org/elasticsearch/node/PluginServiceInstances.java
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferencePlugin.java
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/action/filter/ShardBulkInferenceActionFilterTests.java
elasticsearchmachine pushed a commit that referenced this pull request Apr 28, 2025
* Semantic Text Chunking Indexing Pressure (#125517)

We have observed many OOMs due to the memory required to inject chunked inference results for semantic_text fields. This PR uses coordinating indexing pressure to account for this memory usage. When indexing pressure memory usage exceeds the threshold set by indexing_pressure.memory.limit, chunked inference result injection will be suspended to prevent OOMs.

(cherry picked from commit 85713f7)

# Conflicts:
#	server/src/main/java/org/elasticsearch/node/NodeConstruction.java
#	server/src/main/java/org/elasticsearch/node/PluginServiceInstances.java
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferencePlugin.java
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/action/filter/ShardBulkInferenceActionFilterTests.java

* [CI] Auto commit changes from spotless

---------

Co-authored-by: elasticsearchmachine <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. >enhancement :ml Machine learning :SearchOrg/Relevance Label for the Search (solution/org) Relevance team v8.19.0 v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants