Skip to content

Conversation

valeriy42
Copy link
Contributor

@valeriy42 valeriy42 commented Sep 1, 2025

This PR ensures that in case of insufficient memory on a node, the internal IllegalArgumentException from assignModelToNode is not leaked to the upper layers of the architecture by first checking that the model can be assigned to the node. The check canAssign() is now moved into the function assignModelToNode().

@valeriy42 valeriy42 added >bug :ml Machine learning v9.2.0 labels Sep 1, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @valeriy42, I've created a changelog YAML for you.

@valeriy42 valeriy42 marked this pull request as ready for review September 3, 2025 13:57
@elasticsearchmachine elasticsearchmachine added the Team:ML Meta label for the ML team label Sep 3, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes model assignment error handling to prevent internal IllegalArgumentException from leaking to upper layers when there's insufficient memory on a node. The fix adds defensive checks using the canAssign method before attempting to assign models to nodes.

  • Adds canAssign checks before model assignments to prevent memory-related exceptions
  • Changes the visibility of canAssign method from package-private to public for broader access
  • Updates test naming to better reflect the test's purpose of explaining missing allocations

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
AssignmentPlan.java Changes canAssign method visibility from package-private to public
TrainedModelAssignmentRebalancer.java Adds canAssign check before assignModelToNode call with proper control flow
ZoneAwareAssignmentPlanner.java Adds canAssign check before assignModelToNode call with proper control flow
TrainedModelAssignmentRebalancerTests.java Updates test method name to better reflect its purpose
133916.yaml Adds changelog entry documenting the bug fix

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link
Contributor

@jan-elastic jan-elastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM; just a small issue

@valeriy42 valeriy42 merged commit e5c91ca into elastic:main Sep 9, 2025
34 checks passed
@valeriy42 valeriy42 deleted the fix/not-enough-memory-exception branch September 9, 2025 11:16
rjernst pushed a commit to rjernst/elasticsearch that referenced this pull request Sep 9, 2025
…eneration (elastic#133916)

This PR ensures that in case of insufficient memory on a node, the internal IllegalArgumentException from assignModelToNode is not leaked to the upper layers of the architecture by first checking that the model can be assigned to the node. The check canAssign() is now moved into the function assignModelToNode().
Kubik42 pushed a commit to Kubik42/elasticsearch that referenced this pull request Sep 9, 2025
…eneration (elastic#133916)

This PR ensures that in case of insufficient memory on a node, the internal IllegalArgumentException from assignModelToNode is not leaked to the upper layers of the architecture by first checking that the model can be assigned to the node. The check canAssign() is now moved into the function assignModelToNode().
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>bug :ml Machine learning Team:ML Meta label for the ML team v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants