- 
                Notifications
    
You must be signed in to change notification settings  - Fork 25.6k
 
[ML] Fix test failure updating model deployment with stale cluster state. #128667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| 
           Pinging @elastic/ml-core (Team:ML)  | 
    
| 
               | 
          ||
| private static final Logger logger = LogManager.getLogger(TrainedModelAssignmentClusterService.class); | ||
| 
               | 
          ||
| private static final TransportVersion RENAME_ALLOCATION_TO_ASSIGNMENT_TRANSPORT_VERSION = TransportVersions.V_8_3_0; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These version checks are redundant in 9.0 and 9.1. The 8.x backports will need to keep them however.
| ActionListener<ClusterState> updatedStateListener = ActionListener.wrap( | ||
| updatedState -> submitUnbatchedTask("update model deployment", new ClusterStateUpdateTask() { | ||
| ActionListener<TrainedModelAssignmentMetadata.Builder> updatedAssignmentListener = ActionListener.wrap( | ||
| updatedAssignment -> submitUnbatchedTask("update model deployment", new ClusterStateUpdateTask() { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the fix, here the new assignment state is passed rather than the updated cluster state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the TrainedModelAssignmentMetadata changed in the meantime?
That's a similar bug to the existing one, right? But much smaller, because the TrainedModelAssignmentMetadata changes less often than the ClusterState.
Should we protect against that? For example, only replace the TrainedModelAssignmentMetadata in the ClusterState if it's identical to the one we started the update computation with? And if it has changed, try this process again. More or less this paradigm:
https://github.com/elastic/elasticsearch/blob/main/test/framework/src/main/java/org/elasticsearch/common/util/MockBigArrays.java#L728-L742.
Maybe that's overkill and too complicated though. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're deciding not to fix this, let's leave a comment about this small issue with this implementation and call it a day
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the areClusterStatesCompatibleForRebalance function it also checks that the TrainedModelAssignmentMetadata has not changed.
Line 610 in ffd02c5
| && TrainedModelAssignmentMetadata.fromState(source).equals(TrainedModelAssignmentMetadata.fromState(target)); | 
By comparing the starting state with the latest state before applying the updated TrainedModelAssignmentMetadata (which is a lightweight operation and can be done in the ClusterStateUpdateTask) the code is effectively performing a "compare and swap" paradigm as linked above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, cool! I missed that part of the code. Guess this all works then as is
| final boolean isResetMode = MlMetadata.getMlMetadata(event.state()).isResetMode(); | ||
| TrainedModelAssignmentMetadata modelAssignmentMetadata = TrainedModelAssignmentMetadata.fromState(event.state()); | ||
| final String currentNode = event.state().nodes().getLocalNodeId(); | ||
| final boolean isNewAllocationSupported = event.state() | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another version change that is irrelevant for 9
# Conflicts: # muted-tests.yml # x-pack/plugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/assignment/TrainedModelAssignmentClusterService.java
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Left one comment. Certainly a big improvement!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
When updating a model deployment (for example changing the number of allocations) calculating the new deployment is an expensive operation so it is done in a separate thread outside of the
ClusterStateUpdateTask. However, if there was another clusterstate update while computing the new deployment then submitting the cluster state fails because the version of the cluster state used to calculate the update is now lower than the version of the latest state.The fix is quite easy, compute the model deployment update outside of the ClusterStateUpdateTask then merge it with the latest state when executing the task. The code already has a check that the deployment update is compatible with the new state (
areClusterStatesCompatibleForRebalance(...)) making it safe to merge the new state.Closes #121165