Skip to content

Conversation

prwhelan
Copy link
Member

@prwhelan prwhelan commented Jul 22, 2025

Flag updates from Inference so Serverless can detect them.
Swap tests to set adaptive allocations rather than num allocations to pass in serverless.

@prwhelan prwhelan added >test Issues or PRs that are addressing/adding tests :ml Machine learning Team:ML Meta label for the ML team v9.2.0 labels Jul 22, 2025
@elasticsearchmachine elasticsearchmachine added the serverless-linked Added by automation, don't add manually label Jul 22, 2025
@prwhelan prwhelan marked this pull request as ready for review July 22, 2025 18:55
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

Copy link
Contributor

@jan-elastic jan-elastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@prwhelan prwhelan changed the title [ML] Use adaptive allocations in test [ML] Flag updates from Inference Jul 23, 2025
@prwhelan prwhelan requested a review from jan-elastic July 23, 2025 20:54

public void setFromInference(boolean fromInference) {
this.fromInference = fromInference;
this.isInternal = fromInference;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks confusing the setFromInference also sets isInternal.

import org.elasticsearch.xpack.core.ml.utils.ExceptionsHelper;

import java.io.IOException;
import java.util.Objects;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm missing a bit of context: why do we need to distinguish between these cases?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a corresponding Serverless PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, let me ping you with the internal documentation

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm missing a bit of context: why do we need to distinguish between these cases?

We need to allow updates to num_allocations in serverless that originate from the AdaptiveAllocationsScalerService (ADAPTIVE_ALLOCATIONS), but we want to disallow updates from users (API and INFERENCE). The only alternative I thought of was refactoring AdaptiveAllocationsScalerService to update directly rather than through the API, but that felt more intrusive.

// we changed over from a boolean to an enum
// when it was a boolean, true came from adaptive allocations and false came from the rest api
// treat "inference" as if it came from the api
out.writeBoolean(isInternal());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to determine if source == Source.ADAPTIVE_ALLOCATIONS here? Since this will return true for Source.INFERENCE as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, we set the boolean to true if the source was either from the inference update api or the adaptive allocations autoscaler. out.writeBoolean(isInternal()) preserves this logic (i think). It means the stream reader will think an inference api call is an adaptive allocations api call, but that only affects serverless which is only mixed cluster during a rolling update.


public boolean isInternal() {
return isInternal;
return source == Source.INFERENCE || source == Source.ADAPTIVE_ALLOCATIONS;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you confirm that we do want Source.INFERENCE here for all the usage of isInternal() below?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed! Yeah inference update code previously set isInternal to true (back when the boolean existed)


public static final ParseField TIMEOUT = new ParseField("timeout");

private static final TransportVersion INFERENCE_UPDATE_ML = TransportVersion.fromName("inference_update_ml");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this name INFERENCE_UPDATE_ML isn't particularly clear.

What about something like UPDATE_TRAINED_MODEL_DEPLOYMENT_REQUEST_SOURCE or so? 🤷

import static org.elasticsearch.xpack.core.ml.action.StartTrainedModelDeploymentAction.Request.NUMBER_OF_ALLOCATIONS;

public class UpdateTrainedModelDeploymentAction extends ActionType<CreateTrainedModelAssignmentAction.Response> {
public enum Source {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it cleaner to move this into Request (so UpdateTrainedModelDeploymentAction.Request.Source)?

Copy link
Contributor

@jan-elastic jan-elastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just nitpicking

public enum Source {
API,
ADAPTIVE_ALLOCATIONS,
INFERENCE
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit INFERENCE is pretty vague, from the usage it appears to mean the request comes from the Inference API

Suggested change
INFERENCE
INFERENCE_API

@prwhelan prwhelan merged commit 4290a8e into elastic:main Oct 13, 2025
34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:ml Machine learning serverless-linked Added by automation, don't add manually Team:ML Meta label for the ML team >test Issues or PRs that are addressing/adding tests v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants