[ML] Wait for allocation on scale up #114719

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Merged

davidkyle merged 9 commits into main from adaptive-allocations-from-0

Oct 15, 2024

Member

davidkyle commented Oct 14, 2024

Changes TransportInternalInferModelAction to queue inference requests when an model deployment is scaling up from 0 allocations. Incoming inference requests are stored in a queue then send to the model once it is deployed on a node.

Adds a setting to control the adaptive allocations scale to zero period for the purpose of running tests.

davidkyle added >enhancement :ml auto-backport cloud-deploy v8.16.0 v9.0.0 labels

elasticsearchmachine added the Team:ML label

Collaborator

elasticsearchmachine commented Oct 14, 2024

Pinging @elastic/ml-core (Team:ML)

Collaborator

elasticsearchmachine commented Oct 14, 2024

Hi @davidkyle, I've created a changelog YAML for you.

jan-elastic reviewed

View reviewed changes

...RestTest/java/org/elasticsearch/xpack/ml/integration/AdaptiveAllocationsScaleFromZeroIT.java Outdated Show resolved Hide resolved

jan-elastic reviewed

View reviewed changes

...RestTest/java/org/elasticsearch/xpack/ml/integration/AdaptiveAllocationsScaleFromZeroIT.java Outdated Show resolved Hide resolved

jan-elastic reviewed

View reviewed changes

...s/src/javaRestTest/java/org/elasticsearch/xpack/ml/integration/PyTorchModelRestTestCase.java Show resolved Hide resolved

jan-elastic reviewed

View reviewed changes

...in/ml/src/main/java/org/elasticsearch/xpack/ml/action/TransportInternalInferModelAction.java Outdated Show resolved Hide resolved

jan-elastic reviewed

View reviewed changes

...plugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/InferenceWaitForAllocation.java Outdated Show resolved Hide resolved

jan-elastic reviewed

View reviewed changes

...plugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/InferenceWaitForAllocation.java Outdated Show resolved Hide resolved

jan-elastic reviewed

View reviewed changes

...plugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/InferenceWaitForAllocation.java Outdated Show resolved Hide resolved

jan-elastic reviewed

View reviewed changes

...c/main/java/org/elasticsearch/xpack/core/ml/inference/assignment/TrainedModelAssignment.java Outdated Show resolved Hide resolved

jan-elastic reviewed

View reviewed changes

...plugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/InferenceWaitForAllocation.java Outdated Show resolved Hide resolved

jan-elastic approved these changes

View reviewed changes

Contributor

jan-elastic left a comment

Generally LGTM. Some some minor comments.

Please have another look at the logging. It looks like there's some leftover debug logging (logged at info).

davidkyle requested review from a team as code owners

October 15, 2024 10:27

davidkyle added 9 commits

October 15, 2024 11:30


          wait for allocation

080c434


          tests

6f4ffaf


          protect against multiple requests

f9e875b


          Remove queue

0ecab0d


          Update docs/changelog/114719.yaml

d11e549


          tidy up

9344bf4


          review

6fe565d


          fix using alias

ad8e046


          Add feature flag

a959890

davidkyle force-pushed the adaptive-allocations-from-0 branch from 5f9dddc to a959890 Compare

October 15, 2024 10:31

davidkyle removed request for a team

October 15, 2024 10:31

jan-elastic reviewed

View reviewed changes

test/test-clusters/src/main/java/org/elasticsearch/test/cluster/FeatureFlag.java

    
                  CHUNKING_SETTINGS_ENABLED("es.inference_chunking_settings_feature_flag_enabled=true", Version.fromString("8.16.0"), null),

                  INFERENCE_DEFAULT_ELSER("es.inference_default_elser_feature_flag_enabled=true", Version.fromString("8.16.0"), null);

                  INFERENCE_DEFAULT_ELSER("es.inference_default_elser_feature_flag_enabled=true", Version.fromString("8.16.0"), null),

                  ML_SCALE_FROM_ZERO("es.ml_scale_from_zero_feature_flag_enabled=true", Version.fromString("8.16.0"), null);

Contributor

jan-elastic Oct 15, 2024

I'd prefix with INFERENCE instead of ML.

Very subtle naming btw: we used to have scale_to_zero_feature_flag, and now scale_from_zero_feature_flag :)

jan-elastic approved these changes

View reviewed changes

Contributor

jan-elastic left a comment

LGTM

davidkyle removed the auto-backport label

davidkyle enabled auto-merge (squash)

October 15, 2024 10:46

davidkyle mentioned this pull request

[8.16][ML] Wait for allocation on scale up #114814

Merged

davidkyle merged commit bd6eeca into main

18 checks passed

davidkyle deleted the adaptive-allocations-from-0 branch

October 15, 2024 11:38

georgewallace pushed a commit to georgewallace/elasticsearch that referenced this pull request


          [ML] Wait for allocation on scale up from 0 (elastic#114719)

119fa3f

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cloud-deploy >enhancement :ml Team:ML v8.16.0 v9.0.0