Skip to content

Conversation

@davidkyle
Copy link
Member

Changes TransportInternalInferModelAction to queue inference requests when an model deployment is scaling up from 0 allocations. Incoming inference requests are stored in a queue then send to the model once it is deployed on a node.

Adds a setting to control the adaptive allocations scale to zero period for the purpose of running tests.

@davidkyle davidkyle added >enhancement :ml Machine learning auto-backport Automatically create backport pull requests when merged cloud-deploy Publish cloud docker image for Cloud-First-Testing v8.16.0 v9.0.0 labels Oct 14, 2024
@elasticsearchmachine elasticsearchmachine added the Team:ML Meta label for the ML team label Oct 14, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@elasticsearchmachine
Copy link
Collaborator

Hi @davidkyle, I've created a changelog YAML for you.

Copy link
Contributor

@jan-elastic jan-elastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM. Some some minor comments.

Please have another look at the logging. It looks like there's some leftover debug logging (logged at info).

@davidkyle davidkyle requested review from a team as code owners October 15, 2024 10:27
@davidkyle davidkyle force-pushed the adaptive-allocations-from-0 branch from 5f9dddc to a959890 Compare October 15, 2024 10:31
@davidkyle davidkyle removed request for a team October 15, 2024 10:31
CHUNKING_SETTINGS_ENABLED("es.inference_chunking_settings_feature_flag_enabled=true", Version.fromString("8.16.0"), null),
INFERENCE_DEFAULT_ELSER("es.inference_default_elser_feature_flag_enabled=true", Version.fromString("8.16.0"), null);
INFERENCE_DEFAULT_ELSER("es.inference_default_elser_feature_flag_enabled=true", Version.fromString("8.16.0"), null),
ML_SCALE_FROM_ZERO("es.ml_scale_from_zero_feature_flag_enabled=true", Version.fromString("8.16.0"), null);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefix with INFERENCE instead of ML.

Very subtle naming btw: we used to have scale_to_zero_feature_flag, and now scale_from_zero_feature_flag :)

Copy link
Contributor

@jan-elastic jan-elastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@davidkyle davidkyle removed the auto-backport Automatically create backport pull requests when merged label Oct 15, 2024
@davidkyle davidkyle enabled auto-merge (squash) October 15, 2024 10:46
@davidkyle davidkyle merged commit bd6eeca into main Oct 15, 2024
18 checks passed
@davidkyle davidkyle deleted the adaptive-allocations-from-0 branch October 15, 2024 11:38
georgewallace pushed a commit to georgewallace/elasticsearch that referenced this pull request Oct 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cloud-deploy Publish cloud docker image for Cloud-First-Testing >enhancement :ml Machine learning Team:ML Meta label for the ML team v8.16.0 v9.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants