Skip to content

Conversation

@jonathan-buttner
Copy link
Contributor

@jonathan-buttner jonathan-buttner commented Oct 27, 2025

WIP

This PR is just to show the changes I made to be able to test the issue here: #137134

To make the reproduction faster I temporarily changed the code to allow the times to be shorter:

PUT /_cluster/settings
{
  "persistent": {
    "xpack.ml.trained_models.adaptive_allocations.scale_to_zero_time": "10s",
    "xpack.ml.trained_models.adaptive_allocations.scale_up_cooldown_time": "10s",
    "logger.org.elasticsearch.xpack.ml.inference.assignment": "DEBUG"
  }
}

Then we can follow the steps in the issue to reproduce, which are:

  1. Create deployment via creating inference endpoint
PUT _inference/rerank/mytest-old
{
    "service": "elasticsearch",
    "service_settings": {
        "num_threads": 1,
        "model_id": ".rerank-v1",
        "adaptive_allocations": {
            "enabled": true,
            "min_number_of_allocations": 0,
            "max_number_of_allocations": 2
        }
    }
}
  1. Wait for mytest-old to scale to zero ~10 seconds
GET _ml/trained_models/_stats
  1. Create a new deployment via inference endpoint, mytest-old should still exist, but it will have an allocation which is not intended.
PUT _inference/rerank/mytest-new3
{
    "service": "elasticsearch",
    "service_settings": {
        "num_threads": 1,
        "model_id": ".rerank-v1",
        "adaptive_allocations": {
            "enabled": true,
            "min_number_of_allocations": 0,
            "max_number_of_allocations": 2
        }
    }
}
GET _ml/trained_models/_stats

@elasticsearchmachine
Copy link
Collaborator

Hi @jonathan-buttner, I've created a changelog YAML for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>bug :ml Machine learning Team:ML Meta label for the ML team v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants