Skip to content

Cannot update min_number_of_allocations for ELSER inference endpoint. #128456

@vladislav-marochkin

Description

@vladislav-marochkin

Elasticsearch Version

8.18.1

Installed Plugins

No response

Java Version

OpenJDK 24

OS Version

Ubuntu with the Linux kernel version 5.4.0-208-generic

Problem Description

Expected behavior:
I expect to be able to update the min_number_of_allocations parameter for an existing ELSER inference endpoint with adaptive allocations enabled, using the _update API. The update should apply the new value without error, as long as the request is valid and consistent with the endpoint’s configuration.

Actual behavior:
When attempting to update the min_number_of_allocations parameter via the _update API, the request fails with a parsing error related to num_allocations. The error message indicates a failure to parse [num_allocations] in the update request, even though only min_number_of_allocations and max_number_of_allocations are being set under adaptive_allocations.

If I try to explicitly include num_allocations in the request, the update fails with a validation error stating that [number_of_allocations] cannot be set if adaptive allocations is enabled.

This makes it impossible to update min_number_of_allocations for an existing ELSER inference endpoint with adaptive allocations enabled, even though this should be a supported operation.


We are using ECK


Steps to Reproduce

  1. Create ELSER inference endpoint.
PUT _inference/elser_model_2_search
{
      "task_type": "sparse_embedding",
      "service": "elasticsearch",
      "service_settings": {
        "num_threads": 1,
        "model_id": ".elser_model_2_linux-x86_64",
        "adaptive_allocations": {
          "enabled": true,
          "min_number_of_allocations": 1,
          "max_number_of_allocations": 10
        }
      },
      "chunking_settings": {
        "strategy": "sentence",
        "max_chunk_size": 250,
        "sentence_overlap": 1
      }
}

  1. Try to update "min_number_of_allocations".
PUT _inference/elser_model_2_search/_update
{
      "task_type": "sparse_embedding",
      "service_settings": {
        "num_threads": 1,
        "model_id": ".elser_model_2_linux-x86_64",
        "adaptive_allocations": {
          "enabled": true,
          "min_number_of_allocations": 0,
          "max_number_of_allocations": 10
        }
      }
}

This returns:

{
  "error": {
    "root_cause": [
      {
        "type": "status_exception",
        "reason": """Failed to parse [num_allocations] of update request [{      "task_type": "sparse_embedding",      "service_settings": {        "num_threads": 1,        "model_id": ".elser_model_2_linux-x86_64",        "adaptive_allocations": {          "enabled": true,          "min_number_of_allocations": 0,          "max_number_of_allocations": 10        }      }}
]"""
      }
    ],
    "type": "status_exception",
    "reason": """Failed to parse [num_allocations] of update request [{      "task_type": "sparse_embedding",      "service_settings": {        "num_threads": 1,        "model_id": ".elser_model_2_linux-x86_64",        "adaptive_allocations": {          "enabled": true,          "min_number_of_allocations": 0,          "max_number_of_allocations": 10        }      }}
]"""
  },
  "status": 400
}

  1. Try to include "num_allocations" with the update request.
PUT _inference/elser_model_2_search/_update
{
      "task_type": "sparse_embedding",
      "service_settings": {
        "num_threads": 1,
        "num_allocations": 1,
        "model_id": ".elser_model_2_linux-x86_64",
        "adaptive_allocations": {
          "enabled": true,
          "min_number_of_allocations": 0,
          "max_number_of_allocations": 10
        }
      }
}

This returns:

{
  "error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        "reason": "Validation Failed: 1: [number_of_allocations] cannot be set if adaptive allocations is enabled;"
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "Validation Failed: 1: [number_of_allocations] cannot be set if adaptive allocations is enabled;"
  },
  "status": 400
}

Logs (if relevant)

No response

Metadata

Metadata

Assignees

Labels

:mlMachine learning>bugTeam:MLMeta label for the ML team

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions