Configurable Inference Timeout #129880

Samiul-TheSoccerFan · 2025-06-23T20:54:05Z

This PR focuses on introducing user configurable inference timeout settings and use that as timeout during inference calls. Currently, it is hardcoded to 10s and the goal is to make it configurable.

Setup

PUT _inference/sparse_embedding/my-elser-model
{
  "service": "elser",
  "service_settings": {
    "num_allocations": 1,
    "num_threads": 1
  },
  "task_settings": {}
}

PUT my-semantic-index-5
{
  "mappings": {
    "properties": {
      "writer": {
        "type": "semantic_text",
        "inference_id": "my-elser-model"
      },
      "reader": {
        "type": "semantic_text"
      }
    }
  }
}

PUT my-semantic-index-6
{
  "mappings": {
    "properties": {
      "writer": {
        "type": "semantic_text",
        "inference_id": "my-elser-model"
      },
      "reader": {
        "type": "semantic_text"
      }
    }
  }
}

POST my-semantic-index-5/_doc/1
{
  "writer": "Little Red Riding Hood",
  "reader": ["inference test", "another inference test"]
}

POST my-semantic-index-5/_doc/2
{
  "writer": "Another Little Red Riding Hood",
  "reader": ["inference test", "another inference test"]
}

POST my-semantic-index-5/_doc/3
{
   "writer": "Another Little Red Riding Hood",
  "reader": ["inference test", "another inference test"]
}

POST my-semantic-index-6/_doc/1
{
  "writer": "Little Red Riding Hood",
  "reader": ["inference test", "another inference test"]
}

POST my-semantic-index-6/_doc/2
{
  "writer": "Another Little Red Riding Hood",
  "reader": ["inference test", "another inference test"]
}

POST my-semantic-index-6/_doc/3
{
   "writer": "Another Little Red Riding Hood",
  "reader": ["inference test", "another inference test"]
}

GET the default settings:

GET /my-semantic-index-5/_settings

GET /my-semantic-index-5/_settings?include_defaults=true

GET /my-semantic-index-6/_settings

GET /my-semantic-index-6/_settings?include_defaults=true

Update the inference timeout value:

PUT /my-semantic-index-6/_settings
{
  "index": {
    "semantic_text": {
      "inference_timeout": "1s"
    }
  }
}

GET the updated settings:

GET /my-semantic-index-5/_settings

GET /my-semantic-index-5/_settings?include_defaults=true

GET /my-semantic-index-6/_settings

GET /my-semantic-index-6/_settings?include_defaults=true

…emantic query build

Mikep86

Nice start! This initial commit gives us some useful info about what the scope of the solution should be:

The inference timeout is also hard-coded for sparse_vector (see here) and knn (see here) queries. Let's expand the scope to have one setting that controls the inference timeout for those + the semantic query inference timeout.
The machinations you had to go through to get the setting value for an individual index indicates that we should make this a cluster setting instead.

…uctor for testing purposes

Samiul-TheSoccerFan · 2025-07-18T17:08:27Z

closing in favor of #131551

Adding settings for query time inference config and applying during s…

56cef20

…emantic query build

elasticsearchmachine added the v9.1.0 label Jun 23, 2025

Samiul-TheSoccerFan added >enhancement v8.19.0 v9.1.0 and removed v9.1.0 labels Jun 23, 2025

Mikep86 reviewed Jun 25, 2025

View reviewed changes

elasticsearchmachine added v9.2.0 and removed v9.1.0 labels Jun 26, 2025

Samiul-TheSoccerFan and others added 15 commits July 9, 2025 11:50

moving inference timeout from index settings to cluster settings

4b0aded

inference timeout for internal services

31d17bc

adding inference timeout for third party inference timeout

b8a5405

propagate clustersettings to sageMaker

faebbef

remove unnecessary blank line

24d50a0

[CI] Auto commit changes from spotless

d8fc4af

Updating cluster settings to generalized term

6fe0bc1

supply inference context to all inference services and added a constr…

dd86d48

…uctor for testing purposes

revert back previous changes

9bccc5d

sending null so infernece queries will pick up cluster timeout settings

906b1e9

update tests

c1345c4

replacing clusterservice with inference context in tests

a57c470

linting for all changed files

fad271b

removed unwanted testing constructor

93fd152

[CI] Auto commit changes from spotless

cef35c0

Samiul-TheSoccerFan removed the v8.19.0 label Jul 14, 2025

Samiul-TheSoccerFan mentioned this pull request Jul 14, 2025

[Inference Timeout] Supply inference context to all third party services #131251

Merged

Samiul-TheSoccerFan closed this Jul 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Configurable Inference Timeout #129880

Configurable Inference Timeout #129880

Uh oh!

Samiul-TheSoccerFan commented Jun 23, 2025

Uh oh!

Mikep86 left a comment

Uh oh!

Samiul-TheSoccerFan commented Jul 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Configurable Inference Timeout #129880

Configurable Inference Timeout #129880

Uh oh!

Conversation

Samiul-TheSoccerFan commented Jun 23, 2025

Setup

Uh oh!

Mikep86 left a comment

Choose a reason for hiding this comment

Uh oh!

Samiul-TheSoccerFan commented Jul 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants