Skip to content

Conversation

@Samiul-TheSoccerFan
Copy link
Contributor

This PR focuses on introducing user configurable inference timeout settings and use that as timeout during inference calls. Currently, it is hardcoded to 10s and the goal is to make it configurable.

Setup

PUT _inference/sparse_embedding/my-elser-model
{
  "service": "elser",
  "service_settings": {
    "num_allocations": 1,
    "num_threads": 1
  },
  "task_settings": {}
}

PUT my-semantic-index-5
{
  "mappings": {
    "properties": {
      "writer": {
        "type": "semantic_text",
        "inference_id": "my-elser-model"
      },
      "reader": {
        "type": "semantic_text"
      }
    }
  }
}

PUT my-semantic-index-6
{
  "mappings": {
    "properties": {
      "writer": {
        "type": "semantic_text",
        "inference_id": "my-elser-model"
      },
      "reader": {
        "type": "semantic_text"
      }
    }
  }
}

POST my-semantic-index-5/_doc/1
{
  "writer": "Little Red Riding Hood",
  "reader": ["inference test", "another inference test"]
}

POST my-semantic-index-5/_doc/2
{
  "writer": "Another Little Red Riding Hood",
  "reader": ["inference test", "another inference test"]
}

POST my-semantic-index-5/_doc/3
{
   "writer": "Another Little Red Riding Hood",
  "reader": ["inference test", "another inference test"]
}

POST my-semantic-index-6/_doc/1
{
  "writer": "Little Red Riding Hood",
  "reader": ["inference test", "another inference test"]
}

POST my-semantic-index-6/_doc/2
{
  "writer": "Another Little Red Riding Hood",
  "reader": ["inference test", "another inference test"]
}

POST my-semantic-index-6/_doc/3
{
   "writer": "Another Little Red Riding Hood",
  "reader": ["inference test", "another inference test"]
}

GET the default settings:

GET /my-semantic-index-5/_settings

GET /my-semantic-index-5/_settings?include_defaults=true

GET /my-semantic-index-6/_settings

GET /my-semantic-index-6/_settings?include_defaults=true

Update the inference timeout value:

PUT /my-semantic-index-6/_settings
{
  "index": {
    "semantic_text": {
      "inference_timeout": "1s"
    }
  }
}

GET the updated settings:

GET /my-semantic-index-5/_settings

GET /my-semantic-index-5/_settings?include_defaults=true

GET /my-semantic-index-6/_settings

GET /my-semantic-index-6/_settings?include_defaults=true

Copy link
Contributor

@Mikep86 Mikep86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice start! This initial commit gives us some useful info about what the scope of the solution should be:

  • The inference timeout is also hard-coded for sparse_vector (see here) and knn (see here) queries. Let's expand the scope to have one setting that controls the inference timeout for those + the semantic query inference timeout.
  • The machinations you had to go through to get the setting value for an individual index indicates that we should make this a cluster setting instead.

@Samiul-TheSoccerFan
Copy link
Contributor Author

closing in favor of #131551

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants