Skip to content

Min similarity is not handled through the semantic text highlighter #136056

@Mikep86

Description

@Mikep86

Elasticsearch Version

main

Installed Plugins

No response

Java Version

bundled

OS Version

n/a

Problem Description

knn queries that specify a similarity and query a semantic_text field do not work with semantic highlighting.

I debugged this and traced it down to the fact that we do not handle VectorSimilarityQuery in SemanticTextHighlighter#extractDenseVectorQueries. I created a quick branch to prove out a fix: https://github.com/Mikep86/elasticsearch/tree/semantic-text_debug-highlighting-with-min-similarity

Steps to Reproduce

  1. Create an index with a semantic_text field that uses a dense vector model:
PUT test-index
{
  "mappings": {
    "properties": {
      "inference_field": {
        "type": "semantic_text",
        "inference_id": ".multilingual-e5-small-elasticsearch"
      }
    }
  }
}
  1. Add some docs to the index

  2. Query the field using a knn field with a similarity value and highlighting enabled:

GET test-index/_search
{
  "query": {
    "knn": {
      "similarity": 0.40,
      "field": "inference_field",
      "query_vector_builder": {
        "text_embedding": {
          "model_text": "foo"
        }
      },
      "k": 10,
      "num_candidates": 100
    }
  },
  "highlight": {
    "fields": {
      "inference_field": {
        "order": "score",
        "number_of_fragments": 1
      }
    }
  }
}

Logs (if relevant)

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    :SearchOrg/RelevanceLabel for the Search (solution/org) Relevance team>bugTeam:Search - RelevanceThe Search organization Search Relevance team

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions