Skip to content

Conversation

kderusso
Copy link
Member

We ran into a confusing use case with the semantic highlighter.

PUT my-test-index
{
  "mappings": {
    "properties": {
      "text": {
        "type": "semantic_text"
      }
    }
  }
}

PUT my-test-index/_doc/1
{
  "text": [
    "puggles are pugs and beagles",
    "chiweenies are chihuahuas and dachshunds"
  ]
}

GET my-test-index/_search
{
  "query": {
    "match_all": {}
  },
  "highlight": {
    "fields": {
      "text": {
        "number_of_fragments": 1
      }
    }
  }
}

With the above search, the expectation was that the first snippet would be returned as order was not specified, but in fact the second snippet was returned. This is because we had only asked for one snippet, so only the top scoring snippet was returned.

This clarifies the documentation to show an alternate way to get semantic highlighter chunks, in the order in which they appear in the document:

GET my-test-index/_search
{
  "query": {
    "match_all": {}
  },
  "highlight": {
    "fields": {
      "text": {
        "number_of_fragments": 1
      }
    },
    "highlight_query": {
      "match": {
        "text": "chihuahua"
      }
    }
  }
}

@kderusso kderusso added >docs General docs changes :SearchOrg/Relevance Label for the Search (solution/org) Relevance team v9.2.0 labels Jul 24, 2025
@elasticsearchmachine elasticsearchmachine added Team:Docs Meta label for docs team Team:SearchOrg Meta label for the Search Org (Enterprise Search) Team:Search - Relevance The Search organization Search Relevance team labels Jul 24, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-docs (Team:Docs)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/search-eng (Team:SearchOrg)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/search-relevance (Team:Search - Relevance)

POST test-index/_search
{
"query": {
"match_all": {}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Samiul-TheSoccerFan what version should this be tagged with?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was backported all the way to 8.18 and 9.0. If you are looking for specific versions, I would say from 8.18.4 and 9.0.4.

Copy link
Contributor

github-actions bot commented Jul 24, 2025

Copy link
Contributor

@Mikep86 Mikep86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick turnaround on this, and adding match_all at the same time!

"highlight": {
"fields": {
"my_semantic_field": {
"number_of_fragments": 5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we call out adjusting number_of_fragments here to a larger number to get all chunks in a doc, and how that will be variable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 8cae0f6 LMKWYT

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

# Highlighting [highlighting]

Highlighters enable you to get highlighted snippets from one or more fields in your search results so you can show users where the query matches are. When you request highlights, the response contains an additional `highlight` element for each search hit that includes the highlighted fields and the highlighted fragments.
Highlighters enable you to retrieve the best-matching highlighted snippets from one or more fields in your search results so you can show users where the query matches are. When you request highlights, the response contains an additional `highlight` element for each search hit that includes the highlighted fields and the highlighted fragments.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice light-touch change 👍

Copy link
Contributor

@Samiul-TheSoccerFan Samiul-TheSoccerFan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kderusso kderusso added v8.18.4 v8.19.1 v9.0.5 v9.1.1 auto-backport Automatically create backport pull requests when merged labels Jul 29, 2025
@kderusso kderusso merged commit b3510c1 into elastic:main Jul 29, 2025
10 checks passed
kderusso added a commit to kderusso/elasticsearch that referenced this pull request Jul 29, 2025
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.18 Commit could not be cherrypicked due to conflicts
9.1
8.19 Commit could not be cherrypicked due to conflicts
9.0 Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 131871

@kderusso
Copy link
Member Author

💔 Some backports could not be created

Status Branch Result
9.0
8.19 Conflict resolution was aborted by the user
8.18 Conflict resolution was aborted by the user

Manual backport

To create the backport manually run:

backport --pr 131871

Questions ?

Please refer to the Backport tool documentation

@kderusso
Copy link
Member Author

Removed backport labels for 8.18 and 8.19 - Markdown doesn't exist there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged >docs General docs changes :SearchOrg/Relevance Label for the Search (solution/org) Relevance team Team:Docs Meta label for docs team Team:Search - Relevance The Search organization Search Relevance team Team:SearchOrg Meta label for the Search Org (Enterprise Search) v9.0.5 v9.1.1 v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants