Skip to content

Conversation

@piergm
Copy link
Member

@piergm piergm commented Jun 26, 2023

constant_keyword fields are now treated like standard keyword
for what it concerns highlighting.
It has been achieved by passing the original QueryBuilder to the
highlighting context and by adding a toHighlightQuery method that only in the
necessary cases is overwritten to return a matching Lucene Query.
This query is substituting the MatchAllDocsQuery that is returned during the
search phase that prevented the highlighting.

closes #85596

piergm added 30 commits June 9, 2023 09:55
constant_keyword are now treated like standard keyword and can be
highlighted in kibana.

closes elastic#85596
constant_keyword are now treated like standard keyword and can be
highlighted in kibana.

closes elastic#85596
@piergm piergm requested a review from romseygeek July 10, 2023 07:08
@jimczi
Copy link
Contributor

jimczi commented Jul 12, 2023

I am concerned by the complexity of the change compared to the benefit. In general, highlighting a keyword field is way too complex as it is implemented today. Can we look at alternatives? We could for instance have a different path for keyword field independently of the fact that they're constant or not. We don't need the full UnifiedHighlighter complexity to highlight a keyword field, that's just non sense. It should be a simple true/false path.

@jimczi
Copy link
Contributor

jimczi commented Jul 12, 2023

One alternative for instance would be to use a custom MatchAllDocsQuery when a constant keyword query rewrites to match all that keeps the information of the field and the term. Then we can expose this information when visit and matches are called at the Lucene API?

Copy link
Member Author

@piergm piergm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @jimczi for your feedback. I understand your concerns about the complexity of the solution with respect to the benefits. It kind-of is my concern too, on the other hand this solution allows for future custom highlighting with ease by implementing the toHighlightQuery method where needed. This reason made me reconcile with the solution.

The usage of a custom MatchAllDocsQuery was my first approach to this problem too. The issues I found with that is the MatchAllDocsQuery class being final. This prevented me to extend with further information and to make this a simple true/false path.
The solution considered given this limit were:

  • Wrapping MatchAllDocsQuery in order to simulate extend of a final class, but this will break all the instanceofs down the line.
  • Opening a PR to Lucene with one of the following changes:
    • Removing final from MatchAllDocsQuery (Is this a change useful for Lucene, or just for us?)
    • Adding originalQuery to the MatchAllDocsQuery constructor (Is this a change useful for Lucene, or just for us?)
    • Adding a reason String to the MatchAllDocsQuery constructor as is for MatchNoDocsQuery. I think this is the only acceptable change by the Lucene community, but this will leave us with an hacky string-matching solution to decide on the highlighting path to be taken.

Having said so I appreciate your suggestion on your comment and will evaluate the feasibility of exposing the information needed when visit and matches are called!
Thanks

@jimczi
Copy link
Contributor

jimczi commented Jul 14, 2023

Ok thanks for explaining @piergm . I guess my next question is whether we should support highlighting on constant keyword field all together? Considering the complexity and the required changes (not rewriting the shard request, adding a new toHighlightQuery, ...) I am not sure we should pursue this path. Highlighting is geared towards text field so not supporting constant_keyword field feels like a good trade off to me.

@javanna
Copy link
Member

javanna commented Jul 14, 2023

I tend to agree that this adds quite a bit of complexity, for something that should be simpler. We did go through different options and really the main goal here was for @piergm to get familiar with the codebase (Success!) and possibly address a real-life issue at the same time. This is an issue that was reported by Kibana, which will highlight keyword and text fields indistinctly. I could see how from a UI perspective, if you can highlight keyword fields, you should be able to highlight constant keywords too (and it should be easy to do so, because they either are fully highlighted or not!). We will re-discuss this with the team and figure out a viable way forward. Also keeping in mind that this is not high priority for us at the moment.

@quux00 quux00 added v8.11.0 and removed v8.10.0 labels Aug 16, 2023
@mattc58 mattc58 added v8.12.0 and removed v8.11.0 labels Oct 4, 2023
@javanna javanna added Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch and removed Team:Search Meta label for search team labels Jul 16, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>bug :Search Relevance/Highlighting How a query matched a document Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Highlighting in Kibana doesn't work for constant_keyword field type

9 participants