Skip to content

Conversation

carlosdelest
Copy link
Member

@carlosdelest carlosdelest commented Dec 11, 2024

Follow up to #116663.

Adds docs for vector rescoring.

Search your data / knn search section:

image
image

knn query / knn retriever parameters section:

image

@carlosdelest carlosdelest added the >docs General docs changes label Dec 11, 2024
Copy link
Contributor

Documentation preview:

@carlosdelest carlosdelest force-pushed the feature/knn-vector-rescore-query-docs branch from 10c98d3 to 09a9156 Compare December 11, 2024 09:48
@carlosdelest carlosdelest force-pushed the feature/knn-vector-rescore-query-docs branch from 09a9156 to 8ade227 Compare December 11, 2024 09:54
@carlosdelest carlosdelest added v8.18.0 auto-backport Automatically create backport pull requests when merged labels Dec 11, 2024
@carlosdelest carlosdelest added Team:Docs Meta label for docs team :Search Relevance/Vectors Vector search Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch labels Dec 11, 2024
@carlosdelest carlosdelest marked this pull request as ready for review December 11, 2024 10:28
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-docs (Team:Docs)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

There are two main ways to oversample and rescore. The first is to utilize the <<rescore, rescore section>> in the `_search` request.
[discrete]
[[dense-vector-knn-search-reranking-rescore-section]]
===== Use the `rescore_vector` section for top-level kNN search
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following example doesn't use rescore_vector, instead it utilizes the top-level rescore.

I am not sure we need to specifically call out ALL the ways that you can rescore. There are 3 ways now (rescore_vector, script score, and "rescore" at the top level). All have their uses, but our docs should be "normal user" prescriptive. Let's focus on rescore_vector as that is the easiest for oversampling and rescoring vectors.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should keep the existing ways of rescoring until we make rescore_vector GA.

We can group the other ways under a "other ways of doing rescoring" kind of section, but I think it's worth keeping them until rescore_vector is out of tech preview.

WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose calling these out is ok, we just need to be perfectly clear what each one means and when to use each one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've set rescore_vector as the way to do it, and an additional section for the other methods, along with when you should use them, in ff7f906.

LMKWYT!

Comment on lines 1191 to 1197

[discrete]
[[dense-vector-knn-search-reranking-script-score]]
===== Use a `script_score` query to rescore per shard

You can rescore per shard with the <<query-dsl-knn-query, knn query>> and <<query-dsl-script-score-query, script_score query >>.
Generally, this means that there will be more rescoring per shard, but this can increase overall recall at the cost of compute.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's not bother calling out script_score. I say remove this section and focus on "rescore_vector"

Comment on lines 1093 to 1097
There are three main ways to oversample and rescore:

* <<dense-vector-knn-search-reranking-rescore-parameter>>
* <<dense-vector-knn-search-reranking-rescore-section>>
* <<dense-vector-knn-search-reranking-script-score>>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's focus on rescore_vector and remove the other sections.

+
include::{es-ref-dir}/rest-api/common-parms.asciidoc[tag=knn-similarity]

include::{es-ref-dir}/rest-api/common-parms.asciidoc[tag=knn-rescore-vector]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we also need to put it in search-api-knn in search.asciidoc

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the catch - done in 9c00564

Copy link
Contributor

@ChrisHegarty ChrisHegarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@mayya-sharipova mayya-sharipova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Carlos, nice addition!

@carlosdelest
Copy link
Member Author

@benwtrent this is ready for review, it includes the latest changes for vector rescoring in #119835

@carlosdelest carlosdelest requested a review from a team January 15, 2025 06:59
@carlosdelest carlosdelest merged commit aea4853 into elastic:main Jan 17, 2025
5 checks passed
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged >docs General docs changes :Search Relevance/Vectors Vector search Team:Docs Meta label for docs team Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v8.18.0 v9.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants