-
Notifications
You must be signed in to change notification settings - Fork 25.5k
Add DirectIO bulk rescoring #135380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add DirectIO bulk rescoring #135380
Conversation
Hi @benwtrent, I've created a changelog YAML for you. |
Pinging @elastic/es-search-relevance (Team:Search Relevance) |
...ava/org/elasticsearch/index/codec/vectors/es93/DirectIOCapableLucene99FlatVectorsFormat.java
Outdated
Show resolved
Hide resolved
} | ||
} | ||
|
||
static class Lucene99FlatBulkScoringVectorsReader extends FlatVectorsReader { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So much ceremony required...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@thecoop yes, some of it should go away with Lucene 10.4 the nice thing is that the top level format name remains unchanged, so its a easy removal once the new lucene is released.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a PR for the corresponding changes on lucene_snapshot
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@thecoop no, not yet.
server/src/main/java/org/elasticsearch/search/vectors/RescoreKnnVectorQuery.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/search/vectors/RescoreKnnVectorQuery.java
Outdated
Show resolved
Hide resolved
<inspection_tool class="jol" enabled="false" level="WARNING" enabled_by_default="false" /> | ||
</profile> | ||
</component> | ||
</component> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
revert this
This adds directIO bulk rescoring where vectors are prefetched in batches and then bulk scored with the random vector scorer.
Without #134803 this doesn't do much.
I haven't really optimized the batch sizes, I am sure we can pick something better given the knowledge of IO capabilities of the underlying system.