Skip to content

Conversation

@iverase
Copy link
Contributor

@iverase iverase commented Aug 29, 2025

Currently if we open a file with a directory other than Mmap, we fall back to the scalar implementation. We can still vectorize this situation by reading the vectors on-heap. This commit just do that, it adds an OnHeapES91OSQVectorsScorer which just reads vectors into a byte array and the run the scorer using the java vector API.

Here are how it looks like:
In 128 bits:

Benchmark                                                           (dims)   Mode  Cnt   Score   Error   Units
OSQScorerBenchmark.scoreFromMemorySegmentAllBulkMmapScalar            1024  thrpt    5   4.188 ± 0.025  ops/ms
OSQScorerBenchmark.scoreFromMemorySegmentAllBulkMmapVect              1024  thrpt    5  31.017 ± 0.176  ops/ms
OSQScorerBenchmark.scoreFromMemorySegmentAllBulkNiofsScalar           1024  thrpt    5   2.709 ± 0.030  ops/ms
OSQScorerBenchmark.scoreFromMemorySegmentAllBulkNfiosVect             1024  thrpt    5   6.310 ± 0.150  ops/ms
OSQScorerBenchmark.scoreFromMemorySegmentOnlyVectorBulkMmapScalar     1024  thrpt    5   4.318 ± 0.016  ops/ms
OSQScorerBenchmark.scoreFromMemorySegmentOnlyVectorBulkMmapVect       1024  thrpt    5  25.878 ± 1.576  ops/ms
OSQScorerBenchmark.scoreFromMemorySegmentOnlyVectorBulkNiofsScalar    1024  thrpt    5   2.500 ± 0.088  ops/ms
OSQScorerBenchmark.scoreFromMemorySegmentOnlyVectorBulkNiofsVect      1024  thrpt    5   6.027 ± 0.983  ops/ms
OSQScorerBenchmark.scoreFromMemorySegmentOnlyVectorMmapScalar         1024  thrpt    5   4.032 ± 0.009  ops/ms
OSQScorerBenchmark.scoreFromMemorySegmentOnlyVectorMmapVect           1024  thrpt    5  22.166 ± 0.437  ops/ms
OSQScorerBenchmark.scoreFromMemorySegmentOnlyVectorNiofsScalar        1024  thrpt    5   2.507 ± 0.468  ops/ms
OSQScorerBenchmark.scoreFromMemorySegmentOnlyVectorNiofsVect          1024  thrpt    5   4.898 ± 0.304  ops/ms

In 256 bits:

Benchmark                                                           (dims)   Mode  Cnt   Score    Error   Units
OSQScorerBenchmark.scoreFromMemorySegmentAllBulkMmapScalar            1024  thrpt    5   8.309 ±  0.122  ops/ms
OSQScorerBenchmark.scoreFromMemorySegmentAllBulkMmapVect              1024  thrpt    5  20.956 ±  0.315  ops/ms
OSQScorerBenchmark.scoreFromMemorySegmentAllBulkNiofsScalar           1024  thrpt    5   2.438 ±  0.017  ops/ms
OSQScorerBenchmark.scoreFromMemorySegmentAllBulkNiofsVect             1024  thrpt    5   4.234 ±  0.235  ops/ms
OSQScorerBenchmark.scoreFromMemorySegmentOnlyVectorBulkMmapScalar     1024  thrpt    5   7.897 ±  0.073  ops/ms
OSQScorerBenchmark.scoreFromMemorySegmentOnlyVectorBulkMmapVect       1024  thrpt    5  15.071 ±  4.142  ops/ms
OSQScorerBenchmark.scoreFromMemorySegmentOnlyVectorBulkNiofsScalar    1024  thrpt    5   2.321 ±  0.118  ops/ms
OSQScorerBenchmark.scoreFromMemorySegmentOnlyVectorBulkNiofsVect      1024  thrpt    5   3.943 ±  0.067  ops/ms
OSQScorerBenchmark.scoreFromMemorySegmentOnlyVectorMmapScalar         1024  thrpt    5   7.890 ±  0.088  ops/ms
OSQScorerBenchmark.scoreFromMemorySegmentOnlyVectorMmapVect           1024  thrpt    5  12.665 ±  0.576  ops/ms
OSQScorerBenchmark.scoreFromMemorySegmentOnlyVectorNiofsScalar        1024  thrpt    5   1.818 ±  1.615  ops/ms
OSQScorerBenchmark.scoreFromMemorySegmentOnlyVectorNiofsVect          1024  thrpt    5   3.141 ±  0.041  ops/ms

The differences between Mmap And Niofs are expected as Mmap is running on off heap memory all the time. This get's a 2x speed up when Mmap is not available.

@iverase iverase requested a review from benwtrent August 29, 2025 17:32
@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Aug 29, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

Copy link
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simple copy and paste with some changes. Seems nice enough to me. Though for sure, mmap is the way to go if possible.

@ChrisHegarty what do you think?

@iverase
Copy link
Contributor Author

iverase commented Sep 2, 2025

ping @ChrisHegarty , do you have any concern here?

Copy link
Contributor

@ChrisHegarty ChrisHegarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it's not ideal to have several Panama implementations, they add little congestive load, and they're certainly worth it performance-wise. And we have good tests. So LGTM. Thanks @iverase

@iverase iverase merged commit 80f4e75 into elastic:main Sep 2, 2025
33 checks passed
@iverase iverase deleted the OnHeapES91OSQVectorsScorer branch September 2, 2025 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>non-issue :Search Relevance/Vectors Vector search Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants