Remove soar duplicate checking #132617

benwtrent · 2025-08-09T13:08:02Z

Through our various benchmarking runs, I have noticed we do a silly amount of work just handling duplicate vectors for overspill. When it comes to block scoring, it is likely much better to just score the duplicates, and deduplicate later. This indeed is the case, and the performance increases as the number of vector ops increases.

Multi-segment Cohere-wiki-768 8M

I ran every nprobe 5 times and picked the fastest.

CANDIDATE

index_name                      index_type  n_probe  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall     visited  filter_selectivity
------------------------------  ----------  -------  -----------  ----------------  -------------  ------  ------  ----------  ------------------
cohere-wikipedia-docs-768d.vec         ivf       10         7.12              0.00           0.00   140.45    0.80    83108.96                1.00
cohere-wikipedia-docs-768d.vec         ivf       20        10.47              0.00           0.00    95.51    0.86   169324.80                1.00
cohere-wikipedia-docs-768d.vec         ivf       50        19.86              0.00           0.00    50.35    0.91   461667.04                1.00
cohere-wikipedia-docs-768d.vec         ivf      100        33.65              0.00           0.00    29.72    0.94   950007.20                1.00
cohere-wikipedia-docs-768d.vec         ivf      200        57.04              0.00           0.00    17.53    0.95  1797631.04                1.00
cohere-wikipedia-docs-768d.vec         ivf      500       124.30              0.00           0.00     8.05    0.96  4334902.24                1.00
cohere-wikipedia-docs-768d.vec         ivf     1000       236.78              0.00           0.00     4.22    0.96  8521820.48                1.00

BASELINE

index_name                      index_type  n_probe  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall     visited  filter_selectivity
------------------------------  ----------  -------  -----------  ----------------  -------------  ------  ------  ----------  ------------------
cohere-wikipedia-docs-768d.vec         ivf       10         7.21              0.00           0.00  138.70    0.81    74077.53                1.00
cohere-wikipedia-docs-768d.vec         ivf       20        10.83              0.00           0.00   92.34    0.86   144966.33                1.00
cohere-wikipedia-docs-768d.vec         ivf       50        21.75              0.00           0.00   45.98    0.91   365150.68                1.00
cohere-wikipedia-docs-768d.vec         ivf      100        38.25              0.00           0.00   26.14    0.93   698105.96                1.00
cohere-wikipedia-docs-768d.vec         ivf      200        65.61              0.00           0.00   15.24    0.95  1278157.01                1.00
cohere-wikipedia-docs-768d.vec         ivf      500       148.98              0.00           0.00    6.71    0.95  2890457.27                1.00
cohere-wikipedia-docs-768d.vec         ivf     1000       281.02              0.00           0.00    3.56    0.95  4939370.44                1.00

Single segment Cohere-wiki-1024 1M

My thought being that maybe larger vectors will make block scoring more expensive, so picking individual vectors would be better. Same methodology as above

Candidate

index_name        index_type  n_probe  latency(ms)  net_cpu_time(ms)  avg_cpu_count      QPS  recall    visited  filter_selectivity
----------------  ----------  -------  -----------  ----------------  -------------  -------  ------  ---------  ------------------
wiki1024en.train                       ivf       10         0.63              0.00           0.00  1587.30    0.81     6389.60                1.00
wiki1024en.train                       ivf       20         0.86              0.00           0.00  1162.79    0.88    12528.48                1.00
wiki1024en.train                       ivf       50         1.43              0.00           0.00   699.30    0.93    30627.04                1.00
wiki1024en.train                       ivf      100         2.30              0.00           0.00   434.78    0.95    61259.84                1.00
wiki1024en.train                       ivf      200         4.12              0.00           0.00   242.72    0.97   122569.44                1.00
wiki1024en.train                       ivf      500         9.64              0.00           0.00   103.73    0.98   307816.80                1.00
wiki1024en.train                       ivf     1000        18.79              0.00           0.00    53.22    0.98   618772.32                1.00

Baseline

index_name        index_type  n_probe  latency(ms)  net_cpu_time(ms)  avg_cpu_count      QPS  recall    visited  filter_selectivity
----------------  ----------  -------  -----------  ----------------  -------------  -------  ------  ---------  ------------------
wiki1024en.train         ivf       10         0.65              0.00           0.00  1538.46    0.82    5680.72                1.00
wiki1024en.train         ivf       20         0.84              0.00           0.00  1190.48    0.88   10677.40                1.00
wiki1024en.train         ivf       50         1.49              0.00           0.00   671.14    0.94   24431.26                1.00
wiki1024en.train         ivf      100         2.41              0.00           0.00   414.94    0.96   47000.85                1.00
wiki1024en.train         ivf      200         4.56              0.00           0.00   219.30    0.97   91284.42                1.00
wiki1024en.train         ivf      500        10.56              0.00           0.00    94.70    0.98  218185.33                1.00
wiki1024en.train         ivf     1000        20.81              0.00           0.00    48.05    0.98  412137.05                1.00

…te-checking

elasticsearchmachine · 2025-08-09T13:08:45Z

Pinging @elastic/es-search-relevance (Team:Search Relevance)

server/src/main/java/org/elasticsearch/index/codec/vectors/IVFVectorsReader.java

iverase

I like that the change removes the need for the visitedDocs bitset but it worries me that some of the calls to common APIs now will not be correct. For example the following call:

            if (scoredDocs > 0) {
                knnCollector.incVisitedCount(scoredDocs);
            }

It won't be correct because we are counting twice some documents? Is that a problem?

benwtrent · 2025-08-11T13:42:17Z

It won't be correct because we are counting twice some documents? Is that a problem?

We may visit the same doc twice, but I think that is ok.

We are using "visited" as a stand in for "number of vector ops". Which is correct and exposed via profiling. The top-hit count is still just being exposed as the k returned by the query (which is uneffected).

What do you think?

…te-checking

iverase · 2025-08-11T14:03:19Z

I saw in another PR that we might move from visiting x nProbes to have a visited ratio (which I think it is the right approach as clusters are not balanced). This change will have an effect on that ration.

Anyway, I do like to remove the visitedDocs BitSet so I am good with this and trying to make the ration work considering that we might visit a document twice.

iverase

LGTM

benwtrent added 4 commits August 7, 2025 14:16

Remove soar duplicate checking

80cf823

Merge remote-tracking branch 'upstream/main' into remove-soar-duplica…

7abdeef

…te-checking

iter

ad041f8

Merge remote-tracking branch 'upstream/main' into remove-soar-duplica…

518501d

…te-checking

benwtrent added >non-issue :Search Relevance/Vectors Vector search v9.2.0 labels Aug 9, 2025

elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Aug 9, 2025

fixing bug

3bd9e95

iverase reviewed Aug 11, 2025

View reviewed changes

server/src/main/java/org/elasticsearch/index/codec/vectors/IVFVectorsReader.java Outdated Show resolved Hide resolved

iverase reviewed Aug 11, 2025

View reviewed changes

Merge remote-tracking branch 'upstream/main' into remove-soar-duplica…

bcac3d6

…te-checking

iter

478e40e

iverase approved these changes Aug 11, 2025

View reviewed changes

fixing tail scoring

2cf33e4

benwtrent added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Aug 11, 2025

elasticsearchmachine merged commit bfefe03 into elastic:main Aug 11, 2025
33 checks passed

benwtrent deleted the remove-soar-duplicate-checking branch August 11, 2025 19:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove soar duplicate checking #132617

Remove soar duplicate checking #132617

benwtrent commented Aug 9, 2025 •

edited

Loading

Uh oh!

elasticsearchmachine commented Aug 9, 2025

Uh oh!

Uh oh!

iverase left a comment

Uh oh!

benwtrent commented Aug 11, 2025

Uh oh!

iverase commented Aug 11, 2025

Uh oh!

iverase left a comment

Uh oh!

Uh oh!

Uh oh!

Remove soar duplicate checking #132617

Remove soar duplicate checking #132617

Conversation

benwtrent commented Aug 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Multi-segment Cohere-wiki-768 8M

CANDIDATE

BASELINE

Single segment Cohere-wiki-1024 1M

Candidate

Baseline

Uh oh!

elasticsearchmachine commented Aug 9, 2025

Uh oh!

Uh oh!

iverase left a comment

Choose a reason for hiding this comment

Uh oh!

benwtrent commented Aug 11, 2025

Uh oh!

iverase commented Aug 11, 2025

Uh oh!

iverase left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

benwtrent commented Aug 9, 2025 •

edited

Loading