Skip to content

Conversation

@benwtrent
Copy link
Member

This PR is pretty basic, right now we don't enforce any ordering at all for our IVF postings lists.

It seems like we should at a minimum make sure they are in doc-id order.

If we decide to switch this in the future, at least we will have a consistent ordering.

@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Jun 18, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@benwtrent
Copy link
Member Author

I did some benchmarking, it doesn't give us much space savings (yet), but it didn't hurt performance.

@iverase
Copy link
Contributor

iverase commented Jun 18, 2025

The idea for not sorting by docId was to favour having vectors together that were not SOAR vectors so bulk scoring is
more effective.

@benwtrent
Copy link
Member Author

I need to benchmark this in highly filtered scenarios (e.g. when we will search more centroids), to ensure this doesn't hurt search performance.

@benwtrent
Copy link
Member Author

I ran some higher-recall filtered search scenarios and there is basically zero increase in query latency.

baseline:

index_name                      index_type  num_docs  index_time(ms)  force_merge_time(ms)  num_segments
------------------------------  ----------  --------  --------------  --------------------  ------------
cohere-wikipedia-docs-768d.vec         ivf   2000000          160346                243206             0
cohere-wikipedia-docs-768d.vec         ivf   2000000               0                     0             0
cohere-wikipedia-docs-768d.vec         ivf   2000000               0                     0             0

index_name                      index_type  n_probe  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall   visited  filter_selectivity
------------------------------  ----------  -------  -----------  ----------------  -------------  ------  ------  --------  ------------------
cohere-wikipedia-docs-768d.vec         ivf      200         3.68              0.00           0.00  271.74    0.94  46351.83                1.00
cohere-wikipedia-docs-768d.vec         ivf      200         3.69              0.00           0.00  271.00    0.94  46351.83                1.00
cohere-wikipedia-docs-768d.vec         ivf      200         3.65              0.00           0.00  273.97    0.94  46351.83                1.00
cohere-wikipedia-docs-768d.vec         ivf      200         3.68              0.00           0.00  271.74    0.94  46351.83                1.00
cohere-wikipedia-docs-768d.vec         ivf      200         3.68              0.00           0.00  271.74    0.94  46351.83                1.00
cohere-wikipedia-docs-768d.vec         ivf      200         6.26              0.00           0.00  159.74    0.94  29174.39                0.40
cohere-wikipedia-docs-768d.vec         ivf      200         5.19              0.00           0.00  192.68    0.94  29174.39                0.40
cohere-wikipedia-docs-768d.vec         ivf      200         5.19              0.00           0.00  192.68    0.94  29174.39                0.40
cohere-wikipedia-docs-768d.vec         ivf      200         5.48              0.00           0.00  182.48    0.94  29174.39                0.40
cohere-wikipedia-docs-768d.vec         ivf      200         5.28              0.00           0.00  189.39    0.94  29174.39                0.40
cohere-wikipedia-docs-768d.vec         ivf      200         5.23              0.00           0.00  191.20    0.94  12030.14                0.10
cohere-wikipedia-docs-768d.vec         ivf      200         5.18              0.00           0.00  193.05    0.94  12030.14                0.10
cohere-wikipedia-docs-768d.vec         ivf      200         5.13              0.00           0.00  194.93    0.94  12030.14                0.10
cohere-wikipedia-docs-768d.vec         ivf      200         5.56              0.00           0.00  179.86    0.94  12030.14                0.10
cohere-wikipedia-docs-768d.vec         ivf      200         5.03              0.00           0.00  198.81    0.94  12030.14                0.10

this PR:

index_name                      index_type  num_docs  index_time(ms)  force_merge_time(ms)  num_segments
------------------------------  ----------  --------  --------------  --------------------  ------------
cohere-wikipedia-docs-768d.vec         ivf   2000000          154027                237642             0
cohere-wikipedia-docs-768d.vec         ivf   2000000               0                     0             0
cohere-wikipedia-docs-768d.vec         ivf   2000000               0                     0             0

index_name                      index_type  n_probe  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall   visited  filter_selectivity
------------------------------  ----------  -------  -----------  ----------------  -------------  ------  ------  --------  ------------------
cohere-wikipedia-docs-768d.vec         ivf      200         3.80              0.00           0.00  263.16    0.94  46351.83                1.00
cohere-wikipedia-docs-768d.vec         ivf      200         3.75              0.00           0.00  266.67    0.94  46351.83                1.00
cohere-wikipedia-docs-768d.vec         ivf      200         3.96              0.00           0.00  252.53    0.94  46351.83                1.00
cohere-wikipedia-docs-768d.vec         ivf      200         3.68              0.00           0.00  271.74    0.94  46351.83                1.00
cohere-wikipedia-docs-768d.vec         ivf      200         3.70              0.00           0.00  270.27    0.94  46351.83                1.00
cohere-wikipedia-docs-768d.vec         ivf      200         5.34              0.00           0.00  187.27    0.94  29174.39                0.40
cohere-wikipedia-docs-768d.vec         ivf      200         5.17              0.00           0.00  193.42    0.94  29174.39                0.40
cohere-wikipedia-docs-768d.vec         ivf      200         5.21              0.00           0.00  191.94    0.94  29174.39                0.40
cohere-wikipedia-docs-768d.vec         ivf      200         5.18              0.00           0.00  193.05    0.94  29174.39                0.40
cohere-wikipedia-docs-768d.vec         ivf      200         5.17              0.00           0.00  193.42    0.94  29174.39                0.40
cohere-wikipedia-docs-768d.vec         ivf      200         5.10              0.00           0.00  196.08    0.94  12030.14                0.10
cohere-wikipedia-docs-768d.vec         ivf      200         5.07              0.00           0.00  197.24    0.94  12030.14                0.10
cohere-wikipedia-docs-768d.vec         ivf      200         5.17              0.00           0.00  193.42    0.94  12030.14                0.10
cohere-wikipedia-docs-768d.vec         ivf      200         5.08              0.00           0.00  196.85    0.94  12030.14                0.10
cohere-wikipedia-docs-768d.vec         ivf      200         5.08              0.00           0.00  196.85    0.94  12030.14                0.10

Comment on lines 104 to 152
// keeping them in the same file indicates we pull the entire file into cache
docIdsWriter.writeDocIds(j -> floatVectorValues.ordToDoc(cluster[j]), size, postingsOutput);
postingsOutput.writeGroupVInts(docIds, size);
postingsOutput.writeGroupVInts(spillDocIds, overspillCluster.length);
onHeapQuantizedVectors.reset(centroid, size, j -> cluster[finalOrds[j]]);
bulkWriter.writeVectors(onHeapQuantizedVectors);
// write overspill vectors
onHeapQuantizedVectors.reset(centroid, overspillCluster.length, j -> overspillCluster[finalSpillOrds[j]]);
bulkWriter.writeVectors(onHeapQuantizedVectors);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sort of what you had in mind @iverase ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. My idea is that we can use this information to first score the assignments of all the clusters we want to visit so we can ensure that the posting lists will be unique and have simpler (and faster) visiting logic, and later visit the spill assignments where we would have a more complex (slower) logic to remove already visited posting lists. The downside of this approach is that it will require more hops on the posting list files breaking a bit the disk friendly approach of this type of index.

Do you think this is something doable? it complicated the search logic quite a bit and maybe the benefits are too small. What do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iverase the main benefit of SOAR and overspilling in general is that fewer nProbe need to be gathered. I would expect us to score both regular and overspill up to some fraction of nprobe

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not saying that we would not score both but doing one after the other so if there is not deletes, scoring unique posting lists would be faster (no need to process docIds before scoring them). I can see it gets hairy and it would require different logic branches which is not great.

If you see no performance impact, I think we can just order all docs. We can make the distinction later on if is ever required.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, we already have all the vectors are already in doc order. But, when we combine initial and secondary assignments into one grouping, that is when you get a partial ordering.

Each assignment array is already in vector ordinal order, which also means that they are already in doc Id order.

This PR now just keeps them separate (no sorting required).

Overall, there don't seem to be significant performance gains.

I noticed slightly lower performance when no filters are provided.

I noticed higher performance with very restrictive filters are used with high-nprobe, but I don't really know why.

I wouldn't expect doc ID decoding to be a significant issue.

I am gonna leave this as it is and we can revisit it at a later time, unless you have a better intuition around this.

@benwtrent
Copy link
Member Author

done elsewhere

@benwtrent benwtrent closed this Aug 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>non-issue :Search Relevance/Vectors Vector search Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants