Skip to content

Conversation

@tteofili
Copy link
Contributor

@tteofili tteofili commented Jul 25, 2025

This is a first attempt at dynamically adapt each segment's nProbe param to the distribution of the centroids wrt a specific query vector.
This seeks to detect if a specific query vector lies close to most of nProbe centroids, if so, it might be worth to explore more centroids.

See #129290.

this patch also adds a merge_policy param to KnnIndexTester so that we could better test differing numbers of merge policies (for this patch it was useful to set it to "merge_policy" : "no", "forge_merge" : false, to search with multiple segments).

@tteofili
Copy link
Contributor Author

the first implementation calculates the difference in percentage between the score of the closest centroid with the score of the nProbe-th centroid and it doubles nProbe in case that's very tiny (< 0,01%).

@tteofili
Copy link
Contributor Author

For example with multiple segments on Wikipedia dataset with Cohere-1024 embeddings

index_name       index_type  num_docs  index_time(ms)  force_merge_time(ms)  num_segments
---------------  ----------  --------  --------------  --------------------  ------------  
wiki1024en.docs         ivf   1000000           16278                     0            16

current

index_name       index_type  n_probe  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall    visited  filter_selectivity
---------------  ----------  -------  -----------  ----------------  -------------  ------  ------  ---------  ------------------  
wiki1024en.docs         ivf       10         4.78             26.85           5.62  209.21    0.98   91297.74                1.00
wiki1024en.docs         ivf       20         7.11             45.40           6.39  140.65    0.99  154945.14                1.00
wiki1024en.docs         ivf       30         9.13             59.32           6.50  109.53    0.99  200248.30                1.00
wiki1024en.docs         ivf       40        10.56             69.94           6.62   94.70    0.99  230295.45                1.00
wiki1024en.docs         ivf       50        12.24             76.11           6.22   81.70    0.99  254517.65                1.00
wiki1024en.docs         ivf       60        12.73             83.15           6.53   78.55    0.99  272494.89                1.00
wiki1024en.docs         ivf       70        13.05             85.80           6.57   76.63    0.99  285934.05                1.00
wiki1024en.docs         ivf       80        13.54             87.76           6.48   73.86    0.99  296885.67                1.00
wiki1024en.docs         ivf       90        14.23             89.82           6.31   70.27    0.99  306887.09                1.00
wiki1024en.docs         ivf      100        15.06             91.67           6.09   66.40    0.99  316524.32                1.00
wiki1024en.docs         ivf      200        22.81            106.47           4.67   43.84    1.00  397371.73                1.00
wiki1024en.docs         ivf      500        34.71            126.37           3.64   28.81    1.00  499179.57                1.00

candidate

index_name       index_type  n_probe  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall    visited  filter_selectivity
---------------  ----------  -------  -----------  ----------------  -------------  ------  ------  ---------  ------------------  
wiki1024en.docs         ivf       10         1.20              5.28           4.40  833.33    0.98   81883.08                1.00
wiki1024en.docs         ivf       20         1.40              7.96           5.69  714.29    0.99  146391.44                1.00
wiki1024en.docs         ivf       30         1.70              9.86           5.80  588.24    0.99  194012.73                1.00
wiki1024en.docs         ivf       40         2.04             12.07           5.92  490.20    0.99  234710.38                1.00
wiki1024en.docs         ivf       50         2.41             14.57           6.05  414.94    0.99  261959.87                1.00
wiki1024en.docs         ivf       60         2.43             15.14           6.23  411.52    0.99  280331.17                1.00
wiki1024en.docs         ivf       70         2.55             15.63           6.13  392.16    0.99  294235.55                1.00
wiki1024en.docs         ivf       80         2.66             15.98           6.01  375.94    0.99  305556.75                1.00
wiki1024en.docs         ivf       90         2.83             16.56           5.85  353.36    1.00  315484.57                1.00
wiki1024en.docs         ivf      100         2.85             17.05           5.98  350.88    1.00  325032.13                1.00
wiki1024en.docs         ivf      200         4.26             19.03           4.47  234.74    1.00  404936.88                1.00
wiki1024en.docs         ivf      500         8.59             24.23           2.82  116.41    1.00  499500.00                1.00

@tteofili
Copy link
Contributor Author

this is being superseeded by #132396

@tteofili tteofili closed this Aug 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants