Skip to content

Conversation

tteofili
Copy link
Contributor

@tteofili tteofili commented Aug 4, 2025

As discussed here, we might favor / penalize exploration of certain segments based on query vs segment affinity.
This does that by leveraging information about segment density (vectors per cluster), query to global centroid similarity.
Segments with higher affinity get increased visited_ratio, whereas segments with lower affinity see their visited_ratio decreased, optionally segments with very small affinity might not get explored.

@tteofili tteofili changed the title DiskBBQ - Adapt nProbe based on query - segment affinity in multi segment scenario DiskBBQ - Adapt visited_ratio based on query - segment affinity in multi segment scenario Aug 21, 2025
@benwtrent
Copy link
Member

I benchmarked this change, just using regular indexing & merge policy. I wonder if there is a bug? For very low percentage visits (1% or less), its worse.

Then for higher visit percentages, its only marginally better.

I also benchmarked with a 10% selectivity filter, and there is zero difference between the two implementations.

index_name                      index_type  num_docs  index_time(ms)  force_merge_time(ms)  num_segments
------------------------------  ----------  --------  --------------  --------------------  ------------
cohere-wikipedia-docs-768d.vec         ivf   8000000               0                     0            24
corpus-ann-gist-1M.fvec                ivf   1000000               0                     0            21
wiki1024en.train                       ivf   1000000               0                     0            18

BASELINE

index_name                      index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall    visited  filter_selectivity
------------------------------  ----------  -------------------  -----------  ----------------  -------------  ------  ------  ---------  ------------------
cohere-wikipedia-docs-768d.vec         ivf                 0.50         4.24              0.00           0.00  235.85    0.91   87476.40                1.00
cohere-wikipedia-docs-768d.vec         ivf                 1.00         6.92              0.00           0.00  144.51    0.93  166970.99                1.00
cohere-wikipedia-docs-768d.vec         ivf                 2.00        10.91              0.00           0.00   91.66    0.94  327095.79                1.00
corpus-ann-gist-1M.fvec                ivf                 0.50         1.23              0.00           0.00  813.01    0.88   15789.88                1.00
corpus-ann-gist-1M.fvec                ivf                 1.00         1.69              0.00           0.00  591.72    0.90   25413.18                1.00
corpus-ann-gist-1M.fvec                ivf                 2.00         2.39              0.00           0.00  418.41    0.90   45134.49                1.00
corpus-ann-gist-1M.fvec                ivf                 3.00         3.16              0.00           0.00  316.46    0.90   64945.82                1.00
corpus-ann-gist-1M.fvec                ivf                 4.00         4.01              0.00           0.00  249.38    0.90   85914.20                1.00
corpus-ann-gist-1M.fvec                ivf                 5.00         4.54              0.00           0.00  220.26    0.90  105406.93                1.00
wiki1024en.train                       ivf                 0.50         1.02              0.00           0.00  980.39    0.79   17694.47                1.00
wiki1024en.train                       ivf                 1.00         1.26              0.00           0.00  793.65    0.84   26217.51                1.00
wiki1024en.train                       ivf                 2.00         1.90              0.00           0.00  526.32    0.90   47006.85                1.00
wiki1024en.train                       ivf                 3.00         2.43              0.00           0.00  411.52    0.93   66709.42                1.00
wiki1024en.train                       ivf                 4.00         3.06              0.00           0.00  326.80    0.94   86651.58                1.00
wiki1024en.train                       ivf                 5.00         3.64              0.00           0.00  274.73    0.95  106544.51                1.00

CANDIDATE:

index_name                      index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall    visited  filter_selectivity
------------------------------  ----------  -------------------  -----------  ----------------  -------------  ------  ------  ---------  ------------------
cohere-wikipedia-docs-768d.vec         ivf                 0.50         6.68              0.00           0.00  149.70    0.92  129804.80                1.00
cohere-wikipedia-docs-768d.vec         ivf                 1.00         8.48              0.00           0.00  117.92    0.93  166970.99                1.00
cohere-wikipedia-docs-768d.vec         ivf                 2.00        10.36              0.00           0.00   96.53    0.94  241959.52                1.00
corpus-ann-gist-1M.fvec                ivf                 0.50         1.89              0.00           0.00  529.10    0.90   23801.30                1.00
corpus-ann-gist-1M.fvec                ivf                 1.00         2.05              0.00           0.00  487.80    0.90   25413.18                1.00
corpus-ann-gist-1M.fvec                ivf                 2.00         2.06              0.00           0.00  485.44    0.90   28470.45                1.00
corpus-ann-gist-1M.fvec                ivf                 3.00         2.88              0.00           0.00  347.22    0.90   40225.97                1.00
corpus-ann-gist-1M.fvec                ivf                 4.00         3.13              0.00           0.00  319.49    0.90   52354.58                1.00
corpus-ann-gist-1M.fvec                ivf                 5.00         3.66              0.00           0.00  273.22    0.90   63736.07                1.00
wiki1024en.train                       ivf                 0.50         1.20              0.00           0.00  833.33    0.81   20492.74                1.00
wiki1024en.train                       ivf                 1.00         1.53              0.00           0.00  653.59    0.84   26217.51                1.00
wiki1024en.train                       ivf                 2.00         1.83              0.00           0.00  546.45    0.88   38854.70                1.00
wiki1024en.train                       ivf                 3.00         2.40              0.00           0.00  416.67    0.91   55507.61                1.00
wiki1024en.train                       ivf                 4.00         3.34              0.00           0.00  299.40    0.93   71875.22                1.00
wiki1024en.train                       ivf                 5.00         3.45              0.00           0.00  289.86    0.94   87559.65                1.00

@john-wagster
Copy link
Contributor

I benchmarked this change, just using regular indexing & merge policy. I wonder if there is a bug? For very low percentage visits (1% or less), its worse.

This makes sense. Where the PR is at currently we cap you at 1% as the lowest threshold. So likely what happens at 1% right now is we explore all segments at a 1% ratio at least but may explore more. So for configured 1% we're exploring too much. I can try to deal with this by saying we won't do this logic at all if we have a ratio that's below say 5%. So at 1% it would revert to the behavior on main.

        // for low affinity scores, decrease visited ratio
        if (affinityScore <= affinityThreshold) {
            return Math.max(visitRatio * 0.5f, 0.01f);
        }

Then for higher visit percentages, its only marginally better.

that surprises me given what @tteofili was showing me. Let me see if I can replicate and dig into those numbers.

I also benchmarked with a 10% selectivity filter, and there is zero difference between the two implementations.

I'll try to run with this as well.

…tio, lower all thresholds to favor smaller visit ratios
@john-wagster
Copy link
Contributor

john-wagster commented Aug 25, 2025

I cleaned up the magic numbers a bit. I'm seeing decent improvements with dbpedia. I'll run some additional datasets here later tonight.

I set the thresholds to be below half a percent where we cut off and lowered the min affinity from 1% to 0.1%. So you should see an actual impact at those levels you were testing @benwtrent.

dbpedia 4m (21-24 segments produced)

# baseline
index_name                             index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall     visited  filter_selectivity
-------------------------------------  ----------  -------------------  -----------  ----------------  -------------  ------  ------  ----------  ------------------  
corpus-dbpedia-entity-E5-small-0.fvec         ivf                 0.50         1.83              0.00           0.00  546.45    0.62    53792.41                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                 1.00         3.04              0.00           0.00  329.22    0.70   100197.52                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                 2.00         3.88              0.00           0.00  257.40    0.78   193119.21                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                 3.00         4.86              0.00           0.00  205.87    0.81   285863.72                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                 4.00         6.38              0.00           0.00  156.74    0.84   378586.81                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                 5.00         7.26              0.00           0.00  137.65    0.86   471071.43                1.00

# candidate
index_name                             index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall     visited  filter_selectivity
-------------------------------------  ----------  -------------------  -----------  ----------------  -------------  ------  ------  ----------  ------------------  
corpus-dbpedia-entity-E5-small-0.fvec         ivf                 0.50         1.61              0.00           0.00  622.08    0.61    44239.54                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                 1.00         2.87              0.00           0.00  348.74    0.68    78858.63                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                 2.00         3.35              0.00           0.00  298.73    0.75   149642.79                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                 3.00         4.60              0.00           0.00  217.27    0.79   219955.92                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                 4.00         5.34              0.00           0.00  187.35    0.81   290437.33                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                 5.00         5.64              0.00           0.00  177.38    0.84   360858.71                1.00

@john-wagster
Copy link
Contributor

john-wagster commented Aug 26, 2025

I ran dbpedia again with the same ingest this time. And got good numbers still I got confusing numbers. Low visit percentages have poor QPS but visit less, which doesn't make any sense. I'll look into that further:

# baseline
index_name                             index_type  num_docs  index_time(ms)  force_merge_time(ms)  num_segments
-------------------------------------  ----------  --------  --------------  --------------------  ------------  
corpus-dbpedia-entity-E5-small-0.fvec         ivf  10000000               0                     0            24

index_name                             index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall     visited  filter_selectivity
-------------------------------------  ----------  -------------------  -----------  ----------------  -------------  ------  ------  ----------  ------------------  
corpus-dbpedia-entity-E5-small-0.fvec         ivf                 0.50         1.79              0.00           0.00  558.66    0.63    54701.69                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                 1.00         2.42              0.00           0.00  413.22    0.71   101428.57                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                 2.00         3.58              0.00           0.00  279.72    0.78   194028.14                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                 3.00         4.80              0.00           0.00  208.22    0.82   286612.36                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                 4.00         5.89              0.00           0.00  169.78    0.84   379281.03                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                 5.00         7.11              0.00           0.00  140.75    0.86   472046.09                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                10.00        12.31              0.00           0.00   81.27    0.91   935829.19                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                30.00        33.58              0.00           0.00   29.78    0.96  2790057.65                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                50.00        54.65              0.00           0.00   18.30    0.97  4644486.01                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                70.00        75.55              0.00           0.00   13.24    0.98  6498687.84                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf               100.00       107.76              0.00           0.00    9.28    0.98  9271902.00                1.00


# candidate
index_name                             index_type  num_docs  index_time(ms)  force_merge_time(ms)  num_segments
-------------------------------------  ----------  --------  --------------  --------------------  ------------  
corpus-dbpedia-entity-E5-small-0.fvec         ivf  10000000               0                     0            24

index_name                             index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall     visited  filter_selectivity
-------------------------------------  ----------  -------------------  -----------  ----------------  -------------  ------  ------  ----------  ------------------  
corpus-dbpedia-entity-E5-small-0.fvec         ivf                 0.50         2.14              0.00           0.00  466.74    0.61    44239.54                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                 1.00         2.15              0.00           0.00  465.12    0.68    78858.63                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                 2.00         3.04              0.00           0.00  328.68    0.75   149642.79                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                 3.00         4.01              0.00           0.00  249.07    0.79   219955.92                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                 4.00         4.86              0.00           0.00  205.97    0.81   290437.33                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                 5.00         5.55              0.00           0.00  180.10    0.84   360858.71                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                10.00         9.81              0.00           0.00  101.88    0.89   713512.61                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                30.00        26.00              0.00           0.00   38.47    0.94  2123247.04                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                50.00        42.69              0.00           0.00   23.42    0.96  3533203.26                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                70.00        58.48              0.00           0.00   17.10    0.97  4943038.51                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf               100.00        83.27              0.00           0.00   12.01    0.98  7053592.12                1.00

However, when I started to run cohere I got similar results to what you originally reported @benwtrent. I'm seeing when filtering very little difference between candidate and baseline. And when not filtering I'm seeing the baseline sometimes doing better. I'm going to run it a few more times tomorrow. I see a lot of variation with the default merge policy across a few runs I did, which was a little surprising. And I'll see if there's anything that can be done about improvements here and post some additional numbers.

@benwtrent
Copy link
Member

I see a lot of variation with the default merge policy across a few runs I did, which was a little surprising.

The concurrent merge scheduler may or may not kick off a merge due to thread availability and memory pressure. So, its not totally surprising that the number of segments jump between tier sizes (e.g. 5-10 in count)

@tteofili
Copy link
Contributor Author

tteofili commented Sep 5, 2025

I got to stabilize the results such that this can work both with the default (tiered) merge policy and with no merge policy, without breaking recall.
One problem in the earlier impl is that density was used to boost affinity whereas the number of vectors in a segment is a more important factor, especially when the size of segments has a higher variance (e.g., typical log-like distributions of tiered merge policy).
as a side note, to have comparable runs, the index has to be the same (hence I had to set reindex=false in the json config).

Cohere

Main

index_name       index_type  num_docs  index_time(ms)  force_merge_time(ms)  num_segments
---------------  ----------  --------  --------------  --------------------  ------------  
wiki1024en.docs         ivf   1000000               0                     0            18

index_name       index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count      QPS  recall    visited  filter_selectivity
---------------  ----------  -------------------  -----------  ----------------  -------------  -------  ------  ---------  ------------------  
wiki1024en.docs         ivf                 1.00         0.90              2.97           3.30  1111.11    0.91   30116.00                1.00
wiki1024en.docs         ivf                 5.00         0.85              3.77           4.44  1176.47    0.97   59804.29                1.00
wiki1024en.docs         ivf                10.00         1.21              5.23           4.32   826.45    0.99  108649.49                1.00
wiki1024en.docs         ivf                30.00         2.23              9.35           4.19   448.43    1.00  307106.92                1.00
wiki1024en.docs         ivf                50.00         3.42             13.91           4.07   292.40    1.00  506324.92                1.00
wiki1024en.docs         ivf                70.00         4.93             19.10           3.87   202.84    1.00  704843.20                1.00
wiki1024en.docs         ivf               100.00         6.44             25.29           3.93   155.28    1.00  998499.00                1.00

Candidate

index_name       index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count      QPS  recall    visited  filter_selectivity
---------------  ----------  -------------------  -----------  ----------------  -------------  -------  ------  ---------  ------------------  
wiki1024en.docs         ivf                 1.00         0.77              2.82           3.66  1298.70    0.91   30118.81                1.00
wiki1024en.docs         ivf                 5.00         0.82              3.44           4.20  1219.51    0.97   55186.97                1.00
wiki1024en.docs         ivf                10.00         1.12              4.11           3.67   892.86    0.98   98530.78                1.00
wiki1024en.docs         ivf                30.00         2.21              7.05           3.19   452.49    1.00  274222.32                1.00
wiki1024en.docs         ivf                50.00         3.38             10.53           3.12   295.86    1.00  451666.17                1.00
wiki1024en.docs         ivf                70.00         4.66             13.67           2.93   214.59    1.00  628783.98                1.00
wiki1024en.docs         ivf               100.00         6.54             18.75           2.87   152.91    1.00  891313.12                1.00

DBPedia

Main

index_name                             index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall     visited  filter_selectivity
-------------------------------------  ----------  -------------------  -----------  ----------------  -------------  ------  ------  ----------  ------------------  
corpus-dbpedia-entity-E5-small-0.fvec         ivf                 1.00         2.01              0.00           0.00  498.13    0.67    99594.35                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                 5.00         6.45              0.00           0.00  155.10    0.84   470410.90                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                10.00        11.79              0.00           0.00   84.84    0.89   933995.92                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                30.00        33.64              0.00           0.00   29.73    0.93  2788274.12                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                50.00        53.81              0.00           0.00   18.58    0.94  4642552.50                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                70.00        74.87              0.00           0.00   13.36    0.95  6496735.59                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf               100.00       108.33              0.00           0.00    9.23    0.95  9271553.00                1.00

Candidate

index_name                             index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall     visited  filter_selectivity
-------------------------------------  ----------  -------------------  -----------  ----------------  -------------  ------  ------  ----------  ------------------  
corpus-dbpedia-entity-E5-small-0.fvec         ivf                 1.00         1.76              0.00           0.00  568.18    0.66    95574.59                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                 5.00         5.92              0.00           0.00  168.99    0.84   450293.62                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                10.00        11.27              0.00           0.00   88.69    0.89   893695.94                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                30.00        31.63              0.00           0.00   31.62    0.93  2667241.42                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                50.00        52.16              0.00           0.00   19.17    0.94  4440804.43                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf                70.00        71.50              0.00           0.00   13.99    0.95  6214189.25                1.00
corpus-dbpedia-entity-E5-small-0.fvec         ivf               100.00        99.96              0.00           0.00   10.00    0.95  8722690.54                1.00

the final result is that the reduction in visited_nodes is less evident, so while the idea seemed interesting, it's currently probably not worth the complexity.

@tteofili tteofili closed this Sep 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>enhancement :Search Relevance/Vectors Vector search Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants