Investigate BBQ quantizing centroids & rescoring

### Description

Digging into https://github.com/elastic/elasticsearch/pull/129950 its tricky to get absolutely correct.

As a sibling issue to improve centroid vector scoring (which could be done in addition to indexing the centroids), we should consider quantizing centroids to single bit, and reranking with 4 bit quantization.

When the number of centroids gets large, I would expect this to have a pretty significant impact. 

Some things to consider:

 - Smallest centroid quantization size should be dictated by the smallest in postings list. Meaning if postings list are 2 bit, centroids should be 2 bit
 - We will definitely need to oversample and rescore. We can do something simple and do 3x oversampling by default for single bit and something cleverer later.
 - The initial bit quantizations should be 1 bit and 4 bit. Though, once we get to 2bit, I think we should just jump to 7bit for higher fidelity.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Investigate BBQ quantizing centroids & rescoring #131234

Description

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Investigate BBQ quantizing centroids & rescoring #131234

Description

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions