Skip to content

Can we support vectors to be loaded with direct I/O for full precision re-ranking? #14746

@dungba88

Description

@dungba88

Description

Spin-off from discussion in #14708. One of the concern with with full precision (FP) re-ranking (for quantized vectors) is that if we use off-heap vector reader it will page-in the FP vector data and can compete with quantized vector data which are used for HNSW graph search. As HNSW will suffer the performance greatly if the vectors are not in memory, for instance with limited memory, can we support a mode to let the FP vectors be loaded with direct I/O? (Or if this is already possible?)

For integrating with the existing quantized vectors codec, is my understanding correct that we will need to create a new codec/vector reader that extend from the existing reader and use a different raw vector format?

I can try this, but wondering what the community think about it. Is there other use case that needs a on-heap direct I/O vector readers as well?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions