Skip to content

Investigate dynamic biased connections for HNSW when using index sorting #15471

@benwtrent

Description

@benwtrent

Description

When index sorting is configured, its generally assumed that the search requests will be filtered according to that sort criteria.

Currently, HNSW is "filter agnostic", and does nothing around handling connections that may or may not be included in some future filter.

Could we add additional "biased" connections within the lowest level of the HNSW graph? These biased connections would be in addition to the true nearest neighbor connections, but would be required to be within some ordinal range.

For example, consider vector ordinal 0, which is connected to the traditionally nearest neighbors of 148, 511, 513, 1044, 1667. In addition to these connections, we have a restriction that each ordinal must consider nearest neighbors within an ordinal range of 100. Now vector 0 will have new connections 10, 11, 56, 148, 511, 513, 1044, 1667.

Since the index is sorted according to a commonly filtered criteria, it is likely that ordinals that are near each other would also pass the same filters (I suppose it depends on cardinality, etc....).

QDrant does something sort of like this, but they have the luxury of knowing filter criteria directly: (https://qdrant.tech/articles/filtrable-hnsw/)

I think we can substitute index sorting to apply additional connections in a similar, but maybe not as robust way.

Combined with ACORN (which would pair very well with this), this could really help heavily filtered search when the criteria matches the index sort.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions