Skip to content
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 39 additions & 25 deletions qdrant-landing/content/documentation/concepts/indexing.md
Original file line number Diff line number Diff line change
Expand Up @@ -288,6 +288,45 @@ The HNSW parameters can also be configured on a collection and named vector
level by setting [`hnsw_config`](/documentation/concepts/indexing/#vector-index) to fine-tune search
performance.

### Filterable HNSW Index

Separately, a payload index and a vector index cannot completely address the challenges of filtered search.

In the case of high-selectivity (weak) filters, you can use the HNSW index as it is.
In the case of low-selectivity (strict) filters, you can use the payload index and do a complete rescore.
However, for cases in the middle, this approach does not work well.
On one hand, we cannot apply a full scan on too many vectors.
On the other hand, the HNSW graph starts to fall apart when using filters that are too strict.

![HNSW fail](/docs/precision_by_m.png)

<!-- ![hnsw graph](/docs/graph.gif) -->

Qdrant solves this problem by extending the HNSW graph with additional edges based on indexed payload values.
Extra edges allow you to efficiently search for nearby vectors using the HNSW index and apply filters as you search in the graph.
You can find more information on this approach in our [article](/articles/filterable-hnsw/).

#### The ACORN Search Algorithm

*Available as of v1.16.0*

In some cases, the additional edges built for Qdrant's filterable HNSW may not be sufficient.
These extra edges are added for each payload index separately, but not for every possible combination of payload indices.
As a result, a combination of two or more strict filters might still lead to disconnected graph components.
The same can happen when there are a large number of soft-deleted points in the graph.
In such cases, use the [ACORN Search Algorithm](/documentation/concepts/search/#acorn-search-algorithm).
When using ACORN, during graph traversal, it explores not just direct neighbors (first hop), but also neighbors of neighbors (second hop) when direct neighbors are filtered out. This improves search accuracy at the cost of performance.

#### Disable the Creation of Extra Edges for Payload Fields

*Available as of v1.17.0*

Not all payload indices may be intended for use with dense vector search. For example, when a collection contains both dense and sparse vectors, some payload fields may only be used to filter sparse vector searches. Since sparse vector search does not use the HNSW index, it is unnecessary to build extra edges in the HNSW graph for these fields. Creating extra edges adds indexing latency and increases the size of the HNSW graph, which consumes memory as well as disk space, so you may want to disable it for fields that do not require it.

You can disable the creation of extra edges for an indexed payload field by setting `enable_hnsw` to `false` when configuring a payload index:

{{< code-snippet path="/documentation/headless/snippets/create-payload-index/disable-hnsw/" >}}

## Sparse Vector Index

*Available as of v1.7.0*
Expand Down Expand Up @@ -340,28 +379,3 @@ Where:

- `N` is the total number of documents in the collection.
- `n` is the number of documents containing non-zero values for the given vector element.

## Filterable Index

Separately, a payload index and a vector index cannot solve the problem of search using the filter completely.

In the case of high-selectivity (weak) filters, you can use the HNSW index as it is.
In the case of low-selectivity (strict) filters, you can use the payload index and complete rescore.

However, for cases in the middle, this approach does not work well.
On the one hand, we cannot apply a full scan on too many vectors.
On the other hand, the HNSW graph starts to fall apart when using too strict filters.

![HNSW fail](/docs/precision_by_m.png)

<!-- ![hnsw graph](/docs/graph.gif) -->

Qdrant solves this problem by extending the HNSW graph with additional edges based on the stored payload values.
Extra edges allow you to efficiently search for nearby vectors using the HNSW index and apply filters as you search in the graph.
You can find more information on this approach in our [article](/articles/filterable-hnsw/).

However, in some cases, these additional edges might not be enough.
These extra edges are added per each payload index separately, but not per each possible combination of them.
So, a combination of two or more strict filters still might lead to disconnected graph components.
The same may happen when having a large number of soft-deleted points in the graph.
In such cases, the [ACORN Search Algorithm](/documentation/concepts/search/#acorn-search-algorithm) can be used.
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
This code snippet demonstrates how to create a payload index for a keyword field while disabling the creation of additional HNSW graph nodes for that index by setting the `enable_hnsw` parameter to `false`.
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
using Qdrant.Client;
using Qdrant.Client.Grpc;

public class Snippet
{
public static async Task Run()
{
var client = new QdrantClient("localhost", 6334);

await client.CreatePayloadIndexAsync(
collectionName: "{collection_name}",
fieldName: "name_of_the_field_to_index",
schemaType: PayloadSchemaType.Keyword,
indexParams: new PayloadIndexParams
{
KeywordIndexParams = new KeywordIndexParams
{
EnableHnsw = false
}
}
);
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
```csharp
using Qdrant.Client;
using Qdrant.Client.Grpc;

var client = new QdrantClient("localhost", 6334);

await client.CreatePayloadIndexAsync(
collectionName: "{collection_name}",
fieldName: "name_of_the_field_to_index",
schemaType: PayloadSchemaType.Keyword,
indexParams: new PayloadIndexParams
{
KeywordIndexParams = new KeywordIndexParams
{
EnableHnsw = false
}
}
);
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
```go
import (
"context"

"github.com/qdrant/go-client/qdrant"
)

client, err := qdrant.NewClient(&qdrant.Config{
Host: "localhost",
Port: 6334,
})

client.CreateFieldIndex(context.Background(), &qdrant.CreateFieldIndexCollection{
CollectionName: "{collection_name}",
FieldName: "name_of_the_field_to_index",
FieldType: qdrant.FieldType_FieldTypeKeyword.Enum(),
FieldIndexParams: qdrant.NewPayloadIndexParamsKeyword(
&qdrant.KeywordIndexParams{
EnableHnsw: qdrant.PtrOf(false),
}),
})
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
```java
import io.qdrant.client.QdrantClient;
import io.qdrant.client.QdrantGrpcClient;
import io.qdrant.client.grpc.Collections.KeywordIndexParams;
import io.qdrant.client.grpc.Collections.PayloadIndexParams;
import io.qdrant.client.grpc.Collections.PayloadSchemaType;

QdrantClient client =
new QdrantClient(QdrantGrpcClient.newBuilder("localhost", 6334, false).build());

client
.createPayloadIndexAsync(
"{collection_name}",
"name_of_the_field_to_index",
PayloadSchemaType.Keyword,
PayloadIndexParams.newBuilder()
.setKeywordIndexParams(
KeywordIndexParams.newBuilder()
.setEnableHnsw(false)
.build())
.build(),
null,
null,
null)
.get();
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
```rust
use qdrant_client::qdrant::{
CreateFieldIndexCollectionBuilder, FieldType, KeywordIndexParamsBuilder,
};
use qdrant_client::Qdrant;

let client = Qdrant::from_url("http://localhost:6334").build()?;

client
.create_field_index(
CreateFieldIndexCollectionBuilder::new(
"{collection_name}",
"name_of_the_field_to_index",
FieldType::Keyword,
)
.field_index_params(KeywordIndexParamsBuilder::default().enable_hnsw(false)),
)
.await?;
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
package snippet

import (
"context"

"github.com/qdrant/go-client/qdrant"
)

func Main() {
client, err := qdrant.NewClient(&qdrant.Config{
Host: "localhost",
Port: 6334,
})

// @hide-start
if err != nil {
panic(err)
}
// @hide-end

client.CreateFieldIndex(context.Background(), &qdrant.CreateFieldIndexCollection{
CollectionName: "{collection_name}",
FieldName: "name_of_the_field_to_index",
FieldType: qdrant.FieldType_FieldTypeKeyword.Enum(),
FieldIndexParams: qdrant.NewPayloadIndexParamsKeyword(
&qdrant.KeywordIndexParams{
EnableHnsw: qdrant.PtrOf(false),
}),
})
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
```http
PUT /collections/{collection_name}/index
{
"field_name": "name_of_the_field_to_index",
"field_schema": {
"type": "keyword",
"enable_hnsw": false
}
}
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
package com.example.snippets_amalgamation;

import io.qdrant.client.QdrantClient;
import io.qdrant.client.QdrantGrpcClient;
import io.qdrant.client.grpc.Collections.KeywordIndexParams;
import io.qdrant.client.grpc.Collections.PayloadIndexParams;
import io.qdrant.client.grpc.Collections.PayloadSchemaType;

public class Snippet {
public static void run() throws Exception {
QdrantClient client =
new QdrantClient(QdrantGrpcClient.newBuilder("localhost", 6334, false).build());

client
.createPayloadIndexAsync(
"{collection_name}",
"name_of_the_field_to_index",
PayloadSchemaType.Keyword,
PayloadIndexParams.newBuilder()
.setKeywordIndexParams(
KeywordIndexParams.newBuilder()
.setEnableHnsw(false)
.build())
.build(),
null,
null,
null)
.get();
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
use qdrant_client::qdrant::{
CreateFieldIndexCollectionBuilder, FieldType, KeywordIndexParamsBuilder,
};
use qdrant_client::Qdrant;

pub async fn main() -> anyhow::Result<()> {
let client = Qdrant::from_url("http://localhost:6334").build()?;

client
.create_field_index(
CreateFieldIndexCollectionBuilder::new(
"{collection_name}",
"name_of_the_field_to_index",
FieldType::Keyword,
)
.field_index_params(KeywordIndexParamsBuilder::default().enable_hnsw(false)),
)
.await?;

Ok(())
}
Loading