Skip to content
This repository was archived by the owner on Aug 16, 2022. It is now read-only.

Commit 3ac3e02

Browse files
authored
Merge pull request #396 from jmazanec15/knn-docs-refactor
Refactor k-NN documentation
2 parents 07a2eea + 29ca648 commit 3ac3e02

File tree

9 files changed

+705
-356
lines changed

9 files changed

+705
-356
lines changed

docs/knn/api.md

Lines changed: 143 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,143 @@
1+
---
2+
layout: default
3+
title: API
4+
nav_order: 4
5+
parent: k-NN
6+
has_children: false
7+
---
8+
9+
# API
10+
The k-NN plugin adds two APIs in order to allow users to better manage the plugin's functionality.
11+
12+
## Stats
13+
The KNN Stats API provides information about the current status of the k-NN Plugin. The plugin keeps track of both cluster level and node level stats. Cluster level stats have a single value for the entire cluster. Node level stats have a single value for each node in the cluster. A user can filter their query by nodeID and statName in the following way:
14+
```
15+
GET /_opendistro/_knn/nodeId1,nodeId2/stats/statName1,statName2
16+
```
17+
18+
Statistic | Description
19+
:--- | :---
20+
`circuit_breaker_triggered` | Indicates whether the circuit breaker is triggered. This is only relevant to approximate k-NN search.
21+
`total_load_time` | The time in nanoseconds that KNN has taken to load graphs into the cache. This is only relevant to approximate k-NN search.
22+
`eviction_count` | The number of graphs that have been evicted from the cache due to memory constraints or idle time. *note:* explicit evictions that occur because of index deletion are not counted. This is only relevant to approximate k-NN search.
23+
`hit_count` | The number of cache hits. A cache hit occurs when a user queries a graph and it is already loaded into memory. This is only relevant to approximate k-NN search.
24+
`miss_count` | The number of cache misses. A cache miss occurs when a user queries a graph and it has not yet been loaded into memory. This is only relevant to approximate k-NN search.
25+
`graph_memory_usage` | Current cache size (total size of all graphs in memory) in kilobytes. This is only relevant to approximate k-NN search.
26+
`graph_memory_usage_percentage` | The current weight of the cache as a percentage of the maximum cache capacity.
27+
`graph_index_requests` | The number of requests to add the knn_vector field of a document into a graph.
28+
`graph_index_errors` | The number of requests to add the knn_vector field of a document into a graph that have produced an error.
29+
`graph_query_requests` | The number of graph queries that have been made.
30+
`graph_query_errors` | The number of graph queries that have produced an error.
31+
`knn_query_requests` | The number of KNN query requests received.
32+
`cache_capacity_reached` | Whether `knn.memory.circuit_breaker.limit` has been reached. This is only relevant to approximate k-NN search.
33+
`load_success_count` | The number of times KNN successfully loaded a graph into the cache. This is only relevant to approximate k-NN search.
34+
`load_exception_count` | The number of times an exception occurred when trying to load a graph into the cache. This is only relevant to approximate k-NN search.
35+
`indices_in_cache` | For each index that has graphs in the cache, this stat provides the number of graphs that index has and the total graph_memory_usage that index is using in Kilobytes.
36+
`script_compilations` | The number of times the KNN script has been compiled. This value should usually be 1 or 0, but if the cache containing the compiled scripts is filled, the KNN script might be recompiled. This is only relevant to k-NN score script search.
37+
`script_compilation_errors` | The number of errors during script compilation. This is only relevant to k-NN score script search.
38+
`script_query_requests` | The total number of script queries. This is only relevant to k-NN score script search.
39+
`script_query_errors` | The number of errors during script queries. This is only relevant to k-NN score script search.
40+
41+
### Usage
42+
```
43+
44+
GET /_opendistro/_knn/stats?pretty
45+
{
46+
"_nodes" : {
47+
"total" : 1,
48+
"successful" : 1,
49+
"failed" : 0
50+
},
51+
"cluster_name" : "_run",
52+
"circuit_breaker_triggered" : false,
53+
"nodes" : {
54+
"HYMrXXsBSamUkcAjhjeN0w" : {
55+
"eviction_count" : 0,
56+
"miss_count" : 1,
57+
"graph_memory_usage" : 1,
58+
"graph_memory_usage_percentage" : 3.68,
59+
"graph_index_requests" : 7,
60+
"graph_index_errors" : 1,
61+
"knn_query_requests" : 4,
62+
"graph_query_requests" : 30,
63+
"graph_query_errors" : 15,
64+
"indices_in_cache" : {
65+
"myindex" : {
66+
"graph_memory_usage" : 2,
67+
"graph_memory_usage_percentage" : 3.68,
68+
"graph_count" : 2
69+
}
70+
},
71+
"cache_capacity_reached" : false,
72+
"load_exception_count" : 0,
73+
"hit_count" : 0,
74+
"load_success_count" : 1,
75+
"total_load_time" : 2878745,
76+
"script_compilations" : 1,
77+
"script_compilation_errors" : 0,
78+
"script_query_requests" : 534,
79+
"script_query_errors" : 0
80+
}
81+
}
82+
}
83+
```
84+
85+
```
86+
GET /_opendistro/_knn/HYMrXXsBSamUkcAjhjeN0w/stats/circuit_breaker_triggered,graph_memory_usage?pretty
87+
{
88+
"_nodes" : {
89+
"total" : 1,
90+
"successful" : 1,
91+
"failed" : 0
92+
},
93+
"cluster_name" : "_run",
94+
"circuit_breaker_triggered" : false,
95+
"nodes" : {
96+
"HYMrXXsBSamUkcAjhjeN0w" : {
97+
"graph_memory_usage" : 1
98+
}
99+
}
100+
}
101+
```
102+
103+
## Warmup
104+
The Hierarchical Navigable Small World (HNSW) graphs that are used to perform an approximate k-Nearest Neighbor (k-NN) search are stored as `.hnsw` files with other Apache Lucene segment files. In order for you to perform a search on these graphs using the k-NN plugin, these files need to be loaded into native memory.
105+
106+
If the plugin has not loaded the graphs into native memory, it loads them when it receives a search request. This loading time can cause high latency during initial queries. To avoid this situation, users often run random queries during a warmup period. After this warmup period, the graphs are loaded into native memory and their production workloads can begin. This loading process is indirect and requires extra effort.
107+
108+
As an alternative, you can avoid this latency issue by running the k-NN plugin warmup API operation on whatever indices you're interested in searching. This operation loads all the graphs for all of the shards (primaries and replicas) of all the indices specified in the request into native memory.
109+
110+
After the process finishes, you can start searching against the indices with no initial latency penalties. The warmup API operation is idempotent, so if a segment's graphs are already loaded into memory, this operation has no impact on those graphs. It only loads graphs that aren't currently in memory.
111+
112+
### Usage
113+
This request performs a warmup on three indices:
114+
115+
```json
116+
GET /_opendistro/_knn/warmup/index1,index2,index3?pretty
117+
{
118+
"_shards" : {
119+
"total" : 6,
120+
"successful" : 6,
121+
"failed" : 0
122+
}
123+
}
124+
```
125+
126+
`total` indicates how many shards the k-NN plugin attempted to warm up. The response also includes the number of shards the plugin succeeded and failed to warm up.
127+
128+
The call does not return until the warmup operation is complete or the request times out. If the request times out, the operation still continues on the cluster. To monitor the warmup operation, use the Elasticsearch `_tasks` API:
129+
130+
```json
131+
GET /_tasks
132+
```
133+
134+
After the operation has finished, use the [k-NN `_stats` API operation](#Stats) to see what the k-NN plugin loaded into the graph.
135+
136+
### Best practices
137+
For the warmup API to function properly, follow these best practices.
138+
139+
First, don't run merge operations on indices that you want to warm up. During merge, the k-NN plugin creates new segments, and old segments are (sometimes) deleted. For example, you could encounter a situation in which the warmup API operation loads graphs A and B into native memory, but segment C is created from segments A and B being merged. The graphs for A and B would no longer be in memory, and graph C would also not be in memory. In this case, the initial penalty for loading graph C is still present.
140+
141+
Second, confirm that all graphs you want to warm up can fit into native memory. For more information about the native memory limit, see the [knn.memory.circuit_breaker.limit statistic](../settings/#cluster-settings). High graph memory usage causes cache thrashing, which can lead to operations constantly failing and attempting to run again.
142+
143+
Finally, don't index any documents that you want to load into the cache. Writing new information to segments prevents the warmup API operation from loading the graphs until they're searchable. This means that you would have to run the warmup operation again after indexing finishes.

docs/knn/approximate-knn.md

Lines changed: 147 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,147 @@
1+
---
2+
layout: default
3+
title: Approximate Search
4+
nav_order: 1
5+
parent: k-NN
6+
has_children: false
7+
has_math: true
8+
---
9+
10+
# Approximate k-NN Search
11+
12+
The approximate k-NN method uses [nmslib's](https://github.com/nmslib/nmslib/) implementation of the HNSW algorithm to power k-NN search. In this case, approximate means that for a given search, the neighbors returned are an estimate of the true k-nearest neighbors. Of the three methods, this method offers the best search scalability for large data sets. Generally speaking, once the data set gets into the hundreds of thousands of vectors, this approach should be preferred.
13+
14+
This plugin builds an HNSW graph of the vectors for each "knn-vector field"/"Lucene segment" pair during indexing that can be used to efficiently find the k-nearest neighbors to a query vector during search. These graphs are loaded into native memory during search and managed by a cache. To pre-load the graphs into memory, please refer to the [warmup API](api#Warmup). In order to see what graphs are loaded in memory as well as other stats, please refer to the [stats API](api#Stats). To learn more about segments, please refer to [Apache Lucene's documentation](https://lucene.apache.org/core/8_7_0/core/org/apache/lucene/codecs/lucene87/package-summary.html#package.description). Because the graphs are constructed during indexing, it is not possible to apply a filter on an index and then use this search method. All filters will be applied on the results produced by the approximate nearest neighbor search.
15+
16+
## Get started with approximate k-NN
17+
18+
To use the k-NN plugin's approximate search functionality, you must first create a k-NN index with the index setting, `index.knn` to `true`. This setting tells the plugin to create HNSW graphs for the index.
19+
20+
Additionally, if you are using the approximate k-nearest neighbor method, you should specify `knn.space_type` to the space that you are interested in. This setting cannot be changed after it is set. Please refer to the [spaces section](#spaces) to see what spaces we support! By default, `index.knn.space_type` is `l2`. For more information on index settings, such as algorithm parameters that can be tweaked to tune performance, please refer to the [documentation](settings#IndexSettings).
21+
22+
Next, you must add one or more fields of the `knn_vector` data type. Here is an example that creates an index with two `knn_vector` fields and uses cosine similarity:
23+
24+
```json
25+
PUT my-knn-index-1
26+
{
27+
"settings": {
28+
"index": {
29+
"knn": true,
30+
"knn.space_type": "cosinesimil"
31+
}
32+
},
33+
"mappings": {
34+
"properties": {
35+
"my_vector1": {
36+
"type": "knn_vector",
37+
"dimension": 2
38+
},
39+
"my_vector2": {
40+
"type": "knn_vector",
41+
"dimension": 4
42+
}
43+
}
44+
}
45+
}
46+
```
47+
48+
The `knn_vector` data type supports a vector of floats that can have a dimension of up to 10,000, as set by the dimension mapping parameter.
49+
50+
In Elasticsearch, codecs handle the storage and retrieval of indices. The k-NN plugin uses a custom codec to write vector data to graphs so that the underlying k-NN search library can read it.
51+
{: .tip }
52+
53+
After you create the index, you can add some data to it:
54+
55+
```json
56+
POST _bulk
57+
{ "index": { "_index": "my-knn-index-1", "_id": "1" } }
58+
{ "my_vector1": [1.5, 2.5], "price": 12.2 }
59+
{ "index": { "_index": "my-knn-index-1", "_id": "2" } }
60+
{ "my_vector1": [2.5, 3.5], "price": 7.1 }
61+
{ "index": { "_index": "my-knn-index-1", "_id": "3" } }
62+
{ "my_vector1": [3.5, 4.5], "price": 12.9 }
63+
{ "index": { "_index": "my-knn-index-1", "_id": "4" } }
64+
{ "my_vector1": [5.5, 6.5], "price": 1.2 }
65+
{ "index": { "_index": "my-knn-index-1", "_id": "5" } }
66+
{ "my_vector1": [4.5, 5.5], "price": 3.7 }
67+
{ "index": { "_index": "my-knn-index-1", "_id": "6" } }
68+
{ "my_vector2": [1.5, 5.5, 4.5, 6.4], "price": 10.3 }
69+
{ "index": { "_index": "my-knn-index-1", "_id": "7" } }
70+
{ "my_vector2": [2.5, 3.5, 5.6, 6.7], "price": 5.5 }
71+
{ "index": { "_index": "my-knn-index-1", "_id": "8" } }
72+
{ "my_vector2": [4.5, 5.5, 6.7, 3.7], "price": 4.4 }
73+
{ "index": { "_index": "my-knn-index-1", "_id": "9" } }
74+
{ "my_vector2": [1.5, 5.5, 4.5, 6.4], "price": 8.9 }
75+
76+
```
77+
78+
Then you can execute an approximate nearest neighbor search on the data using the `knn` query type:
79+
80+
```json
81+
GET my-knn-index-1/_search
82+
{
83+
"size": 2,
84+
"query": {
85+
"knn": {
86+
"my_vector2": {
87+
"vector": [2, 3, 5, 6],
88+
"k": 2
89+
}
90+
}
91+
}
92+
}
93+
```
94+
95+
`k` is the number of neighbors the search of each graph will return. You must also include the `size` option. This will determine how many results the query will actually return. `k` results will be returned for each shard (and each segment) and `size` results for the entire query. The plugin supports a maximum `k` value of 10,000.
96+
97+
### Using approximate k-NN with filters
98+
If you use the `knn` query alongside filters or other clauses (e.g. `bool`, `must`, `match`), you might receive fewer than `k` results. In this example, `post_filter` reduces the number of results from 2 to 1:
99+
100+
```json
101+
GET my-knn-index-1/_search
102+
{
103+
"size": 2,
104+
"query": {
105+
"knn": {
106+
"my_vector2": {
107+
"vector": [2, 3, 5, 6],
108+
"k": 2
109+
}
110+
}
111+
},
112+
"post_filter": {
113+
"range": {
114+
"price": {
115+
"gte": 5,
116+
"lte": 10
117+
}
118+
}
119+
}
120+
}
121+
```
122+
123+
## Spaces
124+
125+
A space corresponds to the function used to measure the distance between 2 points in order to determine the k-nearest neighbors. From the k-NN perspective, a lower score equates to a closer and better result. This is the opposite of how Elasticsearch scores results, where a greater score equates to a better result. To convert distances to Elasticsearch scores, we take 1/(1 + distance). Currently, the k-NN plugin supports the following spaces:
126+
127+
<table>
128+
<thead style="text-align: left">
129+
<tr>
130+
<th>spaceType</th>
131+
<th>Distance Function</th>
132+
<th>Elasticsearch Score</th>
133+
</tr>
134+
</thead>
135+
<tr>
136+
<td>l2</td>
137+
<td>\[ Distance(X, Y) = \sum_{i=1}^n (X_i - Y_i)^2 \]</td>
138+
<td>1 / (1 + Distance Function)</td>
139+
</tr>
140+
<tr>
141+
<td>cosinesimil</td>
142+
<td>\[ {A &middot; B \over \|A\| &middot; \|B\|} =
143+
{\sum_{i=1}^n (A_i &middot; B_i) \over \sqrt{\sum_{i=1}^n A_i^2} &middot; \sqrt{\sum_{i=1}^n B_i^2}}\]
144+
where \(\|A\|\) and \(\|B\|\) represent normalized vectors.</td>
145+
<td>1 / (1 + Distance Function)</td>
146+
</tr>
147+
</table>

0 commit comments

Comments
 (0)