Skip to content

Commit ec30b45

Browse files
committed
WIP - Add more docs
1 parent cc92e2e commit ec30b45

File tree

3 files changed

+151
-133
lines changed

3 files changed

+151
-133
lines changed

docs/reference/mapping/types/dense-vector.asciidoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -127,7 +127,7 @@ When using a quantized format, you may want to oversample and rescore the result
127127
To use a quantized index, you can set your index type to `int8_hnsw`, `int4_hnsw`, or `bbq_hnsw`. When indexing `float` vectors, the current default
128128
index type is `int8_hnsw`.
129129

130-
Quantized vectors can use <<knn-quantized-vector-rescoring,rescoring>> to improve accuracy on approximate kNN search results.
130+
Quantized vectors can use <<dense-vector-knn-search-reranking,oversampling and rescoring>> to improve accuracy on approximate kNN search results.
131131

132132
NOTE: Quantization will continue to keep the raw float vector values on disk for reranking, reindexing, and quantization improvements over the lifetime of the data.
133133
This means disk usage will increase by ~25% for `int8`, ~12.5% for `int4`, and ~3.1% for `bbq` due to the overhead of storing the quantized and raw vectors.

docs/reference/rest-api/common-parms.asciidoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1360,5 +1360,5 @@ The approximate kNN search will retrieve the top `k * oversample` candidates per
13601360
and then use the original vectors for rescoring.
13611361
The top `k` rescored candidates will be returned as results.
13621362

1363-
See <<knn-quantized-vector-rescoring,rescoring quantized vectors>> for details.
1363+
See <<dense-vector-knn-search-reranking,oversampling and rescoring quantized vectors>> for details.
13641364
end::knn-rescore[]

docs/reference/search/search-your-data/knn-search.asciidoc

Lines changed: 149 additions & 131 deletions
Original file line numberDiff line numberDiff line change
@@ -781,7 +781,7 @@ What if you wanted to filter by some top-level document metadata? You can do thi
781781

782782

783783
NOTE: `filter` will always be over the top-level document metadata. This means you cannot filter based on `nested`
784-
field metadata.
784+
field metadata.
785785

786786
[source,console]
787787
----
@@ -1012,55 +1012,6 @@ Now the result will contain the nearest found paragraph when searching.
10121012
// TESTRESPONSE[s/"took": 4/"took" : "$body.took"/]
10131013

10141014

1015-
[discrete]
1016-
[[knn-quantized-vector-rescoring]]
1017-
==== Rescoring results for quantized vectors
1018-
1019-
When using <<dense-vector-quantization,quantized vectors>> for kNN search, you can optionally rescore results to balance performance and accuracy.
1020-
Rescoring works by retrieving more results per shard using approximate kNN, and then use the original vector values for rescoring these results.
1021-
As the non-quantized, original vectors are used to calculate the final score on the top results, rescoring combines:
1022-
- The performance and memory gains of approximate retrieval using quantized vectors on the top candidates.
1023-
- The accuracy of using the original vectors for rescoring the top candidates.
1024-
1025-
Rescoring won't be as accurate as an <<exact-knn,exact kNN search>>, as some of the top results may not be retrieved using approximate kNN search.
1026-
But the results retrieved by rescoring from the top candidates will have the same score and relative ordering as would be retrieved using exact kNN search.
1027-
1028-
You can use the `rescore` option to specify an `oversample` parameter.
1029-
When `oversample` is specified, the approximate kNN search will retrieve the top `k * oversample` candidates per shard.
1030-
It will then use the original vectors to rescore them, and return the top `k` results.
1031-
1032-
`num_candidates` will not be affected by oversample, besides ensuring that there are at least `k * oversample` candidates per shard.
1033-
1034-
Here is an example of using the `rescore` option with the `oversample` parameter:
1035-
1036-
[source,console]
1037-
----
1038-
POST image-index/_search
1039-
{
1040-
"knn": {
1041-
"field": "image-vector",
1042-
"query_vector": [-5, 9, -12],
1043-
"k": 10,
1044-
"num_candidates": 100,
1045-
"rescore": {
1046-
"oversample": 2.0
1047-
}
1048-
},
1049-
"fields": [ "title", "file-type" ]
1050-
}
1051-
----
1052-
//TEST[continued]
1053-
// TEST[s/"k": 10/"k": 3/]
1054-
// TEST[s/"num_candidates": 100/"num_candidates": 3/]
1055-
1056-
This example will effectively:
1057-
- Search using approximate kNN with `num_candidates` set to 100.
1058-
- Rescore the top 20 (`k * oversample`) candidates per shard using the original vectors.
1059-
- Return the top 10 (`k`) results from the rescored candidates.
1060-
1061-
NOTE: Rescoring only makes sense for quantized vectors; when <<dense-vector-quantization,quantization>> is not used, the original vectors are used for scoring.
1062-
Rescore option will be ignored for non-quantized `dense_vector` fields.
1063-
10641015
[discrete]
10651016
[[knn-indexing-considerations]]
10661017
==== Indexing considerations
@@ -1117,100 +1068,78 @@ NOTE: Approximate kNN search always uses the
11171068
the global top `k` matches across shards. You cannot set the
11181069
`search_type` explicitly when running kNN search.
11191070

1071+
11201072
[discrete]
1121-
[[exact-knn]]
1122-
=== Exact kNN
1073+
[[dense-vector-knn-search-reranking]]
1074+
==== Oversampling and rescoring for quantized vectors
11231075

1124-
To run an exact kNN search, use a `script_score` query with a vector function.
1076+
When using <<dense-vector-quantization,quantized vectors>> for kNN search, you can optionally rescore results to balance performance and accuracy, by doing:
1077+
* Oversampling: Retrieve more candidates per shard using approximate kNN
1078+
* Rescoring: Use the original vector values for re-calculating the score on the oversampled candidates.
11251079

1126-
. Explicitly map one or more `dense_vector` fields. If you don't intend to use
1127-
the field for approximate kNN, set the `index` mapping option to `false`. This
1128-
can significantly improve indexing speed.
1129-
+
1130-
[source,console]
1131-
----
1132-
PUT product-index
1133-
{
1134-
"mappings": {
1135-
"properties": {
1136-
"product-vector": {
1137-
"type": "dense_vector",
1138-
"dims": 5,
1139-
"index": false
1140-
},
1141-
"price": {
1142-
"type": "long"
1143-
}
1144-
}
1145-
}
1146-
}
1147-
----
1080+
As the non-quantized, original vectors are used to calculate the final score on the top results, rescoring combines:
11481081

1149-
. Index your data.
1150-
+
1151-
[source,console]
1152-
----
1153-
POST product-index/_bulk?refresh=true
1154-
{ "index": { "_id": "1" } }
1155-
{ "product-vector": [230.0, 300.33, -34.8988, 15.555, -200.0], "price": 1599 }
1156-
{ "index": { "_id": "2" } }
1157-
{ "product-vector": [-0.5, 100.0, -13.0, 14.8, -156.0], "price": 799 }
1158-
{ "index": { "_id": "3" } }
1159-
{ "product-vector": [0.5, 111.3, -13.0, 14.8, -156.0], "price": 1099 }
1160-
...
1161-
----
1162-
//TEST[continued]
1163-
//TEST[s/\.\.\.//]
1082+
* The performance and memory gains of approximate retrieval using quantized vectors on the top candidates.
1083+
* The accuracy of using the original vectors for rescoring the top candidates.
1084+
1085+
All forms of quantization will result in some accuracy loss and as the quantization level increases the accuracy loss will also increase.
1086+
Generally, we have found that:
1087+
1088+
* `int8` requires minimal if any rescoring
1089+
* `int4` requires some rescoring for higher accuracy and larger recall scenarios. Generally, oversampling by 1.5x-2x recovers most of the accuracy loss.
1090+
* `bbq` requires rescoring except on exceptionally large indices or models specifically designed for quantization. We have found that between 3x-5x oversampling is generally sufficient. But for fewer dimensions or vectors that do not quantize well, higher oversampling may be required.
1091+
1092+
There are three main ways to oversample and rescore:
1093+
1094+
* <<dense-vector-knn-search-reranking-rescore-parameter>>
1095+
* <<dense-vector-knn-search-reranking-rescore-section>>
1096+
* <<dense-vector-knn-search-reranking-script-score>>
1097+
1098+
[discrete]
1099+
[[dense-vector-knn-search-reranking-rescore-parameter]]
1100+
===== Use the `rescore` option to rescore per shard
1101+
1102+
preview:[]
1103+
1104+
You can use the `rescore` option to automatically perform reranking.
1105+
When a rescore `oversample` parameter is specified, the approximate kNN search will retrieve the top `k * oversample` candidates per shard.
1106+
It will then use the original vectors to rescore them, and return the top `k` results.
1107+
1108+
`num_candidates` will not be affected by oversample, besides ensuring that there are at least `k * oversample` candidates per shard.
1109+
1110+
Here is an example of using the `rescore` option with the `oversample` parameter:
11641111

1165-
. Use the <<search-search,search API>> to run a `script_score` query containing
1166-
a <<vector-functions,vector function>>.
1167-
+
1168-
TIP: To limit the number of matched documents passed to the vector function, we
1169-
recommend you specify a filter query in the `script_score.query` parameter. If
1170-
needed, you can use a <<query-dsl-match-all-query,`match_all` query>> in this
1171-
parameter to match all documents. However, matching all documents can
1172-
significantly increase search latency.
1173-
+
11741112
[source,console]
11751113
----
1176-
POST product-index/_search
1114+
POST image-index/_search
11771115
{
1178-
"query": {
1179-
"script_score": {
1180-
"query" : {
1181-
"bool" : {
1182-
"filter" : {
1183-
"range" : {
1184-
"price" : {
1185-
"gte": 1000
1186-
}
1187-
}
1188-
}
1189-
}
1190-
},
1191-
"script": {
1192-
"source": "cosineSimilarity(params.queryVector, 'product-vector') + 1.0",
1193-
"params": {
1194-
"queryVector": [-0.5, 90.0, -10, 14.8, -156.0]
1195-
}
1196-
}
1116+
"knn": {
1117+
"field": "image-vector",
1118+
"query_vector": [-5, 9, -12],
1119+
"k": 10,
1120+
"num_candidates": 100,
1121+
"rescore": {
1122+
"oversample": 2.0
11971123
}
1198-
}
1124+
},
1125+
"fields": [ "title", "file-type" ]
11991126
}
12001127
----
12011128
//TEST[continued]
1129+
// TEST[s/"k": 10/"k": 3/]
1130+
// TEST[s/"num_candidates": 100/"num_candidates": 3/]
12021131

1203-
[discrete]
1204-
[[dense-vector-knn-search-reranking]]
1205-
==== Oversampling and rescoring for quantized vectors
1132+
This example will:
1133+
* Search using approximate kNN with `num_candidates` set to 100.
1134+
* Rescore the top 20 (`k * oversample`) candidates per shard using the original vectors.
1135+
* Return the top 10 (`k`) results from the rescored candidates.
12061136

1207-
All forms of quantization will result in some accuracy loss and as the quantization level increases the accuracy loss will also increase.
1208-
Generally, we have found that:
1209-
- `int8` requires minimal if any rescoring
1210-
- `int4` requires some rescoring for higher accuracy and larger recall scenarios. Generally, oversampling by 1.5x-2x recovers most of the accuracy loss.
1211-
- `bbq` requires rescoring except on exceptionally large indices or models specifically designed for quantization. We have found that between 3x-5x oversampling is generally sufficient. But for fewer dimensions or vectors that do not quantize well, higher oversampling may be required.
12121137

1213-
There are two main ways to oversample and rescore. The first is to utilize the <<rescore, rescore section>> in the `_search` request.
1138+
[discrete]
1139+
[[dense-vector-knn-search-reranking-rescore-section]]
1140+
===== Use the `rescore` section for top-level kNN search
1141+
1142+
You can use the <<rescore, rescore section>> in the `_search` request to rescore the top results from a kNN search.
12141143

12151144
Here is an example using the top level `knn` search with oversampling and using `rescore` to rerank the results:
12161145

@@ -1259,8 +1188,13 @@ gathering 20 nearest neighbors according to quantized scoring and rescoring with
12591188
<5> The weight of the original query, here we simply throw away the original score
12601189
<6> The weight of the rescore query, here we only use the rescore query
12611190

1262-
The second way is to score per shard with the <<query-dsl-knn-query, knn query>> and <<query-dsl-script-score-query, script_score query >>. Generally, this means that there will be more rescoring per shard, but this
1263-
can increase overall recall at the cost of compute.
1191+
1192+
[discrete]
1193+
[[dense-vector-knn-search-reranking-script-score]]
1194+
===== Use a `script_score` query to rescore per shard
1195+
1196+
You can rescore per shard with the <<query-dsl-knn-query, knn query>> and <<query-dsl-script-score-query, script_score query >>.
1197+
Generally, this means that there will be more rescoring per shard, but this can increase overall recall at the cost of compute.
12641198

12651199
[source,console]
12661200
--------------------------------------------------
@@ -1292,3 +1226,87 @@ POST /my-index/_search
12921226
<3> The number of candidates to use for the initial approximate `knn` search. This will search using the quantized vectors
12931227
and return the top 20 candidates per shard to then be scored
12941228
<4> The script to score the results. Script score will interact directly with the originally provided float32 vector.
1229+
1230+
1231+
[discrete]
1232+
[[exact-knn]]
1233+
=== Exact kNN
1234+
1235+
To run an exact kNN search, use a `script_score` query with a vector function.
1236+
1237+
. Explicitly map one or more `dense_vector` fields. If you don't intend to use
1238+
the field for approximate kNN, set the `index` mapping option to `false`. This
1239+
can significantly improve indexing speed.
1240+
+
1241+
[source,console]
1242+
----
1243+
PUT product-index
1244+
{
1245+
"mappings": {
1246+
"properties": {
1247+
"product-vector": {
1248+
"type": "dense_vector",
1249+
"dims": 5,
1250+
"index": false
1251+
},
1252+
"price": {
1253+
"type": "long"
1254+
}
1255+
}
1256+
}
1257+
}
1258+
----
1259+
1260+
. Index your data.
1261+
+
1262+
[source,console]
1263+
----
1264+
POST product-index/_bulk?refresh=true
1265+
{ "index": { "_id": "1" } }
1266+
{ "product-vector": [230.0, 300.33, -34.8988, 15.555, -200.0], "price": 1599 }
1267+
{ "index": { "_id": "2" } }
1268+
{ "product-vector": [-0.5, 100.0, -13.0, 14.8, -156.0], "price": 799 }
1269+
{ "index": { "_id": "3" } }
1270+
{ "product-vector": [0.5, 111.3, -13.0, 14.8, -156.0], "price": 1099 }
1271+
...
1272+
----
1273+
//TEST[continued]
1274+
//TEST[s/\.\.\.//]
1275+
1276+
. Use the <<search-search,search API>> to run a `script_score` query containing
1277+
a <<vector-functions,vector function>>.
1278+
+
1279+
TIP: To limit the number of matched documents passed to the vector function, we
1280+
recommend you specify a filter query in the `script_score.query` parameter. If
1281+
needed, you can use a <<query-dsl-match-all-query,`match_all` query>> in this
1282+
parameter to match all documents. However, matching all documents can
1283+
significantly increase search latency.
1284+
+
1285+
[source,console]
1286+
----
1287+
POST product-index/_search
1288+
{
1289+
"query": {
1290+
"script_score": {
1291+
"query" : {
1292+
"bool" : {
1293+
"filter" : {
1294+
"range" : {
1295+
"price" : {
1296+
"gte": 1000
1297+
}
1298+
}
1299+
}
1300+
}
1301+
},
1302+
"script": {
1303+
"source": "cosineSimilarity(params.queryVector, 'product-vector') + 1.0",
1304+
"params": {
1305+
"queryVector": [-0.5, 90.0, -10, 14.8, -156.0]
1306+
}
1307+
}
1308+
}
1309+
}
1310+
}
1311+
----
1312+
//TEST[continued]

0 commit comments

Comments
 (0)