@@ -40,12 +40,12 @@ based on a similarity metric, the better its match.
40
40
41
41
{es} supports two methods for kNN search:
42
42
43
- * <<exact-knn,Exact, brute-force kNN>> using a `script_score` query with a
44
- vector function
45
-
46
43
* <<approximate-knn,Approximate kNN>> using the `knn` search
47
44
option
48
45
46
+ * <<exact-knn,Exact, brute-force kNN>> using a `script_score` query with a
47
+ vector function
48
+
49
49
In most cases, you'll want to use approximate kNN. Approximate kNN offers lower
50
50
latency at the cost of slower indexing and imperfect accuracy.
51
51
@@ -57,89 +57,6 @@ to limit the number of matching documents passed to the function. If you
57
57
filter your data to a small subset of documents, you can get good search
58
58
performance using this approach.
59
59
60
- [discrete]
61
- [[exact-knn]]
62
- === Exact kNN
63
-
64
- To run an exact kNN search, use a `script_score` query with a vector function.
65
-
66
- . Explicitly map one or more `dense_vector` fields. If you don't intend to use
67
- the field for approximate kNN, omit the `index` mapping option or set it to
68
- `false`. This can significantly improve indexing speed.
69
- +
70
- [source,console]
71
- ----
72
- PUT product-index
73
- {
74
- "mappings": {
75
- "properties": {
76
- "product-vector": {
77
- "type": "dense_vector",
78
- "dims": 5,
79
- "index": false
80
- },
81
- "price": {
82
- "type": "long"
83
- }
84
- }
85
- }
86
- }
87
- ----
88
-
89
- . Index your data.
90
- +
91
- [source,console]
92
- ----
93
- POST product-index/_bulk?refresh=true
94
- { "index": { "_id": "1" } }
95
- { "product-vector": [230.0, 300.33, -34.8988, 15.555, -200.0], "price": 1599 }
96
- { "index": { "_id": "2" } }
97
- { "product-vector": [-0.5, 100.0, -13.0, 14.8, -156.0], "price": 799 }
98
- { "index": { "_id": "3" } }
99
- { "product-vector": [0.5, 111.3, -13.0, 14.8, -156.0], "price": 1099 }
100
- ...
101
- ----
102
- //TEST[continued]
103
- //TEST[s/\.\.\.//]
104
-
105
- . Use the <<search-search,search API>> to run a `script_score` query containing
106
- a <<vector-functions,vector function>>.
107
- +
108
- TIP: To limit the number of matched documents passed to the vector function, we
109
- recommend you specify a filter query in the `script_score.query` parameter. If
110
- needed, you can use a <<query-dsl-match-all-query,`match_all` query>> in this
111
- parameter to match all documents. However, matching all documents can
112
- significantly increase search latency.
113
- +
114
- [source,console]
115
- ----
116
- POST product-index/_search
117
- {
118
- "query": {
119
- "script_score": {
120
- "query" : {
121
- "bool" : {
122
- "filter" : {
123
- "range" : {
124
- "price" : {
125
- "gte": 1000
126
- }
127
- }
128
- }
129
- }
130
- },
131
- "script": {
132
- "source": "cosineSimilarity(params.queryVector, 'product-vector') + 1.0",
133
- "params": {
134
- "queryVector": [-0.5, 90.0, -10, 14.8, -156.0]
135
- }
136
- }
137
- }
138
- }
139
- }
140
- ----
141
- //TEST[continued]
142
-
143
60
[discrete]
144
61
[[approximate-knn]]
145
62
=== Approximate kNN
@@ -629,3 +546,86 @@ NOTE: Approximate kNN search always uses the
629
546
the global top `k` matches across shards. You cannot set the
630
547
`search_type` explicitly when running kNN search.
631
548
549
+ [discrete]
550
+ [[exact-knn]]
551
+ === Exact kNN
552
+
553
+ To run an exact kNN search, use a `script_score` query with a vector function.
554
+
555
+ . Explicitly map one or more `dense_vector` fields. If you don't intend to use
556
+ the field for approximate kNN, omit the `index` mapping option or set it to
557
+ `false`. This can significantly improve indexing speed.
558
+ +
559
+ [source,console]
560
+ ----
561
+ PUT product-index
562
+ {
563
+ "mappings": {
564
+ "properties": {
565
+ "product-vector": {
566
+ "type": "dense_vector",
567
+ "dims": 5,
568
+ "index": false
569
+ },
570
+ "price": {
571
+ "type": "long"
572
+ }
573
+ }
574
+ }
575
+ }
576
+ ----
577
+
578
+ . Index your data.
579
+ +
580
+ [source,console]
581
+ ----
582
+ POST product-index/_bulk?refresh=true
583
+ { "index": { "_id": "1" } }
584
+ { "product-vector": [230.0, 300.33, -34.8988, 15.555, -200.0], "price": 1599 }
585
+ { "index": { "_id": "2" } }
586
+ { "product-vector": [-0.5, 100.0, -13.0, 14.8, -156.0], "price": 799 }
587
+ { "index": { "_id": "3" } }
588
+ { "product-vector": [0.5, 111.3, -13.0, 14.8, -156.0], "price": 1099 }
589
+ ...
590
+ ----
591
+ //TEST[continued]
592
+ //TEST[s/\.\.\.//]
593
+
594
+ . Use the <<search-search,search API>> to run a `script_score` query containing
595
+ a <<vector-functions,vector function>>.
596
+ +
597
+ TIP: To limit the number of matched documents passed to the vector function, we
598
+ recommend you specify a filter query in the `script_score.query` parameter. If
599
+ needed, you can use a <<query-dsl-match-all-query,`match_all` query>> in this
600
+ parameter to match all documents. However, matching all documents can
601
+ significantly increase search latency.
602
+ +
603
+ [source,console]
604
+ ----
605
+ POST product-index/_search
606
+ {
607
+ "query": {
608
+ "script_score": {
609
+ "query" : {
610
+ "bool" : {
611
+ "filter" : {
612
+ "range" : {
613
+ "price" : {
614
+ "gte": 1000
615
+ }
616
+ }
617
+ }
618
+ }
619
+ },
620
+ "script": {
621
+ "source": "cosineSimilarity(params.queryVector, 'product-vector') + 1.0",
622
+ "params": {
623
+ "queryVector": [-0.5, 90.0, -10, 14.8, -156.0]
624
+ }
625
+ }
626
+ }
627
+ }
628
+ }
629
+ ----
630
+ //TEST[continued]
631
+
0 commit comments