Commit c0374c2
authored
Reduce quantization optimization steps at ivf query time (elastic#130493)
Since we are quantizing for posting list centroid, I think we can get
away with fewer optimization iterations.
Dropping from 5 to 2 reduces latency when hitting many centroids, with
no recall impact (at least on my data sets).
baseline:
```
index_name index_type n_probe latency(ms) net_cpu_time(ms) avg_cpu_count QPS recall visited
------------------------------ ---------- ------- ----------- ---------------- ------------- ------ ------ ---------
cohere-wikipedia-docs-768d.vec ivf 100 2.43 0.00 0.00 411.52 0.91 23766.65
```
candidate:
```
index_name index_type n_probe latency(ms) net_cpu_time(ms) avg_cpu_count QPS recall visited
------------------------------ ---------- ------- ----------- ---------------- ------------- ------ ------ ---------
cohere-wikipedia-docs-768d.vec ivf 100 1.84 0.00 0.00 543.48 0.91 23766.65
```
Here is a more extreme case (many segments):
baseline:
```
index_name index_type n_probe latency(ms) net_cpu_time(ms) avg_cpu_count QPS recall visited
------------------------------ ---------- ------- ----------- ---------------- ------------- ------ ------ ---------
cohere-wikipedia-docs-768d.vec ivf 100 36.10 0.00 0.00 27.70 0.87 364480.37
```
candidate:
```
index_name index_type n_probe latency(ms) net_cpu_time(ms) avg_cpu_count QPS recall visited
------------------------------ ---------- ------- ----------- ---------------- ------------- ------ ------ ---------
cohere-wikipedia-docs-768d.vec ivf 100 24.94 0.00 0.00 40.10 0.87 364480.37
```
Need to test against more data sets, but this is a nice improvement.1 parent 7fac8ff commit c0374c2
File tree
1 file changed
+2
-1
lines changed- server/src/main/java/org/elasticsearch/index/codec/vectors
1 file changed
+2
-1
lines changedLines changed: 2 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
33 | 33 | | |
34 | 34 | | |
35 | 35 | | |
| 36 | + | |
36 | 37 | | |
37 | 38 | | |
38 | 39 | | |
| |||
211 | 212 | | |
212 | 213 | | |
213 | 214 | | |
214 | | - | |
| 215 | + | |
215 | 216 | | |
216 | 217 | | |
217 | 218 | | |
| |||
0 commit comments