Commit 7ceee5d
authored
Put sensible bounds on quantized scores (#15411)
Loss from quantization can yield some unexpected values from the corrected dot product, sometimes producing
values that are out-of-bounds. This is more likely when the inputs are "extreme" in the sense that they are very far
from the segment-level centroid.
Bound euclidean distance to a non-negative value -- negative values do not make any sense.
Clamp dot product/cosine score to [-1,1] as the normalized dot product should always return values in this range.
This works well enough for 4+ bit quantization but may not work as well for 1-bit quantization since the loss is so great.
Fix the testSingleVectorCase to l2 normalize all vectors for DOT_PRODUCT similarity.
This is a partial fix for #154081 parent efa5204 commit 7ceee5d
File tree
2 files changed
+21
-6
lines changed- lucene/core/src
- java/org/apache/lucene/codecs/lucene104
- test/org/apache/lucene/codecs/lucene104
2 files changed
+21
-6
lines changedLines changed: 7 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
263 | 263 | | |
264 | 264 | | |
265 | 265 | | |
266 | | - | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
267 | 269 | | |
268 | 270 | | |
269 | 271 | | |
| |||
274 | 276 | | |
275 | 277 | | |
276 | 278 | | |
277 | | - | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
278 | 283 | | |
279 | 284 | | |
280 | 285 | | |
Lines changed: 14 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
47 | 47 | | |
48 | 48 | | |
49 | 49 | | |
| 50 | + | |
50 | 51 | | |
| 52 | + | |
51 | 53 | | |
52 | 54 | | |
53 | 55 | | |
| |||
109 | 111 | | |
110 | 112 | | |
111 | 113 | | |
112 | | - | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
113 | 119 | | |
114 | 120 | | |
115 | 121 | | |
| |||
118 | 124 | | |
119 | 125 | | |
120 | 126 | | |
121 | | - | |
| 127 | + | |
| 128 | + | |
122 | 129 | | |
123 | | - | |
124 | | - | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
125 | 135 | | |
126 | 136 | | |
127 | 137 | | |
| |||
0 commit comments