|
| 1 | +### Vector Quantization for High Performance |
| 2 | + |
| 3 | +`sqlite-vector` supports **vector quantization**, a powerful technique to significantly accelerate vector search while reducing memory usage. You can quantize your vectors with: |
| 4 | + |
| 5 | +```sql |
| 6 | +SELECT vector_quantize('my_table', 'my_column'); |
| 7 | +``` |
| 8 | + |
| 9 | +To further boost performance, quantized vectors can be **preloaded in memory** using: |
| 10 | + |
| 11 | +```sql |
| 12 | +SELECT vector_quantize_preload('my_table', 'my_column'); |
| 13 | +``` |
| 14 | + |
| 15 | +This can result in a **4×–5× speedup** on nearest neighbor queries while keeping memory usage low. |
| 16 | + |
| 17 | +#### What is Quantization? |
| 18 | + |
| 19 | +Quantization compresses high-dimensional float vectors (e.g., `FLOAT32`) into compact representations using lower-precision formats (e.g., `UINT8`). This drastically reduces the size of the data—often by a factor of 4 to 8—making it practical to load large datasets entirely in memory, even on edge devices. |
| 20 | + |
| 21 | +#### Why is it Important? |
| 22 | + |
| 23 | +* **Faster Searches**: With preloaded quantized vectors, distance computations are up to 5× faster. |
| 24 | +* **Lower Memory Footprint**: Quantized vectors use significantly less RAM, allowing millions of vectors to fit in memory. |
| 25 | +* **Edge-ready**: The reduced size and in-memory access make this ideal for mobile, embedded, and on-device AI applications. |
| 26 | + |
| 27 | +#### Estimate Memory Usage |
| 28 | + |
| 29 | +Before preloading quantized vectors, you can **estimate the memory required** using: |
| 30 | + |
| 31 | +```sql |
| 32 | +SELECT vector_quantize_memory('my_table', 'my_column'); |
| 33 | +``` |
| 34 | + |
| 35 | +This gives you an approximate number of bytes needed to load the quantized vectors into memory. |
| 36 | + |
| 37 | +#### Accuracy You Can Trust |
| 38 | + |
| 39 | +Despite the compression, our quantization algorithms are finely tuned to maintain high accuracy. You can expect **recall rates greater than 0.95**, ensuring that approximate searches closely match exact results in quality. |
| 40 | + |
| 41 | +#### Measuring Recall in SQLite-Vector |
| 42 | + |
| 43 | +You can evaluate the recall of quantized search compared to exact search using a single SQL query. For example, assuming a table `vec_examples` with an `embedding` column, use: |
| 44 | + |
| 45 | +```sql |
| 46 | +WITH |
| 47 | +exact_knn AS ( |
| 48 | + SELECT e.rowid |
| 49 | + FROM vec_examples AS e |
| 50 | + JOIN vector_full_scan('vec_examples', 'embedding', ?1, ?2) AS v |
| 51 | + ON e.rowid = v.rowid |
| 52 | +), |
| 53 | +approx_knn AS ( |
| 54 | + SELECT e.rowid |
| 55 | + FROM vec_examples AS e |
| 56 | + JOIN vector_quantize_scan('vec_examples', 'embedding', ?1, ?2) AS v |
| 57 | + ON e.rowid = v.rowid |
| 58 | +), |
| 59 | +matches AS ( |
| 60 | + SELECT COUNT(*) AS match_count |
| 61 | + FROM exact_knn |
| 62 | + WHERE rowid IN (SELECT rowid FROM approx_knn) |
| 63 | +), |
| 64 | +total AS ( |
| 65 | + SELECT COUNT(*) AS total_count |
| 66 | + FROM exact_knn |
| 67 | +) |
| 68 | +SELECT |
| 69 | + (SELECT match_count FROM matches) AS match_count, |
| 70 | + (SELECT total_count FROM total) AS total_count, |
| 71 | + CAST((SELECT match_count FROM matches) AS FLOAT) / |
| 72 | + CAST((SELECT total_count FROM total) AS FLOAT) AS recall; |
| 73 | +``` |
| 74 | + |
| 75 | +Where `?1` is the input vector (as a BLOB) and `?2` is the number of nearest neighbors `k`. |
| 76 | +This query compares exact and quantized results and computes the recall ratio, helping you validate the quality of quantized search. |
0 commit comments