Skip to content

Commit 8e8c8c1

Browse files
authored
Create QUANTIZATION.md
1 parent a20e44f commit 8e8c8c1

File tree

1 file changed

+76
-0
lines changed

1 file changed

+76
-0
lines changed

QUANTIZATION.md

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
### Vector Quantization for High Performance
2+
3+
`sqlite-vector` supports **vector quantization**, a powerful technique to significantly accelerate vector search while reducing memory usage. You can quantize your vectors with:
4+
5+
```sql
6+
SELECT vector_quantize('my_table', 'my_column');
7+
```
8+
9+
To further boost performance, quantized vectors can be **preloaded in memory** using:
10+
11+
```sql
12+
SELECT vector_quantize_preload('my_table', 'my_column');
13+
```
14+
15+
This can result in a **4×–5× speedup** on nearest neighbor queries while keeping memory usage low.
16+
17+
#### What is Quantization?
18+
19+
Quantization compresses high-dimensional float vectors (e.g., `FLOAT32`) into compact representations using lower-precision formats (e.g., `UINT8`). This drastically reduces the size of the data—often by a factor of 4 to 8—making it practical to load large datasets entirely in memory, even on edge devices.
20+
21+
#### Why is it Important?
22+
23+
* **Faster Searches**: With preloaded quantized vectors, distance computations are up to 5× faster.
24+
* **Lower Memory Footprint**: Quantized vectors use significantly less RAM, allowing millions of vectors to fit in memory.
25+
* **Edge-ready**: The reduced size and in-memory access make this ideal for mobile, embedded, and on-device AI applications.
26+
27+
#### Estimate Memory Usage
28+
29+
Before preloading quantized vectors, you can **estimate the memory required** using:
30+
31+
```sql
32+
SELECT vector_quantize_memory('my_table', 'my_column');
33+
```
34+
35+
This gives you an approximate number of bytes needed to load the quantized vectors into memory.
36+
37+
#### Accuracy You Can Trust
38+
39+
Despite the compression, our quantization algorithms are finely tuned to maintain high accuracy. You can expect **recall rates greater than 0.95**, ensuring that approximate searches closely match exact results in quality.
40+
41+
#### Measuring Recall in SQLite-Vector
42+
43+
You can evaluate the recall of quantized search compared to exact search using a single SQL query. For example, assuming a table `vec_examples` with an `embedding` column, use:
44+
45+
```sql
46+
WITH
47+
exact_knn AS (
48+
SELECT e.rowid
49+
FROM vec_examples AS e
50+
JOIN vector_full_scan('vec_examples', 'embedding', ?1, ?2) AS v
51+
ON e.rowid = v.rowid
52+
),
53+
approx_knn AS (
54+
SELECT e.rowid
55+
FROM vec_examples AS e
56+
JOIN vector_quantize_scan('vec_examples', 'embedding', ?1, ?2) AS v
57+
ON e.rowid = v.rowid
58+
),
59+
matches AS (
60+
SELECT COUNT(*) AS match_count
61+
FROM exact_knn
62+
WHERE rowid IN (SELECT rowid FROM approx_knn)
63+
),
64+
total AS (
65+
SELECT COUNT(*) AS total_count
66+
FROM exact_knn
67+
)
68+
SELECT
69+
(SELECT match_count FROM matches) AS match_count,
70+
(SELECT total_count FROM total) AS total_count,
71+
CAST((SELECT match_count FROM matches) AS FLOAT) /
72+
CAST((SELECT total_count FROM total) AS FLOAT) AS recall;
73+
```
74+
75+
Where `?1` is the input vector (as a BLOB) and `?2` is the number of nearest neighbors `k`.
76+
This query compares exact and quantized results and computes the recall ratio, helping you validate the quality of quantized search.

0 commit comments

Comments
 (0)