|
1 | | -# sqlite-vector |
| 1 | +# SQLite-Vector: Fast, Lightweight Vector Search Extension for SQLite |
| 2 | + |
| 3 | +**SQLite-Vector** is a cross-platform, ultra-efficient SQLite extension that brings vector search capabilities to your embedded database. It works seamlessly on **iOS, Android, Windows, Linux, and macOS**, using just **30MB of memory** by default. With support for **Float32, Float16, BFloat16, Int8, and UInt8**, and **highly optimized distance functions**, it's the ideal solution for **Edge AI** applications. |
| 4 | + |
| 5 | +## 🚀 Highlights |
| 6 | + |
| 7 | +* ✅ **No virtual tables required** – store vectors directly as `BLOB`s in ordinary tables |
| 8 | +* ✅ **Blazing fast** – optimized C implementation with SIMD acceleration |
| 9 | +* ✅ **Low memory footprint** – defaults to just 30MB of RAM usage |
| 10 | +* ✅ **Zero preindexing needed** – no long preprocessing or index-building phases |
| 11 | +* ✅ **Works offline** – perfect for on-device, privacy-preserving AI workloads |
| 12 | +* ✅ **Plug-and-play** – drop into existing SQLite workflows with minimal effort |
| 13 | +* ✅ **Cross-platform** – works out of the box on all major OSes |
| 14 | + |
| 15 | +--- |
| 16 | + |
| 17 | +## 🧠 What Is Vector Search? |
| 18 | + |
| 19 | +Vector search is the process of finding the closest match(es) to a given vector (a point in high-dimensional space) based on a similarity or distance metric. It is essential for AI and machine learning applications where data is often encoded into vector embeddings. |
| 20 | + |
| 21 | +### Common Use Cases |
| 22 | + |
| 23 | +* **Semantic Search**: find documents, emails, or messages similar to a query |
| 24 | +* **Image Retrieval**: search for visually similar images |
| 25 | +* **Recommendation Systems**: match users with products, videos, or music |
| 26 | +* **Voice and Audio Search**: match voice queries or environmental sounds |
| 27 | +* **Anomaly Detection**: find outliers in real-time sensor data |
| 28 | +* **Robotics**: localize spatial features or behaviors using embedded observations |
| 29 | + |
| 30 | +In the AI era, embeddings are everywhere – from language models like GPT to vision transformers. Storing and searching them efficiently is the foundation of intelligent applications. |
| 31 | + |
| 32 | +--- |
| 33 | + |
| 34 | +## 🧩 Why Use SQLite-Vector? |
| 35 | + |
| 36 | +| Feature | SQLite-Vector | Traditional Solutions | |
| 37 | +| -------------------------- | ------------- | ------------------------------------------ | |
| 38 | +| Works with ordinary tables | ✅ | ❌ (usually require special virtual tables) | |
| 39 | +| Requires preindexing | ❌ | ✅ (can take hours for large datasets) | |
| 40 | +| Requires external server | ❌ | ✅ (often needs Redis/FAISS/Weaviate/etc.) | |
| 41 | +| Memory-efficient | ✅ | ❌ | |
| 42 | +| Easy to use SQL | ✅ | ❌ (often complex JOINs, subqueries) | |
| 43 | +| Offline/Edge ready | ✅ | ❌ | |
| 44 | +| Cross-platform | ✅ | ❌ | |
| 45 | + |
| 46 | +Unlike other vector databases or extensions that require complex setup, SQLite-Vector **just works** with your existing database schema and tools. |
| 47 | + |
| 48 | +--- |
| 49 | + |
| 50 | +## 🛠 Supported Vector Types |
| 51 | + |
| 52 | +You can store your vectors as `BLOB` columns in ordinary tables. Supported formats include: |
| 53 | + |
| 54 | +* `float32` (4 bytes per element) |
| 55 | +* `float16` (2 bytes per element) |
| 56 | +* `bfloat16` (2 bytes per element) |
| 57 | +* `int8` (1 byte per element) |
| 58 | +* `uint8` (1 byte per element) |
| 59 | + |
| 60 | +Simply insert a vector as a binary blob into your table. No special table types or schemas are required. |
| 61 | + |
| 62 | +--- |
| 63 | + |
| 64 | +## 📐 Supported Distance Metrics |
| 65 | + |
| 66 | +Optimized implementations available: |
| 67 | + |
| 68 | +* **L2 Distance (Euclidean)** |
| 69 | +* **Squared L2** |
| 70 | +* **L1 Distance (Manhattan)** |
| 71 | +* **Cosine Distance** |
| 72 | +* **Dot Product** |
| 73 | + |
| 74 | +These are implemented in pure C and optimized for SIMD when available, ensuring maximum performance on modern CPUs and mobile devices. |
| 75 | + |
| 76 | +--- |
| 77 | + |
| 78 | +## 🔍 Example Usage |
| 79 | + |
| 80 | +```sql |
| 81 | +-- Create a regular SQLite table |
| 82 | +CREATE TABLE images ( |
| 83 | + id INTEGER PRIMARY KEY, |
| 84 | + embedding BLOB, -- store Float32/UInt8/etc. |
| 85 | + label TEXT |
| 86 | +); |
| 87 | + |
| 88 | +-- Insert a vector (Float32, 384 dimensions) |
| 89 | +INSERT INTO images (embedding, label) VALUES (?, 'cat'); |
| 90 | + |
| 91 | +-- Initialize vector |
| 92 | +SELECT vector_init('images', 'embedding', 'type=FLOAT32,dimension=384'); |
| 93 | + |
| 94 | +-- Quantize vector |
| 95 | +SELECT vector_quantize('images', 'embedding'); |
| 96 | + |
| 97 | +-- Optional preload quantized version |
| 98 | +SELECT vector_quantize_preload('images', 'embedding'); |
| 99 | + |
| 100 | +-- Run a nearest neighbor query (returns top 20 closest vectors) |
| 101 | +SELECT e.id, v.distance FROM images AS e |
| 102 | + JOIN vector_quantize_scan('images', 'embedding', 20) AS v |
| 103 | + ON e.id = v.rowid; |
| 104 | +``` |
| 105 | + |
| 106 | +--- |
| 107 | + |
| 108 | +## 📦 Installation |
| 109 | + |
| 110 | +### Pre-built Binaries |
| 111 | + |
| 112 | +Download the appropriate pre-built binary for your platform from the official [Releases](https://github.com/sqliteai/sqlite-vector/releases) page: |
| 113 | + |
| 114 | +- Linux: x86 and ARM |
| 115 | +- macOS: x86 and ARM |
| 116 | +- Windows: x86 |
| 117 | +- Android |
| 118 | +- iOS |
| 119 | + |
| 120 | +### Loading the Extension |
| 121 | + |
| 122 | +```sql |
| 123 | +-- In SQLite CLI |
| 124 | +.load ./vector |
| 125 | + |
| 126 | +-- In SQL |
| 127 | +SELECT load_extension('./vector'); |
| 128 | +``` |
| 129 | + |
| 130 | +Or embed it directly into your application. |
| 131 | + |
| 132 | +## 🌍 Perfect for Edge AI |
| 133 | + |
| 134 | +SQLite-Vector is designed with the **Edge AI** use case in mind: |
| 135 | + |
| 136 | +* 📴 Runs offline – no internet required |
| 137 | +* 📱 Works on mobile devices – iOS/Android friendly |
| 138 | +* 🔒 Keeps data local – ideal for privacy-focused apps |
| 139 | +* ⚡ Extremely fast – real-time performance on device |
| 140 | + |
| 141 | +You can deploy powerful similarity search capabilities right inside your app or embedded system – **no cloud needed**. |
0 commit comments