-
Notifications
You must be signed in to change notification settings - Fork 4
Home
Leonardo Xavier Kuffo Rivero edited this page Mar 25, 2026
·
10 revisions
SuperKMeans is a super-fast library for clustering high-dimensional vector embeddings. The main use case is to create partition-based indexes in large vector collections (e.g., IVF).
We offer two clustering options:
- Vanilla Super K-Means: Classical Lloyd's k-means accelerated with dimension pruning. Clustering is of equivalent quality to FAISS.
- Hierarchical Super K-Means: Hierarchical k-means accelerated with dimension pruning. Extremely fast for large collections (+1M embeddings) with minimal quality loss.
- You need to index a large collection of high-dimensional (d > 128) vector embeddings
- You need a lightweight and much faster alternative to FAISS clustering
Check INSTALL.md.