diff --git a/.github/ignore-notebooks.txt b/.github/ignore-notebooks.txt
index 55052688..61ba17de 100644
--- a/.github/ignore-notebooks.txt
+++ b/.github/ignore-notebooks.txt
@@ -7,4 +7,6 @@
02_semantic_cache_optimization
spring_ai_redis_rag.ipynb
00_litellm_proxy_redis.ipynb
-04_redisvl_benchmarking_basics.ipynb
\ No newline at end of file
+04_redisvl_benchmarking_basics.ipynb
+06_hnsw_to_svs_vamana_migration.ipynb
+07_flat_to_svs_vamana_migration.ipynb
\ No newline at end of file
diff --git a/README.md b/README.md
index a01de17f..6425baf0 100644
--- a/README.md
+++ b/README.md
@@ -69,6 +69,8 @@ Need quickstarts to begin your Redis AI journey?
| ๐ข **Data Type Support** - Shows how to convert a float32 index to float16 or integer dataypes | [](python-recipes/vector-search/03_dtype_support.ipynb) | [](https://colab.research.google.com/github/redis-developer/redis-ai-resources/blob/main/python-recipes/vector-search/03_dtype_support.ipynb) |
| ๐ **Benchmarking Basics** - Overview of search benchmarking basics with RedisVL and Python multiprocessing | [](python-recipes/vector-search/04_redisvl_benchmarking_basics.ipynb) | [](https://colab.research.google.com/github/redis-developer/redis-ai-resources/blob/main/python-recipes/vector-search/04_redisvl_benchmarking_basics.ipynb) |
| ๐ **Multi Vector Search** - Overview of multi vector queries with RedisVL | [](python-recipes/vector-search/05_multivector_search.ipynb) | [](https://colab.research.google.com/github/redis-developer/redis-ai-resources/blob/main/python-recipes/vector-search/05_multivector_search.ipynb) |
+| ๐๏ธ **HNSW to SVS-VAMANA Migration** - Showcase how to migrate HNSW indices to SVS-VAMANA and demonstrate 50-75% memory savings benefits | [](python-recipes/vector-search/06_hnsw_to_svs_vamana_migration.ipynb) | [](https://colab.research.google.com/github/redis-developer/redis-ai-resources/blob/main/python-recipes/vector-search/06_hnsw_to_svs_vamana_migration.ipynb) |
+| ๐๏ธ **FLAT to SVS-VAMANA Migration** - Showcase how to migrate FLAT indices to SVS-VAMANA and demonstrate significant memory reduction benefits | [](python-recipes/vector-search/07_flat_to_svs_vamana_migration.ipynb) | [](https://colab.research.google.com/github/redis-developer/redis-ai-resources/blob/main/python-recipes/vector-search/07_flat_to_svs_vamana_migration.ipynb) |
### Retrieval Augmented Generation (RAG)
diff --git a/python-recipes/vector-search/06_hnsw_to_svs_vamana_migration.ipynb b/python-recipes/vector-search/06_hnsw_to_svs_vamana_migration.ipynb
new file mode 100644
index 00000000..dbe20a7a
--- /dev/null
+++ b/python-recipes/vector-search/06_hnsw_to_svs_vamana_migration.ipynb
@@ -0,0 +1,1314 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "\n",
+ "# Migrating from HNSW to SVS-VAMANA\n",
+ "\n",
+ "## Let's Begin!\n",
+ "
\n",
+ "\n",
+ "This notebook demonstrates how to migrate existing HNSW vector indices to SVS-VAMANA for improved memory efficiency while maintaining search quality.\n",
+ "\n",
+ "## What You'll Learn\n",
+ "\n",
+ "- How to assess your current HNSW index for migration\n",
+ "- Step-by-step migration from HNSW to SVS-VAMANA\n",
+ "- Memory usage comparison and cost analysis\n",
+ "- Search quality validation between HNSW and SVS-VAMANA\n",
+ "- Performance benchmarking and recall comparison\n",
+ "- Migration decision framework for production systems\n",
+ "\n",
+ "## Prerequisites\n",
+ "\n",
+ "- Redis Stack 8.2.0+ with RediSearch 2.8.10+\n",
+ "- Existing HNSW index with substantial data (1000+ documents recommended)\n",
+ "- High-dimensional vectors (768+ dimensions for best compression benefits)\n",
+ "\n",
+ "## HNSW vs SVS-VAMANA\n",
+ "\n",
+ "**HNSW (Hierarchical Navigable Small World):**\n",
+ "- Excellent search quality and recall\n",
+ "- Fast query performance\n",
+ "- Higher memory usage (stores full-precision vectors)\n",
+ "- Good for applications prioritizing search quality\n",
+ "\n",
+ "**SVS-VAMANA:**\n",
+ "- Competitive search quality with compression\n",
+ "- Significant memory savings (50-75% reduction)\n",
+ "- Built-in vector compression (LeanVec, quantization)\n",
+ "- Ideal for large-scale deployments with cost constraints"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## ๐ฆ Installation & Setup\n",
+ "\n",
+ "This notebook requires **sentence-transformers** for generating embeddings and **Redis Stack** running in Docker.\n",
+ "\n",
+ "**Requirements:**\n",
+ "- Redis Stack 8.2.0+ with RediSearch 2.8.10+\n",
+ "- sentence-transformers (for generating embeddings)\n",
+ "- numpy (for vector operations)\n",
+ "- redisvl (should be available in your environment)\n",
+ "\n",
+ "**๐ณ Docker Setup (Required):**\n",
+ "\n",
+ "Before running this notebook, make sure Redis Stack is running in Docker:\n",
+ "\n",
+ "```bash\n",
+ "# Start Redis Stack with Docker\n",
+ "docker run -d --name redis-stack -p 6379:6379 -p 8001:8001 redis/redis-stack:latest\n",
+ "```\n",
+ "\n",
+ "Or if you prefer using docker-compose, create a `docker-compose.yml` file:\n",
+ "\n",
+ "```yaml\n",
+ "version: '3.8'\n",
+ "services:\n",
+ " redis:\n",
+ " image: redis/redis-stack:latest\n",
+ " ports:\n",
+ " - \"6379:6379\"\n",
+ " - \"8001:8001\"\n",
+ "```\n",
+ "\n",
+ "Then run: `docker-compose up -d`\n",
+ "\n",
+ "**๐ Python Dependencies Installation:**\n",
+ "\n",
+ "Install the required Python packages:\n",
+ "\n",
+ "```bash\n",
+ "# Install core dependencies\n",
+ "pip install redisvl numpy sentence-transformers\n",
+ "\n",
+ "# Or install with specific versions for compatibility\n",
+ "pip install redisvl>=0.2.0 numpy>=1.21.0 sentence-transformers>=2.2.0\n",
+ "```\n",
+ "\n",
+ "**For Google Colab users, run this cell:**\n",
+ "\n",
+ "```python\n",
+ "!pip install redisvl sentence-transformers numpy\n",
+ "```\n",
+ "\n",
+ "**For Conda users:**\n",
+ "\n",
+ "```bash\n",
+ "conda install numpy\n",
+ "pip install redisvl sentence-transformers\n",
+ "```"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# # Install dependencies if needed\n",
+ "# import sys\n",
+ "# import subprocess\n",
+ "\n",
+ "# def install_if_missing(package):\n",
+ "# try:\n",
+ "# __import__(package)\n",
+ "# except ImportError:\n",
+ "# print(f\"Installing {package}...\")\n",
+ "# subprocess.check_call([sys.executable, \"-m\", \"pip\", \"install\", package])\n",
+ "\n",
+ "# # Check and install required packages\n",
+ "# install_if_missing(\"sentence-transformers\")\n",
+ "# install_if_missing(\"redisvl\")\n",
+ "\n",
+ "# print(\"โ
All dependencies are ready!\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "๐ Libraries imported successfully!\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Import required libraries\n",
+ "import os\n",
+ "import json\n",
+ "import numpy as np\n",
+ "import time\n",
+ "from typing import List, Dict, Any\n",
+ "\n",
+ "# Redis and RedisVL imports\n",
+ "import redis\n",
+ "from redisvl.index import SearchIndex\n",
+ "from redisvl.query import VectorQuery\n",
+ "from redisvl.redis.utils import array_to_buffer, buffer_to_array\n",
+ "from redisvl.utils import CompressionAdvisor\n",
+ "from redisvl.redis.connection import supports_svs\n",
+ "\n",
+ "# Configuration\n",
+ "REDIS_URL = \"redis://localhost:6379\"\n",
+ "\n",
+ "print(\"๐ Libraries imported successfully!\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Step 1: Verify Redis and SVS Support\n",
+ "\n",
+ "First, let's ensure Redis Stack is running and supports SVS-VAMANA."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "โ
Redis connection successful\n",
+ "๐ Redis version: 8.2.2\n",
+ "โ
SVS-VAMANA supported\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Test Redis connection and SVS support\n",
+ "try:\n",
+ " client = redis.Redis.from_url(REDIS_URL)\n",
+ " client.ping()\n",
+ " print(\"โ
Redis connection successful\")\n",
+ " \n",
+ " # Check Redis version\n",
+ " redis_info = client.info()\n",
+ " redis_version = redis_info['redis_version']\n",
+ " print(f\"๐ Redis version: {redis_version}\")\n",
+ " \n",
+ " # Check SVS support\n",
+ " if supports_svs(client):\n",
+ " print(\"โ
SVS-VAMANA supported\")\n",
+ " else:\n",
+ " print(\"โ SVS-VAMANA not supported\")\n",
+ " print(\"Please ensure you're using Redis Stack 8.2.0+ with RediSearch 2.8.10+\")\n",
+ " \n",
+ "except Exception as e:\n",
+ " print(f\"โ Redis connection failed: {e}\")\n",
+ " print(\"Please ensure Redis Stack is running on localhost:6379\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Step 2: Load Sample Data\n",
+ "\n",
+ "We'll use the movie dataset to demonstrate the migration process."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "๐ฝ๏ธ Loaded 20 movie records\n",
+ "Sample movie: Explosive Pursuit\n",
+ "Genres available: {'comedy', 'action'}\n",
+ "\n",
+ "๐ง Configuration:\n",
+ "Vector dimensions: 1024\n",
+ "Dataset size: 20 movie documents\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Load the movies dataset\n",
+ "with open('resources/movies.json', 'r') as f:\n",
+ " movies_data = json.load(f)\n",
+ "\n",
+ "print(\n",
+ " f\"๐ฝ๏ธ Loaded {len(movies_data)} movie records\",\n",
+ " f\"Sample movie: {movies_data[0]['title']}\",\n",
+ " f\"Genres available: {set(movie['genre'] for movie in movies_data)}\",\n",
+ " sep=\"\\n\"\n",
+ ")\n",
+ "\n",
+ "# Configuration for demonstration \n",
+ "dims = 1024 # sentence-transformers/all-roberta-large-v1 - 1024 dims\n",
+ "num_docs = len(movies_data) # Use actual dataset size\n",
+ "\n",
+ "print(\n",
+ " f\"\\n๐ง Configuration:\",\n",
+ " f\"Vector dimensions: {dims}\",\n",
+ " f\"Dataset size: {num_docs} movie documents\",\n",
+ " sep=\"\\n\"\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Step 3: Create HNSW Index\n",
+ "\n",
+ "First, we'll create an HNSW index with typical production settings."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Creating HNSW index with optimized settings...\n",
+ "โ
Created HNSW index: hnsw_demo_index\n",
+ "\n",
+ "๐ง HNSW Configuration:\n",
+ "M (connections per node): 16\n",
+ "EF Construction: 200\n",
+ "EF Runtime: 10\n",
+ "Distance metric: cosine\n",
+ "Data type: float32\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Create HNSW schema with production-like settings\n",
+ "hnsw_schema = {\n",
+ " \"index\": {\n",
+ " \"name\": \"hnsw_demo_index\",\n",
+ " \"prefix\": \"demo:hnsw:\",\n",
+ " },\n",
+ " \"fields\": [\n",
+ " {\"name\": \"movie_id\", \"type\": \"tag\"},\n",
+ " {\"name\": \"title\", \"type\": \"text\"},\n",
+ " {\"name\": \"genre\", \"type\": \"tag\"},\n",
+ " {\"name\": \"rating\", \"type\": \"numeric\"},\n",
+ " {\"name\": \"description\", \"type\": \"text\"},\n",
+ " {\n",
+ " \"name\": \"embedding\",\n",
+ " \"type\": \"vector\",\n",
+ " \"attrs\": {\n",
+ " \"dims\": dims,\n",
+ " \"algorithm\": \"hnsw\",\n",
+ " \"datatype\": \"float32\",\n",
+ " \"distance_metric\": \"cosine\",\n",
+ " \"m\": 16, # Number of bi-directional links for each node\n",
+ " \"ef_construction\": 200, # Size of dynamic candidate list\n",
+ " \"ef_runtime\": 10 # Size of dynamic candidate list during search\n",
+ " }\n",
+ " }\n",
+ " ]\n",
+ "}\n",
+ "\n",
+ "print(\"Creating HNSW index with optimized settings...\")\n",
+ "hnsw_index = SearchIndex.from_dict(hnsw_schema, redis_url=REDIS_URL)\n",
+ "hnsw_index.create(overwrite=True)\n",
+ "print(f\"โ
Created HNSW index: {hnsw_index.name}\")\n",
+ "\n",
+ "# Display HNSW configuration\n",
+ "print(\n",
+ " \"\\n๐ง HNSW Configuration:\",\n",
+ " f\"M (connections per node): 16\",\n",
+ " f\"EF Construction: 200\",\n",
+ " f\"EF Runtime: 10\",\n",
+ " f\"Distance metric: cosine\",\n",
+ " f\"Data type: float32\",\n",
+ " sep=\"\\n\"\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Step 4: Generate Embeddings and Load HNSW Index\n",
+ "\n",
+ "Generate embeddings for movie descriptions and populate the HNSW index."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 6,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "๐ Generating embeddings for movie descriptions...\n",
+ "14:40:35 sentence_transformers.SentenceTransformer INFO Use pytorch device_name: mps\n",
+ "14:40:35 sentence_transformers.SentenceTransformer INFO Load pretrained SentenceTransformer: all-MiniLM-L6-v2\n"
+ ]
+ },
+ {
+ "data": {
+ "application/vnd.jupyter.widget-view+json": {
+ "model_id": "dfa2af21d4904b58845f57a9786706e3",
+ "version_major": 2,
+ "version_minor": 0
+ },
+ "text/plain": [
+ "Batches: 0%| | 0/1 [00:00, ?it/s]"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "โ
Generated real embeddings using SentenceTransformer\n",
+ "๐ Embedding shape: (20, 1024)\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Generate embeddings (synthetic for demonstration)\n",
+ "print(\"๐ Generating embeddings for movie descriptions...\")\n",
+ "\n",
+ "try:\n",
+ " # Try to use sentence-transformers if available\n",
+ " from sentence_transformers import SentenceTransformer\n",
+ " model = SentenceTransformer('all-MiniLM-L6-v2') # 384 dimensions\n",
+ " \n",
+ " # Generate real embeddings\n",
+ " descriptions = [movie['description'] for movie in movies_data]\n",
+ " embeddings = model.encode(descriptions, convert_to_numpy=True)\n",
+ " \n",
+ " # Pad to 1024 dimensions for demonstration\n",
+ " if embeddings.shape[1] < dims:\n",
+ " padding = np.zeros((embeddings.shape[0], dims - embeddings.shape[1]))\n",
+ " embeddings = np.concatenate([embeddings, padding], axis=1)\n",
+ " \n",
+ " print(f\"โ
Generated real embeddings using SentenceTransformer\")\n",
+ " \n",
+ "except ImportError:\n",
+ " # Fallback to synthetic embeddings\n",
+ " print(\"๐ SentenceTransformer not available, generating synthetic embeddings...\")\n",
+ " \n",
+ " np.random.seed(42) # For reproducible results\n",
+ " embeddings = []\n",
+ " \n",
+ " for i, movie in enumerate(movies_data):\n",
+ " # Create a pseudo-semantic embedding based on movie content\n",
+ " vector = np.random.random(dims).astype(np.float32)\n",
+ " \n",
+ " # Add some structure based on genre\n",
+ " if movie['genre'] == 'action':\n",
+ " vector[:50] += 0.3 # Action movies cluster\n",
+ " else: # comedy\n",
+ " vector[50:100] += 0.3 # Comedy movies cluster\n",
+ " \n",
+ " # Normalize\n",
+ " vector = vector / np.linalg.norm(vector)\n",
+ " embeddings.append(vector)\n",
+ " \n",
+ " embeddings = np.array(embeddings)\n",
+ " print(f\"โ
Generated {len(embeddings)} synthetic embeddings\")\n",
+ "\n",
+ "print(f\"๐ Embedding shape: {embeddings.shape}\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "๐ฆ Prepared 20 documents for indexing\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Prepare data for loading into HNSW index\n",
+ "sample_data = []\n",
+ "for i, movie in enumerate(movies_data):\n",
+ " sample_data.append({\n",
+ " 'movie_id': str(movie['id']),\n",
+ " 'title': movie['title'],\n",
+ " 'genre': movie['genre'],\n",
+ " 'rating': movie['rating'],\n",
+ " 'description': movie['description'],\n",
+ " 'embedding': array_to_buffer(embeddings[i].astype(np.float32), dtype='float32')\n",
+ " })\n",
+ "\n",
+ "print(f\"๐ฆ Prepared {len(sample_data)} documents for indexing\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 8,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "๐ฅ Loading data into HNSW index...\n",
+ " Loaded 20/20 documents\n",
+ "โณ Waiting for HNSW indexing to complete...\n",
+ "\n",
+ "โ
HNSW index loaded with 20 documents\n",
+ "Index size: 4.225975036621094 MB\n",
+ "Indexing time: ~5 seconds (HNSW graph construction)\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Load data into HNSW index\n",
+ "print(\"๐ฅ Loading data into HNSW index...\")\n",
+ "batch_size = 100 # Process in batches\n",
+ "\n",
+ "for i in range(0, len(sample_data), batch_size):\n",
+ " batch = sample_data[i:i+batch_size]\n",
+ " hnsw_index.load(batch)\n",
+ " print(f\" Loaded {min(i+batch_size, len(sample_data))}/{len(sample_data)} documents\")\n",
+ "\n",
+ "# Wait for indexing to complete\n",
+ "print(\"โณ Waiting for HNSW indexing to complete...\")\n",
+ "time.sleep(5) # HNSW indexing takes longer than FLAT\n",
+ "\n",
+ "hnsw_info = hnsw_index.info()\n",
+ "print(\n",
+ " f\"\\nโ
HNSW index loaded with {hnsw_info['num_docs']} documents\",\n",
+ " f\"Index size: {hnsw_info.get('vector_index_sz_mb', 'N/A')} MB\",\n",
+ " f\"Indexing time: ~5 seconds (HNSW graph construction)\",\n",
+ " sep=\"\\n\"\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Step 5: Get Compression Recommendation\n",
+ "\n",
+ "Use the CompressionAdvisor to get optimal SVS-VAMANA settings for our data."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 9,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "๐ Analyzing data for optimal compression settings...\n",
+ "\n",
+ "๐ Compression Recommendations:\n",
+ "\n",
+ "๐๏ธ Memory Priority:\n",
+ " Algorithm: svs-vamana\n",
+ " Compression: LeanVec4x8\n",
+ " Datatype: float16\n",
+ " Dimensions: 1024 โ 512\n",
+ "\n",
+ "โ๏ธ Balanced Priority:\n",
+ " Algorithm: svs-vamana\n",
+ " Compression: LeanVec4x8\n",
+ " Datatype: float16\n",
+ " Dimensions: 1024 โ 512\n",
+ "\n",
+ "โก Performance Priority:\n",
+ " Algorithm: svs-vamana\n",
+ " Compression: LeanVec4x8\n",
+ " Datatype: float16\n",
+ " Dimensions: 1024 โ 512\n",
+ "\n",
+ "โ
Selected configuration: Memory Priority\n",
+ "Expected memory reduction: ~50.0% from dimension reduction\n",
+ "Additional savings from float16 compression\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Get compression recommendation\n",
+ "print(\"๐ Analyzing data for optimal compression settings...\")\n",
+ "\n",
+ "# Get recommendations for different priorities\n",
+ "memory_config = CompressionAdvisor.recommend(dims=dims, priority=\"memory\")\n",
+ "balanced_config = CompressionAdvisor.recommend(dims=dims, priority=\"balanced\")\n",
+ "performance_config = CompressionAdvisor.recommend(dims=dims, priority=\"performance\")\n",
+ "\n",
+ "print(\n",
+ " \"\\n๐ Compression Recommendations:\",\n",
+ " \"\",\n",
+ " \"๐๏ธ Memory Priority:\",\n",
+ " f\" Algorithm: {memory_config['algorithm']}\",\n",
+ " f\" Compression: {memory_config.get('compression', 'None')}\",\n",
+ " f\" Datatype: {memory_config['datatype']}\",\n",
+ " f\" Dimensions: {dims} โ {memory_config.get('reduce', dims)}\",\n",
+ " \"\",\n",
+ " \"โ๏ธ Balanced Priority:\",\n",
+ " f\" Algorithm: {balanced_config['algorithm']}\",\n",
+ " f\" Compression: {balanced_config.get('compression', 'None')}\",\n",
+ " f\" Datatype: {balanced_config['datatype']}\",\n",
+ " f\" Dimensions: {dims} โ {balanced_config.get('reduce', dims)}\",\n",
+ " \"\",\n",
+ " \"โก Performance Priority:\",\n",
+ " f\" Algorithm: {performance_config['algorithm']}\",\n",
+ " f\" Compression: {performance_config.get('compression', 'None')}\",\n",
+ " f\" Datatype: {performance_config['datatype']}\",\n",
+ " f\" Dimensions: {dims} โ {performance_config.get('reduce', dims)}\",\n",
+ " sep=\"\\n\"\n",
+ ")\n",
+ "\n",
+ "# Select configuration (using memory priority for maximum savings)\n",
+ "selected_config = memory_config\n",
+ "target_dims = selected_config.get('reduce', dims)\n",
+ "target_dtype = selected_config['datatype']\n",
+ "\n",
+ "print(\n",
+ " f\"\\nโ
Selected configuration: Memory Priority\",\n",
+ " f\"Expected memory reduction: ~{((dims - target_dims) / dims * 100):.1f}% from dimension reduction\",\n",
+ " f\"Additional savings from {selected_config['datatype']} compression\",\n",
+ " sep=\"\\n\"\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Step 6: Create SVS-VAMANA Index\n",
+ "\n",
+ "Create the SVS-VAMANA index with the recommended compression settings."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 10,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Creating SVS-VAMANA index with compression...\n",
+ "โ
Created SVS-VAMANA index: svs_demo_index\n",
+ "Compression: LeanVec4x8\n",
+ "Datatype: float16\n",
+ "Dimensions: 1024 โ 512\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Create SVS-VAMANA schema with compression\n",
+ "svs_schema = {\n",
+ " \"index\": {\n",
+ " \"name\": \"svs_demo_index\",\n",
+ " \"prefix\": \"demo:svs:\",\n",
+ " },\n",
+ " \"fields\": [\n",
+ " {\"name\": \"movie_id\", \"type\": \"tag\"},\n",
+ " {\"name\": \"title\", \"type\": \"text\"},\n",
+ " {\"name\": \"genre\", \"type\": \"tag\"},\n",
+ " {\"name\": \"rating\", \"type\": \"numeric\"},\n",
+ " {\"name\": \"description\", \"type\": \"text\"},\n",
+ " {\n",
+ " \"name\": \"embedding\",\n",
+ " \"type\": \"vector\",\n",
+ " \"attrs\": {\n",
+ " \"dims\": target_dims, # Use reduced dimensions (512)\n",
+ " \"algorithm\": \"svs-vamana\",\n",
+ " \"datatype\": selected_config['datatype'],\n",
+ " \"distance_metric\": \"cosine\"\n",
+ " # Note: Don't include the full selected_config to avoid dims/reduce conflict\n",
+ " }\n",
+ " }\n",
+ " ]\n",
+ "}\n",
+ "\n",
+ "print(\"Creating SVS-VAMANA index with compression...\")\n",
+ "svs_index = SearchIndex.from_dict(svs_schema, redis_url=REDIS_URL)\n",
+ "svs_index.create(overwrite=True)\n",
+ "print(\n",
+ " f\"โ
Created SVS-VAMANA index: {svs_index.name}\",\n",
+ " f\"Compression: {selected_config.get('compression', 'None')}\",\n",
+ " f\"Datatype: {selected_config['datatype']}\",\n",
+ " f\"Dimensions: {dims} โ {target_dims}\",\n",
+ " sep=\"\\n\"\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Step 7: Migrate Data from HNSW to SVS-VAMANA\n",
+ "\n",
+ "Extract data from the HNSW index and migrate it to SVS-VAMANA with compression."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 11,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "๐ Extracting data from HNSW index...\n",
+ "Found 20 documents to migrate\n",
+ "Prepared 20 documents for migration\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Extract data from HNSW index\n",
+ "print(\"๐ Extracting data from HNSW index...\")\n",
+ "\n",
+ "client = redis.Redis.from_url(REDIS_URL)\n",
+ "keys = client.keys(\"demo:hnsw:*\")\n",
+ "print(f\"Found {len(keys)} documents to migrate\")\n",
+ "\n",
+ "# Process and transform data for SVS index\n",
+ "svs_data = []\n",
+ "\n",
+ "for key in keys:\n",
+ " doc_data = client.hgetall(key)\n",
+ " \n",
+ " if b'embedding' in doc_data:\n",
+ " # Extract original vector from HNSW index\n",
+ " original_vector = np.array(buffer_to_array(doc_data[b'embedding'], dtype='float32'))\n",
+ " \n",
+ " # Apply dimensionality reduction if needed (LeanVec)\n",
+ " if target_dims < dims:\n",
+ " vector = original_vector[:target_dims]\n",
+ " else:\n",
+ " vector = original_vector\n",
+ " \n",
+ " # Convert to target datatype\n",
+ " if target_dtype == 'float16':\n",
+ " vector = vector.astype(np.float16)\n",
+ " \n",
+ " svs_data.append({\n",
+ " \"movie_id\": doc_data[b'movie_id'].decode(),\n",
+ " \"title\": doc_data[b'title'].decode(),\n",
+ " \"genre\": doc_data[b'genre'].decode(),\n",
+ " \"rating\": int(doc_data[b'rating'].decode()),\n",
+ " \"description\": doc_data[b'description'].decode(),\n",
+ " \"embedding\": array_to_buffer(vector, dtype=target_dtype)\n",
+ " })\n",
+ "\n",
+ "print(f\"Prepared {len(svs_data)} documents for migration\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "๐ฅ Loading data into SVS-VAMANA index...\n",
+ " Migrated 20/20 documents\n",
+ "โณ Waiting for SVS-VAMANA indexing to complete...\n",
+ "\n",
+ "โ
Migration complete! SVS index has 20 documents\n",
+ "Index size: 1.017791748046875 MB\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Load data into SVS index\n",
+ "print(\"๐ฅ Loading data into SVS-VAMANA index...\")\n",
+ "batch_size = 100 # Define batch size for migration\n",
+ "\n",
+ "if len(svs_data) > 0:\n",
+ " for i in range(0, len(svs_data), batch_size):\n",
+ " batch = svs_data[i:i+batch_size]\n",
+ " svs_index.load(batch)\n",
+ " print(f\" Migrated {min(i+batch_size, len(svs_data))}/{len(svs_data)} documents\")\n",
+ "\n",
+ " # Wait for indexing to complete\n",
+ " print(\"โณ Waiting for SVS-VAMANA indexing to complete...\")\n",
+ " time.sleep(5)\n",
+ "\n",
+ " svs_info = svs_index.info()\n",
+ " print(\n",
+ " f\"\\nโ
Migration complete! SVS index has {svs_info['num_docs']} documents\",\n",
+ " f\"Index size: {svs_info.get('vector_index_sz_mb', 'N/A')} MB\",\n",
+ " sep=\"\\n\"\n",
+ " )\n",
+ "else:\n",
+ " print(\"โ ๏ธ No data to migrate. Make sure the HNSW index was populated first.\")\n",
+ " print(\" Run the previous cells to load data into the HNSW index.\")\n",
+ " svs_info = svs_index.info()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Step 8: Compare Memory Usage\n",
+ "\n",
+ "Analyze the memory savings achieved through the HNSW to SVS-VAMANA migration."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "๐ Memory Usage Comparison\n",
+ "========================================\n",
+ "Original HNSW index: 4.23 MB\n",
+ "SVS-VAMANA index: 1.02 MB\n",
+ "\n",
+ "๐ฐ Memory savings: 75.9%\n",
+ "Absolute reduction: 3.21 MB\n",
+ "\n",
+ "๐ต Cost Impact Analysis:\n",
+ "Monthly cost reduction: $0.23\n",
+ "Annual cost reduction: $2.71\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Helper function to extract memory info\n",
+ "def get_memory_mb(index_info):\n",
+ " \"\"\"Extract memory usage in MB from index info\"\"\"\n",
+ " memory = index_info.get('vector_index_sz_mb', 0)\n",
+ " if isinstance(memory, str):\n",
+ " try:\n",
+ " return float(memory)\n",
+ " except ValueError:\n",
+ " return 0.0\n",
+ " return float(memory)\n",
+ "\n",
+ "# Get memory usage\n",
+ "hnsw_memory = get_memory_mb(hnsw_info)\n",
+ "svs_memory = get_memory_mb(svs_info)\n",
+ "\n",
+ "print(\n",
+ " \"๐ Memory Usage Comparison\",\n",
+ " \"=\" * 40,\n",
+ " f\"Original HNSW index: {hnsw_memory:.2f} MB\",\n",
+ " f\"SVS-VAMANA index: {svs_memory:.2f} MB\",\n",
+ " \"\",\n",
+ " sep=\"\\n\"\n",
+ ")\n",
+ "\n",
+ "if hnsw_memory > 0:\n",
+ " if svs_memory > 0:\n",
+ " savings = ((hnsw_memory - svs_memory) / hnsw_memory) * 100\n",
+ " print(\n",
+ " f\"๐ฐ Memory savings: {savings:.1f}%\",\n",
+ " f\"Absolute reduction: {hnsw_memory - svs_memory:.2f} MB\",\n",
+ " sep=\"\\n\"\n",
+ " )\n",
+ " else:\n",
+ " print(\"โณ SVS index still indexing - memory comparison pending\")\n",
+ " \n",
+ " # Cost analysis\n",
+ " print(\"\\n๐ต Cost Impact Analysis:\")\n",
+ " cost_per_gb_hour = 0.10 # Example cloud pricing\n",
+ " hours_per_month = 24 * 30\n",
+ " \n",
+ " hnsw_monthly_cost = (hnsw_memory / 1024) * cost_per_gb_hour * hours_per_month\n",
+ " if svs_memory > 0:\n",
+ " svs_monthly_cost = (svs_memory / 1024) * cost_per_gb_hour * hours_per_month\n",
+ " monthly_savings = hnsw_monthly_cost - svs_monthly_cost\n",
+ " print(\n",
+ " f\"Monthly cost reduction: ${monthly_savings:.2f}\",\n",
+ " f\"Annual cost reduction: ${monthly_savings * 12:.2f}\",\n",
+ " sep=\"\\n\"\n",
+ " )\n",
+ " else:\n",
+ " print(\n",
+ " f\"Current monthly cost: ${hnsw_monthly_cost:.2f}\",\n",
+ " \"Projected savings: Available after indexing completes\",\n",
+ " sep=\"\\n\"\n",
+ " )\n",
+ "else:\n",
+ " print(\"โ ๏ธ Memory information not available\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Step 9: Validate Search Quality\n",
+ "\n",
+ "Compare search quality between HNSW and SVS-VAMANA to ensure the migration maintains acceptable recall."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "๐ Generating test queries for quality validation...\n",
+ "Generated 10 test queries\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Generate test queries\n",
+ "print(\"๐ Generating test queries for quality validation...\")\n",
+ "\n",
+ "np.random.seed(123) # For reproducible test queries\n",
+ "num_test_queries = 10\n",
+ "test_queries = []\n",
+ "\n",
+ "for i in range(num_test_queries):\n",
+ " # Create test query vectors\n",
+ " query_vec = np.random.random(dims).astype(np.float32)\n",
+ " query_vec = query_vec / np.linalg.norm(query_vec) # Normalize\n",
+ " test_queries.append(query_vec)\n",
+ "\n",
+ "print(f\"Generated {len(test_queries)} test queries\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 15,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "๐ Testing HNSW search quality...\n",
+ "HNSW search completed in 0.007 seconds\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Test HNSW search quality\n",
+ "print(\"๐ Testing HNSW search quality...\")\n",
+ "\n",
+ "hnsw_results = []\n",
+ "hnsw_start = time.time()\n",
+ "\n",
+ "for query_vec in test_queries:\n",
+ " query = VectorQuery(\n",
+ " vector=query_vec,\n",
+ " vector_field_name=\"embedding\",\n",
+ " return_fields=[\"movie_id\", \"title\", \"genre\"],\n",
+ " dtype=\"float32\",\n",
+ " num_results=10\n",
+ " )\n",
+ " results = hnsw_index.query(query)\n",
+ " hnsw_results.append([doc[\"movie_id\"] for doc in results])\n",
+ "\n",
+ "hnsw_time = time.time() - hnsw_start\n",
+ "print(f\"HNSW search completed in {hnsw_time:.3f} seconds\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 16,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "๐ Testing SVS-VAMANA search quality...\n",
+ "SVS-VAMANA search completed in 0.006 seconds\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Test SVS-VAMANA search quality\n",
+ "print(\"๐ Testing SVS-VAMANA search quality...\")\n",
+ "\n",
+ "svs_results = []\n",
+ "svs_start = time.time()\n",
+ "\n",
+ "for i, query_vec in enumerate(test_queries):\n",
+ " # Adjust query vector for SVS index (handle dimensionality reduction)\n",
+ " if target_dims < dims:\n",
+ " svs_query_vec = query_vec[:target_dims]\n",
+ " else:\n",
+ " svs_query_vec = query_vec\n",
+ " \n",
+ " if target_dtype == 'float16':\n",
+ " svs_query_vec = svs_query_vec.astype(np.float16)\n",
+ " \n",
+ " query = VectorQuery(\n",
+ " vector=svs_query_vec,\n",
+ " vector_field_name=\"embedding\",\n",
+ " return_fields=[\"movie_id\", \"title\", \"genre\"],\n",
+ " dtype=target_dtype,\n",
+ " num_results=10\n",
+ " )\n",
+ " results = svs_index.query(query)\n",
+ " svs_results.append([doc[\"movie_id\"] for doc in results])\n",
+ "\n",
+ "svs_time = time.time() - svs_start\n",
+ "print(f\"SVS-VAMANA search completed in {svs_time:.3f} seconds\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 17,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "๐ Search Quality Comparison\n",
+ "========================================\n",
+ "Recall@5: 1.000 (100.0%)\n",
+ "Recall@10: 0.990 (99.0%)\n",
+ "\n",
+ "โฑ๏ธ Performance Comparison:\n",
+ "HNSW query time: 0.007s (0.7ms per query)\n",
+ "SVS-VAMANA query time: 0.006s (0.6ms per query)\n",
+ "Speed difference: +10.4%\n",
+ "\n",
+ "๐ฏ Quality Assessment: ๐ข Excellent - Minimal quality loss\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Calculate recall and performance metrics\n",
+ "def calculate_recall(reference_results, test_results, k=10):\n",
+ " \"\"\"Calculate recall@k between two result sets\"\"\"\n",
+ " if not reference_results or not test_results:\n",
+ " return 0.0\n",
+ " \n",
+ " total_recall = 0.0\n",
+ " for ref, test in zip(reference_results, test_results):\n",
+ " ref_set = set(ref[:k])\n",
+ " test_set = set(test[:k])\n",
+ " if len(ref_set) > 0:\n",
+ " recall = len(ref_set.intersection(test_set)) / len(ref_set)\n",
+ " total_recall += recall\n",
+ " \n",
+ " return total_recall / len(reference_results)\n",
+ "\n",
+ "# Calculate metrics\n",
+ "recall_at_5 = calculate_recall(hnsw_results, svs_results, k=5)\n",
+ "recall_at_10 = calculate_recall(hnsw_results, svs_results, k=10)\n",
+ "\n",
+ "print(\n",
+ " \"๐ Search Quality Comparison\",\n",
+ " \"=\" * 40,\n",
+ " f\"Recall@5: {recall_at_5:.3f} ({recall_at_5*100:.1f}%)\",\n",
+ " f\"Recall@10: {recall_at_10:.3f} ({recall_at_10*100:.1f}%)\",\n",
+ " \"\",\n",
+ " \"โฑ๏ธ Performance Comparison:\",\n",
+ " f\"HNSW query time: {hnsw_time:.3f}s ({hnsw_time/num_test_queries*1000:.1f}ms per query)\",\n",
+ " f\"SVS-VAMANA query time: {svs_time:.3f}s ({svs_time/num_test_queries*1000:.1f}ms per query)\",\n",
+ " f\"Speed difference: {((hnsw_time - svs_time) / hnsw_time * 100):+.1f}%\",\n",
+ " sep=\"\\n\"\n",
+ ")\n",
+ "\n",
+ "# Quality assessment\n",
+ "if recall_at_10 >= 0.95:\n",
+ " quality_assessment = \"๐ข Excellent - Minimal quality loss\"\n",
+ "elif recall_at_10 >= 0.90:\n",
+ " quality_assessment = \"๐ก Good - Acceptable quality for most applications\"\n",
+ "elif recall_at_10 >= 0.80:\n",
+ " quality_assessment = \"๐ Fair - Consider if quality requirements are flexible\"\n",
+ "else:\n",
+ " quality_assessment = \"๐ด Poor - Migration not recommended\"\n",
+ "\n",
+ "print(f\"\\n๐ฏ Quality Assessment: {quality_assessment}\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Step 10: Migration Decision Framework\n",
+ "\n",
+ "Based on the analysis, determine if migration is recommended."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 18,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "๐ค Migration Decision Analysis\n",
+ "========================================\n",
+ "\n",
+ "๐ Criteria Evaluation:\n",
+ "Memory savings: 75.9% โ
(threshold: 20%)\n",
+ "Search quality: 0.990 โ
(threshold: 0.85)\n",
+ "\n",
+ "๐ฏ Migration Recommendation: ๐ข RECOMMENDED\n",
+ "๐ญ Reasoning: Migration provides significant memory savings while maintaining good search quality.\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Migration decision logic\n",
+ "memory_savings_threshold = 20 # Minimum 20% memory savings\n",
+ "recall_threshold = 0.85 # Minimum 85% recall@10\n",
+ "\n",
+ "memory_savings_pct = ((hnsw_memory - svs_memory) / hnsw_memory * 100) if hnsw_memory > 0 and svs_memory > 0 else 0\n",
+ "meets_memory_threshold = memory_savings_pct >= memory_savings_threshold\n",
+ "meets_quality_threshold = recall_at_10 >= recall_threshold\n",
+ "\n",
+ "print(\n",
+ " \"๐ค Migration Decision Analysis\",\n",
+ " \"=\" * 40,\n",
+ " \"\",\n",
+ " \"๐ Criteria Evaluation:\",\n",
+ " f\"Memory savings: {memory_savings_pct:.1f}% {'โ
' if meets_memory_threshold else 'โ'} (threshold: {memory_savings_threshold}%)\",\n",
+ " f\"Search quality: {recall_at_10:.3f} {'โ
' if meets_quality_threshold else 'โ'} (threshold: {recall_threshold})\",\n",
+ " \"\",\n",
+ " sep=\"\\n\"\n",
+ ")\n",
+ "\n",
+ "if meets_memory_threshold and meets_quality_threshold:\n",
+ " recommendation = \"๐ข RECOMMENDED\"\n",
+ " reasoning = \"Migration provides significant memory savings while maintaining good search quality.\"\n",
+ "elif meets_memory_threshold and not meets_quality_threshold:\n",
+ " recommendation = \"๐ก CONDITIONAL\"\n",
+ " reasoning = \"Good memory savings but reduced search quality. Consider if your application can tolerate lower recall.\"\n",
+ "elif not meets_memory_threshold and meets_quality_threshold:\n",
+ " recommendation = \"๐ LIMITED BENEFIT\"\n",
+ " reasoning = \"Search quality is maintained but memory savings are minimal. Migration may not be worth the effort.\"\n",
+ "else:\n",
+ " recommendation = \"๐ด NOT RECOMMENDED\"\n",
+ " reasoning = \"Insufficient memory savings and/or poor search quality. Consider alternative optimization strategies.\"\n",
+ "\n",
+ "print(\n",
+ " f\"๐ฏ Migration Recommendation: {recommendation}\",\n",
+ " f\"๐ญ Reasoning: {reasoning}\",\n",
+ " sep=\"\\n\"\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Step 11: Production Migration Checklist\n",
+ "\n",
+ "If migration is recommended, follow this checklist for production deployment."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 19,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "๐ HNSW to SVS-VAMANA Migration Checklist\n",
+ "==================================================\n",
+ "\n",
+ "PRE-MIGRATION:\n",
+ "โก Backup existing HNSW index data\n",
+ "โก Test migration on staging environment\n",
+ "โก Validate search quality with real queries\n",
+ "โก Measure baseline HNSW performance metrics\n",
+ "โก Plan rollback strategy\n",
+ "โก Document current HNSW parameters (M, EF_construction, EF_runtime)\n",
+ "\n",
+ "MIGRATION:\n",
+ "โก Create SVS-VAMANA index with tested configuration\n",
+ "โก Migrate data in batches during low-traffic periods\n",
+ "โก Monitor memory usage and indexing progress\n",
+ "โก Validate data integrity after migration\n",
+ "โก Test search functionality thoroughly\n",
+ "โก Compare recall metrics with baseline\n",
+ "\n",
+ "POST-MIGRATION:\n",
+ "โก Monitor search performance and quality\n",
+ "โก Track memory usage and cost savings\n",
+ "โก Update application configuration\n",
+ "โก Document new SVS-VAMANA settings\n",
+ "โก Clean up old HNSW index after validation period\n",
+ "โก Update monitoring and alerting thresholds\n",
+ "\n",
+ "๐ก HNSW-SPECIFIC TIPS:\n",
+ "โข HNSW indices are more complex to rebuild than FLAT\n",
+ "โข Consider the impact on applications using EF_runtime tuning\n",
+ "โข SVS-VAMANA may have different optimal query parameters\n",
+ "โข Test with your specific HNSW configuration (M, EF values)\n",
+ "โข Monitor for 48-72 hours before removing HNSW index\n",
+ "โข Keep compression settings documented for future reference\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(\n",
+ " \"๐ HNSW to SVS-VAMANA Migration Checklist\",\n",
+ " \"=\" * 50,\n",
+ " \"\\nPRE-MIGRATION:\",\n",
+ " \"โก Backup existing HNSW index data\",\n",
+ " \"โก Test migration on staging environment\",\n",
+ " \"โก Validate search quality with real queries\",\n",
+ " \"โก Measure baseline HNSW performance metrics\",\n",
+ " \"โก Plan rollback strategy\",\n",
+ " \"โก Document current HNSW parameters (M, EF_construction, EF_runtime)\",\n",
+ " \"\\nMIGRATION:\",\n",
+ " \"โก Create SVS-VAMANA index with tested configuration\",\n",
+ " \"โก Migrate data in batches during low-traffic periods\",\n",
+ " \"โก Monitor memory usage and indexing progress\",\n",
+ " \"โก Validate data integrity after migration\",\n",
+ " \"โก Test search functionality thoroughly\",\n",
+ " \"โก Compare recall metrics with baseline\",\n",
+ " \"\\nPOST-MIGRATION:\",\n",
+ " \"โก Monitor search performance and quality\",\n",
+ " \"โก Track memory usage and cost savings\",\n",
+ " \"โก Update application configuration\",\n",
+ " \"โก Document new SVS-VAMANA settings\",\n",
+ " \"โก Clean up old HNSW index after validation period\",\n",
+ " \"โก Update monitoring and alerting thresholds\",\n",
+ " \"\\n๐ก HNSW-SPECIFIC TIPS:\",\n",
+ " \"โข HNSW indices are more complex to rebuild than FLAT\",\n",
+ " \"โข Consider the impact on applications using EF_runtime tuning\",\n",
+ " \"โข SVS-VAMANA may have different optimal query parameters\",\n",
+ " \"โข Test with your specific HNSW configuration (M, EF values)\",\n",
+ " \"โข Monitor for 48-72 hours before removing HNSW index\",\n",
+ " \"โข Keep compression settings documented for future reference\",\n",
+ " sep=\"\\n\"\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Step 12: Cleanup\n",
+ "\n",
+ "Clean up the demonstration indices."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 20,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "๐งน Cleaning up demonstration indices...\n",
+ "โ
Deleted HNSW demonstration index\n",
+ "โ
Deleted SVS-VAMANA demonstration index\n",
+ "\n",
+ "๐ HNSW to SVS-VAMANA migration demonstration complete!\n",
+ "\n",
+ "Next steps:\n",
+ "1. Apply learnings to your production HNSW indices\n",
+ "2. Test with your actual query patterns and data\n",
+ "3. Monitor performance in your environment\n",
+ "4. Consider gradual rollout strategy\n",
+ "5. Evaluate impact on applications using HNSW-specific features\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(\"๐งน Cleaning up demonstration indices...\")\n",
+ "\n",
+ "# Clean up HNSW index\n",
+ "try:\n",
+ " hnsw_index.delete(drop=True)\n",
+ " print(\"โ
Deleted HNSW demonstration index\")\n",
+ "except Exception as e:\n",
+ " print(f\"โ ๏ธ Failed to delete HNSW index: {e}\")\n",
+ "\n",
+ "# Clean up SVS index\n",
+ "try:\n",
+ " svs_index.delete(drop=True)\n",
+ " print(\"โ
Deleted SVS-VAMANA demonstration index\")\n",
+ "except Exception as e:\n",
+ " print(f\"โ ๏ธ Failed to delete SVS index: {e}\")\n",
+ "\n",
+ "print(\n",
+ " \"\\n๐ HNSW to SVS-VAMANA migration demonstration complete!\",\n",
+ " \"\\nNext steps:\",\n",
+ " \"1. Apply learnings to your production HNSW indices\",\n",
+ " \"2. Test with your actual query patterns and data\",\n",
+ " \"3. Monitor performance in your environment\",\n",
+ " \"4. Consider gradual rollout strategy\",\n",
+ " \"5. Evaluate impact on applications using HNSW-specific features\",\n",
+ " sep=\"\\n\"\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.12.6"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/python-recipes/vector-search/07_flat_to_svs_vamana_migration.ipynb b/python-recipes/vector-search/07_flat_to_svs_vamana_migration.ipynb
new file mode 100644
index 00000000..e52879c9
--- /dev/null
+++ b/python-recipes/vector-search/07_flat_to_svs_vamana_migration.ipynb
@@ -0,0 +1,1192 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "\n",
+ "# Migrating from FLAT to SVS-VAMANA\n",
+ "\n",
+ "## Let's Begin!\n",
+ "
\n",
+ "\n",
+ "This notebook demonstrates how to migrate existing FLAT vector indices to SVS-VAMANA for improved memory efficiency and cost savings.\n",
+ "\n",
+ "## What You'll Learn\n",
+ "\n",
+ "- How to assess your current FLAT index for migration\n",
+ "- Step-by-step migration from FLAT to SVS-VAMANA\n",
+ "- Memory usage comparison and cost analysis\n",
+ "- Search quality validation\n",
+ "- Performance benchmarking\n",
+ "- Migration decision framework\n",
+ "\n",
+ "## Prerequisites\n",
+ "\n",
+ "- Redis Stack 8.2.0+ with RediSearch 2.8.10+\n",
+ "- Existing vector index with substantial data (1000+ documents recommended)\n",
+ "- Vector embeddings (768 dimensions using sentence-transformers/all-mpnet-base-v2)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## ๐ฆ Installation & Setup\n",
+ "\n",
+ "This notebook requires **sentence-transformers** for generating embeddings and **Redis Stack** running in Docker.\n",
+ "\n",
+ "**Requirements:**\n",
+ "- Redis Stack 8.2.0+ with RediSearch 2.8.10+\n",
+ "- sentence-transformers (for generating embeddings)\n",
+ "- numpy (for vector operations)\n",
+ "- redisvl (should be available in your environment)\n",
+ "\n",
+ "**๐ณ Docker Setup (Required):**\n",
+ "\n",
+ "Before running this notebook, make sure Redis Stack is running in Docker:\n",
+ "\n",
+ "```bash\n",
+ "# Start Redis Stack with Docker\n",
+ "docker run -d --name redis-stack -p 6379:6379 -p 8001:8001 redis/redis-stack:latest\n",
+ "```\n",
+ "\n",
+ "Or if you prefer using docker-compose, create a `docker-compose.yml` file:\n",
+ "\n",
+ "```yaml\n",
+ "version: '3.8'\n",
+ "services:\n",
+ " redis:\n",
+ " image: redis/redis-stack:latest\n",
+ " ports:\n",
+ " - \"6379:6379\"\n",
+ " - \"8001:8001\"\n",
+ "```\n",
+ "\n",
+ "Then run: `docker-compose up -d`\n",
+ "\n",
+ "**๐ Python Dependencies Installation:**\n",
+ "\n",
+ "Install the required Python packages:\n",
+ "\n",
+ "```bash\n",
+ "# Install core dependencies\n",
+ "pip install redisvl numpy sentence-transformers\n",
+ "\n",
+ "# Or install with specific versions for compatibility\n",
+ "pip install redisvl>=0.2.0 numpy>=1.21.0 sentence-transformers>=2.2.0\n",
+ "```\n",
+ "\n",
+ "**For Google Colab users, run this cell:**\n",
+ "\n",
+ "```python\n",
+ "!pip install redisvl sentence-transformers numpy\n",
+ "```\n",
+ "\n",
+ "**For Conda users:**\n",
+ "\n",
+ "```bash\n",
+ "conda install numpy\n",
+ "pip install redisvl sentence-transformers\n",
+ "```"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 35,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Setup redis-vl environment\n",
+ "import os\n",
+ "import sys\n",
+ "import subprocess\n",
+ "# Required imports from redis-vl\n",
+ "import numpy as np\n",
+ "import time\n",
+ "from redisvl.index import SearchIndex\n",
+ "from redisvl.query import VectorQuery\n",
+ "from redisvl.redis.utils import array_to_buffer, buffer_to_array\n",
+ "from redisvl.utils import CompressionAdvisor\n",
+ "from redisvl.redis.connection import supports_svs\n",
+ "import redis\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Step 1: Verify SVS-VAMANA Support\n",
+ "\n",
+ "First, let's ensure your Redis environment supports SVS-VAMANA."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 36,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "โ
Redis connection successful\n",
+ "โ
SVS-VAMANA supported\n",
+ " Ready for migration!\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Check Redis connection and SVS support\n",
+ "REDIS_URL = \"redis://localhost:6379\"\n",
+ "\n",
+ "try:\n",
+ " client = redis.Redis.from_url(REDIS_URL)\n",
+ " client.ping()\n",
+ " print(\"โ
Redis connection successful\")\n",
+ " \n",
+ " if supports_svs(client):\n",
+ " print(\"โ
SVS-VAMANA supported\")\n",
+ " print(\" Ready for migration!\")\n",
+ " else:\n",
+ " print(\"โ SVS-VAMANA not supported\")\n",
+ " print(\" Requires Redis >= 8.2.0 with RediSearch >= 2.8.10\")\n",
+ " print(\" Please upgrade Redis Stack before proceeding\")\n",
+ " \n",
+ "except Exception as e:\n",
+ " print(f\"โ Redis connection failed: {e}\")\n",
+ " print(\" Please ensure Redis is running and accessible\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Step 2: Assess Your Current Index\n",
+ "\n",
+ "For this demonstration, we'll create a sample FLAT index. In practice, you would analyze your existing index."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 37,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "๐ฅ Loading sample movie data...\n",
+ "Loaded 20 movie records\n",
+ "Sample movie: Explosive Pursuit - A daring cop chases a notorious criminal across the city in a high-stakes game of cat and mouse.\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Download sample data from redis-ai-resources\n",
+ "print(\"๐ฅ Loading sample movie data...\")\n",
+ "import os\n",
+ "import json\n",
+ "\n",
+ "# Load the movies dataset\n",
+ "url = \"resources/movies.json\"\n",
+ "with open(\"resources/movies.json\", \"r\") as f:\n",
+ " movies_data = json.load(f)\n",
+ "\n",
+ "print(f\"Loaded {len(movies_data)} movie records\")\n",
+ "print(f\"Sample movie: {movies_data[0]['title']} - {movies_data[0]['description']}\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 38,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "๐ Migration Assessment\n",
+ "Vector dimensions: 768 (sentence-transformers/all-mpnet-base-v2)\n",
+ "Dataset size: 20 movie documents\n",
+ "Data includes: title, genre, rating, description\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Configuration for demonstration \n",
+ "dims = 768 # sentence-transformers/all-mpnet-base-v2 - 768 dims\n",
+ "\n",
+ "num_docs = len(movies_data) # Use actual dataset size\n",
+ "\n",
+ "print(\n",
+ " \"๐ Migration Assessment\",\n",
+ " f\"Vector dimensions: {dims} (sentence-transformers/all-mpnet-base-v2)\",\n",
+ " f\"Dataset size: {num_docs} movie documents\",\n",
+ " \"Data includes: title, genre, rating, description\",\n",
+ " sep=\"\\n\"\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "---\n",
+ "Next, let's configure a smaple FLAT index. Notice the algorithm value, dims value, and datatype value under fields."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 39,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Creating sample FLAT index...\n",
+ "โ
Created FLAT index: migration_demo_flat\n"
+ ]
+ }
+ ],
+ "source": [
+ "flat_schema = {\n",
+ " \"index\": {\n",
+ " \"name\": \"migration_demo_flat\",\n",
+ " \"prefix\": \"demo:flat:\",\n",
+ " },\n",
+ " \"fields\": [\n",
+ " {\"name\": \"movie_id\", \"type\": \"tag\"},\n",
+ " {\"name\": \"title\", \"type\": \"text\"},\n",
+ " {\"name\": \"genre\", \"type\": \"tag\"},\n",
+ " {\"name\": \"rating\", \"type\": \"numeric\"},\n",
+ " {\"name\": \"description\", \"type\": \"text\"},\n",
+ " {\n",
+ " \"name\": \"embedding\",\n",
+ " \"type\": \"vector\",\n",
+ " \"attrs\": {\n",
+ " \"dims\": dims,\n",
+ " \"algorithm\": \"flat\",\n",
+ " \"datatype\": \"float32\",\n",
+ " \"distance_metric\": \"cosine\"\n",
+ " }\n",
+ " }\n",
+ " ]\n",
+ "}\n",
+ "\n",
+ "# Create and populate FLAT index\n",
+ "print(\"Creating sample FLAT index...\")\n",
+ "flat_index = SearchIndex.from_dict(flat_schema, redis_url=REDIS_URL)\n",
+ "flat_index.create(overwrite=True)\n",
+ "print(f\"โ
Created FLAT index: {flat_index.name}\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "---\n",
+ "Generate embeddings for movie descriptions\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 40,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "๐ Generating embeddings for movie descriptions...\n",
+ "๐ฆ Loading sentence transformer model...\n",
+ "14:45:27 sentence_transformers.SentenceTransformer INFO Use pytorch device_name: mps\n",
+ "14:45:27 sentence_transformers.SentenceTransformer INFO Load pretrained SentenceTransformer: sentence-transformers/all-mpnet-base-v2\n",
+ "โ
Loaded embedding model with 768 dimensions\n"
+ ]
+ },
+ {
+ "data": {
+ "application/vnd.jupyter.widget-view+json": {
+ "model_id": "0e06f2f860ec443e802a3fbf3961487c",
+ "version_major": 2,
+ "version_minor": 0
+ },
+ "text/plain": [
+ "Batches: 0%| | 0/1 [00:00, ?it/s]"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "โ
Generated 20 real embeddings using sentence-transformers\n"
+ ]
+ }
+ ],
+ "source": [
+ "from sentence_transformers import SentenceTransformer\n",
+ "\n",
+ "print(\"๐ Generating embeddings for movie descriptions...\")\n",
+ "embedding_model=\"sentence-transformers/all-mpnet-base-v2\"\n",
+ "\n",
+ "try:\n",
+ " # Try to use sentence-transformers for real embeddings\n",
+ " print(\"๐ฆ Loading sentence transformer model...\")\n",
+ " model = SentenceTransformer(embedding_model)\n",
+ " print(f\"โ
Loaded embedding model with {dims} dimensions\")\n",
+ " \n",
+ " # Generate real embeddings\n",
+ " descriptions = [movie['description'] for movie in movies_data]\n",
+ " embeddings = model.encode(descriptions, convert_to_numpy=True, normalize_embeddings=True)\n",
+ " print(f\"โ
Generated {len(embeddings)} real embeddings using sentence-transformers\")\n",
+ " \n",
+ "except ImportError:\n",
+ " # Fallback to synthetic embeddings\n",
+ " print(\"โ ๏ธ sentence-transformers not available, using synthetic embeddings\")\n",
+ " print(f\"๐ฆ Using {dims} dimensions for synthetic embeddings\")\n",
+ " \n",
+ " # Generate synthetic embeddings (normalized random vectors for demo)\n",
+ " np.random.seed(42) # For reproducible results\n",
+ " embeddings = []\n",
+ " for i, movie in enumerate(movies_data):\n",
+ " # Create a pseudo-semantic embedding based on movie content\n",
+ " vector = np.random.random(dims).astype(np.float32)\n",
+ " # Add some structure based on genre\n",
+ " if movie['genre'] == 'action':\n",
+ " vector[:50] += 0.3 # Action movies cluster\n",
+ " else: # comedy\n",
+ " vector[50:100] += 0.3 # Comedy movies cluster\n",
+ " \n",
+ " # Normalize\n",
+ " vector = vector / np.linalg.norm(vector)\n",
+ " embeddings.append(vector)\n",
+ " \n",
+ " embeddings = np.array(embeddings)\n",
+ " print(f\"โ
Generated {len(embeddings)} synthetic embeddings\")\n",
+ "\n",
+ "# Prepare data for loading\n",
+ "sample_data = []\n",
+ "for i, movie in enumerate(movies_data):\n",
+ " sample_data.append({\n",
+ " 'movie_id': str(movie['id']),\n",
+ " 'title': movie['title'],\n",
+ " 'genre': movie['genre'],\n",
+ " 'rating': movie['rating'],\n",
+ " 'description': movie['description'],\n",
+ " 'embedding': array_to_buffer(embeddings[i].astype(np.float32), dtype='float32')\n",
+ " })"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 41,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "๐ฅ Loading data into FLAT index...\n",
+ " Loaded 20/20 documents\n",
+ "Waiting for indexing to complete...\n",
+ "\n",
+ "โ
FLAT index loaded with 20 documents\n",
+ "Index size: 3.0168838500976563 MB\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Load data into FLAT index\n",
+ "print(\"๐ฅ Loading data into FLAT index...\")\n",
+ "batch_size = 100 # Process in batches\n",
+ "\n",
+ "for i in range(0, len(sample_data), batch_size):\n",
+ " batch = sample_data[i:i+batch_size]\n",
+ " flat_index.load(batch)\n",
+ " print(f\" Loaded {min(i+batch_size, len(sample_data))}/{len(sample_data)} documents\")\n",
+ "\n",
+ "# Wait for indexing to complete\n",
+ "print(\"Waiting for indexing to complete...\")\n",
+ "time.sleep(3)\n",
+ "\n",
+ "flat_info = flat_index.info()\n",
+ "print(f\"\\nโ
FLAT index loaded with {flat_info['num_docs']} documents\")\n",
+ "print(f\"Index size: {flat_info.get('vector_index_sz_mb', 'N/A')} MB\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Step 3: Get Compression Recommendation\n",
+ "\n",
+ "The CompressionAdvisor analyzes your vector dimensions and provides optimal compression settings for SVS-VAMANA vector indices. It eliminates the guesswork from parameter tuning by providing intelligent recommendations based on your vector characteristics and performance priorities.\n",
+ "\n",
+ "## Configuration Strategy\n",
+ "**High-Dimensional Vectors (โฅ1024 dims)**: Uses **LeanVec4x8** compression with dimensionality reduction. Memory priority reduces dimensions by 50%, speed priority by\n",
+ "25%, balanced by 50%. Achieves 60-80% memory savings.\n",
+ "\n",
+ "**Lower-Dimensional Vectors (<1024 dims)**: Uses **LVQ compression** without dimensionality reduction. Memory priority uses LVQ4 (4 bits), speed uses LVQ4x8 (12 bits),\n",
+ "balanced uses LVQ4x4 (8 bits). Achieves 60-87% memory savings.\n",
+ "\n",
+ "**Our Configuration (768 dims)**: Will use **LVQ compression** as we're below the 1024 dimension threshold. This provides excellent compression without dimensionality reduction.\n",
+ "\n",
+ "## Available Compression Types\n",
+ "- **LVQ4/LVQ4x4/LVQ4x8**: 4/8/12 bits per dimension\n",
+ "- **LeanVec4x8/LeanVec8x8**: 12/16 bits + dimensionality reduction for high-dim vectors\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 42,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "๐ Analyzing compression options...\n",
+ "\n",
+ "MEMORY priority:\n",
+ " Algorithm: svs-vamana\n",
+ " Compression: LVQ4\n",
+ " Datatype: float32\n",
+ "\n",
+ "BALANCED priority:\n",
+ " Algorithm: svs-vamana\n",
+ " Compression: LVQ4x4\n",
+ " Datatype: float32\n",
+ "\n",
+ "PERFORMANCE priority:\n",
+ " Algorithm: svs-vamana\n",
+ " Compression: LVQ4x4\n",
+ " Datatype: float32\n",
+ "\n",
+ "๐ Selected configuration: LVQ4 with float32\n",
+ "Expected memory savings: Significant for 768-dimensional vectors\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Get compression recommendation\n",
+ "print(\"๐ Analyzing compression options...\")\n",
+ "print()\n",
+ "\n",
+ "# Try different priorities to show options\n",
+ "priorities = [\"memory\", \"balanced\", \"performance\"]\n",
+ "configs = {}\n",
+ "\n",
+ "for priority in priorities:\n",
+ " config = CompressionAdvisor.recommend(dims=dims, priority=priority)\n",
+ " configs[priority] = config\n",
+ " print(f\"{priority.upper()} priority:\")\n",
+ " print(f\" Algorithm: {config['algorithm']}\")\n",
+ " print(f\" Compression: {config.get('compression', 'None')}\")\n",
+ " print(f\" Datatype: {config['datatype']}\")\n",
+ " if 'reduce' in config:\n",
+ " reduction = ((dims - config['reduce']) / dims) * 100\n",
+ " print(f\" Dimensionality: {dims} โ {config['reduce']} ({reduction:.1f}% reduction)\")\n",
+ " print()\n",
+ "\n",
+ "# Select memory-optimized configuration for migration\n",
+ "selected_config = configs[\"memory\"]\n",
+ "print(f\"๐ Selected configuration: {selected_config['compression']} with {selected_config['datatype']}\")\n",
+ "print(f\"Expected memory savings: Significant for {dims}-dimensional vectors\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Step 4: Create SVS-VAMANA Index\n",
+ "\n",
+ "Now we'll create the new SVS-VAMANA index with the recommended compression settings."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 43,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Creating SVS-VAMANA index with compression...\n",
+ "โ
Created SVS-VAMANA index: migration_demo_svs\n",
+ "Compression: LVQ4\n",
+ "Datatype: float32\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Fallback configuration if not defined (for CI/CD compatibility)\n",
+ "if 'selected_config' not in locals():\n",
+ " from redisvl.utils import CompressionAdvisor\n",
+ " selected_config = CompressionAdvisor.recommend(dims=dims, priority=\"memory\")\n",
+ "\n",
+ "# Create SVS-VAMANA schema with compression\n",
+ "svs_schema = {\n",
+ " \"index\": {\n",
+ " \"name\": \"migration_demo_svs\",\n",
+ " \"prefix\": \"demo:svs:\",\n",
+ " },\n",
+ " \"fields\": [\n",
+ " {\"name\": \"movie_id\", \"type\": \"tag\"},\n",
+ " {\"name\": \"title\", \"type\": \"text\"},\n",
+ " {\"name\": \"genre\", \"type\": \"tag\"},\n",
+ " {\"name\": \"rating\", \"type\": \"numeric\"},\n",
+ " {\"name\": \"description\", \"type\": \"text\"},\n",
+ " {\n",
+ " \"name\": \"embedding\",\n",
+ " \"type\": \"vector\",\n",
+ " \"attrs\": {\n",
+ " \"dims\": selected_config.get('reduce', dims), # Use reduced dimensions (512)\n",
+ " \"algorithm\": \"svs-vamana\",\n",
+ " \"datatype\": selected_config['datatype'],\n",
+ " \"distance_metric\": \"cosine\"\n",
+ " # Note: Don't include the full selected_config to avoid dims/reduce conflict\n",
+ " }\n",
+ " }\n",
+ " ]\n",
+ "}\n",
+ "\n",
+ "print(\"Creating SVS-VAMANA index with compression...\")\n",
+ "svs_index = SearchIndex.from_dict(svs_schema, redis_url=REDIS_URL)\n",
+ "svs_index.create(overwrite=True)\n",
+ "print(f\"โ
Created SVS-VAMANA index: {svs_index.name}\")\n",
+ "print(f\"Compression: {selected_config.get('compression', 'None')}\")\n",
+ "print(f\"Datatype: {selected_config['datatype']}\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Step 5: Migrate Data\n",
+ "\n",
+ "Extract data from the original index and load it into the SVS-VAMANA index with compression applied."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 44,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "๐ Migrating data to SVS-VAMANA...\n",
+ "Target dimensions: 768 (from 768)\n",
+ "Target datatype: float32\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(\"๐ Migrating data to SVS-VAMANA...\")\n",
+ "\n",
+ "# Fallback configuration if not defined (for CI/CD compatibility)\n",
+ "if 'selected_config' not in locals():\n",
+ " from redisvl.utils import CompressionAdvisor\n",
+ " selected_config = CompressionAdvisor.recommend(dims=dims, priority=\"memory\")\n",
+ "\n",
+ "# Determine target vector dimensions (may be reduced by LeanVec)\n",
+ "target_dims = selected_config.get('reduce', dims)\n",
+ "target_dtype = selected_config['datatype']\n",
+ "\n",
+ "print(f\"Target dimensions: {target_dims} (from {dims})\")\n",
+ "print(f\"Target datatype: {target_dtype}\")\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 45,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Extracting data from original index...\n",
+ "Found 40 documents to migrate\n",
+ "Prepared 40 documents for migration\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Extract data from FLAT index\n",
+ "print(\"Extracting data from original index...\")\n",
+ "keys = client.keys(\"demo:flat:*\")\n",
+ "print(f\"Found {len(keys)} documents to migrate\")\n",
+ "\n",
+ "# Process and transform data for SVS index\n",
+ "svs_data = []\n",
+ "for i, key in enumerate(keys):\n",
+ " doc_data = client.hgetall(key)\n",
+ " \n",
+ " if b'embedding' in doc_data:\n",
+ " # Extract original vector\n",
+ " original_vector = np.array(buffer_to_array(doc_data[b'embedding'], dtype='float32'))\n",
+ " \n",
+ " # Apply dimensionality reduction if needed (LeanVec)\n",
+ " if target_dims < dims:\n",
+ " vector = original_vector[:target_dims]\n",
+ " else:\n",
+ " vector = original_vector\n",
+ " \n",
+ " # Convert to target datatype\n",
+ " if target_dtype == 'float16':\n",
+ " vector = vector.astype(np.float16)\n",
+ " \n",
+ " svs_data.append({\n",
+ " \"movie_id\": doc_data[b'movie_id'].decode(),\n",
+ " \"title\": doc_data[b'title'].decode(),\n",
+ " \"genre\": doc_data[b'genre'].decode(),\n",
+ " \"rating\": int(doc_data[b'rating'].decode()),\n",
+ " \"description\": doc_data[b'description'].decode(),\n",
+ " \"embedding\": array_to_buffer(vector, dtype=target_dtype)\n",
+ " })\n",
+ " \n",
+ " if (i + 1) % 500 == 0:\n",
+ " print(f\" Processed {i + 1}/{len(keys)} documents\")\n",
+ "\n",
+ "print(f\"Prepared {len(svs_data)} documents for migration\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 46,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Loading data into SVS-VAMANA index...\n",
+ " Migrated 40/40 documents\n",
+ "Waiting for indexing to complete...\n",
+ "\n",
+ "โ
Migration complete! SVS index has 20 documents\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Load data into SVS index\n",
+ "print(\"Loading data into SVS-VAMANA index...\")\n",
+ "batch_size = 100 # Define batch size for migration\n",
+ "\n",
+ "if len(svs_data) > 0:\n",
+ " for i in range(0, len(svs_data), batch_size):\n",
+ " batch = svs_data[i:i+batch_size]\n",
+ " svs_index.load(batch)\n",
+ " print(f\" Migrated {min(i+batch_size, len(svs_data))}/{len(svs_data)} documents\")\n",
+ "\n",
+ " # Wait for indexing to complete\n",
+ " print(\"Waiting for indexing to complete...\")\n",
+ " time.sleep(5)\n",
+ "\n",
+ " svs_info = svs_index.info()\n",
+ " print(f\"\\nโ
Migration complete! SVS index has {svs_info['num_docs']} documents\")\n",
+ "else:\n",
+ " print(\"โ ๏ธ No data to migrate. Make sure the FLAT index was populated first.\")\n",
+ " print(\" Run the previous cells to load data into the FLAT index.\")\n",
+ " svs_info = svs_index.info()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Step 6: Compare Memory Usage\n",
+ "\n",
+ "Let's analyze the memory savings achieved through compression. This is just an example on the small sample data. Use a larger dataset before deciding."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 47,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "๐ Memory Usage Comparison\n",
+ "========================================\n",
+ "Original FLAT index: 3.02 MB\n",
+ "SVS-VAMANA index: 3.02 MB\n",
+ "\n",
+ "๐ฐ Memory savings: -0.0%\n",
+ "Absolute reduction: -0.00 MB\n",
+ "\n",
+ "๐ต Cost Impact Analysis:\n",
+ "Monthly cost reduction: $-0.00\n",
+ "Annual cost reduction: $-0.00\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Helper function to extract memory info\n",
+ "def get_memory_mb(index_info):\n",
+ " \"\"\"Extract memory usage in MB from index info\"\"\"\n",
+ " memory = index_info.get('vector_index_sz_mb', 0)\n",
+ " if isinstance(memory, str):\n",
+ " try:\n",
+ " return float(memory)\n",
+ " except ValueError:\n",
+ " return 0.0\n",
+ " return float(memory)\n",
+ "\n",
+ "# Get memory usage\n",
+ "flat_memory = get_memory_mb(flat_info)\n",
+ "svs_memory = get_memory_mb(svs_info)\n",
+ "\n",
+ "print(\n",
+ " \"๐ Memory Usage Comparison\",\n",
+ " \"=\" * 40,\n",
+ " f\"Original FLAT index: {flat_memory:.2f} MB\",\n",
+ " f\"SVS-VAMANA index: {svs_memory:.2f} MB\",\n",
+ " \"\",\n",
+ " sep=\"\\n\"\n",
+ ")\n",
+ "\n",
+ "if flat_memory > 0:\n",
+ " if svs_memory > 0:\n",
+ " savings = ((flat_memory - svs_memory) / flat_memory) * 100\n",
+ " print(\n",
+ " f\"๐ฐ Memory savings: {savings:.1f}%\",\n",
+ " f\"Absolute reduction: {flat_memory - svs_memory:.2f} MB\",\n",
+ " sep=\"\\n\"\n",
+ " )\n",
+ " else:\n",
+ " print(\"โณ SVS index still indexing - memory comparison pending\")\n",
+ " \n",
+ " # Cost analysis\n",
+ " print(\"\\n๐ต Cost Impact Analysis:\")\n",
+ " cost_per_gb_hour = 0.10 # Example cloud pricing\n",
+ " hours_per_month = 24 * 30\n",
+ " \n",
+ " flat_monthly_cost = (flat_memory / 1024) * cost_per_gb_hour * hours_per_month\n",
+ " if svs_memory > 0:\n",
+ " svs_monthly_cost = (svs_memory / 1024) * cost_per_gb_hour * hours_per_month\n",
+ " monthly_savings = flat_monthly_cost - svs_monthly_cost\n",
+ " print(\n",
+ " f\"Monthly cost reduction: ${monthly_savings:.2f}\",\n",
+ " f\"Annual cost reduction: ${monthly_savings * 12:.2f}\",\n",
+ " sep=\"\\n\"\n",
+ " )\n",
+ " else:\n",
+ " print(\n",
+ " f\"Current monthly cost: ${flat_monthly_cost:.2f}\",\n",
+ " \"Projected savings: Available after indexing completes\",\n",
+ " sep=\"\\n\"\n",
+ " )\n",
+ "else:\n",
+ " print(\"โ ๏ธ Memory information not available\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Step 7: Validate Search Quality\n",
+ "\n",
+ "Test that the compressed index maintains good search quality."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 48,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "๐ Validating search quality...\n",
+ "Generated 5 test queries\n",
+ "\n",
+ "Testing original FLAT index...\n",
+ "FLAT search time: 0.012s (0.002s per query)\n",
+ "\n",
+ "Testing SVS-VAMANA index...\n",
+ "SVS search time: 0.017s (0.003s per query)\n",
+ "\n",
+ "๐ Average recall@10: 1.000 (100.0%)\n",
+ "โ
Excellent search quality maintained\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(\"๐ Validating search quality...\")\n",
+ "\n",
+ "# Create test queries\n",
+ "num_test_queries = 5\n",
+ "test_queries = []\n",
+ "\n",
+ "for i in range(num_test_queries):\n",
+ " # Generate normalized test vector\n",
+ " query_vec = np.random.random(dims).astype(np.float32)\n",
+ " query_vec = query_vec / np.linalg.norm(query_vec)\n",
+ " test_queries.append(query_vec)\n",
+ "\n",
+ "print(f\"Generated {num_test_queries} test queries\")\n",
+ "\n",
+ "# Test FLAT index (ground truth)\n",
+ "print(\"\\nTesting original FLAT index...\")\n",
+ "flat_results = []\n",
+ "flat_start = time.time()\n",
+ "\n",
+ "for query_vec in test_queries:\n",
+ " query = VectorQuery(\n",
+ " vector=query_vec,\n",
+ " vector_field_name=\"embedding\",\n",
+ " return_fields=[\"movie_id\", \"title\", \"genre\"],\n",
+ " dtype=\"float32\",\n",
+ " num_results=10\n",
+ " )\n",
+ " results = flat_index.query(query)\n",
+ " flat_results.append([doc[\"movie_id\"] for doc in results])\n",
+ "\n",
+ "flat_time = time.time() - flat_start\n",
+ "print(f\"FLAT search time: {flat_time:.3f}s ({flat_time/num_test_queries:.3f}s per query)\")\n",
+ "\n",
+ "# Test SVS-VAMANA index\n",
+ "print(\"\\nTesting SVS-VAMANA index...\")\n",
+ "svs_results = []\n",
+ "svs_start = time.time()\n",
+ "\n",
+ "for i, query_vec in enumerate(test_queries):\n",
+ " # Adjust query vector for SVS index (handle dimensionality reduction)\n",
+ " if target_dims < dims:\n",
+ " svs_query_vec = query_vec[:target_dims]\n",
+ " else:\n",
+ " svs_query_vec = query_vec\n",
+ " \n",
+ " if target_dtype == 'float16':\n",
+ " svs_query_vec = svs_query_vec.astype(np.float16)\n",
+ " \n",
+ " query = VectorQuery(\n",
+ " vector=svs_query_vec,\n",
+ " vector_field_name=\"embedding\",\n",
+ " return_fields=[\"movie_id\", \"title\", \"genre\"],\n",
+ " dtype=target_dtype,\n",
+ " num_results=10\n",
+ " )\n",
+ " \n",
+ " try:\n",
+ " results = svs_index.query(query)\n",
+ " svs_results.append([doc[\"movie_id\"] for doc in results])\n",
+ " except Exception as e:\n",
+ " print(f\"Query {i+1} failed: {e}\")\n",
+ " svs_results.append([])\n",
+ "\n",
+ "svs_time = time.time() - svs_start\n",
+ "print(f\"SVS search time: {svs_time:.3f}s ({svs_time/num_test_queries:.3f}s per query)\")\n",
+ "\n",
+ "# Calculate recall if we have results\n",
+ "if svs_results and any(svs_results):\n",
+ " recalls = []\n",
+ " for flat_res, svs_res in zip(flat_results, svs_results):\n",
+ " if flat_res and svs_res:\n",
+ " intersection = set(flat_res).intersection(set(svs_res))\n",
+ " recall = len(intersection) / len(flat_res)\n",
+ " recalls.append(recall)\n",
+ " \n",
+ " if recalls:\n",
+ " avg_recall = np.mean(recalls)\n",
+ " print(f\"\\n๐ Average recall@10: {avg_recall:.3f} ({avg_recall*100:.1f}%)\")\n",
+ " \n",
+ " if avg_recall >= 0.9:\n",
+ " print(\"โ
Excellent search quality maintained\")\n",
+ " elif avg_recall >= 0.8:\n",
+ " print(\"โ
Good search quality maintained\")\n",
+ " else:\n",
+ " print(\"โ ๏ธ Search quality may be impacted - consider adjusting compression\")\n",
+ "else:\n",
+ " print(\"โ ๏ธ SVS index may still be indexing - search quality test pending\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Step 8: Migration Decision Framework\n",
+ "\n",
+ "Based on the results, let's determine if migration is recommended."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 49,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "๐ฏ Migration Analysis & Recommendation\n",
+ "==================================================\n",
+ "Dataset: 20 documents, 768-dimensional vectors\n",
+ "Compression: LVQ4\n",
+ "Datatype: float32 โ float32\n",
+ "\n",
+ "Memory savings: -0.0% (Modest)\n",
+ "Search quality: 1.0% recall (Acceptable)\n",
+ "Performance: 1.4x vs original (Acceptable)\n",
+ "\n",
+ "๐ RECOMMENDATION:\n",
+ "โ MIGRATION NOT RECOMMENDED\n",
+ " โข Insufficient benefits for current dataset\n",
+ " โข Consider larger dataset or different compression\n",
+ " โข SVS-VAMANA works best with high-dimensional data\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(\"๐ฏ Migration Analysis & Recommendation\")\n",
+ "print(\"=\" * 50)\n",
+ "\n",
+ "# Fallback configuration if not defined (for CI/CD compatibility)\n",
+ "if 'selected_config' not in locals():\n",
+ " from redisvl.utils import CompressionAdvisor\n",
+ " selected_config = CompressionAdvisor.recommend(dims=dims, priority=\"memory\")\n",
+ "\n",
+ "# Summarize configuration\n",
+ "print(f\"Dataset: {num_docs} documents, {dims}-dimensional vectors\")\n",
+ "print(f\"Compression: {selected_config.get('compression', 'None')}\")\n",
+ "print(f\"Datatype: float32 โ {selected_config['datatype']}\")\n",
+ "if 'reduce' in selected_config:\n",
+ " reduction = ((dims - selected_config['reduce']) / dims) * 100\n",
+ " print(f\"Dimensions: {dims} โ {selected_config['reduce']} ({reduction:.1f}% reduction)\")\n",
+ "print()\n",
+ "\n",
+ "# Decision criteria\n",
+ "memory_savings_significant = False\n",
+ "search_quality_acceptable = True\n",
+ "performance_acceptable = True\n",
+ "\n",
+ "if flat_memory > 0 and svs_memory > 0:\n",
+ " savings_pct = ((flat_memory - svs_memory) / flat_memory) * 100\n",
+ " memory_savings_significant = savings_pct > 25 # 25%+ savings considered significant\n",
+ " print(f\"Memory savings: {savings_pct:.1f}% ({'Significant' if memory_savings_significant else 'Modest'})\")\n",
+ "else:\n",
+ " print(\"Memory savings: Pending (SVS index still indexing)\")\n",
+ "\n",
+ "if 'recalls' in locals() and recalls:\n",
+ " avg_recall = np.mean(recalls)\n",
+ " search_quality_acceptable = avg_recall >= 0.8 # 80%+ recall considered acceptable\n",
+ " print(f\"Search quality: {avg_recall:.1f}% recall ({'Acceptable' if search_quality_acceptable else 'Needs improvement'})\")\n",
+ "else:\n",
+ " print(\"Search quality: Pending validation\")\n",
+ "\n",
+ "if 'flat_time' in locals() and 'svs_time' in locals():\n",
+ " performance_ratio = svs_time / flat_time if flat_time > 0 else 1\n",
+ " performance_acceptable = performance_ratio <= 2.0 # Allow up to 2x slower\n",
+ " print(f\"Performance: {performance_ratio:.1f}x vs original ({'Acceptable' if performance_acceptable else 'Slower than expected'})\")\n",
+ "else:\n",
+ " print(\"Performance: Pending comparison\")\n",
+ "\n",
+ "\n",
+ "# Final recommendation\n",
+ "print(\"\\n๐ RECOMMENDATION:\")\n",
+ "if memory_savings_significant and search_quality_acceptable and performance_acceptable:\n",
+ " print(\"โ
MIGRATE TO SVS-VAMANA\")\n",
+ " print(\" โข Significant memory savings achieved\")\n",
+ " print(\" โข Search quality maintained\")\n",
+ " print(\" โข Performance impact acceptable\")\n",
+ " print(\" โข Cost reduction benefits clear\")\n",
+ "elif memory_savings_significant and search_quality_acceptable:\n",
+ " print(\"โ ๏ธ CONSIDER MIGRATION WITH MONITORING\")\n",
+ " print(\" โข Good memory savings and search quality\")\n",
+ " print(\" โข Monitor performance in production\")\n",
+ " print(\" โข Consider gradual rollout\")\n",
+ "elif memory_savings_significant:\n",
+ " print(\"โ ๏ธ MIGRATION NEEDS TUNING\")\n",
+ " print(\" โข Memory savings achieved\")\n",
+ " print(\" โข Search quality or performance needs improvement\")\n",
+ " print(\" โข Try different compression settings\")\n",
+ "else:\n",
+ " print(\"โ MIGRATION NOT RECOMMENDED\")\n",
+ " print(\" โข Insufficient benefits for current dataset\")\n",
+ " print(\" โข Consider larger dataset or different compression\")\n",
+ " print(\" โข SVS-VAMANA works best with high-dimensional data\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Step 9: Production Migration Checklist\n",
+ "\n",
+ "If migration is recommended, follow this checklist for production deployment."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 50,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "๐ Production Migration Checklist\n",
+ "========================================\n",
+ "\n",
+ "PRE-MIGRATION:\n",
+ "โก Backup existing index data\n",
+ "โก Test migration on staging environment\n",
+ "โก Validate search quality with real queries\n",
+ "โก Measure baseline performance metrics\n",
+ "โก Plan rollback strategy\n",
+ "\n",
+ "MIGRATION:\n",
+ "โก Create SVS-VAMANA index with tested configuration\n",
+ "โก Migrate data in batches during low-traffic periods\n",
+ "โก Monitor memory usage and indexing progress\n",
+ "โก Validate data integrity after migration\n",
+ "โก Test search functionality thoroughly\n",
+ "\n",
+ "POST-MIGRATION:\n",
+ "โก Monitor search performance and quality\n",
+ "โก Track memory usage and cost savings\n",
+ "โก Update application configuration\n",
+ "โก Document new index settings\n",
+ "โก Clean up old index after validation period\n",
+ "\n",
+ "๐ก TIPS:\n",
+ "โข Start with a subset of data for initial validation\n",
+ "โข Use blue-green deployment for zero-downtime migration\n",
+ "โข Monitor for 24-48 hours before removing old index\n",
+ "โข Keep compression settings documented for future reference\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(\n",
+ " \"๐ Production Migration Checklist\",\n",
+ " \"=\" * 40,\n",
+ " \"\\nPRE-MIGRATION:\",\n",
+ " \"โก Backup existing index data\",\n",
+ " \"โก Test migration on staging environment\",\n",
+ " \"โก Validate search quality with real queries\",\n",
+ " \"โก Measure baseline performance metrics\",\n",
+ " \"โก Plan rollback strategy\",\n",
+ " \"\\nMIGRATION:\",\n",
+ " \"โก Create SVS-VAMANA index with tested configuration\",\n",
+ " \"โก Migrate data in batches during low-traffic periods\",\n",
+ " \"โก Monitor memory usage and indexing progress\",\n",
+ " \"โก Validate data integrity after migration\",\n",
+ " \"โก Test search functionality thoroughly\",\n",
+ " \"\\nPOST-MIGRATION:\",\n",
+ " \"โก Monitor search performance and quality\",\n",
+ " \"โก Track memory usage and cost savings\",\n",
+ " \"โก Update application configuration\",\n",
+ " \"โก Document new index settings\",\n",
+ " \"โก Clean up old index after validation period\",\n",
+ " \"\\n๐ก TIPS:\",\n",
+ " \"โข Start with a subset of data for initial validation\",\n",
+ " \"โข Use blue-green deployment for zero-downtime migration\",\n",
+ " \"โข Monitor for 24-48 hours before removing old index\",\n",
+ " \"โข Keep compression settings documented for future reference\",\n",
+ " sep=\"\\n\"\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Step 10: Cleanup\n",
+ "\n",
+ "Clean up the demonstration indices."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 51,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "๐งน Cleaning up demonstration indices...\n",
+ "โ
Deleted FLAT demonstration index\n",
+ "โ
Deleted SVS-VAMANA demonstration index\n",
+ "\n",
+ "๐ Migration demonstration complete!\n",
+ "\n",
+ "Next steps:\n",
+ "1. Apply learnings to your production data\n",
+ "2. Test with your actual query patterns\n",
+ "3. Monitor performance in your environment\n",
+ "4. Consider gradual rollout strategy\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(\"๐งน Cleaning up demonstration indices...\")\n",
+ "\n",
+ "# Clean up FLAT index\n",
+ "try:\n",
+ " flat_index.delete(drop=True)\n",
+ " print(\"โ
Deleted FLAT demonstration index\")\n",
+ "except Exception as e:\n",
+ " print(f\"โ ๏ธ Failed to delete FLAT index: {e}\")\n",
+ "\n",
+ "# Clean up SVS index\n",
+ "try:\n",
+ " svs_index.delete(drop=True)\n",
+ " print(\"โ
Deleted SVS-VAMANA demonstration index\")\n",
+ "except Exception as e:\n",
+ " print(f\"โ ๏ธ Failed to delete SVS index: {e}\")\n",
+ "\n",
+ "print(\n",
+ " \"\\n๐ Migration demonstration complete!\",\n",
+ " \"\\nNext steps:\",\n",
+ " \"1. Apply learnings to your production data\",\n",
+ " \"2. Test with your actual query patterns\",\n",
+ " \"3. Monitor performance in your environment\",\n",
+ " \"4. Consider gradual rollout strategy\",\n",
+ " sep=\"\\n\"\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.12.6"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/python-recipes/vector-search/07_vector_algorithm_benchmark.ipynb b/python-recipes/vector-search/07_vector_algorithm_benchmark.ipynb
new file mode 100644
index 00000000..9acb9c81
--- /dev/null
+++ b/python-recipes/vector-search/07_vector_algorithm_benchmark.ipynb
@@ -0,0 +1,959 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "\n",
+ "# Vector Algorithm Benchmark: FLAT vs HNSW vs SVS-VAMANA\n",
+ "\n",
+ "## Let's Begin!\n",
+ "
\n",
+ "\n",
+ "This notebook benchmarks FLAT, HNSW, and SVS-VAMANA vector search algorithms using **real data from Hugging Face** across different embedding dimensions.\n",
+ "\n",
+ "## What You'll Learn\n",
+ "\n",
+ "- **Memory usage comparison** across algorithms and dimensions\n",
+ "- **Index creation performance** with real text data\n",
+ "- **Query performance** and latency analysis\n",
+ "- **Search quality** with recall metrics on real embeddings\n",
+ "- **Algorithm selection guidance** based on your requirements\n",
+ "\n",
+ "## Benchmark Configuration\n",
+ "\n",
+ "- **Dataset**: SQuAD (Stanford Question Answering Dataset) from Hugging Face\n",
+ "- **Algorithms**: FLAT, HNSW, SVS-VAMANA\n",
+ "- **Dimensions**: 384, 768, 1536 (native sentence-transformer embeddings)\n",
+ "- **Dataset Size**: 1,000 documents per dimension\n",
+ "- **Query Set**: 50 real questions per configuration\n",
+ "- **Focus**: Real-world performance with actual text embeddings\n",
+ "\n",
+ "## Prerequisites\n",
+ "\n",
+ "- Redis Stack 8.2.0+ with RediSearch 2.8.10+\n",
+ "- At least 4GB RAM for comfortable benchmarking\n",
+ "- Internet connection for downloading SQuAD dataset\n",
+ "- ~30-45 minutes runtime for complete benchmark"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## ๐ฆ Installation & Setup\n",
+ "\n",
+ "**๐ณ Docker Setup (Required):**\n",
+ "\n",
+ "Before running this notebook, make sure Redis Stack is running:\n",
+ "\n",
+ "```bash\n",
+ "# Start Redis Stack with Docker\n",
+ "docker run -d --name redis-stack -p 6379:6379 -p 8001:8001 redis/redis-stack:latest\n",
+ "```"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Install dependencies if needed\n",
+ "import sys\n",
+ "import subprocess\n",
+ "\n",
+ "def install_if_missing(package):\n",
+ " try:\n",
+ " __import__(package.split('[')[0]) # Handle package[extras] format\n",
+ " except ImportError:\n",
+ " print(f\"Installing {package}...\")\n",
+ " subprocess.check_call([sys.executable, \"-m\", \"pip\", \"install\", package])\n",
+ "\n",
+ "# Check and install required packages\n",
+ "install_if_missing(\"redisvl\")\n",
+ "install_if_missing(\"matplotlib\")\n",
+ "install_if_missing(\"seaborn\")\n",
+ "install_if_missing(\"pandas\")\n",
+ "install_if_missing(\"datasets\")\n",
+ "install_if_missing(\"sentence-transformers\")\n",
+ "\n",
+ "print(\"โ
All dependencies are ready!\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Import required libraries\n",
+ "import os\n",
+ "import json\n",
+ "import time\n",
+ "import psutil\n",
+ "import numpy as np\n",
+ "import pandas as pd\n",
+ "import matplotlib.pyplot as plt\n",
+ "import seaborn as sns\n",
+ "from typing import Dict, List, Tuple, Any\n",
+ "from dataclasses import dataclass\n",
+ "from collections import defaultdict\n",
+ "\n",
+ "# Redis and RedisVL imports\n",
+ "import redis\n",
+ "from redisvl.index import SearchIndex\n",
+ "from redisvl.query import VectorQuery\n",
+ "from redisvl.redis.utils import array_to_buffer, buffer_to_array\n",
+ "from redisvl.utils import CompressionAdvisor\n",
+ "from redisvl.redis.connection import supports_svs\n",
+ "\n",
+ "# Configuration\n",
+ "REDIS_URL = \"redis://localhost:6379\"\n",
+ "np.random.seed(42) # For reproducible results\n",
+ "\n",
+ "# Set up plotting style\n",
+ "plt.style.use('default')\n",
+ "sns.set_palette(\"husl\")\n",
+ "\n",
+ "print(\"๐ Libraries imported successfully!\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Benchmark configuration\n",
+ "@dataclass\n",
+ "class BenchmarkConfig:\n",
+ " dimensions: List[int]\n",
+ " algorithms: List[str]\n",
+ " docs_per_dimension: int\n",
+ " query_count: int\n",
+ " \n",
+ "# Initialize benchmark configuration\n",
+ "config = BenchmarkConfig(\n",
+ " dimensions=[384, 768, 1536],\n",
+ " algorithms=['flat', 'hnsw', 'svs-vamana'],\n",
+ " docs_per_dimension=1000,\n",
+ " query_count=50\n",
+ ")\n",
+ "\n",
+ "print(\n",
+ " \"๐ง Benchmark Configuration:\",\n",
+ " f\"Dimensions: {config.dimensions}\",\n",
+ " f\"Algorithms: {config.algorithms}\",\n",
+ " f\"Documents per dimension: {config.docs_per_dimension:,}\",\n",
+ " f\"Test queries: {config.query_count}\",\n",
+ " f\"Total documents: {len(config.dimensions) * config.docs_per_dimension:,}\",\n",
+ " f\"Dataset: SQuAD from Hugging Face\",\n",
+ " sep=\"\\n\"\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Step 1: Verify Redis and SVS Support"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Test Redis connection and capabilities\n",
+ "try:\n",
+ " client = redis.Redis.from_url(REDIS_URL)\n",
+ " client.ping()\n",
+ " \n",
+ " redis_info = client.info()\n",
+ " redis_version = redis_info['redis_version']\n",
+ " \n",
+ " svs_supported = supports_svs(client)\n",
+ " \n",
+ " print(\n",
+ " \"โ
Redis connection successful\",\n",
+ " f\"๐ Redis version: {redis_version}\",\n",
+ " f\"๐ง SVS-VAMANA supported: {'โ
Yes' if svs_supported else 'โ No'}\",\n",
+ " sep=\"\\n\"\n",
+ " )\n",
+ " \n",
+ " if not svs_supported:\n",
+ " print(\"โ ๏ธ SVS-VAMANA not supported. Benchmark will skip SVS tests.\")\n",
+ " config.algorithms = ['flat', 'hnsw'] # Remove SVS from tests\n",
+ " \n",
+ "except Exception as e:\n",
+ " print(f\"โ Redis connection failed: {e}\")\n",
+ " print(\"Please ensure Redis Stack is running on localhost:6379\")\n",
+ " raise"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Step 2: Load Real Dataset from Hugging Face\n",
+ "\n",
+ "Load the SQuAD dataset and generate real embeddings using sentence-transformers."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "def load_squad_dataset(num_docs: int) -> List[Dict[str, Any]]:\n",
+ " \"\"\"Load SQuAD dataset from Hugging Face\"\"\"\n",
+ " try:\n",
+ " from datasets import load_dataset\n",
+ " \n",
+ " print(\"๐ฅ Loading SQuAD dataset from Hugging Face...\")\n",
+ " \n",
+ " # Load SQuAD dataset\n",
+ " dataset = load_dataset(\"squad\", split=\"train\")\n",
+ " \n",
+ " # Take a subset for our benchmark\n",
+ " dataset = dataset.select(range(min(num_docs, len(dataset))))\n",
+ " \n",
+ " # Convert to our format\n",
+ " documents = []\n",
+ " for i, item in enumerate(dataset):\n",
+ " # Combine question and context for richer text\n",
+ " text = f\"{item['question']} {item['context']}\"\n",
+ " \n",
+ " documents.append({\n",
+ " 'doc_id': f'squad_{i:06d}',\n",
+ " 'title': item['title'],\n",
+ " 'question': item['question'],\n",
+ " 'context': item['context'][:500], # Truncate long contexts\n",
+ " 'text': text,\n",
+ " 'category': 'qa', # All are Q&A documents\n",
+ " 'score': 1.0\n",
+ " })\n",
+ " \n",
+ " print(f\"โ
Loaded {len(documents)} documents from SQuAD\")\n",
+ " return documents\n",
+ " \n",
+ " except ImportError:\n",
+ " print(\"โ ๏ธ datasets library not available, falling back to local data\")\n",
+ " return load_local_fallback_data(num_docs)\n",
+ " except Exception as e:\n",
+ " print(f\"โ ๏ธ Failed to load SQuAD dataset: {e}\")\n",
+ " print(\"Falling back to local data...\")\n",
+ " return load_local_fallback_data(num_docs)\n",
+ "\n",
+ "def load_local_fallback_data(num_docs: int) -> List[Dict[str, Any]]:\n",
+ " \"\"\"Fallback to local movie dataset if SQuAD is not available\"\"\"\n",
+ " try:\n",
+ " import json\n",
+ " with open('resources/movies.json', 'r') as f:\n",
+ " movies = json.load(f)\n",
+ " \n",
+ " # Expand the small movie dataset by duplicating with variations\n",
+ " documents = []\n",
+ " for i in range(num_docs):\n",
+ " movie = movies[i % len(movies)]\n",
+ " documents.append({\n",
+ " 'doc_id': f'movie_{i:06d}',\n",
+ " 'title': f\"{movie['title']} (Variant {i // len(movies) + 1})\",\n",
+ " 'question': f\"What is {movie['title']} about?\",\n",
+ " 'context': movie['description'],\n",
+ " 'text': f\"What is {movie['title']} about? {movie['description']}\",\n",
+ " 'category': movie['genre'],\n",
+ " 'score': movie['rating']\n",
+ " })\n",
+ " \n",
+ " print(f\"โ
Using local movie dataset: {len(documents)} documents\")\n",
+ " return documents\n",
+ " \n",
+ " except Exception as e:\n",
+ " print(f\"โ Failed to load local data: {e}\")\n",
+ " raise"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "def generate_embeddings_for_texts(texts: List[str], dimensions: int) -> np.ndarray:\n",
+ " \"\"\"Generate embeddings for texts using sentence-transformers\"\"\"\n",
+ " try:\n",
+ " from sentence_transformers import SentenceTransformer\n",
+ " \n",
+ " # Choose model based on target dimensions\n",
+ " if dimensions == 384:\n",
+ " model_name = 'all-MiniLM-L6-v2'\n",
+ " elif dimensions == 768:\n",
+ " model_name = 'all-mpnet-base-v2'\n",
+ " elif dimensions == 1536:\n",
+ " # For 1536D, use gtr-t5-xl which produces native 1536D embeddings\n",
+ " model_name = 'sentence-transformers/gtr-t5-xl'\n",
+ " else:\n",
+ " model_name = 'all-MiniLM-L6-v2' # Default\n",
+ " \n",
+ " print(f\"๐ค Generating {dimensions}D embeddings using {model_name}...\")\n",
+ " \n",
+ " model = SentenceTransformer(model_name)\n",
+ " embeddings = model.encode(texts, convert_to_numpy=True, show_progress_bar=True)\n",
+ " \n",
+ " # Handle dimension adjustment\n",
+ " current_dims = embeddings.shape[1]\n",
+ " if current_dims < dimensions:\n",
+ " # Pad with small random values (better than zeros)\n",
+ " padding_size = dimensions - current_dims\n",
+ " padding = np.random.normal(0, 0.01, (embeddings.shape[0], padding_size))\n",
+ " embeddings = np.concatenate([embeddings, padding], axis=1)\n",
+ " elif current_dims > dimensions:\n",
+ " # Truncate\n",
+ " embeddings = embeddings[:, :dimensions]\n",
+ " \n",
+ " # Normalize embeddings\n",
+ " norms = np.linalg.norm(embeddings, axis=1, keepdims=True)\n",
+ " embeddings = embeddings / norms\n",
+ " \n",
+ " print(f\"โ
Generated embeddings: {embeddings.shape}\")\n",
+ " return embeddings.astype(np.float32)\n",
+ " \n",
+ " except ImportError:\n",
+ " print(f\"โ ๏ธ sentence-transformers not available, using synthetic embeddings\")\n",
+ " return generate_synthetic_embeddings(len(texts), dimensions)\n",
+ " except Exception as e:\n",
+ " print(f\"โ ๏ธ Error generating embeddings: {e}\")\n",
+ " print(\"Falling back to synthetic embeddings...\")\n",
+ " return generate_synthetic_embeddings(len(texts), dimensions)\n",
+ "\n",
+ "def generate_synthetic_embeddings(num_docs: int, dimensions: int) -> np.ndarray:\n",
+ " \"\"\"Generate synthetic embeddings as fallback\"\"\"\n",
+ " print(f\"๐ Generating {num_docs} synthetic {dimensions}D embeddings...\")\n",
+ " \n",
+ " # Create base random vectors\n",
+ " embeddings = np.random.normal(0, 1, (num_docs, dimensions)).astype(np.float32)\n",
+ " \n",
+ " # Add some clustering structure\n",
+ " cluster_size = num_docs // 3\n",
+ " embeddings[:cluster_size, :min(50, dimensions)] += 0.5\n",
+ " embeddings[cluster_size:2*cluster_size, min(50, dimensions):min(100, dimensions)] += 0.5\n",
+ " \n",
+ " # Normalize vectors\n",
+ " norms = np.linalg.norm(embeddings, axis=1, keepdims=True)\n",
+ " embeddings = embeddings / norms\n",
+ " \n",
+ " return embeddings\n",
+ "\n",
+ "# Load real dataset and generate embeddings\n",
+ "print(\"๐ Loading real dataset and generating embeddings...\")\n",
+ "\n",
+ "# Load the base dataset once\n",
+ "raw_documents = load_squad_dataset(config.docs_per_dimension)\n",
+ "texts = [doc['text'] for doc in raw_documents]\n",
+ "\n",
+ "# Generate separate query texts (use questions from SQuAD)\n",
+ "query_texts = [doc['question'] for doc in raw_documents[:config.query_count]]\n",
+ "\n",
+ "benchmark_data = {}\n",
+ "query_data = {}\n",
+ "\n",
+ "for dim in config.dimensions:\n",
+ " print(f\"\\n๐ Processing {dim}D embeddings...\")\n",
+ " \n",
+ " # Generate embeddings for documents\n",
+ " embeddings = generate_embeddings_for_texts(texts, dim)\n",
+ " \n",
+ " # Generate embeddings for queries\n",
+ " query_embeddings = generate_embeddings_for_texts(query_texts, dim)\n",
+ " \n",
+ " # Combine documents with embeddings\n",
+ " documents = []\n",
+ " for i, (doc, embedding) in enumerate(zip(raw_documents, embeddings)):\n",
+ " documents.append({\n",
+ " **doc,\n",
+ " 'embedding': array_to_buffer(embedding, dtype='float32')\n",
+ " })\n",
+ " \n",
+ " benchmark_data[dim] = documents\n",
+ " query_data[dim] = query_embeddings\n",
+ "\n",
+ "print(\n",
+ " f\"\\nโ
Generated benchmark data:\",\n",
+ " f\"Total documents: {sum(len(docs) for docs in benchmark_data.values()):,}\",\n",
+ " f\"Total queries: {sum(len(queries) for queries in query_data.values()):,}\",\n",
+ " f\"Dataset source: {'SQuAD (Hugging Face)' if 'squad_' in raw_documents[0]['doc_id'] else 'Local movies'}\",\n",
+ " sep=\"\\n\"\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Step 3: Index Creation Benchmark\n",
+ "\n",
+ "Measure index creation time and memory usage for each algorithm and dimension."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "def create_index_schema(algorithm: str, dimensions: int, prefix: str) -> Dict[str, Any]:\n",
+ " \"\"\"Create index schema for the specified algorithm\"\"\"\n",
+ " \n",
+ " base_schema = {\n",
+ " \"index\": {\n",
+ " \"name\": f\"benchmark_{algorithm}_{dimensions}d\",\n",
+ " \"prefix\": prefix,\n",
+ " },\n",
+ " \"fields\": [\n",
+ " {\"name\": \"doc_id\", \"type\": \"tag\"},\n",
+ " {\"name\": \"title\", \"type\": \"text\"},\n",
+ " {\"name\": \"category\", \"type\": \"tag\"},\n",
+ " {\"name\": \"score\", \"type\": \"numeric\"},\n",
+ " {\n",
+ " \"name\": \"embedding\",\n",
+ " \"type\": \"vector\",\n",
+ " \"attrs\": {\n",
+ " \"dims\": dimensions,\n",
+ " \"distance_metric\": \"cosine\",\n",
+ " \"datatype\": \"float32\"\n",
+ " }\n",
+ " }\n",
+ " ]\n",
+ " }\n",
+ " \n",
+ " # Algorithm-specific configurations\n",
+ " vector_field = base_schema[\"fields\"][-1][\"attrs\"]\n",
+ " \n",
+ " if algorithm == 'flat':\n",
+ " vector_field[\"algorithm\"] = \"flat\"\n",
+ " \n",
+ " elif algorithm == 'hnsw':\n",
+ " vector_field.update({\n",
+ " \"algorithm\": \"hnsw\",\n",
+ " \"m\": 16,\n",
+ " \"ef_construction\": 200,\n",
+ " \"ef_runtime\": 10\n",
+ " })\n",
+ " \n",
+ " elif algorithm == 'svs-vamana':\n",
+ " # Get compression recommendation\n",
+ " compression_config = CompressionAdvisor.recommend(dims=dimensions, priority=\"memory\")\n",
+ " \n",
+ " vector_field.update({\n",
+ " \"algorithm\": \"svs-vamana\",\n",
+ " \"datatype\": compression_config.get('datatype', 'float32')\n",
+ " })\n",
+ " \n",
+ " # Handle dimensionality reduction for high dimensions\n",
+ " if 'reduce' in compression_config:\n",
+ " vector_field[\"dims\"] = compression_config['reduce']\n",
+ " \n",
+ " return base_schema\n",
+ "\n",
+ "def benchmark_index_creation(algorithm: str, dimensions: int, documents: List[Dict]) -> Tuple[SearchIndex, float, float]:\n",
+ " \"\"\"Benchmark index creation and return index, build time, and memory usage\"\"\"\n",
+ " \n",
+ " prefix = f\"bench:{algorithm}:{dimensions}d:\"\n",
+ " \n",
+ " # Clean up any existing index\n",
+ " try:\n",
+ " client.execute_command('FT.DROPINDEX', f'benchmark_{algorithm}_{dimensions}d')\n",
+ " except:\n",
+ " pass\n",
+ " \n",
+ " # Create schema and index\n",
+ " schema = create_index_schema(algorithm, dimensions, prefix)\n",
+ " \n",
+ " start_time = time.time()\n",
+ " \n",
+ " # Create index\n",
+ " index = SearchIndex.from_dict(schema, redis_url=REDIS_URL)\n",
+ " index.create(overwrite=True)\n",
+ " \n",
+ " # Load data in batches\n",
+ " batch_size = 100\n",
+ " for i in range(0, len(documents), batch_size):\n",
+ " batch = documents[i:i+batch_size]\n",
+ " index.load(batch)\n",
+ " \n",
+ " # Wait for indexing to complete\n",
+ " if algorithm == 'hnsw':\n",
+ " time.sleep(3) # HNSW needs more time for graph construction\n",
+ " else:\n",
+ " time.sleep(1)\n",
+ " \n",
+ " build_time = time.time() - start_time\n",
+ " \n",
+ " # Get index info for memory usage\n",
+ " try:\n",
+ " index_info = index.info()\n",
+ " index_size_mb = float(index_info.get('vector_index_sz_mb', 0))\n",
+ " except:\n",
+ " index_size_mb = 0.0\n",
+ " \n",
+ " return index, build_time, index_size_mb\n",
+ "\n",
+ "# Run index creation benchmarks\n",
+ "print(\"๐๏ธ Running index creation benchmarks...\")\n",
+ "\n",
+ "creation_results = {}\n",
+ "indices = {}\n",
+ "\n",
+ "for dim in config.dimensions:\n",
+ " print(f\"\\n๐ Benchmarking {dim}D embeddings:\")\n",
+ " \n",
+ " for algorithm in config.algorithms:\n",
+ " print(f\" Creating {algorithm.upper()} index...\")\n",
+ " \n",
+ " try:\n",
+ " index, build_time, index_size_mb = benchmark_index_creation(\n",
+ " algorithm, dim, benchmark_data[dim]\n",
+ " )\n",
+ " \n",
+ " creation_results[f\"{algorithm}_{dim}\"] = {\n",
+ " 'algorithm': algorithm,\n",
+ " 'dimensions': dim,\n",
+ " 'build_time_sec': build_time,\n",
+ " 'index_size_mb': index_size_mb,\n",
+ " 'num_docs': len(benchmark_data[dim])\n",
+ " }\n",
+ " \n",
+ " indices[f\"{algorithm}_{dim}\"] = index\n",
+ " \n",
+ " print(\n",
+ " f\" โ
{algorithm.upper()}: {build_time:.2f}s, {index_size_mb:.2f}MB\"\n",
+ " )\n",
+ " \n",
+ " except Exception as e:\n",
+ " print(f\" โ {algorithm.upper()} failed: {e}\")\n",
+ " creation_results[f\"{algorithm}_{dim}\"] = None\n",
+ "\n",
+ "print(\"\\nโ
Index creation benchmarks complete!\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Step 4: Query Performance Benchmark\n",
+ "\n",
+ "Measure query latency and search quality for each algorithm."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "def calculate_recall(retrieved_ids: List[str], ground_truth_ids: List[str], k: int) -> float:\n",
+ " \"\"\"Calculate recall@k between retrieved and ground truth results\"\"\"\n",
+ " if not ground_truth_ids or not retrieved_ids:\n",
+ " return 0.0\n",
+ " \n",
+ " retrieved_set = set(retrieved_ids[:k])\n",
+ " ground_truth_set = set(ground_truth_ids[:k])\n",
+ " \n",
+ " if len(ground_truth_set) == 0:\n",
+ " return 0.0\n",
+ " \n",
+ " intersection = len(retrieved_set.intersection(ground_truth_set))\n",
+ " return intersection / len(ground_truth_set)\n",
+ "\n",
+ "def benchmark_query_performance(index: SearchIndex, query_vectors: np.ndarray, \n",
+ " algorithm: str, dimensions: int) -> Dict[str, float]:\n",
+ " \"\"\"Benchmark query performance and quality\"\"\"\n",
+ " \n",
+ " latencies = []\n",
+ " all_results = []\n",
+ " \n",
+ " # Get ground truth from FLAT index (if available)\n",
+ " ground_truth_results = []\n",
+ " flat_index_key = f\"flat_{dimensions}\"\n",
+ " \n",
+ " if flat_index_key in indices and algorithm != 'flat':\n",
+ " flat_index = indices[flat_index_key]\n",
+ " for query_vec in query_vectors:\n",
+ " query = VectorQuery(\n",
+ " vector=query_vec,\n",
+ " vector_field_name=\"embedding\",\n",
+ " return_fields=[\"doc_id\"],\n",
+ " dtype=\"float32\",\n",
+ " num_results=10\n",
+ " )\n",
+ " results = flat_index.query(query)\n",
+ " ground_truth_results.append([doc[\"doc_id\"] for doc in results])\n",
+ " \n",
+ " # Benchmark the target algorithm\n",
+ " for i, query_vec in enumerate(query_vectors):\n",
+ " # Adjust query vector for SVS if needed\n",
+ " if algorithm == 'svs-vamana':\n",
+ " compression_config = CompressionAdvisor.recommend(dims=dimensions, priority=\"memory\")\n",
+ " \n",
+ " if 'reduce' in compression_config:\n",
+ " target_dims = compression_config['reduce']\n",
+ " if target_dims < dimensions:\n",
+ " query_vec = query_vec[:target_dims]\n",
+ " \n",
+ " if compression_config.get('datatype') == 'float16':\n",
+ " query_vec = query_vec.astype(np.float16)\n",
+ " dtype = 'float16'\n",
+ " else:\n",
+ " dtype = 'float32'\n",
+ " else:\n",
+ " dtype = 'float32'\n",
+ " \n",
+ " # Execute query with timing\n",
+ " start_time = time.time()\n",
+ " \n",
+ " query = VectorQuery(\n",
+ " vector=query_vec,\n",
+ " vector_field_name=\"embedding\",\n",
+ " return_fields=[\"doc_id\", \"title\", \"category\"],\n",
+ " dtype=dtype,\n",
+ " num_results=10\n",
+ " )\n",
+ " \n",
+ " results = index.query(query)\n",
+ " latency = time.time() - start_time\n",
+ " \n",
+ " latencies.append(latency * 1000) # Convert to milliseconds\n",
+ " all_results.append([doc[\"doc_id\"] for doc in results])\n",
+ " \n",
+ " # Calculate metrics\n",
+ " avg_latency = np.mean(latencies)\n",
+ " \n",
+ " # Calculate recall if we have ground truth\n",
+ " if ground_truth_results and algorithm != 'flat':\n",
+ " recall_5_scores = []\n",
+ " recall_10_scores = []\n",
+ " \n",
+ " for retrieved, ground_truth in zip(all_results, ground_truth_results):\n",
+ " recall_5_scores.append(calculate_recall(retrieved, ground_truth, 5))\n",
+ " recall_10_scores.append(calculate_recall(retrieved, ground_truth, 10))\n",
+ " \n",
+ " recall_at_5 = np.mean(recall_5_scores)\n",
+ " recall_at_10 = np.mean(recall_10_scores)\n",
+ " else:\n",
+ " # FLAT is our ground truth, so perfect recall\n",
+ " recall_at_5 = 1.0 if algorithm == 'flat' else 0.0\n",
+ " recall_at_10 = 1.0 if algorithm == 'flat' else 0.0\n",
+ " \n",
+ " return {\n",
+ " 'avg_query_time_ms': avg_latency,\n",
+ " 'recall_at_5': recall_at_5,\n",
+ " 'recall_at_10': recall_at_10,\n",
+ " 'num_queries': len(query_vectors)\n",
+ " }\n",
+ "\n",
+ "# Run query performance benchmarks\n",
+ "print(\"๐ Running query performance benchmarks...\")\n",
+ "\n",
+ "query_results = {}\n",
+ "\n",
+ "for dim in config.dimensions:\n",
+ " print(f\"\\n๐ Benchmarking {dim}D queries:\")\n",
+ " \n",
+ " for algorithm in config.algorithms:\n",
+ " index_key = f\"{algorithm}_{dim}\"\n",
+ " \n",
+ " if index_key in indices:\n",
+ " print(f\" Testing {algorithm.upper()} queries...\")\n",
+ " \n",
+ " try:\n",
+ " performance = benchmark_query_performance(\n",
+ " indices[index_key], \n",
+ " query_data[dim], \n",
+ " algorithm, \n",
+ " dim\n",
+ " )\n",
+ " \n",
+ " query_results[index_key] = performance\n",
+ " \n",
+ " print(\n",
+ " f\" โ
{algorithm.upper()}: {performance['avg_query_time_ms']:.2f}ms avg, \"\n",
+ " f\"R@5: {performance['recall_at_5']:.3f}, R@10: {performance['recall_at_10']:.3f}\"\n",
+ " )\n",
+ " \n",
+ " except Exception as e:\n",
+ " print(f\" โ {algorithm.upper()} query failed: {e}\")\n",
+ " query_results[index_key] = None\n",
+ " else:\n",
+ " print(f\" โญ๏ธ Skipping {algorithm.upper()} (index creation failed)\")\n",
+ "\n",
+ "print(\"\\nโ
Query performance benchmarks complete!\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Step 5: Results Analysis and Visualization\n",
+ "\n",
+ "Analyze and visualize the benchmark results with real data."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Combine results into comprehensive dataset\n",
+ "def create_results_dataframe() -> pd.DataFrame:\n",
+ " \"\"\"Combine all benchmark results into a pandas DataFrame\"\"\"\n",
+ " \n",
+ " results = []\n",
+ " \n",
+ " for dim in config.dimensions:\n",
+ " for algorithm in config.algorithms:\n",
+ " key = f\"{algorithm}_{dim}\"\n",
+ " \n",
+ " if key in creation_results and creation_results[key] is not None:\n",
+ " creation_data = creation_results[key]\n",
+ " query_data_item = query_results.get(key, {})\n",
+ " \n",
+ " result = {\n",
+ " 'algorithm': algorithm,\n",
+ " 'dimensions': dim,\n",
+ " 'num_docs': creation_data['num_docs'],\n",
+ " 'build_time_sec': creation_data['build_time_sec'],\n",
+ " 'index_size_mb': creation_data['index_size_mb'],\n",
+ " 'avg_query_time_ms': query_data_item.get('avg_query_time_ms', 0),\n",
+ " 'recall_at_5': query_data_item.get('recall_at_5', 0),\n",
+ " 'recall_at_10': query_data_item.get('recall_at_10', 0)\n",
+ " }\n",
+ " \n",
+ " results.append(result)\n",
+ " \n",
+ " return pd.DataFrame(results)\n",
+ "\n",
+ "# Create results DataFrame\n",
+ "df_results = create_results_dataframe()\n",
+ "\n",
+ "print(\"๐ Real Data Benchmark Results Summary:\")\n",
+ "print(df_results.to_string(index=False, float_format='%.3f'))\n",
+ "\n",
+ "# Display key insights\n",
+ "if not df_results.empty:\n",
+ " print(f\"\\n๐ฏ Key Insights from Real Data:\")\n",
+ " \n",
+ " # Memory efficiency\n",
+ " best_memory = df_results.loc[df_results['index_size_mb'].idxmin()]\n",
+ " print(f\"๐ Most memory efficient: {best_memory['algorithm'].upper()} at {best_memory['dimensions']}D ({best_memory['index_size_mb']:.2f}MB)\")\n",
+ " \n",
+ " # Query speed\n",
+ " best_speed = df_results.loc[df_results['avg_query_time_ms'].idxmin()]\n",
+ " print(f\"โก Fastest queries: {best_speed['algorithm'].upper()} at {best_speed['dimensions']}D ({best_speed['avg_query_time_ms']:.2f}ms)\")\n",
+ " \n",
+ " # Search quality\n",
+ " best_quality = df_results.loc[df_results['recall_at_10'].idxmax()]\n",
+ " print(f\"๐ฏ Best search quality: {best_quality['algorithm'].upper()} at {best_quality['dimensions']}D (R@10: {best_quality['recall_at_10']:.3f})\")\n",
+ " \n",
+ " # Dataset info\n",
+ " dataset_source = 'SQuAD (Hugging Face)' if 'squad_' in raw_documents[0]['doc_id'] else 'Local movies'\n",
+ " print(f\"\\n๐ Dataset: {dataset_source}\")\n",
+ " print(f\"๐ Total documents tested: {df_results['num_docs'].iloc[0]:,}\")\n",
+ " print(f\"๐ Total queries per dimension: {config.query_count}\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Create visualizations for real data results\n",
+ "def create_real_data_visualizations(df: pd.DataFrame):\n",
+ " \"\"\"Create visualizations for real data benchmark results\"\"\"\n",
+ " \n",
+ " if df.empty:\n",
+ " print(\"โ ๏ธ No results to visualize\")\n",
+ " return\n",
+ " \n",
+ " # Set up the plotting area\n",
+ " fig, axes = plt.subplots(2, 2, figsize=(15, 10))\n",
+ " fig.suptitle('Real Data Vector Algorithm Benchmark Results', fontsize=16, fontweight='bold')\n",
+ " \n",
+ " # 1. Memory Usage Comparison\n",
+ " ax1 = axes[0, 0]\n",
+ " pivot_memory = df.pivot(index='dimensions', columns='algorithm', values='index_size_mb')\n",
+ " pivot_memory.plot(kind='bar', ax=ax1, width=0.8)\n",
+ " ax1.set_title('Index Size by Algorithm (Real Data)')\n",
+ " ax1.set_xlabel('Dimensions')\n",
+ " ax1.set_ylabel('Index Size (MB)')\n",
+ " ax1.legend(title='Algorithm')\n",
+ " ax1.tick_params(axis='x', rotation=0)\n",
+ " \n",
+ " # 2. Query Performance\n",
+ " ax2 = axes[0, 1]\n",
+ " pivot_query = df.pivot(index='dimensions', columns='algorithm', values='avg_query_time_ms')\n",
+ " pivot_query.plot(kind='bar', ax=ax2, width=0.8)\n",
+ " ax2.set_title('Average Query Time (Real Embeddings)')\n",
+ " ax2.set_xlabel('Dimensions')\n",
+ " ax2.set_ylabel('Query Time (ms)')\n",
+ " ax2.legend(title='Algorithm')\n",
+ " ax2.tick_params(axis='x', rotation=0)\n",
+ " \n",
+ " # 3. Search Quality\n",
+ " ax3 = axes[1, 0]\n",
+ " pivot_recall = df.pivot(index='dimensions', columns='algorithm', values='recall_at_10')\n",
+ " pivot_recall.plot(kind='bar', ax=ax3, width=0.8)\n",
+ " ax3.set_title('Search Quality (Recall@10)')\n",
+ " ax3.set_xlabel('Dimensions')\n",
+ " ax3.set_ylabel('Recall@10')\n",
+ " ax3.legend(title='Algorithm')\n",
+ " ax3.tick_params(axis='x', rotation=0)\n",
+ " ax3.set_ylim(0, 1.1)\n",
+ " \n",
+ " # 4. Memory Efficiency\n",
+ " ax4 = axes[1, 1]\n",
+ " df['docs_per_mb'] = df['num_docs'] / df['index_size_mb']\n",
+ " pivot_efficiency = df.pivot(index='dimensions', columns='algorithm', values='docs_per_mb')\n",
+ " pivot_efficiency.plot(kind='bar', ax=ax4, width=0.8)\n",
+ " ax4.set_title('Memory Efficiency (Real Data)')\n",
+ " ax4.set_xlabel('Dimensions')\n",
+ " ax4.set_ylabel('Documents per MB')\n",
+ " ax4.legend(title='Algorithm')\n",
+ " ax4.tick_params(axis='x', rotation=0)\n",
+ " \n",
+ " plt.tight_layout()\n",
+ " plt.show()\n",
+ "\n",
+ "# Create visualizations\n",
+ "create_real_data_visualizations(df_results)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Step 6: Real Data Insights and Recommendations\n",
+ "\n",
+ "Generate insights based on real data performance."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Generate real data specific recommendations\n",
+ "if not df_results.empty:\n",
+ " dataset_source = 'SQuAD (Hugging Face)' if 'squad_' in raw_documents[0]['doc_id'] else 'Local movies'\n",
+ " \n",
+ " print(\n",
+ " f\"๐ฏ Real Data Benchmark Insights\",\n",
+ " f\"Dataset: {dataset_source}\",\n",
+ " f\"Documents: {df_results['num_docs'].iloc[0]:,} per dimension\",\n",
+ " f\"Embedding Models: sentence-transformers\",\n",
+ " \"=\" * 50,\n",
+ " sep=\"\\n\"\n",
+ " )\n",
+ " \n",
+ " for dim in config.dimensions:\n",
+ " dim_data = df_results[df_results['dimensions'] == dim]\n",
+ " \n",
+ " if not dim_data.empty:\n",
+ " print(f\"\\n๐ {dim}D Embeddings Analysis:\")\n",
+ " \n",
+ " for _, row in dim_data.iterrows():\n",
+ " algo = row['algorithm'].upper()\n",
+ " print(\n",
+ " f\" {algo}:\",\n",
+ " f\" Index: {row['index_size_mb']:.2f}MB\",\n",
+ " f\" Query: {row['avg_query_time_ms']:.2f}ms\",\n",
+ " f\" Recall@10: {row['recall_at_10']:.3f}\",\n",
+ " f\" Efficiency: {row['docs_per_mb']:.1f} docs/MB\",\n",
+ " sep=\"\\n\"\n",
+ " )\n",
+ " \n",
+ " print(\n",
+ " f\"\\n๐ก Key Takeaways with Real Data:\",\n",
+ " \"โข Real embeddings show different performance characteristics than synthetic\",\n",
+ " \"โข Sentence-transformer models provide realistic vector distributions\",\n",
+ " \"โข SQuAD Q&A pairs offer diverse semantic content for testing\",\n",
+ " \"โข Results are more representative of production workloads\",\n",
+ " \"โข Consider testing with your specific embedding models and data\",\n",
+ " sep=\"\\n\"\n",
+ " )\n",
+ "else:\n",
+ " print(\"โ ๏ธ No results available for analysis\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Step 7: Cleanup\n",
+ "\n",
+ "Clean up benchmark indices to free memory."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Clean up all benchmark indices\n",
+ "print(\"๐งน Cleaning up benchmark indices...\")\n",
+ "\n",
+ "cleanup_count = 0\n",
+ "for index_key, index in indices.items():\n",
+ " try:\n",
+ " index.delete(drop=True)\n",
+ " cleanup_count += 1\n",
+ " print(f\" โ
Deleted {index_key}\")\n",
+ " except Exception as e:\n",
+ " print(f\" โ ๏ธ Failed to delete {index_key}: {e}\")\n",
+ "\n",
+ "dataset_source = 'SQuAD (Hugging Face)' if 'squad_' in raw_documents[0]['doc_id'] else 'Local movies'\n",
+ "\n",
+ "print(\n",
+ " f\"\\n๐ Real Data Benchmark Complete!\",\n",
+ " f\"Dataset: {dataset_source}\",\n",
+ " f\"Cleaned up {cleanup_count} indices\",\n",
+ " f\"\\nNext steps:\",\n",
+ " \"1. Review the real data performance characteristics above\",\n",
+ " \"2. Compare with synthetic data results if available\",\n",
+ " \"3. Test with your specific embedding models and datasets\",\n",
+ " \"4. Scale up with larger datasets for production insights\",\n",
+ " \"5. Consider the impact of real text diversity on algorithm performance\",\n",
+ " sep=\"\\n\"\n",
+ ")"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.8.5"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}