|
25 | 25 | "\n", |
26 | 26 | "- Redis Stack 8.2.0+ with RediSearch 2.8.10+\n", |
27 | 27 | "- Existing vector index with substantial data (1000+ documents recommended)\n", |
28 | | - "- Vector embeddings (384 dimensions using sentence-transformers/all-MiniLM-L6-v2)" |
| 28 | + "- Vector embeddings (768 dimensions using sentence-transformers/all-mpnet-base-v2)" |
29 | 29 | ] |
30 | 30 | }, |
31 | 31 | { |
|
193 | 193 | ], |
194 | 194 | "source": [ |
195 | 195 | "# Configuration for demonstration \n", |
196 | | - "dims = 384 # sentence-transformers/all-MiniLM-L6-v2 - 384 dims\n", |
| 196 | + "dims = 768 # sentence-transformers/all-mpnet-base-v2 - 768 dims\n", |
197 | 197 | "\n", |
198 | 198 | "num_docs = len(movies_data) # Use actual dataset size\n", |
199 | 199 | "\n", |
200 | 200 | "print(\n", |
201 | 201 | " \"📊 Migration Assessment\",\n", |
202 | | - " f\"Vector dimensions: {dims} (sentence-transformers/all-MiniLM-L6-v2)\",\n", |
| 202 | + " f\"Vector dimensions: {dims} (sentence-transformers/all-mpnet-base-v2)\",\n", |
203 | 203 | " f\"Dataset size: {num_docs} movie documents\",\n", |
204 | 204 | " \"Data includes: title, genre, rating, description\",\n", |
205 | 205 | " sep=\"\\n\"\n", |
|
311 | 311 | "from sentence_transformers import SentenceTransformer\n", |
312 | 312 | "\n", |
313 | 313 | "print(\"🔄 Generating embeddings for movie descriptions...\")\n", |
314 | | - "embedding_model=\"sentence-transformers/all-MiniLM-L6-v2\"\n", |
| 314 | + "embedding_model=\"sentence-transformers/all-mpnet-base-v2\"\n", |
315 | 315 | "\n", |
316 | 316 | "try:\n", |
317 | 317 | " # Try to use sentence-transformers for real embeddings\n", |
|
413 | 413 | "**Lower-Dimensional Vectors (<1024 dims)**: Uses **LVQ compression** without dimensionality reduction. Memory priority uses LVQ4 (4 bits), speed uses LVQ4x8 (12 bits),\n", |
414 | 414 | "balanced uses LVQ4x4 (8 bits). Achieves 60-87% memory savings.\n", |
415 | 415 | "\n", |
416 | | - "**Our Configuration (384 dims)**: Will use **LVQ compression** as we're below the 1024 dimension threshold. This provides excellent compression without dimensionality reduction.\n", |
| 416 | + "**Our Configuration (768 dims)**: Will use **LVQ compression** as we're below the 1024 dimension threshold. This provides excellent compression without dimensionality reduction.\n", |
417 | 417 | "\n", |
418 | 418 | "## Available Compression Types\n", |
419 | 419 | "- **LVQ4/LVQ4x4/LVQ4x8**: 4/8/12 bits per dimension\n", |
|
0 commit comments