Skip to content

Commit 725af07

Browse files
committed
docs(migrate): add migration guides and field attributes reference
- Add conceptual guide: how migrations work (Diataxis explanation) - Add task guide: step-by-step migration walkthrough (Diataxis how-to) - Expand field-attributes.md with migration support matrix - Add vector datatypes table with algorithm compatibility - Update navigation indexes to include new guides - Normalize SVS-VAMANA naming throughout docs
1 parent b06e949 commit 725af07

File tree

8 files changed

+657
-8
lines changed

8 files changed

+657
-8
lines changed

docs/concepts/field-attributes.md

Lines changed: 90 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -267,7 +267,7 @@ Key vector attributes:
267267
- `dims`: Vector dimensionality (required)
268268
- `algorithm`: `flat`, `hnsw`, or `svs-vamana`
269269
- `distance_metric`: `COSINE`, `L2`, or `IP`
270-
- `datatype`: `float16`, `float32`, `float64`, or `bfloat16`
270+
- `datatype`: Vector precision (see table below)
271271
- `index_missing`: Allow searching for documents without vectors
272272

273273
```yaml
@@ -281,6 +281,48 @@ Key vector attributes:
281281
index_missing: true # Handle documents without embeddings
282282
```
283283

284+
### Vector Datatypes
285+
286+
The `datatype` attribute controls how vector components are stored. Smaller datatypes reduce memory usage but may affect precision.
287+
288+
| Datatype | Bits | Memory (768 dims) | Use Case |
289+
|----------|------|-------------------|----------|
290+
| `float32` | 32 | 3 KB | Default. Best precision for most applications. |
291+
| `float16` | 16 | 1.5 KB | Good balance of memory and precision. Recommended for large-scale deployments. |
292+
| `bfloat16` | 16 | 1.5 KB | Better dynamic range than float16. Useful when embeddings have large value ranges. |
293+
| `float64` | 64 | 6 KB | Maximum precision. Rarely needed. |
294+
| `int8` | 8 | 768 B | Integer quantization. Significant memory savings with some precision loss. |
295+
| `uint8` | 8 | 768 B | Unsigned integer quantization. For embeddings with non-negative values. |
296+
297+
**Algorithm Compatibility:**
298+
299+
| Datatype | FLAT | HNSW | SVS-VAMANA |
300+
|----------|------|------|------------|
301+
| `float32` | Yes | Yes | Yes |
302+
| `float16` | Yes | Yes | Yes |
303+
| `bfloat16` | Yes | Yes | No |
304+
| `float64` | Yes | Yes | No |
305+
| `int8` | Yes | Yes | No |
306+
| `uint8` | Yes | Yes | No |
307+
308+
**Choosing a Datatype:**
309+
310+
- **Start with `float32`** unless you have memory constraints
311+
- **Use `float16`** for production systems with millions of vectors (50% memory savings, minimal precision loss)
312+
- **Use `int8`/`uint8`** only after benchmarking recall on your specific dataset
313+
- **SVS-VAMANA users**: Must use `float16` or `float32`
314+
315+
**Quantization with the Migrator:**
316+
317+
You can change vector datatypes on existing indexes using the migration wizard:
318+
319+
```bash
320+
rvl migrate wizard --index my_index --url redis://localhost:6379
321+
# Select "Update field" > choose vector field > change datatype
322+
```
323+
324+
The migrator automatically re-encodes stored vectors to the new precision. See {doc}`/user_guide/how_to_guides/migrate-indexes` for details.
325+
284326
## Redis-Specific Subtleties
285327

286328
### Modifier Ordering
@@ -304,6 +346,53 @@ Not all attributes work with all field types:
304346
| `unf` | ✓ | ✗ | ✓ | ✗ | ✗ |
305347
| `withsuffixtrie` | ✓ | ✓ | ✗ | ✗ | ✗ |
306348

349+
### Migration Support
350+
351+
The migration wizard (`rvl migrate wizard`) supports updating field attributes on existing indexes. The table below shows which attributes can be updated via the wizard vs requiring manual schema patch editing.
352+
353+
**Wizard Prompts:**
354+
355+
| Attribute | Text | Tag | Numeric | Geo | Vector |
356+
|-----------|------|-----|---------|-----|--------|
357+
| `sortable` | Wizard | Wizard | Wizard | Wizard | N/A |
358+
| `index_missing` | Wizard | Wizard | Wizard | Wizard | N/A |
359+
| `index_empty` | Wizard | Wizard | N/A | N/A | N/A |
360+
| `no_index` | Wizard | Wizard | Wizard | Wizard | N/A |
361+
| `unf` | Wizard* | N/A | Wizard* | N/A | N/A |
362+
| `separator` | N/A | Wizard | N/A | N/A | N/A |
363+
| `case_sensitive` | N/A | Wizard | N/A | N/A | N/A |
364+
| `no_stem` | Wizard | N/A | N/A | N/A | N/A |
365+
| `weight` | Wizard | N/A | N/A | N/A | N/A |
366+
| `algorithm` | N/A | N/A | N/A | N/A | Wizard |
367+
| `datatype` | N/A | N/A | N/A | N/A | Wizard |
368+
| `distance_metric` | N/A | N/A | N/A | N/A | Wizard |
369+
| `m`, `ef_construction` | N/A | N/A | N/A | N/A | Wizard |
370+
371+
*\* `unf` is only prompted when `sortable` is enabled.*
372+
373+
**Manual Schema Patch Required:**
374+
375+
| Attribute | Notes |
376+
|-----------|-------|
377+
| `phonetic_matcher` | Enable phonetic search |
378+
| `withsuffixtrie` | Suffix/contains search optimization |
379+
380+
**Example manual patch** for adding `index_missing` to a field:
381+
382+
```yaml
383+
# schema_patch.yaml
384+
version: 1
385+
changes:
386+
update_fields:
387+
- name: category
388+
attrs:
389+
index_missing: true
390+
```
391+
392+
```bash
393+
rvl migrate plan --index my_index --schema-patch schema_patch.yaml
394+
```
395+
307396
### JSON Path for Nested Fields
308397

309398
When using JSON storage, use the `path` attribute to index nested fields:

docs/concepts/index-migrations.md

Lines changed: 145 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,145 @@
1+
---
2+
myst:
3+
html_meta:
4+
"description lang=en": |
5+
Learn how RedisVL index migrations work and which schema changes are supported.
6+
---
7+
8+
# Index Migrations
9+
10+
Redis Search indexes are immutable. To change an index schema, you must drop the existing index and create a new one. RedisVL provides a migration workflow that automates this process while preserving your data.
11+
12+
This page explains how migrations work and which changes are supported. For step by step instructions, see the [migration guide](../user_guide/how_to_guides/migrate-indexes.md).
13+
14+
## Supported and blocked changes
15+
16+
The migrator classifies schema changes into two categories:
17+
18+
| Change | Status |
19+
|--------|--------|
20+
| Add or remove a field | Supported |
21+
| Change field options (sortable, separator) | Supported |
22+
| Change vector algorithm (FLAT, HNSW, SVS-VAMANA) | Supported |
23+
| Change distance metric (COSINE, L2, IP) | Supported |
24+
| Tune algorithm parameters (M, EF_CONSTRUCTION) | Supported |
25+
| Quantize vectors (float32 to float16) | Supported |
26+
| Change vector dimensions | Blocked |
27+
| Change key prefix | Blocked |
28+
| Rename a field | Blocked |
29+
| Change storage type (hash to JSON) | Blocked |
30+
| Add a new vector field | Blocked |
31+
32+
**Supported** changes can be applied automatically using `rvl migrate`. The migrator handles the index rebuild and any necessary data transformations.
33+
34+
**Blocked** changes require manual intervention because they involve incompatible data formats or missing data. The migrator will reject these changes and explain why.
35+
36+
## How the migrator works
37+
38+
The migrator uses a plan first workflow:
39+
40+
1. **Plan**: Capture the current schema, classify your changes, and generate a migration plan
41+
2. **Review**: Inspect the plan before making any changes
42+
3. **Apply**: Drop the index, transform data if needed, and recreate with the new schema
43+
4. **Validate**: Verify the result matches expectations
44+
45+
This separation ensures you always know what will happen before any changes are made.
46+
47+
## Migration mode: drop_recreate
48+
49+
The `drop_recreate` mode rebuilds the index in place while preserving your documents.
50+
51+
The process:
52+
53+
1. Drop only the index structure (documents remain in Redis)
54+
2. For datatype changes, re-encode vectors to the target precision
55+
3. Recreate the index with the new schema
56+
4. Wait for Redis to re-index the existing documents
57+
5. Validate the result
58+
59+
**Tradeoff**: The index is unavailable during the rebuild. The migrator requires explicit acknowledgment of this downtime before proceeding.
60+
61+
## Index only vs document dependent changes
62+
63+
Schema changes fall into two categories based on whether they require modifying stored data.
64+
65+
**Index only changes** affect how Redis Search indexes data, not the data itself:
66+
67+
- Algorithm changes: The stored vector bytes are identical. Only the index structure differs.
68+
- Distance metric changes: Same vectors, different similarity calculation.
69+
- Adding or removing fields: The documents already contain the data. The index just starts or stops indexing it.
70+
71+
These changes complete quickly because they only require rebuilding the index.
72+
73+
**Document dependent changes** require modifying the stored data:
74+
75+
- Datatype changes (float32 to float16): Stored vector bytes must be re-encoded.
76+
- Field renames: Stored field names must be updated in every document.
77+
- Dimension changes: Vectors must be re-embedded with a different model.
78+
79+
The migrator handles datatype changes automatically. Other document dependent changes are blocked because they require application level logic or external services.
80+
81+
## Vector quantization
82+
83+
Changing vector precision from float32 to float16 reduces memory usage at the cost of slight precision loss. The migrator handles this automatically by:
84+
85+
1. Reading all vectors from Redis
86+
2. Converting to the target precision
87+
3. Writing updated vectors back
88+
4. Recreating the index with the new schema
89+
90+
Typical reductions:
91+
92+
| Metric | Value |
93+
|--------|-------|
94+
| Index size reduction | ~50% |
95+
| Memory reduction | ~35% |
96+
97+
Quantization time is proportional to document count. Plan for downtime accordingly.
98+
99+
## Why some changes are blocked
100+
101+
### Vector dimension changes
102+
103+
Vector dimensions are determined by your embedding model. A 384 dimensional vector from one model is mathematically incompatible with a 768 dimensional index expecting vectors from a different model. There is no way to resize an embedding.
104+
105+
**Resolution**: Re-embed your documents using the new model and load them into a new index.
106+
107+
### Prefix changes
108+
109+
Changing a prefix from `docs:` to `articles:` requires copying every document to a new key. This operation doubles storage temporarily and can leave orphaned keys if interrupted.
110+
111+
**Resolution**: Create a new index with the new prefix and reload your data.
112+
113+
### Field renames
114+
115+
Field names are stored in the documents themselves as hash field names or JSON keys. Renaming requires iterating through every document and updating the field name.
116+
117+
**Resolution**: Create a new index with the correct field name and reload your data.
118+
119+
### Storage type changes
120+
121+
Hash and JSON have different data layouts. Hash stores flat key value pairs. JSON stores nested structures. Converting between them requires understanding your schema and restructuring each document.
122+
123+
**Resolution**: Export your data, transform it to the new format, and reload into a new index.
124+
125+
### Adding a vector field
126+
127+
Adding a vector field means all existing documents need vectors for that field. The migrator cannot generate these vectors because it does not know which embedding model to use or what content to embed.
128+
129+
**Resolution**: Add vectors to your documents using your application, then run the migration.
130+
131+
## Downtime considerations
132+
133+
With `drop_recreate`, your index is unavailable between the drop and when re-indexing completes. Plan for:
134+
135+
- Search unavailability during the migration window
136+
- Partial results while indexing is in progress
137+
- Resource usage from the re-indexing process
138+
- Quantization time if changing vector datatypes
139+
140+
The duration depends on document count, field count, and vector dimensions. For large indexes, consider running migrations during low traffic periods.
141+
142+
## Learn more
143+
144+
- [Migration guide](../user_guide/how_to_guides/migrate-indexes.md): Step by step instructions
145+
- [Search and indexing](search-and-indexing.md): How Redis Search indexes work

docs/concepts/index.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,13 @@ How RedisVL components connect: schemas, indexes, queries, and extensions.
2626
Schemas, fields, documents, storage types, and query patterns.
2727
:::
2828

29+
:::{grid-item-card} 🔄 Index Migrations
30+
:link: index-migrations
31+
:link-type: doc
32+
33+
How RedisVL handles migration planning, rebuilds, and future shadow migration.
34+
:::
35+
2936
:::{grid-item-card} 🏷️ Field Attributes
3037
:link: field-attributes
3138
:link-type: doc
@@ -62,6 +69,7 @@ Pre-built patterns: caching, message history, and semantic routing.
6269
6370
architecture
6471
search-and-indexing
72+
index-migrations
6573
field-attributes
6674
queries
6775
utilities

docs/concepts/search-and-indexing.md

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -106,9 +106,14 @@ To change a schema, you create a new index with the updated configuration, reind
106106

107107
Planning your schema carefully upfront reduces the need for migrations, but the capability exists when requirements evolve.
108108

109-
---
109+
RedisVL now includes a dedicated migration workflow for this lifecycle:
110+
111+
- `drop_recreate` for document-preserving rebuilds, including vector quantization (`float32``float16`)
110112

111-
**Related concepts:** {doc}`field-attributes` explains how to configure field options like `sortable` and `index_missing`. {doc}`queries` covers the different query types available.
113+
That means schema evolution is no longer only a manual operational pattern. It is also a product surface in RedisVL with a planner, CLI, and validation artifacts.
114+
115+
---
112116

113-
**Learn more:** {doc}`/user_guide/01_getting_started` walks through building your first index. {doc}`/user_guide/05_hash_vs_json` compares storage options in depth. {doc}`/user_guide/02_complex_filtering` covers query composition.
117+
**Related concepts:** {doc}`field-attributes` explains how to configure field options like `sortable` and `index_missing`. {doc}`queries` covers the different query types available. {doc}`index-migrations` explains migration modes, supported changes, and architecture.
114118

119+
**Learn more:** {doc}`/user_guide/01_getting_started` walks through building your first index. {doc}`/user_guide/05_hash_vs_json` compares storage options in depth. {doc}`/user_guide/02_complex_filtering` covers query composition. {doc}`/user_guide/how_to_guides/migrate-indexes` shows how to use the migration CLI in practice.

docs/user_guide/cli.ipynb

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
"source": [
77
"# The RedisVL CLI\n",
88
"\n",
9-
"RedisVL is a Python library with a dedicated CLI to help load and create vector search indices within Redis.\n",
9+
"RedisVL is a Python library with a dedicated CLI to help load, inspect, migrate, and create vector search indices within Redis.\n",
1010
"\n",
1111
"This notebook will walk through how to use the Redis Vector Library CLI (``rvl``).\n",
1212
"\n",
@@ -50,7 +50,12 @@
5050
"| `rvl index` | `delete --index` or `-i <index_name>` | remove the specified index, leaving the data still in Redis|\n",
5151
"| `rvl index` | `destroy --index` or `-i <index_name>`| remove the specified index, as well as the associated data|\n",
5252
"| `rvl stats` | `--index` or `-i <index_name>` | display the index statistics, including number of docs, average bytes per record, indexing time, etc|\n",
53-
"| `rvl stats` | `--schema` or `-s <schema.yaml>` | display the index statistics of a schema defined in <schema.yaml>. The index must have already been created within Redis|"
53+
"| `rvl stats` | `--schema` or `-s <schema.yaml>` | display the index statistics of a schema defined in <schema.yaml>. The index must have already been created within Redis|\n",
54+
"| `rvl migrate` | `helper` or `list` | show migration guidance and list indexes available for migration|\n",
55+
"| `rvl migrate` | `wizard` | interactively build a migration plan and schema patch|\n",
56+
"| `rvl migrate` | `plan` | generate `migration_plan.yaml` from a patch or target schema|\n",
57+
"| `rvl migrate` | `apply --allow-downtime` | execute a reviewed `drop_recreate` migration|\n",
58+
"| `rvl migrate` | `validate` | validate a completed migration and emit report artifacts|"
5459
]
5560
},
5661
{

docs/user_guide/how_to_guides/index.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,7 @@ How-to guides are **task-oriented** recipes that help you accomplish specific go
3434
:::{grid-item-card} 💾 Storage
3535

3636
- [Choose a Storage Type](../05_hash_vs_json.ipynb) -- Hash vs JSON formats and nested data
37+
- [Migrate an Index](migrate-indexes.md) -- use the migrator helper, wizard, plan, apply, and validate workflow
3738
:::
3839

3940
:::{grid-item-card} 💻 CLI Operations
@@ -59,6 +60,7 @@ How-to guides are **task-oriented** recipes that help you accomplish specific go
5960
| Optimize index performance | [Optimize Indexes with SVS-VAMANA](../09_svs_vamana.ipynb) |
6061
| Decide on storage format | [Choose a Storage Type](../05_hash_vs_json.ipynb) |
6162
| Manage indices from terminal | [Manage Indices with the CLI](../cli.ipynb) |
63+
| Plan and run a supported index migration | [Migrate an Index](migrate-indexes.md) |
6264

6365
```{toctree}
6466
:hidden:
@@ -74,4 +76,5 @@ Optimize Indexes with SVS-VAMANA <../09_svs_vamana>
7476
Cache Embeddings <../10_embeddings_cache>
7577
Use Advanced Query Types <../11_advanced_queries>
7678
Write SQL Queries for Redis <../12_sql_to_redis_queries>
79+
Migrate an Index <migrate-indexes>
7780
```

0 commit comments

Comments
 (0)