You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The RedisVL validation system ensures that data written to Redis indexes conforms to the defined schema. It uses dynamic Pydantic model generation to validate objects before they are stored.
4
+
5
+
## Key Features
6
+
7
+
-**Schema-Based Validation**: Validates objects against your index schema definition
8
+
-**Dynamic Model Generation**: Creates Pydantic models on the fly based on your schema
9
+
-**Type Checking**: Ensures fields contain appropriate data types
10
+
-**Field-Specific Validation**:
11
+
- Text and Tag fields must be strings
12
+
- Numeric fields must be integers or floats
13
+
- Geo fields must be properly formatted latitude/longitude strings
14
+
- Vector fields must have the correct dimensions and data types
15
+
-**JSON Path Support**: Validates fields extracted from nested JSON structures
16
+
-**Fail-Fast Approach**: Stops processing at the first validation error
17
+
-**Performance Optimized**: Caches models for repeated validation
18
+
19
+
## Usage
20
+
21
+
### Basic Validation
22
+
23
+
```python
24
+
from redisvl.schema.validation import validate_object
25
+
26
+
# Assuming you have a schema defined
27
+
validated_data = validate_object(schema, data)
28
+
```
29
+
30
+
### Storage Integration
31
+
32
+
The validation is automatically integrated with the storage classes:
The validation system is optimized for performance:
200
+
201
+
-**Model Caching**: Pydantic models are cached by schema name to avoid regeneration
202
+
-**Lazy Validation**: Fields are validated only when needed
203
+
-**Fail-Fast Approach**: Processing stops at the first validation error
204
+
205
+
For large datasets, validation can be a significant part of the processing time. If you need to write many objects with the same structure, consider validating a sample first to ensure correctness.
206
+
207
+
## Limitations
208
+
209
+
-**JSON Path**: The current implementation only supports simple dot notation paths (e.g., `$.field.subfield`). Array indexing is not supported.
210
+
-**Vector Bytes**: When vectors are provided as bytes, the dimensions cannot be validated.
211
+
-**Custom Validators**: The current implementation does not support custom user-defined validators.
212
+
213
+
## Best Practices
214
+
215
+
1.**Define Clear Schemas**: Be explicit about field types and constraints
216
+
2.**Pre-validate Critical Data**: For large datasets, validate a sample before processing everything
217
+
3.**Handle Validation Errors**: Implement proper error handling for validation failures
218
+
4.**Use JSON Paths Carefully**: Test nested JSON extraction to ensure paths are correctly defined
219
+
5.**Consider Optional Fields**: Decide which fields are truly required for your application
220
+
221
+
## Integration with Storage Classes
222
+
223
+
The validation system is fully integrated with the storage classes:
224
+
225
+
-**BaseStorage**: For hash-based storage, validates each field individually
226
+
-**JsonStorage**: For JSON storage, extracts and validates fields from nested structures
227
+
228
+
Each storage class automatically validates data before writing to Redis, ensuring data integrity.
0 commit comments