diff --git a/content/integrate/redisvl/api/_index.md b/content/integrate/redisvl/api/_index.md
index 973df98c26..c52795d710 100644
--- a/content/integrate/redisvl/api/_index.md
+++ b/content/integrate/redisvl/api/_index.md
@@ -47,9 +47,11 @@ Reference documentation for the RedisVL API.
* [VoyageAIReranker](reranker/#voyageaireranker)
* [LLM Cache](cache/)
* [SemanticCache](cache/#semanticcache)
-* [LLM Session Manager](session_manager/)
- * [SemanticSessionManager](session_manager/#semanticsessionmanager)
- * [StandardSessionManager](session_manager/#standardsessionmanager)
+* [Embeddings Cache](cache/#embeddings-cache)
+ * [EmbeddingsCache](cache/#embeddingscache)
+* [LLM Message History](message_history/)
+ * [SemanticMessageHistory](message_history/#semanticmessagehistory)
+ * [MessageHistory](message_history/#messagehistory)
* [Semantic Router](router/)
* [Semantic Router](router/#semantic-router-api)
* [Routing Config](router/#routing-config)
diff --git a/content/integrate/redisvl/api/cache.md b/content/integrate/redisvl/api/cache.md
index b15bc6c3f1..f9cecf6dc2 100644
--- a/content/integrate/redisvl/api/cache.md
+++ b/content/integrate/redisvl/api/cache.md
@@ -82,17 +82,65 @@ response = await cache.acheck(
)
```
+#### `async aclear()`
+
+Async clear the cache of all keys.
+
+* **Return type:**
+ None
+
+#### `async adelete()`
+
+Async delete the cache and its index entirely.
+
+* **Return type:**
+ None
+
+#### `async adisconnect()`
+
+Asynchronously disconnect from Redis and search index.
+
+Closes all Redis connections and index connections.
+
#### `async adrop(ids=None, keys=None)`
-Async expire specific entries from the cache by id or specific
-Redis key.
+Async drop specific entries from the cache by ID or Redis key.
* **Parameters:**
- * **ids** (*Optional* *[* *str* *]*) – The document ID or IDs to remove from the cache.
- * **keys** (*Optional* *[* *str* *]*) – The Redis keys to remove from the cache.
+ * **ids** (*Optional* *[* *List* *[* *str* *]* *]*) – List of entry IDs to remove from the cache.
+ Entry IDs are the unique identifiers without the cache prefix.
+ * **keys** (*Optional* *[* *List* *[* *str* *]* *]*) – List of full Redis keys to remove from the cache.
+ Keys are the complete Redis keys including the cache prefix.
+* **Return type:**
+ None
+
+#### `NOTE`
+At least one of ids or keys must be provided.
+
+* **Raises:**
+ **ValueError** – If neither ids nor keys is provided.
+* **Parameters:**
+ * **ids** (*List* *[* *str* *]* *|* *None*)
+ * **keys** (*List* *[* *str* *]* *|* *None*)
+* **Return type:**
+ None
+
+#### `async aexpire(key, ttl=None)`
+
+Asynchronously set or refresh the expiration time for a key in the cache.
+
+* **Parameters:**
+ * **key** (*str*) – The Redis key to set the expiration on.
+ * **ttl** (*Optional* *[* *int* *]* *,* *optional*) – The time-to-live in seconds. If None,
+ uses the default TTL configured for this cache instance.
+ Defaults to None.
* **Return type:**
None
+#### `NOTE`
+If neither the provided TTL nor the default TTL is set (both are None),
+this method will have no effect.
+
#### `async astore(prompt, response, vector=None, metadata=None, filters=None, ttl=None)`
Async stores the specified key-value pair in the cache along with metadata.
@@ -190,30 +238,63 @@ response = cache.check(
#### `clear()`
-Clear the cache of all keys while preserving the index.
+Clear the cache of all keys.
* **Return type:**
None
#### `delete()`
-Clear the semantic cache of all keys and remove the underlying search
-index.
+Delete the cache and its index entirely.
* **Return type:**
None
+#### `disconnect()`
+
+Disconnect from Redis and search index.
+
+Closes all Redis connections and index connections.
+
#### `drop(ids=None, keys=None)`
-Manually expire specific entries from the cache by id or specific
-Redis key.
+Drop specific entries from the cache by ID or Redis key.
* **Parameters:**
- * **ids** (*Optional* *[* *str* *]*) – The document ID or IDs to remove from the cache.
- * **keys** (*Optional* *[* *str* *]*) – The Redis keys to remove from the cache.
+ * **ids** (*Optional* *[* *List* *[* *str* *]* *]*) – List of entry IDs to remove from the cache.
+ Entry IDs are the unique identifiers without the cache prefix.
+ * **keys** (*Optional* *[* *List* *[* *str* *]* *]*) – List of full Redis keys to remove from the cache.
+ Keys are the complete Redis keys including the cache prefix.
* **Return type:**
None
+#### `NOTE`
+At least one of ids or keys must be provided.
+
+* **Raises:**
+ **ValueError** – If neither ids nor keys is provided.
+* **Parameters:**
+ * **ids** (*List* *[* *str* *]* *|* *None*)
+ * **keys** (*List* *[* *str* *]* *|* *None*)
+* **Return type:**
+ None
+
+#### `expire(key, ttl=None)`
+
+Set or refresh the expiration time for a key in the cache.
+
+* **Parameters:**
+ * **key** (*str*) – The Redis key to set the expiration on.
+ * **ttl** (*Optional* *[* *int* *]* *,* *optional*) – The time-to-live in seconds. If None,
+ uses the default TTL configured for this cache instance.
+ Defaults to None.
+* **Return type:**
+ None
+
+#### `NOTE`
+If neither the provided TTL nor the default TTL is set (both are None),
+this method will have no effect.
+
#### `set_threshold(distance_threshold)`
Sets the semantic distance threshold for the cache.
@@ -235,6 +316,8 @@ Set the default TTL, in seconds, for entries in the cache.
for the cache, in seconds.
* **Raises:**
**ValueError** – If the time-to-live value is not an integer.
+* **Return type:**
+ None
#### `store(prompt, response, vector=None, metadata=None, filters=None, ttl=None)`
@@ -285,7 +368,6 @@ are passed, then only the document TTL is refreshed.
```python
key = cache.store('this is a prompt', 'this is a response')
cache.update(key, metadata={"hit_count": 1, "model_name": "Llama-2-7b"})
-)
```
#### `property aindex: `[`AsyncSearchIndex`]({{< relref "searchindex/#asyncsearchindex" >}})` | None`
@@ -318,3 +400,692 @@ The underlying SearchIndex for the cache.
#### `property ttl: int | None`
The default TTL, in seconds, for entries in the cache.
+
+# Embeddings Cache
+
+## EmbeddingsCache
+
+
+
+### `class EmbeddingsCache(name='embedcache', ttl=None, redis_client=None, redis_url='redis://localhost:6379', connection_kwargs={})`
+
+Bases: `BaseCache`
+
+Embeddings Cache for storing embedding vectors with exact key matching.
+
+Initialize an embeddings cache.
+
+* **Parameters:**
+ * **name** (*str*) – The name of the cache. Defaults to “embedcache”.
+ * **ttl** (*Optional* *[* *int* *]*) – The time-to-live for cached embeddings. Defaults to None.
+ * **redis_client** (*Optional* *[* *Redis* *]*) – Redis client instance. Defaults to None.
+ * **redis_url** (*str*) – Redis URL for connection. Defaults to “redis://localhost:6379”.
+ * **connection_kwargs** (*Dict* *[* *str* *,* *Any* *]*) – Redis connection arguments. Defaults to {}.
+* **Raises:**
+ **ValueError** – If vector dimensions are invalid
+
+```python
+cache = EmbeddingsCache(
+ name="my_embeddings_cache",
+ ttl=3600, # 1 hour
+ redis_url="redis://localhost:6379"
+)
+```
+
+#### `async aclear()`
+
+Async clear the cache of all keys.
+
+* **Return type:**
+ None
+
+#### `async adisconnect()`
+
+Async disconnect from Redis.
+
+* **Return type:**
+ None
+
+#### `async adrop(text, model_name)`
+
+Async remove an embedding from the cache.
+
+Asynchronously removes an embedding from the cache.
+
+* **Parameters:**
+ * **text** (*str*) – The text input that was embedded.
+ * **model_name** (*str*) – The name of the embedding model.
+* **Return type:**
+ None
+
+```python
+await cache.adrop(
+ text="What is machine learning?",
+ model_name="text-embedding-ada-002"
+)
+```
+
+#### `async adrop_by_key(key)`
+
+Async remove an embedding from the cache by its Redis key.
+
+Asynchronously removes an embedding from the cache by its Redis key.
+
+* **Parameters:**
+ **key** (*str*) – The full Redis key for the embedding.
+* **Return type:**
+ None
+
+```python
+await cache.adrop_by_key("embedcache:1234567890abcdef")
+```
+
+#### `async aexists(text, model_name)`
+
+Async check if an embedding exists.
+
+Asynchronously checks if an embedding exists for the given text and model.
+
+* **Parameters:**
+ * **text** (*str*) – The text input that was embedded.
+ * **model_name** (*str*) – The name of the embedding model.
+* **Returns:**
+ True if the embedding exists in the cache, False otherwise.
+* **Return type:**
+ bool
+
+```python
+if await cache.aexists("What is machine learning?", "text-embedding-ada-002"):
+ print("Embedding is in cache")
+```
+
+#### `async aexists_by_key(key)`
+
+Async check if an embedding exists for the given Redis key.
+
+Asynchronously checks if an embedding exists for the given Redis key.
+
+* **Parameters:**
+ **key** (*str*) – The full Redis key for the embedding.
+* **Returns:**
+ True if the embedding exists in the cache, False otherwise.
+* **Return type:**
+ bool
+
+```python
+if await cache.aexists_by_key("embedcache:1234567890abcdef"):
+ print("Embedding is in cache")
+```
+
+#### `async aexpire(key, ttl=None)`
+
+Asynchronously set or refresh the expiration time for a key in the cache.
+
+* **Parameters:**
+ * **key** (*str*) – The Redis key to set the expiration on.
+ * **ttl** (*Optional* *[* *int* *]* *,* *optional*) – The time-to-live in seconds. If None,
+ uses the default TTL configured for this cache instance.
+ Defaults to None.
+* **Return type:**
+ None
+
+#### `NOTE`
+If neither the provided TTL nor the default TTL is set (both are None),
+this method will have no effect.
+
+#### `async aget(text, model_name)`
+
+Async get embedding by text and model name.
+
+Asynchronously retrieves a cached embedding for the given text and model name.
+If found, refreshes the TTL of the entry.
+
+* **Parameters:**
+ * **text** (*str*) – The text input that was embedded.
+ * **model_name** (*str*) – The name of the embedding model.
+* **Returns:**
+ Embedding cache entry or None if not found.
+* **Return type:**
+ Optional[Dict[str, Any]]
+
+```python
+embedding_data = await cache.aget(
+ text="What is machine learning?",
+ model_name="text-embedding-ada-002"
+)
+```
+
+#### `async aget_by_key(key)`
+
+Async get embedding by its full Redis key.
+
+Asynchronously retrieves a cached embedding for the given Redis key.
+If found, refreshes the TTL of the entry.
+
+* **Parameters:**
+ **key** (*str*) – The full Redis key for the embedding.
+* **Returns:**
+ Embedding cache entry or None if not found.
+* **Return type:**
+ Optional[Dict[str, Any]]
+
+```python
+embedding_data = await cache.aget_by_key("embedcache:1234567890abcdef")
+```
+
+#### `async amdrop(texts, model_name)`
+
+Async remove multiple embeddings from the cache by their texts and model name.
+
+Asynchronously removes multiple embeddings in a single operation.
+
+* **Parameters:**
+ * **texts** (*List* *[* *str* *]*) – List of text inputs that were embedded.
+ * **model_name** (*str*) – The name of the embedding model.
+* **Return type:**
+ None
+
+```python
+# Remove multiple embeddings asynchronously
+await cache.amdrop(
+ texts=["What is machine learning?", "What is deep learning?"],
+ model_name="text-embedding-ada-002"
+)
+```
+
+#### `async amdrop_by_keys(keys)`
+
+Async remove multiple embeddings from the cache by their Redis keys.
+
+Asynchronously removes multiple embeddings in a single operation.
+
+* **Parameters:**
+ **keys** (*List* *[* *str* *]*) – List of Redis keys to remove.
+* **Return type:**
+ None
+
+```python
+# Remove multiple embeddings asynchronously
+await cache.amdrop_by_keys(["embedcache:key1", "embedcache:key2"])
+```
+
+#### `async amexists(texts, model_name)`
+
+Async check if multiple embeddings exist by their texts and model name.
+
+Asynchronously checks existence of multiple embeddings in a single operation.
+
+* **Parameters:**
+ * **texts** (*List* *[* *str* *]*) – List of text inputs that were embedded.
+ * **model_name** (*str*) – The name of the embedding model.
+* **Returns:**
+ List of boolean values indicating whether each embedding exists.
+* **Return type:**
+ List[bool]
+
+```python
+# Check if multiple embeddings exist asynchronously
+exists_results = await cache.amexists(
+ texts=["What is machine learning?", "What is deep learning?"],
+ model_name="text-embedding-ada-002"
+)
+```
+
+#### `async amexists_by_keys(keys)`
+
+Async check if multiple embeddings exist by their Redis keys.
+
+Asynchronously checks existence of multiple keys in a single operation.
+
+* **Parameters:**
+ **keys** (*List* *[* *str* *]*) – List of Redis keys to check.
+* **Returns:**
+ List of boolean values indicating whether each key exists.
+ The order matches the input keys order.
+* **Return type:**
+ List[bool]
+
+```python
+# Check if multiple keys exist asynchronously
+exists_results = await cache.amexists_by_keys(["embedcache:key1", "embedcache:key2"])
+```
+
+#### `async amget(texts, model_name)`
+
+Async get multiple embeddings by their texts and model name.
+
+Asynchronously retrieves multiple cached embeddings in a single operation.
+If found, refreshes the TTL of each entry.
+
+* **Parameters:**
+ * **texts** (*List* *[* *str* *]*) – List of text inputs that were embedded.
+ * **model_name** (*str*) – The name of the embedding model.
+* **Returns:**
+ List of embedding cache entries or None for texts not found.
+* **Return type:**
+ List[Optional[Dict[str, Any]]]
+
+```python
+# Get multiple embeddings asynchronously
+embedding_data = await cache.amget(
+ texts=["What is machine learning?", "What is deep learning?"],
+ model_name="text-embedding-ada-002"
+)
+```
+
+#### `async amget_by_keys(keys)`
+
+Async get multiple embeddings by their Redis keys.
+
+Asynchronously retrieves multiple cached embeddings in a single network roundtrip.
+If found, refreshes the TTL of each entry.
+
+* **Parameters:**
+ **keys** (*List* *[* *str* *]*) – List of Redis keys to retrieve.
+* **Returns:**
+ List of embedding cache entries or None for keys not found.
+ The order matches the input keys order.
+* **Return type:**
+ List[Optional[Dict[str, Any]]]
+
+```python
+# Get multiple embeddings asynchronously
+embedding_data = await cache.amget_by_keys([
+ "embedcache:key1",
+ "embedcache:key2"
+])
+```
+
+#### `async amset(items, ttl=None)`
+
+Async store multiple embeddings in a batch operation.
+
+Each item in the input list should be a dictionary with the following fields:
+- ‘text’: The text input that was embedded
+- ‘model_name’: The name of the embedding model
+- ‘embedding’: The embedding vector
+- ‘metadata’: Optional metadata to store with the embedding
+
+* **Parameters:**
+ * **items** (*List* *[* *Dict* *[* *str* *,* *Any* *]* *]*) – List of dictionaries, each containing text, model_name, embedding, and optional metadata.
+ * **ttl** (*int* *|* *None*) – Optional TTL override for these entries.
+* **Returns:**
+ List of Redis keys where the embeddings were stored.
+* **Return type:**
+ List[str]
+
+```python
+# Store multiple embeddings asynchronously
+keys = await cache.amset([
+ {
+ "text": "What is ML?",
+ "model_name": "text-embedding-ada-002",
+ "embedding": [0.1, 0.2, 0.3],
+ "metadata": {"source": "user"}
+ },
+ {
+ "text": "What is AI?",
+ "model_name": "text-embedding-ada-002",
+ "embedding": [0.4, 0.5, 0.6],
+ "metadata": {"source": "docs"}
+ }
+])
+```
+
+#### `async aset(text, model_name, embedding, metadata=None, ttl=None)`
+
+Async store an embedding with its text and model name.
+
+Asynchronously stores an embedding with its text and model name.
+
+* **Parameters:**
+ * **text** (*str*) – The text input that was embedded.
+ * **model_name** (*str*) – The name of the embedding model.
+ * **embedding** (*List* *[* *float* *]*) – The embedding vector to store.
+ * **metadata** (*Optional* *[* *Dict* *[* *str* *,* *Any* *]* *]*) – Optional metadata to store with the embedding.
+ * **ttl** (*Optional* *[* *int* *]*) – Optional TTL override for this specific entry.
+* **Returns:**
+ The Redis key where the embedding was stored.
+* **Return type:**
+ str
+
+```python
+key = await cache.aset(
+ text="What is machine learning?",
+ model_name="text-embedding-ada-002",
+ embedding=[0.1, 0.2, 0.3, ...],
+ metadata={"source": "user_query"}
+)
+```
+
+#### `clear()`
+
+Clear the cache of all keys.
+
+* **Return type:**
+ None
+
+#### `disconnect()`
+
+Disconnect from Redis.
+
+* **Return type:**
+ None
+
+#### `drop(text, model_name)`
+
+Remove an embedding from the cache.
+
+* **Parameters:**
+ * **text** (*str*) – The text input that was embedded.
+ * **model_name** (*str*) – The name of the embedding model.
+* **Return type:**
+ None
+
+```python
+cache.drop(
+ text="What is machine learning?",
+ model_name="text-embedding-ada-002"
+)
+```
+
+#### `drop_by_key(key)`
+
+Remove an embedding from the cache by its Redis key.
+
+* **Parameters:**
+ **key** (*str*) – The full Redis key for the embedding.
+* **Return type:**
+ None
+
+```python
+cache.drop_by_key("embedcache:1234567890abcdef")
+```
+
+#### `exists(text, model_name)`
+
+Check if an embedding exists for the given text and model.
+
+* **Parameters:**
+ * **text** (*str*) – The text input that was embedded.
+ * **model_name** (*str*) – The name of the embedding model.
+* **Returns:**
+ True if the embedding exists in the cache, False otherwise.
+* **Return type:**
+ bool
+
+```python
+if cache.exists("What is machine learning?", "text-embedding-ada-002"):
+ print("Embedding is in cache")
+```
+
+#### `exists_by_key(key)`
+
+Check if an embedding exists for the given Redis key.
+
+* **Parameters:**
+ **key** (*str*) – The full Redis key for the embedding.
+* **Returns:**
+ True if the embedding exists in the cache, False otherwise.
+* **Return type:**
+ bool
+
+```python
+if cache.exists_by_key("embedcache:1234567890abcdef"):
+ print("Embedding is in cache")
+```
+
+#### `expire(key, ttl=None)`
+
+Set or refresh the expiration time for a key in the cache.
+
+* **Parameters:**
+ * **key** (*str*) – The Redis key to set the expiration on.
+ * **ttl** (*Optional* *[* *int* *]* *,* *optional*) – The time-to-live in seconds. If None,
+ uses the default TTL configured for this cache instance.
+ Defaults to None.
+* **Return type:**
+ None
+
+#### `NOTE`
+If neither the provided TTL nor the default TTL is set (both are None),
+this method will have no effect.
+
+#### `get(text, model_name)`
+
+Get embedding by text and model name.
+
+Retrieves a cached embedding for the given text and model name.
+If found, refreshes the TTL of the entry.
+
+* **Parameters:**
+ * **text** (*str*) – The text input that was embedded.
+ * **model_name** (*str*) – The name of the embedding model.
+* **Returns:**
+ Embedding cache entry or None if not found.
+* **Return type:**
+ Optional[Dict[str, Any]]
+
+```python
+embedding_data = cache.get(
+ text="What is machine learning?",
+ model_name="text-embedding-ada-002"
+)
+```
+
+#### `get_by_key(key)`
+
+Get embedding by its full Redis key.
+
+Retrieves a cached embedding for the given Redis key.
+If found, refreshes the TTL of the entry.
+
+* **Parameters:**
+ **key** (*str*) – The full Redis key for the embedding.
+* **Returns:**
+ Embedding cache entry or None if not found.
+* **Return type:**
+ Optional[Dict[str, Any]]
+
+```python
+embedding_data = cache.get_by_key("embedcache:1234567890abcdef")
+```
+
+#### `mdrop(texts, model_name)`
+
+Remove multiple embeddings from the cache by their texts and model name.
+
+Efficiently removes multiple embeddings in a single operation.
+
+* **Parameters:**
+ * **texts** (*List* *[* *str* *]*) – List of text inputs that were embedded.
+ * **model_name** (*str*) – The name of the embedding model.
+* **Return type:**
+ None
+
+```python
+# Remove multiple embeddings
+cache.mdrop(
+ texts=["What is machine learning?", "What is deep learning?"],
+ model_name="text-embedding-ada-002"
+)
+```
+
+#### `mdrop_by_keys(keys)`
+
+Remove multiple embeddings from the cache by their Redis keys.
+
+Efficiently removes multiple embeddings in a single operation.
+
+* **Parameters:**
+ **keys** (*List* *[* *str* *]*) – List of Redis keys to remove.
+* **Return type:**
+ None
+
+```python
+# Remove multiple embeddings
+cache.mdrop_by_keys(["embedcache:key1", "embedcache:key2"])
+```
+
+#### `mexists(texts, model_name)`
+
+Check if multiple embeddings exist by their texts and model name.
+
+Efficiently checks existence of multiple embeddings in a single operation.
+
+* **Parameters:**
+ * **texts** (*List* *[* *str* *]*) – List of text inputs that were embedded.
+ * **model_name** (*str*) – The name of the embedding model.
+* **Returns:**
+ List of boolean values indicating whether each embedding exists.
+* **Return type:**
+ List[bool]
+
+```python
+# Check if multiple embeddings exist
+exists_results = cache.mexists(
+ texts=["What is machine learning?", "What is deep learning?"],
+ model_name="text-embedding-ada-002"
+)
+```
+
+#### `mexists_by_keys(keys)`
+
+Check if multiple embeddings exist by their Redis keys.
+
+Efficiently checks existence of multiple keys in a single operation.
+
+* **Parameters:**
+ **keys** (*List* *[* *str* *]*) – List of Redis keys to check.
+* **Returns:**
+ List of boolean values indicating whether each key exists.
+ The order matches the input keys order.
+* **Return type:**
+ List[bool]
+
+```python
+# Check if multiple keys exist
+exists_results = cache.mexists_by_keys(["embedcache:key1", "embedcache:key2"])
+```
+
+#### `mget(texts, model_name)`
+
+Get multiple embeddings by their texts and model name.
+
+Efficiently retrieves multiple cached embeddings in a single operation.
+If found, refreshes the TTL of each entry.
+
+* **Parameters:**
+ * **texts** (*List* *[* *str* *]*) – List of text inputs that were embedded.
+ * **model_name** (*str*) – The name of the embedding model.
+* **Returns:**
+ List of embedding cache entries or None for texts not found.
+* **Return type:**
+ List[Optional[Dict[str, Any]]]
+
+```python
+# Get multiple embeddings
+embedding_data = cache.mget(
+ texts=["What is machine learning?", "What is deep learning?"],
+ model_name="text-embedding-ada-002"
+)
+```
+
+#### `mget_by_keys(keys)`
+
+Get multiple embeddings by their Redis keys.
+
+Efficiently retrieves multiple cached embeddings in a single network roundtrip.
+If found, refreshes the TTL of each entry.
+
+* **Parameters:**
+ **keys** (*List* *[* *str* *]*) – List of Redis keys to retrieve.
+* **Returns:**
+ List of embedding cache entries or None for keys not found.
+ The order matches the input keys order.
+* **Return type:**
+ List[Optional[Dict[str, Any]]]
+
+```python
+# Get multiple embeddings
+embedding_data = cache.mget_by_keys([
+ "embedcache:key1",
+ "embedcache:key2"
+])
+```
+
+#### `mset(items, ttl=None)`
+
+Store multiple embeddings in a batch operation.
+
+Each item in the input list should be a dictionary with the following fields:
+- ‘text’: The text input that was embedded
+- ‘model_name’: The name of the embedding model
+- ‘embedding’: The embedding vector
+- ‘metadata’: Optional metadata to store with the embedding
+
+* **Parameters:**
+ * **items** (*List* *[* *Dict* *[* *str* *,* *Any* *]* *]*) – List of dictionaries, each containing text, model_name, embedding, and optional metadata.
+ * **ttl** (*int* *|* *None*) – Optional TTL override for these entries.
+* **Returns:**
+ List of Redis keys where the embeddings were stored.
+* **Return type:**
+ List[str]
+
+```python
+# Store multiple embeddings
+keys = cache.mset([
+ {
+ "text": "What is ML?",
+ "model_name": "text-embedding-ada-002",
+ "embedding": [0.1, 0.2, 0.3],
+ "metadata": {"source": "user"}
+ },
+ {
+ "text": "What is AI?",
+ "model_name": "text-embedding-ada-002",
+ "embedding": [0.4, 0.5, 0.6],
+ "metadata": {"source": "docs"}
+ }
+])
+```
+
+#### `set(text, model_name, embedding, metadata=None, ttl=None)`
+
+Store an embedding with its text and model name.
+
+* **Parameters:**
+ * **text** (*str*) – The text input that was embedded.
+ * **model_name** (*str*) – The name of the embedding model.
+ * **embedding** (*List* *[* *float* *]*) – The embedding vector to store.
+ * **metadata** (*Optional* *[* *Dict* *[* *str* *,* *Any* *]* *]*) – Optional metadata to store with the embedding.
+ * **ttl** (*Optional* *[* *int* *]*) – Optional TTL override for this specific entry.
+* **Returns:**
+ The Redis key where the embedding was stored.
+* **Return type:**
+ str
+
+```python
+key = cache.set(
+ text="What is machine learning?",
+ model_name="text-embedding-ada-002",
+ embedding=[0.1, 0.2, 0.3, ...],
+ metadata={"source": "user_query"}
+)
+```
+
+#### `set_ttl(ttl=None)`
+
+Set the default TTL, in seconds, for entries in the cache.
+
+* **Parameters:**
+ **ttl** (*Optional* *[* *int* *]* *,* *optional*) – The optional time-to-live expiration
+ for the cache, in seconds.
+* **Raises:**
+ **ValueError** – If the time-to-live value is not an integer.
+* **Return type:**
+ None
+
+#### `property ttl: int | None`
+
+The default TTL, in seconds, for entries in the cache.
diff --git a/content/integrate/redisvl/api/message_history.md b/content/integrate/redisvl/api/message_history.md
new file mode 100644
index 0000000000..b07408a7e3
--- /dev/null
+++ b/content/integrate/redisvl/api/message_history.md
@@ -0,0 +1,275 @@
+---
+linkTitle: LLM message history
+title: LLM Message History
+type: integration
+---
+
+
+## SemanticMessageHistory
+
+
+
+### `class SemanticMessageHistory(name, session_tag=None, prefix=None, vectorizer=None, distance_threshold=0.3, redis_client=None, redis_url='redis://localhost:6379', connection_kwargs={}, overwrite=False, **kwargs)`
+
+Bases: `BaseMessageHistory`
+
+Initialize message history with index
+
+Semantic Message History stores the current and previous user text prompts
+and LLM responses to allow for enriching future prompts with session
+context. Message history is stored in individual user or LLM prompts and
+responses.
+
+* **Parameters:**
+ * **name** (*str*) – The name of the message history index.
+ * **session_tag** (*Optional* *[* *str* *]*) – Tag to be added to entries to link to a specific
+ conversation session. Defaults to instance ULID.
+ * **prefix** (*Optional* *[* *str* *]*) – Prefix for the keys for this message data.
+ Defaults to None and will be replaced with the index name.
+ * **vectorizer** (*Optional* *[* *BaseVectorizer* *]*) – The vectorizer used to create embeddings.
+ * **distance_threshold** (*float*) – The maximum semantic distance to be
+ included in the context. Defaults to 0.3.
+ * **redis_client** (*Optional* *[* *Redis* *]*) – A Redis client instance. Defaults to
+ None.
+ * **redis_url** (*str* *,* *optional*) – The redis url. Defaults to redis://localhost:6379.
+ * **connection_kwargs** (*Dict* *[* *str* *,* *Any* *]*) – The connection arguments
+ for the redis client. Defaults to empty {}.
+ * **overwrite** (*bool*) – Whether or not to force overwrite the schema for
+ the semantic message index. Defaults to false.
+
+The proposed schema will support a single vector embedding constructed
+from either the prompt or response in a single string.
+
+#### `add_message(message, session_tag=None)`
+
+Insert a single prompt or response into the message history.
+A timestamp is associated with it so that it can be later sorted
+in sequential ordering after retrieval.
+
+* **Parameters:**
+ * **message** (*Dict* *[* *str* *,**str* *]*) – The user prompt or LLM response.
+ * **session_tag** (*Optional* *[* *str* *]*) – Tag to be added to entry to link to a specific
+ conversation session. Defaults to instance ULID.
+* **Return type:**
+ None
+
+#### `add_messages(messages, session_tag=None)`
+
+Insert a list of prompts and responses into the session memory.
+A timestamp is associated with each so that they can be later sorted
+in sequential ordering after retrieval.
+
+* **Parameters:**
+ * **messages** (*List* *[* *Dict* *[* *str* *,* *str* *]* *]*) – The list of user prompts and LLM responses.
+ * **session_tag** (*Optional* *[* *str* *]*) – Tag to be added to entries to link to a specific
+ conversation session. Defaults to instance ULID.
+* **Return type:**
+ None
+
+#### `clear()`
+
+Clears the message history.
+
+* **Return type:**
+ None
+
+#### `delete()`
+
+Clear all message keys and remove the search index.
+
+* **Return type:**
+ None
+
+#### `drop(id=None)`
+
+Remove a specific exchange from the message history.
+
+* **Parameters:**
+ **id** (*Optional* *[* *str* *]*) – The id of the message entry to delete.
+ If None then the last entry is deleted.
+* **Return type:**
+ None
+
+#### `get_recent(top_k=5, as_text=False, raw=False, session_tag=None)`
+
+Retreive the recent message history in sequential order.
+
+* **Parameters:**
+ * **top_k** (*int*) – The number of previous exchanges to return. Default is 5.
+ * **as_text** (*bool*) – Whether to return the conversation as a single string,
+ or list of alternating prompts and responses.
+ * **raw** (*bool*) – Whether to return the full Redis hash entry or just the
+ prompt and response
+ * **session_tag** (*Optional* *[* *str* *]*) – Tag of the entries linked to a specific
+ conversation session. Defaults to instance ULID.
+* **Returns:**
+ A single string transcription of the session
+ : or list of strings if as_text is false.
+* **Return type:**
+ Union[str, List[str]]
+* **Raises:**
+ **ValueError** – if top_k is not an integer greater than or equal to 0.
+
+#### `get_relevant(prompt, as_text=False, top_k=5, fall_back=False, session_tag=None, raw=False, distance_threshold=None)`
+
+Searches the message history for information semantically related to
+the specified prompt.
+
+This method uses vector similarity search with a text prompt as input.
+It checks for semantically similar prompts and responses and gets
+the top k most relevant previous prompts or responses to include as
+context to the next LLM call.
+
+* **Parameters:**
+ * **prompt** (*str*) – The message text to search for in message history
+ * **as_text** (*bool*) – Whether to return the prompts and responses as text
+ * **JSON.** (*or as*)
+ * **top_k** (*int*) – The number of previous messages to return. Default is 5.
+ * **session_tag** (*Optional* *[* *str* *]*) – Tag of the entries linked to a specific
+ conversation session. Defaults to instance ULID.
+ * **distance_threshold** (*Optional* *[* *float* *]*) – The threshold for semantic
+ vector distance.
+ * **fall_back** (*bool*) – Whether to drop back to recent conversation history
+ if no relevant context is found.
+ * **raw** (*bool*) – Whether to return the full Redis hash entry or just the
+ message.
+* **Returns:**
+ Either a list of strings, or a
+ list of prompts and responses in JSON containing the most relevant.
+* **Return type:**
+ Union[List[str], List[Dict[str,str]]
+
+Raises ValueError: if top_k is not an integer greater or equal to 0.
+
+#### `store(prompt, response, session_tag=None)`
+
+Insert a prompt:response pair into the message history. A timestamp
+is associated with each message so that they can be later sorted
+in sequential ordering after retrieval.
+
+* **Parameters:**
+ * **prompt** (*str*) – The user prompt to the LLM.
+ * **response** (*str*) – The corresponding LLM response.
+ * **session_tag** (*Optional* *[* *str* *]*) – Tag to be added to entries to link to a specific
+ conversation session. Defaults to instance ULID.
+* **Return type:**
+ None
+
+#### `property messages: List[str] | List[Dict[str, str]]`
+
+Returns the full message history.
+
+## MessageHistory
+
+
+
+### `class MessageHistory(name, session_tag=None, prefix=None, redis_client=None, redis_url='redis://localhost:6379', connection_kwargs={}, **kwargs)`
+
+Bases: `BaseMessageHistory`
+
+Initialize message history
+
+Message History stores the current and previous user text prompts and
+LLM responses to allow for enriching future prompts with session
+context. Message history is stored in individual user or LLM prompts and
+responses.
+
+* **Parameters:**
+ * **name** (*str*) – The name of the message history index.
+ * **session_tag** (*Optional* *[* *str* *]*) – Tag to be added to entries to link to a specific
+ conversation session. Defaults to instance ULID.
+ * **prefix** (*Optional* *[* *str* *]*) – Prefix for the keys for this conversation data.
+ Defaults to None and will be replaced with the index name.
+ * **redis_client** (*Optional* *[* *Redis* *]*) – A Redis client instance. Defaults to
+ None.
+ * **redis_url** (*str* *,* *optional*) – The redis url. Defaults to redis://localhost:6379.
+ * **connection_kwargs** (*Dict* *[* *str* *,* *Any* *]*) – The connection arguments
+ for the redis client. Defaults to empty {}.
+
+#### `add_message(message, session_tag=None)`
+
+Insert a single prompt or response into the message history.
+A timestamp is associated with it so that it can be later sorted
+in sequential ordering after retrieval.
+
+* **Parameters:**
+ * **message** (*Dict* *[* *str* *,**str* *]*) – The user prompt or LLM response.
+ * **session_tag** (*Optional* *[* *str* *]*) – Tag to be added to entries to link to a specific
+ conversation session. Defaults to instance ULID.
+* **Return type:**
+ None
+
+#### `add_messages(messages, session_tag=None)`
+
+Insert a list of prompts and responses into the message history.
+A timestamp is associated with each so that they can be later sorted
+in sequential ordering after retrieval.
+
+* **Parameters:**
+ * **messages** (*List* *[* *Dict* *[* *str* *,* *str* *]* *]*) – The list of user prompts and LLM responses.
+ * **session_tag** (*Optional* *[* *str* *]*) – Tag to be added to entries to link to a specific
+ conversation session. Defaults to instance ULID.
+* **Return type:**
+ None
+
+#### `clear()`
+
+Clears the conversation message history.
+
+* **Return type:**
+ None
+
+#### `delete()`
+
+Clear all conversation keys and remove the search index.
+
+* **Return type:**
+ None
+
+#### `drop(id=None)`
+
+Remove a specific exchange from the conversation history.
+
+* **Parameters:**
+ **id** (*Optional* *[* *str* *]*) – The id of the message entry to delete.
+ If None then the last entry is deleted.
+* **Return type:**
+ None
+
+#### `get_recent(top_k=5, as_text=False, raw=False, session_tag=None)`
+
+Retrieve the recent message history in sequential order.
+
+* **Parameters:**
+ * **top_k** (*int*) – The number of previous messages to return. Default is 5.
+ * **as_text** (*bool*) – Whether to return the conversation as a single string,
+ or list of alternating prompts and responses.
+ * **raw** (*bool*) – Whether to return the full Redis hash entry or just the
+ prompt and response.
+ * **session_tag** (*Optional* *[* *str* *]*) – Tag of the entries linked to a specific
+ conversation session. Defaults to instance ULID.
+* **Returns:**
+ A single string transcription of the messages
+ : or list of strings if as_text is false.
+* **Return type:**
+ Union[str, List[str]]
+* **Raises:**
+ **ValueError** – if top_k is not an integer greater than or equal to 0.
+
+#### `store(prompt, response, session_tag=None)`
+
+Insert a prompt:response pair into the message history. A timestamp
+is associated with each exchange so that they can be later sorted
+in sequential ordering after retrieval.
+
+* **Parameters:**
+ * **prompt** (*str*) – The user prompt to the LLM.
+ * **response** (*str*) – The corresponding LLM response.
+ * **session_tag** (*Optional* *[* *str* *]*) – Tag to be added to entries to link to a specific
+ conversation session. Defaults to instance ULID.
+* **Return type:**
+ None
+
+#### `property messages: List[str] | List[Dict[str, str]]`
+
+Returns the full message history.
diff --git a/content/integrate/redisvl/api/query.md b/content/integrate/redisvl/api/query.md
index f636006cc5..5144f57116 100644
--- a/content/integrate/redisvl/api/query.md
+++ b/content/integrate/redisvl/api/query.md
@@ -11,7 +11,7 @@ queries for different use cases. Each query class wraps the `redis-py` Query mod
## VectorQuery
-### `class VectorQuery(vector, vector_field_name, return_fields=None, filter_expression=None, dtype='float32', num_results=10, return_score=True, dialect=2, sort_by=None, in_order=False, hybrid_policy=None, batch_size=None, normalize_vector_distance=False)`
+### `class VectorQuery(vector, vector_field_name, return_fields=None, filter_expression=None, dtype='float32', num_results=10, return_score=True, dialect=2, sort_by=None, in_order=False, hybrid_policy=None, batch_size=None, ef_runtime=None, normalize_vector_distance=False)`
Bases: `BaseVectorQuery`, `BaseQuery`
@@ -49,6 +49,9 @@ expression.
of vectors to fetch in each batch. Larger values may improve performance
at the cost of memory usage. Only applies when hybrid_policy=”BATCHES”.
Defaults to None, which lets Redis auto-select an appropriate batch size.
+ * **ef_runtime** (*Optional* *[* *int* *]*) – Controls the size of the dynamic candidate list for HNSW
+ algorithm at query time. Higher values improve recall at the expense of
+ slower search performance. Defaults to None, which uses the index-defined value.
* **normalize_vector_distance** (*bool*) – Redis supports 3 distance metrics: L2 (euclidean),
IP (inner product), and COSINE. By default, L2 distance returns an unbounded value.
COSINE distance returns a value between 0 and 2. IP returns a value determined by
@@ -186,6 +189,17 @@ Set the batch size for the query.
* **TypeError** – If batch_size is not an integer
* **ValueError** – If batch_size is not positive
+#### `set_ef_runtime(ef_runtime)`
+
+Set the EF_RUNTIME parameter for the query.
+
+* **Parameters:**
+ **ef_runtime** (*int*) – The EF_RUNTIME value to use for HNSW algorithm.
+ Higher values improve recall at the expense of slower search.
+* **Raises:**
+ * **TypeError** – If ef_runtime is not an integer
+ * **ValueError** – If ef_runtime is not positive
+
#### `set_filter(filter_expression=None)`
Set the filter expression for the query.
@@ -269,6 +283,15 @@ Return the batch size for the query.
* **Return type:**
Optional[int]
+#### `property ef_runtime: int | None`
+
+Return the EF_RUNTIME parameter for the query.
+
+* **Returns:**
+ The EF_RUNTIME value for the query.
+* **Return type:**
+ Optional[int]
+
#### `property filter: str | `[`FilterExpression`]({{< relref "filter/#filterexpression" >}})` `
The filter expression for the query.
diff --git a/content/integrate/redisvl/api/router.md b/content/integrate/redisvl/api/router.md
index f5dc477bcf..b39639696b 100644
--- a/content/integrate/redisvl/api/router.md
+++ b/content/integrate/redisvl/api/router.md
@@ -26,6 +26,19 @@ Initialize the SemanticRouter.
* **connection_kwargs** (*Dict* *[* *str* *,* *Any* *]*) – The connection arguments
for the redis client. Defaults to empty {}.
+#### `add_route_references(route_name, references)`
+
+Add a reference(s) to an existing route.
+
+* **Parameters:**
+ * **router_name** (*str*) – The name of the router.
+ * **references** (*Union* *[* *str* *,* *List* *[* *str* *]* *]*) – The reference or list of references to add.
+ * **route_name** (*str*)
+* **Returns:**
+ The list of added references keys.
+* **Return type:**
+ List[str]
+
#### `clear()`
Flush all routes from the semantic router index.
@@ -40,6 +53,22 @@ Delete the semantic router index.
* **Return type:**
None
+#### `delete_route_references(route_name='', reference_ids=[], keys=[])`
+
+Get references for an existing semantic router route.
+
+* **Parameters:**
+ * **Optional** (*keys*) – The name of the router.
+ * **Optional** – The reference or list of references to delete.
+ * **Optional** – List of fully qualified keys (prefix:router:reference_id) to delete.
+ * **route_name** (*str*)
+ * **reference_ids** (*List* *[* *str* *]*)
+ * **keys** (*List* *[* *str* *]*)
+* **Returns:**
+ Number of objects deleted
+* **Return type:**
+ int
+
#### `classmethod from_dict(data, **kwargs)`
Create a SemanticRouter from a dictionary.
@@ -63,6 +92,17 @@ router_data = {
router = SemanticRouter.from_dict(router_data)
```
+#### `classmethod from_existing(name, redis_client=None, redis_url='redis://localhost:6379', **kwargs)`
+
+Return SemanticRouter instance from existing index.
+
+* **Parameters:**
+ * **name** (*str*)
+ * **redis_client** (*Redis* *|* *None*)
+ * **redis_url** (*str*)
+* **Return type:**
+ [SemanticRouter](#semanticrouter)
+
#### `classmethod from_yaml(file_path, **kwargs)`
Create a SemanticRouter from a YAML file.
@@ -93,6 +133,21 @@ Get a route by its name.
* **Return type:**
Optional[[Route](#route)]
+#### `get_route_references(route_name='', reference_ids=[], keys=[])`
+
+Get references for an existing route route.
+
+* **Parameters:**
+ * **router_name** (*str*) – The name of the router.
+ * **references** (*Union* *[* *str* *,* *List* *[* *str* *]* *]*) – The reference or list of references to add.
+ * **route_name** (*str*)
+ * **reference_ids** (*List* *[* *str* *]*)
+ * **keys** (*List* *[* *str* *]*)
+* **Returns:**
+ Reference objects stored
+* **Return type:**
+ List[Dict[str, Any]]]
+
#### `model_post_init(context, /)`
This function is meant to behave like a BaseModel method to initialise private attributes.
@@ -232,10 +287,10 @@ validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
* **Parameters:**
- * **max_k** (*Annotated* *[* *int* *,* *FieldInfo* *(* *annotation=NoneType* *,* *required=False* *,* *default=1* *,* *metadata=* *[* *Strict* *(* *strict=True* *)* *,* *Gt* *(* *gt=0* *)* *]* *)* *]*)
+ * **max_k** (*Annotated* *[* *int* *,* *FieldInfo* *(* *annotation=NoneType* *,* *required=True* *,* *metadata=* *[* *Strict* *(* *strict=True* *)* *,* *Gt* *(* *gt=0* *)* *]* *)* *]*)
* **aggregation_method** ([DistanceAggregationMethod](#distanceaggregationmethod))
-#### `max_k: Annotated[int, FieldInfo(annotation=NoneType, required=False, default=1, metadata=[Strict(strict=True), Gt(gt=0)])]`
+#### `max_k: Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0)])]`
Aggregation method to use to classify queries.
diff --git a/content/integrate/redisvl/api/searchindex.md b/content/integrate/redisvl/api/searchindex.md
index 998796e646..d92838f364 100644
--- a/content/integrate/redisvl/api/searchindex.md
+++ b/content/integrate/redisvl/api/searchindex.md
@@ -75,7 +75,7 @@ to the redis-py ft().aggregate() method.
Execute a batch of queries and process results.
* **Parameters:**
- * **queries** (*List* *[* *BaseQuery* *]*)
+ * **queries** (*Sequence* *[* *BaseQuery* *]*)
* **batch_size** (*int*)
* **Return type:**
*List*[*List*[*Dict*[str, *Any*]]]
@@ -169,6 +169,20 @@ with the index.
Disconnect from the Redis database.
+#### `drop_documents(ids)`
+
+Remove documents from the index by their document IDs.
+
+This method converts document IDs to Redis keys automatically by applying
+the index’s key prefix and separator configuration.
+
+* **Parameters:**
+ **ids** (*Union* *[* *str* *,* *List* *[* *str* *]* *]*) – The document ID or IDs to remove from the index.
+* **Returns:**
+ Count of documents deleted from Redis.
+* **Return type:**
+ int
+
#### `drop_keys(keys)`
Remove a specific entry or entries from the index by it’s key ID.
@@ -596,6 +610,20 @@ Delete the search index.
Disconnect from the Redis database.
+#### `async drop_documents(ids)`
+
+Remove documents from the index by their document IDs.
+
+This method converts document IDs to Redis keys automatically by applying
+the index’s key prefix and separator configuration.
+
+* **Parameters:**
+ **ids** (*Union* *[* *str* *,* *List* *[* *str* *]* *]*) – The document ID or IDs to remove from the index.
+* **Returns:**
+ Count of documents deleted from Redis.
+* **Return type:**
+ int
+
#### `async drop_keys(keys)`
Remove a specific entry or entries from the index by it’s key ID.
diff --git a/content/integrate/redisvl/api/vectorizer.md b/content/integrate/redisvl/api/vectorizer.md
index 0f15492f71..7133911960 100644
--- a/content/integrate/redisvl/api/vectorizer.md
+++ b/content/integrate/redisvl/api/vectorizer.md
@@ -9,30 +9,53 @@ type: integration
-### `class HFTextVectorizer(model='sentence-transformers/all-mpnet-base-v2', dtype='float32', *, dims=None)`
+### `class HFTextVectorizer(model='sentence-transformers/all-mpnet-base-v2', dtype='float32', cache=None, *, dims=None)`
Bases: `BaseVectorizer`
-The HFTextVectorizer class is designed to leverage the power of Hugging
-Face’s Sentence Transformers for generating text embeddings. This vectorizer
-is particularly useful in scenarios where advanced natural language
+The HFTextVectorizer class leverages Hugging Face’s Sentence Transformers
+for generating vector embeddings from text input.
+
+This vectorizer is particularly useful in scenarios where advanced natural language
processing and understanding are required, and ideal for running on your own
-hardware (for free).
+hardware without usage fees.
+
+You can optionally enable caching to improve performance when generating
+embeddings for repeated text inputs.
Utilizing this vectorizer involves specifying a pre-trained model from
Hugging Face’s vast collection of Sentence Transformers. These models are
trained on a variety of datasets and tasks, ensuring versatility and
-robust performance across different text embedding needs. Additionally,
-make sure the sentence-transformers library is installed with
-pip install sentence-transformers==2.2.2.
+robust performance across different embedding needs.
+
+Requirements:
+: - The sentence-transformers library must be installed with pip.
```python
-# Embedding a single text
+# Basic usage
vectorizer = HFTextVectorizer(model="sentence-transformers/all-mpnet-base-v2")
embedding = vectorizer.embed("Hello, world!")
-# Embedding a batch of texts
-embeddings = vectorizer.embed_many(["Hello, world!", "How are you?"], batch_size=2)
+# With caching enabled
+from redisvl.extensions.cache.embeddings import EmbeddingsCache
+cache = EmbeddingsCache(name="my_embeddings_cache")
+
+vectorizer = HFTextVectorizer(
+ model="sentence-transformers/all-mpnet-base-v2",
+ cache=cache
+)
+
+# First call will compute and cache the embedding
+embedding1 = vectorizer.embed("Hello, world!")
+
+# Second call will retrieve from cache
+embedding2 = vectorizer.embed("Hello, world!")
+
+# Batch processing
+embeddings = vectorizer.embed_many(
+ ["Hello, world!", "How are you?"],
+ batch_size=2
+)
```
Initialize the Hugging Face text vectorizer.
@@ -44,51 +67,16 @@ Initialize the Hugging Face text vectorizer.
* **dtype** (*str*) – the default datatype to use when embedding text as byte arrays.
Used when setting as_buffer=True in calls to embed() and embed_many().
Defaults to ‘float32’.
- * **dims** (*int* *|* *None*)
+ * **cache** (*Optional* *[*[*EmbeddingsCache*]({{< relref "cache/#embeddingscache" >}}) *]*) – Optional EmbeddingsCache instance to cache embeddings for
+ better performance with repeated texts. Defaults to None.
+ * **\*\*kwargs** – Additional parameters to pass to the SentenceTransformer
+ constructor.
+ * **dims** (*Annotated* *[* *int* *|* *None* *,* *FieldInfo* *(* *annotation=NoneType* *,* *required=True* *,* *metadata=* *[* *Strict* *(* *strict=True* *)* *,* *Gt* *(* *gt=0* *)* *]* *)* *]*)
* **Raises:**
* **ImportError** – If the sentence-transformers library is not installed.
* **ValueError** – If there is an error setting the embedding model dimensions.
* **ValueError** – If an invalid dtype is provided.
-#### `embed(text, preprocess=None, as_buffer=False, **kwargs)`
-
-Embed a chunk of text using the Hugging Face sentence transformer.
-
-* **Parameters:**
- * **text** (*str*) – Chunk of text to embed.
- * **preprocess** (*Optional* *[* *Callable* *]* *,* *optional*) – Optional preprocessing
- callable to perform before vectorization. Defaults to None.
- * **as_buffer** (*bool* *,* *optional*) – Whether to convert the raw embedding
- to a byte string. Defaults to False.
-* **Returns:**
- Embedding as a list of floats, or as a bytes
- object if as_buffer=True
-* **Return type:**
- Union[List[float], bytes]
-* **Raises:**
- **TypeError** – If the wrong input type is passed in for the text.
-
-#### `embed_many(texts, preprocess=None, batch_size=10, as_buffer=False, **kwargs)`
-
-Asynchronously embed many chunks of texts using the Hugging Face
-sentence transformer.
-
-* **Parameters:**
- * **texts** (*List* *[* *str* *]*) – List of text chunks to embed.
- * **preprocess** (*Optional* *[* *Callable* *]* *,* *optional*) – Optional preprocessing
- callable to perform before vectorization. Defaults to None.
- * **batch_size** (*int* *,* *optional*) – Batch size of texts to use when creating
- embeddings. Defaults to 10.
- * **as_buffer** (*bool* *,* *optional*) – Whether to convert the raw embedding
- to a byte string. Defaults to False.
-* **Returns:**
- List of embeddings as lists of floats,
- or as bytes objects if as_buffer=True
-* **Return type:**
- Union[List[List[float]], List[bytes]]
-* **Raises:**
- **TypeError** – If the wrong input type is passed in for the test.
-
#### `model_post_init(context, /)`
This function is meant to behave like a BaseModel method to initialise private attributes.
@@ -101,15 +89,19 @@ It takes context as an argument since that’s what pydantic-core passes when ca
* **Return type:**
None
-#### `model_config: ClassVar[ConfigDict] = {}`
+#### `model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}`
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
+#### `property type: str`
+
+Return the type of vectorizer.
+
## OpenAITextVectorizer
-### `class OpenAITextVectorizer(model='text-embedding-ada-002', api_config=None, dtype='float32', *, dims=None)`
+### `class OpenAITextVectorizer(model='text-embedding-ada-002', api_config=None, dtype='float32', cache=None, *, dims=None)`
Bases: `BaseVectorizer`
@@ -127,14 +119,33 @@ The vectorizer supports both synchronous and asynchronous operations,
allowing for batch processing of texts and flexibility in handling
preprocessing tasks.
+You can optionally enable caching to improve performance when generating
+embeddings for repeated text inputs.
+
```python
-# Synchronous embedding of a single text
+# Basic usage with OpenAI embeddings
vectorizer = OpenAITextVectorizer(
model="text-embedding-ada-002",
api_config={"api_key": "your_api_key"} # OR set OPENAI_API_KEY in your env
)
embedding = vectorizer.embed("Hello, world!")
+# With caching enabled
+from redisvl.extensions.cache.embeddings import EmbeddingsCache
+cache = EmbeddingsCache(name="openai_embeddings_cache")
+
+vectorizer = OpenAITextVectorizer(
+ model="text-embedding-ada-002",
+ api_config={"api_key": "your_api_key"},
+ cache=cache
+)
+
+# First call will compute and cache the embedding
+embedding1 = vectorizer.embed("Hello, world!")
+
+# Second call will retrieve from cache
+embedding2 = vectorizer.embed("Hello, world!")
+
# Asynchronous batch embedding of multiple texts
embeddings = await vectorizer.aembed_many(
["Hello, world!", "How are you?"],
@@ -152,109 +163,27 @@ Initialize the OpenAI vectorizer.
* **dtype** (*str*) – the default datatype to use when embedding text as byte arrays.
Used when setting as_buffer=True in calls to embed() and embed_many().
Defaults to ‘float32’.
- * **dims** (*int* *|* *None*)
+ * **cache** (*Optional* *[*[*EmbeddingsCache*]({{< relref "cache/#embeddingscache" >}}) *]*) – Optional EmbeddingsCache instance to cache embeddings for
+ better performance with repeated texts. Defaults to None.
+ * **dims** (*Annotated* *[* *int* *|* *None* *,* *FieldInfo* *(* *annotation=NoneType* *,* *required=True* *,* *metadata=* *[* *Strict* *(* *strict=True* *)* *,* *Gt* *(* *gt=0* *)* *]* *)* *]*)
* **Raises:**
* **ImportError** – If the openai library is not installed.
* **ValueError** – If the OpenAI API key is not provided.
* **ValueError** – If an invalid dtype is provided.
-#### `aembed(text, preprocess=None, as_buffer=False, **kwargs)`
-
-Asynchronously embed a chunk of text using the OpenAI API.
-
-* **Parameters:**
- * **text** (*str*) – Chunk of text to embed.
- * **preprocess** (*Optional* *[* *Callable* *]* *,* *optional*) – Optional preprocessing callable to
- perform before vectorization. Defaults to None.
- * **as_buffer** (*bool* *,* *optional*) – Whether to convert the raw embedding
- to a byte string. Defaults to False.
-* **Returns:**
- Embedding as a list of floats, or as a bytes
- object if as_buffer=True
-* **Return type:**
- Union[List[float], bytes]
-* **Raises:**
- **TypeError** – If the wrong input type is passed in for the text.
-
-#### `aembed_many(texts, preprocess=None, batch_size=10, as_buffer=False, **kwargs)`
-
-Asynchronously embed many chunks of texts using the OpenAI API.
-
-* **Parameters:**
- * **texts** (*List* *[* *str* *]*) – List of text chunks to embed.
- * **preprocess** (*Optional* *[* *Callable* *]* *,* *optional*) – Optional preprocessing callable to
- perform before vectorization. Defaults to None.
- * **batch_size** (*int* *,* *optional*) – Batch size of texts to use when creating
- embeddings. Defaults to 10.
- * **as_buffer** (*bool* *,* *optional*) – Whether to convert the raw embedding
- to a byte string. Defaults to False.
-* **Returns:**
- List of embeddings as lists of floats,
- or as bytes objects if as_buffer=True
-* **Return type:**
- Union[List[List[float]], List[bytes]]
-* **Raises:**
- **TypeError** – If the wrong input type is passed in for the text.
-
-#### `embed(text, preprocess=None, as_buffer=False, **kwargs)`
-
-Embed a chunk of text using the OpenAI API.
-
-* **Parameters:**
- * **text** (*str*) – Chunk of text to embed.
- * **preprocess** (*Optional* *[* *Callable* *]* *,* *optional*) – Optional preprocessing callable to
- perform before vectorization. Defaults to None.
- * **as_buffer** (*bool* *,* *optional*) – Whether to convert the raw embedding
- to a byte string. Defaults to False.
-* **Returns:**
- Embedding as a list of floats, or as a bytes
- object if as_buffer=True
-* **Return type:**
- Union[List[float], bytes]
-* **Raises:**
- **TypeError** – If the wrong input type is passed in for the text.
-
-#### `embed_many(texts, preprocess=None, batch_size=10, as_buffer=False, **kwargs)`
-
-Embed many chunks of texts using the OpenAI API.
-
-* **Parameters:**
- * **texts** (*List* *[* *str* *]*) – List of text chunks to embed.
- * **preprocess** (*Optional* *[* *Callable* *]* *,* *optional*) – Optional preprocessing
- callable to perform before vectorization. Defaults to None.
- * **batch_size** (*int* *,* *optional*) – Batch size of texts to use when creating
- embeddings. Defaults to 10.
- * **as_buffer** (*bool* *,* *optional*) – Whether to convert the raw embedding
- to a byte string. Defaults to False.
-* **Returns:**
- List of embeddings as lists of floats,
- or as bytes objects if as_buffer=True
-* **Return type:**
- Union[List[List[float]], List[bytes]]
-* **Raises:**
- **TypeError** – If the wrong input type is passed in for the text.
-
-#### `model_post_init(context, /)`
-
-This function is meant to behave like a BaseModel method to initialise private attributes.
-
-It takes context as an argument since that’s what pydantic-core passes when calling it.
+#### `model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}`
-* **Parameters:**
- * **self** (*BaseModel*) – The BaseModel instance.
- * **context** (*Any*) – The context.
-* **Return type:**
- None
+Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
-#### `model_config: ClassVar[ConfigDict] = {}`
+#### `property type: str`
-Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
+Return the type of vectorizer.
## AzureOpenAITextVectorizer
-### `class AzureOpenAITextVectorizer(model='text-embedding-ada-002', api_config=None, dtype='float32', *, dims=None)`
+### `class AzureOpenAITextVectorizer(model='text-embedding-ada-002', api_config=None, dtype='float32', cache=None, *, dims=None)`
Bases: `BaseVectorizer`
@@ -273,8 +202,11 @@ The vectorizer supports both synchronous and asynchronous operations,
allowing for batch processing of texts and flexibility in handling
preprocessing tasks.
+You can optionally enable caching to improve performance when generating
+embeddings for repeated text inputs.
+
```python
-# Synchronous embedding of a single text
+# Basic usage
vectorizer = AzureOpenAITextVectorizer(
model="text-embedding-ada-002",
api_config={
@@ -285,6 +217,26 @@ vectorizer = AzureOpenAITextVectorizer(
)
embedding = vectorizer.embed("Hello, world!")
+# With caching enabled
+from redisvl.extensions.cache.embeddings import EmbeddingsCache
+cache = EmbeddingsCache(name="azureopenai_embeddings_cache")
+
+vectorizer = AzureOpenAITextVectorizer(
+ model="text-embedding-ada-002",
+ api_config={
+ "api_key": "your_api_key",
+ "api_version": "your_api_version",
+ "azure_endpoint": "your_azure_endpoint",
+ },
+ cache=cache
+)
+
+# First call will compute and cache the embedding
+embedding1 = vectorizer.embed("Hello, world!")
+
+# Second call will retrieve from cache
+embedding2 = vectorizer.embed("Hello, world!")
+
# Asynchronous batch embedding of multiple texts
embeddings = await vectorizer.aembed_many(
["Hello, world!", "How are you?"],
@@ -304,109 +256,27 @@ Initialize the AzureOpenAI vectorizer.
* **dtype** (*str*) – the default datatype to use when embedding text as byte arrays.
Used when setting as_buffer=True in calls to embed() and embed_many().
Defaults to ‘float32’.
- * **dims** (*int* *|* *None*)
+ * **cache** (*Optional* *[*[*EmbeddingsCache*]({{< relref "cache/#embeddingscache" >}}) *]*) – Optional EmbeddingsCache instance to cache embeddings for
+ better performance with repeated texts. Defaults to None.
+ * **dims** (*Annotated* *[* *int* *|* *None* *,* *FieldInfo* *(* *annotation=NoneType* *,* *required=True* *,* *metadata=* *[* *Strict* *(* *strict=True* *)* *,* *Gt* *(* *gt=0* *)* *]* *)* *]*)
* **Raises:**
* **ImportError** – If the openai library is not installed.
* **ValueError** – If the AzureOpenAI API key, version, or endpoint are not provided.
* **ValueError** – If an invalid dtype is provided.
-#### `aembed(text, preprocess=None, as_buffer=False, **kwargs)`
-
-Asynchronously embed a chunk of text using the OpenAI API.
-
-* **Parameters:**
- * **text** (*str*) – Chunk of text to embed.
- * **preprocess** (*Optional* *[* *Callable* *]* *,* *optional*) – Optional preprocessing callable to
- perform before vectorization. Defaults to None.
- * **as_buffer** (*bool* *,* *optional*) – Whether to convert the raw embedding
- to a byte string. Defaults to False.
-* **Returns:**
- Embedding as a list of floats, or as a bytes
- object if as_buffer=True
-* **Return type:**
- Union[List[float], bytes]
-* **Raises:**
- **TypeError** – If the wrong input type is passed in for the test.
-
-#### `aembed_many(texts, preprocess=None, batch_size=10, as_buffer=False, **kwargs)`
-
-Asynchronously embed many chunks of texts using the AzureOpenAI API.
-
-* **Parameters:**
- * **texts** (*List* *[* *str* *]*) – List of text chunks to embed.
- * **preprocess** (*Optional* *[* *Callable* *]* *,* *optional*) – Optional preprocessing callable to
- perform before vectorization. Defaults to None.
- * **batch_size** (*int* *,* *optional*) – Batch size of texts to use when creating
- embeddings. Defaults to 10.
- * **as_buffer** (*bool* *,* *optional*) – Whether to convert the raw embedding
- to a byte string. Defaults to False.
-* **Returns:**
- List of embeddings as lists of floats,
- or as bytes objects if as_buffer=True
-* **Return type:**
- Union[List[List[float]], List[bytes]]
-* **Raises:**
- **TypeError** – If the wrong input type is passed in for the test.
-
-#### `embed(text, preprocess=None, as_buffer=False, **kwargs)`
-
-Embed a chunk of text using the AzureOpenAI API.
-
-* **Parameters:**
- * **text** (*str*) – Chunk of text to embed.
- * **preprocess** (*Optional* *[* *Callable* *]* *,* *optional*) – Optional preprocessing callable to
- perform before vectorization. Defaults to None.
- * **as_buffer** (*bool* *,* *optional*) – Whether to convert the raw embedding
- to a byte string. Defaults to False.
-* **Returns:**
- Embedding as a list of floats, or as a bytes
- object if as_buffer=True
-* **Return type:**
- Union[List[float], bytes]
-* **Raises:**
- **TypeError** – If the wrong input type is passed in for the test.
-
-#### `embed_many(texts, preprocess=None, batch_size=10, as_buffer=False, **kwargs)`
-
-Embed many chunks of texts using the AzureOpenAI API.
-
-* **Parameters:**
- * **texts** (*List* *[* *str* *]*) – List of text chunks to embed.
- * **preprocess** (*Optional* *[* *Callable* *]* *,* *optional*) – Optional preprocessing
- callable to perform before vectorization. Defaults to None.
- * **batch_size** (*int* *,* *optional*) – Batch size of texts to use when creating
- embeddings. Defaults to 10.
- * **as_buffer** (*bool* *,* *optional*) – Whether to convert the raw embedding
- to a byte string. Defaults to False.
-* **Returns:**
- List of embeddings as lists of floats,
- or as bytes objects if as_buffer=True
-* **Return type:**
- Union[List[List[float]], List[bytes]]
-* **Raises:**
- **TypeError** – If the wrong input type is passed in for the test.
-
-#### `model_post_init(context, /)`
+#### `model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}`
-This function is meant to behave like a BaseModel method to initialise private attributes.
-
-It takes context as an argument since that’s what pydantic-core passes when calling it.
-
-* **Parameters:**
- * **self** (*BaseModel*) – The BaseModel instance.
- * **context** (*Any*) – The context.
-* **Return type:**
- None
+Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
-#### `model_config: ClassVar[ConfigDict] = {}`
+#### `property type: str`
-Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
+Return the type of vectorizer.
## VertexAITextVectorizer
-### `class VertexAITextVectorizer(model='textembedding-gecko', api_config=None, dtype='float32', *, dims=None)`
+### `class VertexAITextVectorizer(model='textembedding-gecko', api_config=None, dtype='float32', cache=None, *, dims=None)`
Bases: `BaseVectorizer`
@@ -423,8 +293,11 @@ provided through the api_config dictionary or set the GOOGLE_APPLICATION_CREDENT
env var. Additionally, the vertexai python client must be
installed with pip install google-cloud-aiplatform>=1.26.
+You can optionally enable caching to improve performance when generating
+embeddings for repeated text inputs.
+
```python
-# Synchronous embedding of a single text
+# Basic usage
vectorizer = VertexAITextVectorizer(
model="textembedding-gecko",
api_config={
@@ -433,8 +306,27 @@ vectorizer = VertexAITextVectorizer(
})
embedding = vectorizer.embed("Hello, world!")
-# Asynchronous batch embedding of multiple texts
-embeddings = await vectorizer.embed_many(
+# With caching enabled
+from redisvl.extensions.cache.embeddings import EmbeddingsCache
+cache = EmbeddingsCache(name="vertexai_embeddings_cache")
+
+vectorizer = VertexAITextVectorizer(
+ model="textembedding-gecko",
+ api_config={
+ "project_id": "your_gcp_project_id",
+ "location": "your_gcp_location",
+ },
+ cache=cache
+)
+
+# First call will compute and cache the embedding
+embedding1 = vectorizer.embed("Hello, world!")
+
+# Second call will retrieve from cache
+embedding2 = vectorizer.embed("Hello, world!")
+
+# Batch embedding of multiple texts
+embeddings = vectorizer.embed_many(
["Hello, world!", "Goodbye, world!"],
batch_size=2
)
@@ -450,71 +342,27 @@ Initialize the VertexAI vectorizer.
* **dtype** (*str*) – the default datatype to use when embedding text as byte arrays.
Used when setting as_buffer=True in calls to embed() and embed_many().
Defaults to ‘float32’.
- * **dims** (*int* *|* *None*)
+ * **cache** (*Optional* *[*[*EmbeddingsCache*]({{< relref "cache/#embeddingscache" >}}) *]*) – Optional EmbeddingsCache instance to cache embeddings for
+ better performance with repeated texts. Defaults to None.
+ * **dims** (*Annotated* *[* *int* *|* *None* *,* *FieldInfo* *(* *annotation=NoneType* *,* *required=True* *,* *metadata=* *[* *Strict* *(* *strict=True* *)* *,* *Gt* *(* *gt=0* *)* *]* *)* *]*)
* **Raises:**
* **ImportError** – If the google-cloud-aiplatform library is not installed.
* **ValueError** – If the API key is not provided.
* **ValueError** – If an invalid dtype is provided.
-#### `embed(text, preprocess=None, as_buffer=False, **kwargs)`
-
-Embed a chunk of text using the VertexAI Embeddings API.
-
-* **Parameters:**
- * **text** (*str*) – Chunk of text to embed.
- * **preprocess** (*Optional* *[* *Callable* *]* *,* *optional*) – Optional preprocessing callable to
- perform before vectorization. Defaults to None.
- * **as_buffer** (*bool* *,* *optional*) – Whether to convert the raw embedding
- to a byte string. Defaults to False.
-* **Returns:**
- Embedding as a list of floats, or as a bytes
- object if as_buffer=True
-* **Return type:**
- Union[List[float], bytes]
-* **Raises:**
- **TypeError** – If the wrong input type is passed in for the test.
+#### `model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}`
-#### `embed_many(texts, preprocess=None, batch_size=10, as_buffer=False, **kwargs)`
-
-Embed many chunks of text using the VertexAI Embeddings API.
-
-* **Parameters:**
- * **texts** (*List* *[* *str* *]*) – List of text chunks to embed.
- * **preprocess** (*Optional* *[* *Callable* *]* *,* *optional*) – Optional preprocessing callable to
- perform before vectorization. Defaults to None.
- * **batch_size** (*int* *,* *optional*) – Batch size of texts to use when creating
- embeddings. Defaults to 10.
- * **as_buffer** (*bool* *,* *optional*) – Whether to convert the raw embedding
- to a byte string. Defaults to False.
-* **Returns:**
- List of embeddings as lists of floats,
- or as bytes objects if as_buffer=True
-* **Return type:**
- Union[List[List[float]], List[bytes]]
-* **Raises:**
- **TypeError** – If the wrong input type is passed in for the test.
-
-#### `model_post_init(context, /)`
-
-This function is meant to behave like a BaseModel method to initialise private attributes.
-
-It takes context as an argument since that’s what pydantic-core passes when calling it.
-
-* **Parameters:**
- * **self** (*BaseModel*) – The BaseModel instance.
- * **context** (*Any*) – The context.
-* **Return type:**
- None
+Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
-#### `model_config: ClassVar[ConfigDict] = {}`
+#### `property type: str`
-Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
+Return the type of vectorizer.
## CohereTextVectorizer
-### `class CohereTextVectorizer(model='embed-english-v3.0', api_config=None, dtype='float32', *, dims=None)`
+### `class CohereTextVectorizer(model='embed-english-v3.0', api_config=None, dtype='float32', cache=None, *, dims=None)`
Bases: `BaseVectorizer`
@@ -531,9 +379,13 @@ client must be installed with pip install cohere.
The vectorizer supports only synchronous operations, allows for batch
processing of texts and flexibility in handling preprocessing tasks.
+You can optionally enable caching to improve performance when generating
+embeddings for repeated text inputs.
+
```python
from redisvl.utils.vectorize import CohereTextVectorizer
+# Basic usage
vectorizer = CohereTextVectorizer(
model="embed-english-v3.0",
api_config={"api_key": "your-cohere-api-key"} # OR set COHERE_API_KEY in your env
@@ -542,10 +394,32 @@ query_embedding = vectorizer.embed(
text="your input query text here",
input_type="search_query"
)
-doc_embeddings = cohere.embed_many(
+doc_embeddings = vectorizer.embed_many(
texts=["your document text", "more document text"],
input_type="search_document"
)
+
+# With caching enabled
+from redisvl.extensions.cache.embeddings import EmbeddingsCache
+cache = EmbeddingsCache(name="cohere_embeddings_cache")
+
+vectorizer = CohereTextVectorizer(
+ model="embed-english-v3.0",
+ api_config={"api_key": "your-cohere-api-key"},
+ cache=cache
+)
+
+# First call will compute and cache the embedding
+embedding1 = vectorizer.embed(
+ text="your input query text here",
+ input_type="search_query"
+)
+
+# Second call will retrieve from cache
+embedding2 = vectorizer.embed(
+ text="your input query text here",
+ input_type="search_query"
+)
```
Initialize the Cohere vectorizer.
@@ -560,111 +434,27 @@ Visit [https://cohere.ai/embed](https://cohere.ai/embed) to learn about embeddin
Used when setting as_buffer=True in calls to embed() and embed_many().
‘float32’ will use Cohere’s float embeddings, ‘int8’ and ‘uint8’ will map
to Cohere’s corresponding embedding types. Defaults to ‘float32’.
- * **dims** (*int* *|* *None*)
+ * **cache** (*Optional* *[*[*EmbeddingsCache*]({{< relref "cache/#embeddingscache" >}}) *]*) – Optional EmbeddingsCache instance to cache embeddings for
+ better performance with repeated texts. Defaults to None.
+ * **dims** (*Annotated* *[* *int* *|* *None* *,* *FieldInfo* *(* *annotation=NoneType* *,* *required=True* *,* *metadata=* *[* *Strict* *(* *strict=True* *)* *,* *Gt* *(* *gt=0* *)* *]* *)* *]*)
* **Raises:**
* **ImportError** – If the cohere library is not installed.
* **ValueError** – If the API key is not provided.
* **ValueError** – If an invalid dtype is provided.
-#### `embed(text, preprocess=None, as_buffer=False, **kwargs)`
-
-Embed a chunk of text using the Cohere Embeddings API.
+#### `model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}`
-Must provide the embedding input_type as a kwarg to this method
-that specifies the type of input you’re giving to the model.
-
-Supported input types:
-: - `search_document`: Used for embeddings stored in a vector database for search use-cases.
- - `search_query`: Used for embeddings of search queries run against a vector DB to find relevant documents.
- - `classification`: Used for embeddings passed through a text classifier
- - `clustering`: Used for the embeddings run through a clustering algorithm.
-
-When hydrating your Redis DB, the documents you want to search over
-should be embedded with input_type= “search_document” and when you are
-querying the database, you should set the input_type = “search query”.
-If you want to use the embeddings for a classification or clustering
-task downstream, you should set input_type= “classification” or
-“clustering”.
-
-* **Parameters:**
- * **text** (*str*) – Chunk of text to embed.
- * **preprocess** (*Optional* *[* *Callable* *]* *,* *optional*) – Optional preprocessing callable to
- perform before vectorization. Defaults to None.
- * **as_buffer** (*bool* *,* *optional*) – Whether to convert the raw embedding
- to a byte string. Defaults to False.
- * **input_type** (*str*) – Specifies the type of input passed to the model.
- Required for embedding models v3 and higher.
-* **Returns:**
- - If as_buffer=True: Returns a bytes object
- - If as_buffer=False:
- - For dtype=”float32”: Returns a list of floats
- - For dtype=”int8” or “uint8”: Returns a list of integers
-* **Return type:**
- Union[List[float], List[int], bytes]
-* **Raises:**
- **TypeError** – In an invalid input_type is provided.
-
-#### `embed_many(texts, preprocess=None, batch_size=10, as_buffer=False, **kwargs)`
-
-Embed many chunks of text using the Cohere Embeddings API.
-
-Must provide the embedding input_type as a kwarg to this method
-that specifies the type of input you’re giving to the model.
-
-Supported input types:
-: - `search_document`: Used for embeddings stored in a vector database for search use-cases.
- - `search_query`: Used for embeddings of search queries run against a vector DB to find relevant documents.
- - `classification`: Used for embeddings passed through a text classifier
- - `clustering`: Used for the embeddings run through a clustering algorithm.
-
-When hydrating your Redis DB, the documents you want to search over
-should be embedded with input_type= “search_document” and when you are
-querying the database, you should set the input_type = “search query”.
-If you want to use the embeddings for a classification or clustering
-task downstream, you should set input_type= “classification” or
-“clustering”.
-
-* **Parameters:**
- * **texts** (*List* *[* *str* *]*) – List of text chunks to embed.
- * **preprocess** (*Optional* *[* *Callable* *]* *,* *optional*) – Optional preprocessing callable to
- perform before vectorization. Defaults to None.
- * **batch_size** (*int* *,* *optional*) – Batch size of texts to use when creating
- embeddings. Defaults to 10.
- * **as_buffer** (*bool* *,* *optional*) – Whether to convert the raw embedding
- to a byte string. Defaults to False.
- * **input_type** (*str*) – Specifies the type of input passed to the model.
- Required for embedding models v3 and higher.
-* **Returns:**
- - If as_buffer=True: Returns a list of bytes objects
- - If as_buffer=False:
- - For dtype=”float32”: Returns a list of lists of floats
- - For dtype=”int8” or “uint8”: Returns a list of lists of integers
-* **Return type:**
- Union[List[List[float]], List[List[int]], List[bytes]]
-* **Raises:**
- **TypeError** – In an invalid input_type is provided.
-
-#### `model_post_init(context, /)`
-
-This function is meant to behave like a BaseModel method to initialise private attributes.
-
-It takes context as an argument since that’s what pydantic-core passes when calling it.
-
-* **Parameters:**
- * **self** (*BaseModel*) – The BaseModel instance.
- * **context** (*Any*) – The context.
-* **Return type:**
- None
+Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
-#### `model_config: ClassVar[ConfigDict] = {}`
+#### `property type: str`
-Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
+Return the type of vectorizer.
## BedrockTextVectorizer
-### `class BedrockTextVectorizer(model='amazon.titan-embed-text-v2:0', api_config=None, dtype='float32', *, dims=None)`
+### `class BedrockTextVectorizer(model='amazon.titan-embed-text-v2:0', api_config=None, dtype='float32', cache=None, *, dims=None)`
Bases: `BaseVectorizer`
@@ -681,8 +471,11 @@ directly in the api_config dictionary or through environment variables:
The vectorizer supports synchronous operations with batch processing and
preprocessing capabilities.
+You can optionally enable caching to improve performance when generating
+embeddings for repeated text inputs.
+
```python
-# Initialize with explicit credentials
+# Basic usage with explicit credentials
vectorizer = AmazonBedrockTextVectorizer(
model="amazon.titan-embed-text-v2:0",
api_config={
@@ -692,11 +485,22 @@ vectorizer = AmazonBedrockTextVectorizer(
}
)
-# Initialize using environment variables
-vectorizer = AmazonBedrockTextVectorizer()
+# With environment variables and caching
+from redisvl.extensions.cache.embeddings import EmbeddingsCache
+cache = EmbeddingsCache(name="bedrock_embeddings_cache")
-# Generate embeddings
-embedding = vectorizer.embed("Hello, world!")
+vectorizer = AmazonBedrockTextVectorizer(
+ model="amazon.titan-embed-text-v2:0",
+ cache=cache
+)
+
+# First call will compute and cache the embedding
+embedding1 = vectorizer.embed("Hello, world!")
+
+# Second call will retrieve from cache
+embedding2 = vectorizer.embed("Hello, world!")
+
+# Generate batch embeddings
embeddings = vectorizer.embed_many(["Hello", "World"], batch_size=2)
```
@@ -710,66 +514,27 @@ Initialize the AWS Bedrock Vectorizer.
* **dtype** (*str*) – the default datatype to use when embedding text as byte arrays.
Used when setting as_buffer=True in calls to embed() and embed_many().
Defaults to ‘float32’.
- * **dims** (*int* *|* *None*)
+ * **cache** (*Optional* *[*[*EmbeddingsCache*]({{< relref "cache/#embeddingscache" >}}) *]*) – Optional EmbeddingsCache instance to cache embeddings for
+ better performance with repeated texts. Defaults to None.
+ * **dims** (*Annotated* *[* *int* *|* *None* *,* *FieldInfo* *(* *annotation=NoneType* *,* *required=True* *,* *metadata=* *[* *Strict* *(* *strict=True* *)* *,* *Gt* *(* *gt=0* *)* *]* *)* *]*)
* **Raises:**
* **ValueError** – If credentials are not provided in config or environment.
* **ImportError** – If boto3 is not installed.
* **ValueError** – If an invalid dtype is provided.
-#### `embed(text, preprocess=None, as_buffer=False, **kwargs)`
-
-Embed a chunk of text using the AWS Bedrock Embeddings API.
-
-* **Parameters:**
- * **text** (*str*) – Text to embed.
- * **preprocess** (*Optional* *[* *Callable* *]*) – Optional preprocessing function.
- * **as_buffer** (*bool*) – Whether to return as byte buffer.
-* **Returns:**
- Embedding as a list of floats, or as a bytes
- object if as_buffer=True
-* **Return type:**
- Union[List[float], bytes]
-* **Raises:**
- **TypeError** – If text is not a string.
-
-#### `embed_many(texts, preprocess=None, batch_size=10, as_buffer=False, **kwargs)`
-
-Embed many chunks of text using the AWS Bedrock Embeddings API.
-
-* **Parameters:**
- * **texts** (*List* *[* *str* *]*) – List of texts to embed.
- * **preprocess** (*Optional* *[* *Callable* *]*) – Optional preprocessing function.
- * **batch_size** (*int*) – Size of batches for processing. Defaults to 10.
- * **as_buffer** (*bool*) – Whether to return as byte buffers.
-* **Returns:**
- List of embeddings as lists of floats,
- or as bytes objects if as_buffer=True
-* **Return type:**
- Union[List[List[float]], List[bytes]]
-* **Raises:**
- **TypeError** – If texts is not a list of strings.
-
-#### `model_post_init(context, /)`
-
-This function is meant to behave like a BaseModel method to initialise private attributes.
+#### `model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}`
-It takes context as an argument since that’s what pydantic-core passes when calling it.
-
-* **Parameters:**
- * **self** (*BaseModel*) – The BaseModel instance.
- * **context** (*Any*) – The context.
-* **Return type:**
- None
+Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
-#### `model_config: ClassVar[ConfigDict] = {}`
+#### `property type: str`
-Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
+Return the type of vectorizer.
## CustomTextVectorizer
-### `class CustomTextVectorizer(embed, embed_many=None, aembed=None, aembed_many=None, dtype='float32')`
+### `class CustomTextVectorizer(embed, embed_many=None, aembed=None, aembed_many=None, dtype='float32', cache=None)`
Bases: `BaseVectorizer`
@@ -782,13 +547,31 @@ The vectorizer may support both synchronous and asynchronous operations which
allows for batch processing of texts, but at a minimum only syncronous embedding
is required to satisfy the ‘embed()’ method.
+You can optionally enable caching to improve performance when generating
+embeddings for repeated text inputs.
+
```python
-# Synchronous embedding of a single text
+# Basic usage with a custom embedding function
vectorizer = CustomTextVectorizer(
embed = my_vectorizer.generate_embedding
)
embedding = vectorizer.embed("Hello, world!")
+# With caching enabled
+from redisvl.extensions.cache.embeddings import EmbeddingsCache
+cache = EmbeddingsCache(name="my_embeddings_cache")
+
+vectorizer = CustomTextVectorizer(
+ embed=my_vectorizer.generate_embedding,
+ cache=cache
+)
+
+# First call will compute and cache the embedding
+embedding1 = vectorizer.embed("Hello, world!")
+
+# Second call will retrieve from cache
+embedding2 = vectorizer.embed("Hello, world!")
+
# Asynchronous batch embedding of multiple texts
embeddings = await vectorizer.aembed_many(
["Hello, world!", "How are you?"],
@@ -800,97 +583,30 @@ Initialize the Custom vectorizer.
* **Parameters:**
* **embed** (*Callable*) – a Callable function that accepts a string object and returns a list of floats.
- * **embed_many** (*Optional* *[* *Callable*) – a Callable function that accepts a list of string objects and returns a list containing lists of floats. Defaults to None.
+ * **embed_many** (*Optional* *[* *Callable* *]*) – a Callable function that accepts a list of string objects and returns a list containing lists of floats. Defaults to None.
* **aembed** (*Optional* *[* *Callable* *]*) – an asyncronous Callable function that accepts a string object and returns a lists of floats. Defaults to None.
* **aembed_many** (*Optional* *[* *Callable* *]*) – an asyncronous Callable function that accepts a list of string objects and returns a list containing lists of floats. Defaults to None.
* **dtype** (*str*) – the default datatype to use when embedding text as byte arrays.
Used when setting as_buffer=True in calls to embed() and embed_many().
Defaults to ‘float32’.
+ * **cache** (*Optional* *[*[*EmbeddingsCache*]({{< relref "cache/#embeddingscache" >}}) *]*) – Optional EmbeddingsCache instance to cache embeddings for
+ better performance with repeated texts. Defaults to None.
* **Raises:**
**ValueError** – if embedding validation fails.
-#### `async aembed(*args, **kwargs)`
-
-Asynchronously embed a chunk of text.
-
-* **Parameters:**
- * **text** – Text to embed
- * **preprocess** – Optional function to preprocess text
- * **as_buffer** – If True, returns a bytes object instead of a list
-* **Returns:**
- Embedding as a list of floats, or as a bytes
- object if as_buffer=True
-* **Return type:**
- Union[List[float], bytes]
-
-#### `async aembed_many(*args, **kwargs)`
-
-Asynchronously embed multiple chunks of text.
-
-* **Parameters:**
- * **texts** – List of texts to embed
- * **preprocess** – Optional function to preprocess text
- * **batch_size** – Number of texts to process in each batch
- * **as_buffer** – If True, returns each embedding as a bytes object
-* **Returns:**
- List of embeddings as lists of floats,
- or as bytes objects if as_buffer=True
-* **Return type:**
- Union[List[List[float]], List[bytes]]
-
-#### `embed(text, preprocess=None, as_buffer=False, **kwargs)`
-
-Generate an embedding for a single piece of text using your sync embed function.
-
-* **Parameters:**
- * **text** (*str*) – The text to embed.
- * **preprocess** (*Optional* *[* *Callable* *]*) – An optional callable to preprocess the text.
- * **as_buffer** (*bool*) – If True, return the embedding as a byte buffer.
-* **Returns:**
- The embedding of the input text.
-* **Return type:**
- Union[List[float], bytes]
-* **Raises:**
- **TypeError** – If the input is not a string.
-
-#### `embed_many(texts, preprocess=None, batch_size=10, as_buffer=False, **kwargs)`
-
-Generate embeddings for multiple pieces of text in batches using your sync embed_many function.
-
-* **Parameters:**
- * **texts** (*List* *[* *str* *]*) – A list of texts to embed.
- * **preprocess** (*Optional* *[* *Callable* *]*) – Optional preprocessing for each text.
- * **batch_size** (*int*) – Number of texts per batch.
- * **as_buffer** (*bool*) – If True, convert each embedding to a byte buffer.
-* **Returns:**
- A list of embeddings, where each embedding is a list of floats or bytes.
-* **Return type:**
- Union[List[List[float]], List[bytes]]
-* **Raises:**
- * **TypeError** – If the input is not a list of strings.
- * **NotImplementedError** – If no embed_many function was provided.
-
-#### `model_post_init(context, /)`
-
-This function is meant to behave like a BaseModel method to initialise private attributes.
+#### `model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}`
-It takes context as an argument since that’s what pydantic-core passes when calling it.
-
-* **Parameters:**
- * **self** (*BaseModel*) – The BaseModel instance.
- * **context** (*Any*) – The context.
-* **Return type:**
- None
+Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
-#### `model_config: ClassVar[ConfigDict] = {}`
+#### `property type: str`
-Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
+Return the type of vectorizer.
## VoyageAITextVectorizer
-### `class VoyageAITextVectorizer(model='voyage-large-2', api_config=None, dtype='float32', *, dims=None)`
+### `class VoyageAITextVectorizer(model='voyage-large-2', api_config=None, dtype='float32', cache=None, *, dims=None)`
Bases: `BaseVectorizer`
@@ -907,9 +623,13 @@ client must be installed with pip install voyageai.
The vectorizer supports both synchronous and asynchronous operations, allows for batch
processing of texts and flexibility in handling preprocessing tasks.
+You can optionally enable caching to improve performance when generating
+embeddings for repeated text inputs.
+
```python
from redisvl.utils.vectorize import VoyageAITextVectorizer
+# Basic usage
vectorizer = VoyageAITextVectorizer(
model="voyage-large-2",
api_config={"api_key": "your-voyageai-api-key"} # OR set VOYAGE_API_KEY in your env
@@ -922,6 +642,28 @@ doc_embeddings = vectorizer.embed_many(
texts=["your document text", "more document text"],
input_type="document"
)
+
+# With caching enabled
+from redisvl.extensions.cache.embeddings import EmbeddingsCache
+cache = EmbeddingsCache(name="voyageai_embeddings_cache")
+
+vectorizer = VoyageAITextVectorizer(
+ model="voyage-large-2",
+ api_config={"api_key": "your-voyageai-api-key"},
+ cache=cache
+)
+
+# First call will compute and cache the embedding
+embedding1 = vectorizer.embed(
+ text="your input query text here",
+ input_type="query"
+)
+
+# Second call will retrieve from cache
+embedding2 = vectorizer.embed(
+ text="your input query text here",
+ input_type="query"
+)
```
Initialize the VoyageAI vectorizer.
@@ -935,153 +677,17 @@ Visit [https://docs.voyageai.com/docs/embeddings](https://docs.voyageai.com/docs
* **dtype** (*str*) – the default datatype to use when embedding text as byte arrays.
Used when setting as_buffer=True in calls to embed() and embed_many().
Defaults to ‘float32’.
- * **dims** (*int* *|* *None*)
+ * **cache** (*Optional* *[*[*EmbeddingsCache*]({{< relref "cache/#embeddingscache" >}}) *]*) – Optional EmbeddingsCache instance to cache embeddings for
+ better performance with repeated texts. Defaults to None.
+ * **dims** (*Annotated* *[* *int* *|* *None* *,* *FieldInfo* *(* *annotation=NoneType* *,* *required=True* *,* *metadata=* *[* *Strict* *(* *strict=True* *)* *,* *Gt* *(* *gt=0* *)* *]* *)* *]*)
* **Raises:**
* **ImportError** – If the voyageai library is not installed.
* **ValueError** – If the API key is not provided.
-#### `aembed(text, preprocess=None, as_buffer=False, **kwargs)`
-
-Embed a chunk of text using the VoyageAI Embeddings API.
-
-Can provide the embedding input_type as a kwarg to this method
-that specifies the type of input you’re giving to the model. For retrieval/search use cases,
-we recommend specifying this argument when encoding queries or documents to enhance retrieval quality.
-Embeddings generated with and without the input_type argument are compatible.
-
-Supported input types are `document` and `query`
-
-When hydrating your Redis DB, the documents you want to search over
-should be embedded with input_type=”document” and when you are
-querying the database, you should set the input_type=”query”.
-
-* **Parameters:**
- * **text** (*str*) – Chunk of text to embed.
- * **preprocess** (*Optional* *[* *Callable* *]* *,* *optional*) – Optional preprocessing callable to
- perform before vectorization. Defaults to None.
- * **as_buffer** (*bool* *,* *optional*) – Whether to convert the raw embedding
- to a byte string. Defaults to False.
- * **input_type** (*str*) – Specifies the type of input passed to the model.
- * **truncation** (*bool*) – Whether to truncate the input texts to fit within the context length.
- Check [https://docs.voyageai.com/docs/embeddings](https://docs.voyageai.com/docs/embeddings)
-* **Returns:**
- Embedding.
-* **Return type:**
- List[float]
-* **Raises:**
- **TypeError** – In an invalid input_type is provided.
-
-#### `aembed_many(texts, preprocess=None, batch_size=None, as_buffer=False, **kwargs)`
-
-Embed many chunks of text using the VoyageAI Embeddings API.
-
-Can provide the embedding input_type as a kwarg to this method
-that specifies the type of input you’re giving to the model. For retrieval/search use cases,
-we recommend specifying this argument when encoding queries or documents to enhance retrieval quality.
-Embeddings generated with and without the input_type argument are compatible.
-
-Supported input types are `document` and `query`
-
-When hydrating your Redis DB, the documents you want to search over
-should be embedded with input_type=”document” and when you are
-querying the database, you should set the input_type=”query”.
-
-* **Parameters:**
- * **texts** (*List* *[* *str* *]*) – List of text chunks to embed.
- * **preprocess** (*Optional* *[* *Callable* *]* *,* *optional*) – Optional preprocessing callable to
- perform before vectorization. Defaults to None.
- * **batch_size** (*int* *,* *optional*) – Batch size of texts to use when creating
- embeddings. .
- * **as_buffer** (*bool* *,* *optional*) – Whether to convert the raw embedding
- to a byte string. Defaults to False.
- * **input_type** (*str*) – Specifies the type of input passed to the model.
- * **truncation** (*bool*) – Whether to truncate the input texts to fit within the context length.
- Check [https://docs.voyageai.com/docs/embeddings](https://docs.voyageai.com/docs/embeddings)
-* **Returns:**
- List of embeddings.
-* **Return type:**
- List[List[float]]
-* **Raises:**
- **TypeError** – In an invalid input_type is provided.
-
-#### `embed(text, preprocess=None, as_buffer=False, **kwargs)`
-
-Embed a chunk of text using the VoyageAI Embeddings API.
-
-Can provide the embedding input_type as a kwarg to this method
-that specifies the type of input you’re giving to the model. For retrieval/search use cases,
-we recommend specifying this argument when encoding queries or documents to enhance retrieval quality.
-Embeddings generated with and without the input_type argument are compatible.
-
-Supported input types are `document` and `query`
-
-When hydrating your Redis DB, the documents you want to search over
-should be embedded with input_type=”document” and when you are
-querying the database, you should set the input_type=”query”.
-
-* **Parameters:**
- * **text** (*str*) – Chunk of text to embed.
- * **preprocess** (*Optional* *[* *Callable* *]* *,* *optional*) – Optional preprocessing callable to
- perform before vectorization. Defaults to None.
- * **as_buffer** (*bool* *,* *optional*) – Whether to convert the raw embedding
- to a byte string. Defaults to False.
- * **input_type** (*str*) – Specifies the type of input passed to the model.
- * **truncation** (*bool*) – Whether to truncate the input texts to fit within the context length.
- Check [https://docs.voyageai.com/docs/embeddings](https://docs.voyageai.com/docs/embeddings)
-* **Returns:**
- Embedding as a list of floats, or as a bytes
- object if as_buffer=True
-* **Return type:**
- Union[List[float], bytes]
-* **Raises:**
- **TypeError** – If an invalid input_type is provided.
-
-#### `embed_many(texts, preprocess=None, batch_size=None, as_buffer=False, **kwargs)`
-
-Embed many chunks of text using the VoyageAI Embeddings API.
+#### `model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}`
-Can provide the embedding input_type as a kwarg to this method
-that specifies the type of input you’re giving to the model. For retrieval/search use cases,
-we recommend specifying this argument when encoding queries or documents to enhance retrieval quality.
-Embeddings generated with and without the input_type argument are compatible.
-
-Supported input types are `document` and `query`
-
-When hydrating your Redis DB, the documents you want to search over
-should be embedded with input_type=”document” and when you are
-querying the database, you should set the input_type=”query”.
-
-* **Parameters:**
- * **texts** (*List* *[* *str* *]*) – List of text chunks to embed.
- * **preprocess** (*Optional* *[* *Callable* *]* *,* *optional*) – Optional preprocessing callable to
- perform before vectorization. Defaults to None.
- * **batch_size** (*int* *,* *optional*) – Batch size of texts to use when creating
- embeddings. .
- * **as_buffer** (*bool* *,* *optional*) – Whether to convert the raw embedding
- to a byte string. Defaults to False.
- * **input_type** (*str*) – Specifies the type of input passed to the model.
- * **truncation** (*bool*) – Whether to truncate the input texts to fit within the context length.
- Check [https://docs.voyageai.com/docs/embeddings](https://docs.voyageai.com/docs/embeddings)
-* **Returns:**
- List of embeddings as lists of floats,
- or as bytes objects if as_buffer=True
-* **Return type:**
- Union[List[List[float]], List[bytes]]
-* **Raises:**
- **TypeError** – If an invalid input_type is provided.
-
-#### `model_post_init(context, /)`
-
-This function is meant to behave like a BaseModel method to initialise private attributes.
-
-It takes context as an argument since that’s what pydantic-core passes when calling it.
-
-* **Parameters:**
- * **self** (*BaseModel*) – The BaseModel instance.
- * **context** (*Any*) – The context.
-* **Return type:**
- None
+Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
-#### `model_config: ClassVar[ConfigDict] = {}`
+#### `property type: str`
-Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
+Return the type of vectorizer.
diff --git a/content/integrate/redisvl/overview/cli.md b/content/integrate/redisvl/overview/cli.md
index 8aab0f9e3e..123b78986f 100644
--- a/content/integrate/redisvl/overview/cli.md
+++ b/content/integrate/redisvl/overview/cli.md
@@ -19,7 +19,7 @@ Before running this notebook, be sure to
!rvl version
```
- 09:58:03 [RedisVL] INFO RedisVL version 0.5.2
+ 19:16:18 [RedisVL] INFO RedisVL version 0.5.2
## Commands
@@ -74,8 +74,7 @@ fields:
!rvl index create -s schema.yaml
```
- 18:12:32 [RedisVL] INFO Using Redis address from environment variable, REDIS_URL
- 18:12:32 [RedisVL] INFO Index created successfully
+ 19:16:21 [RedisVL] INFO Index created successfully
@@ -84,9 +83,8 @@ fields:
!rvl index listall
```
- 18:12:35 [RedisVL] INFO Using Redis address from environment variable, REDIS_URL
- 18:12:35 [RedisVL] INFO Indices:
- 18:12:35 [RedisVL] INFO 1. vectorizers
+ 19:16:24 [RedisVL] INFO Indices:
+ 19:16:24 [RedisVL] INFO 1. vectorizers
@@ -95,22 +93,21 @@ fields:
!rvl index info -i vectorizers
```
- 18:12:37 [RedisVL] INFO Using Redis address from environment variable, REDIS_URL
Index Information:
- ╭──────────────┬────────────────┬────────────┬─────────────────┬────────────╮
- │ Index Name │ Storage Type │ Prefixes │ Index Options │ Indexing │
- ├──────────────┼────────────────┼────────────┼─────────────────┼────────────┤
- │ vectorizers │ HASH │ ['doc'] │ [] │ 0 │
- ╰──────────────┴────────────────┴────────────┴─────────────────┴────────────╯
+ ╭───────────────┬───────────────┬───────────────┬───────────────┬───────────────╮
+ │ Index Name │ Storage Type │ Prefixes │ Index Options │ Indexing │
+ ├───────────────┼───────────────┼───────────────┼───────────────┼───────────────┤
+ | vectorizers | HASH | ['doc'] | [] | 0 |
+ ╰───────────────┴───────────────┴───────────────┴───────────────┴───────────────╯
Index Fields:
- ╭───────────┬─────────────┬────────┬────────────────┬────────────────┬────────────────┬────────────────┬────────────────┬────────────────┬─────────────────┬────────────────╮
- │ Name │ Attribute │ Type │ Field Option │ Option Value │ Field Option │ Option Value │ Field Option │ Option Value │ Field Option │ Option Value │
- ├───────────┼─────────────┼────────┼────────────────┼────────────────┼────────────────┼────────────────┼────────────────┼────────────────┼─────────────────┼────────────────┤
- │ sentence │ sentence │ TEXT │ WEIGHT │ 1 │ │ │ │ │ │ │
- │ embedding │ embedding │ VECTOR │ algorithm │ FLAT │ data_type │ FLOAT32 │ dim │ 768 │ distance_metric │ COSINE │
- ╰───────────┴─────────────┴────────┴────────────────┴────────────────┴────────────────┴────────────────┴────────────────┴────────────────┴─────────────────┴────────────────╯
+ ╭─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────╮
+ │ Name │ Attribute │ Type │ Field Option │ Option Value │ Field Option │ Option Value │ Field Option │ Option Value │ Field Option │ Option Value │
+ ├─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┤
+ │ sentence │ sentence │ TEXT │ WEIGHT │ 1 │ │ │ │ │ │ │
+ │ embedding │ embedding │ VECTOR │ algorithm │ FLAT │ data_type │ FLOAT32 │ dim │ 768 │ distance_metric │ COSINE │
+ ╰─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────╯
@@ -119,8 +116,7 @@ fields:
!rvl index delete -i vectorizers
```
- 18:12:40 [RedisVL] INFO Using Redis address from environment variable, REDIS_URL
- 18:12:40 [RedisVL] INFO Index deleted successfully
+ 19:16:29 [RedisVL] INFO Index deleted successfully
@@ -129,8 +125,7 @@ fields:
!rvl index listall
```
- 18:12:43 [RedisVL] INFO Using Redis address from environment variable, REDIS_URL
- 18:12:43 [RedisVL] INFO Indices:
+ 19:16:32 [RedisVL] INFO Indices:
## Stats
@@ -144,9 +139,7 @@ The ``rvl stats`` command will return some basic information about the index. Th
!rvl index create -s schema.yaml
```
- 18:13:21 [RedisVL] INFO Using Redis address from environment variable, REDIS_URL
- 18:13:21 redisvl.index.index INFO Index already exists, not overwriting.
- 18:13:21 [RedisVL] INFO Index created successfully
+ 19:16:35 [RedisVL] INFO Index created successfully
@@ -155,9 +148,8 @@ The ``rvl stats`` command will return some basic information about the index. Th
!rvl index listall
```
- 18:13:25 [RedisVL] INFO Using Redis address from environment variable, REDIS_URL
- 18:13:25 [RedisVL] INFO Indices:
- 18:13:25 [RedisVL] INFO 1. vectorizers
+ 19:16:38 [RedisVL] INFO Indices:
+ 19:16:38 [RedisVL] INFO 1. vectorizers
@@ -166,32 +158,31 @@ The ``rvl stats`` command will return some basic information about the index. Th
!rvl stats -i vectorizers
```
- 18:13:31 [RedisVL] INFO Using Redis address from environment variable, REDIS_URL
Statistics:
- ╭─────────────────────────────┬─────────╮
- │ Stat Key │ Value │
- ├─────────────────────────────┼─────────┤
- │ num_docs │ 0 │
- │ num_terms │ 0 │
- │ max_doc_id │ 0 │
- │ num_records │ 0 │
- │ percent_indexed │ 1 │
- │ hash_indexing_failures │ 3 │
- │ number_of_uses │ 2 │
- │ bytes_per_record_avg │ nan │
- │ doc_table_size_mb │ 0 │
- │ inverted_sz_mb │ 0 │
- │ key_table_size_mb │ 0 │
- │ offset_bits_per_record_avg │ nan │
- │ offset_vectors_sz_mb │ 0 │
- │ offsets_per_term_avg │ nan │
- │ records_per_doc_avg │ nan │
- │ sortable_values_size_mb │ 0 │
- │ total_indexing_time │ 0.02 │
- │ total_inverted_index_blocks │ 0 │
- │ vector_index_sz_mb │ 0 │
- ╰─────────────────────────────┴─────────╯
+ ╭─────────────────────────────┬────────────╮
+ │ Stat Key │ Value │
+ ├─────────────────────────────┼────────────┤
+ │ num_docs │ 0 │
+ │ num_terms │ 0 │
+ │ max_doc_id │ 0 │
+ │ num_records │ 0 │
+ │ percent_indexed │ 1 │
+ │ hash_indexing_failures │ 0 │
+ │ number_of_uses │ 1 │
+ │ bytes_per_record_avg │ nan │
+ │ doc_table_size_mb │ 0 │
+ │ inverted_sz_mb │ 0 │
+ │ key_table_size_mb │ 0 │
+ │ offset_bits_per_record_avg │ nan │
+ │ offset_vectors_sz_mb │ 0 │
+ │ offsets_per_term_avg │ nan │
+ │ records_per_doc_avg │ nan │
+ │ sortable_values_size_mb │ 0 │
+ │ total_indexing_time │ 0 │
+ │ total_inverted_index_blocks │ 0 │
+ │ vector_index_sz_mb │ 0.00818634 │
+ ╰─────────────────────────────┴────────────╯
## Optional arguments
@@ -215,9 +206,8 @@ By default rvl first checks if you have `REDIS_URL` environment variable defined
!rvl index listall --host localhost --port 6379
```
- 18:13:36 [RedisVL] INFO Using Redis address from environment variable, REDIS_URL
- 18:13:36 [RedisVL] INFO Indices:
- 18:13:36 [RedisVL] INFO 1. vectorizers
+ 19:16:43 [RedisVL] INFO Indices:
+ 19:16:43 [RedisVL] INFO 1. vectorizers
### Using SSL encription
@@ -230,7 +220,13 @@ You can similarly specify the username and password to construct the full Redis
!rvl index listall --user jane_doe -a password123 --ssl
```
+ 19:16:46 [RedisVL] ERROR Error 8 connecting to rediss:6379. nodename nor servname provided, or not known.
+
+
```python
!rvl index destroy -i vectorizers
```
+
+ 19:16:49 [RedisVL] INFO Index deleted successfully
+
diff --git a/content/integrate/redisvl/user_guide/_index.md b/content/integrate/redisvl/user_guide/_index.md
index 8c13cb5184..83e9a976ea 100644
--- a/content/integrate/redisvl/user_guide/_index.md
+++ b/content/integrate/redisvl/user_guide/_index.md
@@ -36,6 +36,17 @@ User guides provide helpful resources for using RedisVL and its different compon
* [Utilize TTL](llmcache/#utilize-ttl)
* [Simple Performance Testing](llmcache/#simple-performance-testing)
* [Cache Access Controls, Tags & Filters](llmcache/#cache-access-controls-tags-filters)
+* [Caching Embeddings](embeddings_cache/)
+ * [Setup](embeddings_cache/#setup)
+ * [Initializing the EmbeddingsCache](embeddings_cache/#initializing-the-embeddingscache)
+ * [Basic Usage](embeddings_cache/#basic-usage)
+ * [Advanced Usage](embeddings_cache/#advanced-usage)
+ * [Async Support](embeddings_cache/#async-support)
+ * [Real-World Example](embeddings_cache/#real-world-example)
+ * [Performance Benchmark](embeddings_cache/#performance-benchmark)
+ * [Common Use Cases for Embedding Caching](embeddings_cache/#common-use-cases-for-embedding-caching)
+ * [Cleanup](embeddings_cache/#cleanup)
+ * [Summary](embeddings_cache/#summary)
* [Vectorizers](vectorizers/)
* [Creating Text Embeddings](vectorizers/#creating-text-embeddings)
* [Search with Provider Embeddings](vectorizers/#search-with-provider-embeddings)
@@ -49,16 +60,19 @@ User guides provide helpful resources for using RedisVL and its different compon
* [Cleanup](hash_vs_json/#id1)
* [Rerankers](rerankers/)
* [Simple Reranking](rerankers/#simple-reranking)
-* [LLM Session Memory](session_manager/)
- * [Managing multiple users and conversations](session_manager/#managing-multiple-users-and-conversations)
- * [Semantic conversation memory](session_manager/#semantic-conversation-memory)
- * [Conversation control](session_manager/#conversation-control)
+* [LLM Message History](message_history/)
+ * [Managing multiple users and conversations](message_history/#managing-multiple-users-and-conversations)
+ * [Semantic message history](message_history/#semantic-message-history)
+ * [Conversation control](message_history/#conversation-control)
* [Semantic Routing](semantic_router/)
* [Define the Routes](semantic_router/#define-the-routes)
* [Initialize the SemanticRouter](semantic_router/#initialize-the-semanticrouter)
* [Simple routing](semantic_router/#simple-routing)
* [Update the routing config](semantic_router/#update-the-routing-config)
* [Router serialization](semantic_router/#router-serialization)
+* [Add route references](semantic_router/#add-route-references)
+* [Get route references](semantic_router/#get-route-references)
+* [Delete route references](semantic_router/#delete-route-references)
* [Clean up the router](semantic_router/#clean-up-the-router)
* [Threshold Optimization](threshold_optimization/)
* [CacheThresholdOptimizer](threshold_optimization/#cachethresholdoptimizer)
diff --git a/content/integrate/redisvl/user_guide/embeddings_cache.md b/content/integrate/redisvl/user_guide/embeddings_cache.md
new file mode 100644
index 0000000000..731dba6588
--- /dev/null
+++ b/content/integrate/redisvl/user_guide/embeddings_cache.md
@@ -0,0 +1,534 @@
+---
+linkTitle: Caching embeddings
+title: Caching Embeddings
+type: integration
+weight: 10
+---
+
+
+RedisVL provides an `EmbeddingsCache` that makes it easy to store and retrieve embedding vectors with their associated text and metadata. This cache is particularly useful for applications that frequently compute the same embeddings, enabling you to:
+
+- Reduce computational costs by reusing previously computed embeddings
+- Decrease latency in applications that rely on embeddings
+- Store additional metadata alongside embeddings for richer applications
+
+This notebook will show you how to use the `EmbeddingsCache` effectively in your applications.
+
+## Setup
+
+First, let's import the necessary libraries. We'll use a text embedding model from HuggingFace to generate our embeddings.
+
+
+```python
+import os
+import time
+import numpy as np
+
+# Disable tokenizers parallelism to avoid deadlocks
+os.environ["TOKENIZERS_PARALLELISM"] = "False"
+
+# Import the EmbeddingsCache
+from redisvl.extensions.cache.embeddings import EmbeddingsCache
+from redisvl.utils.vectorize import HFTextVectorizer
+```
+
+Let's create a vectorizer to generate embeddings for our texts:
+
+
+```python
+# Initialize the vectorizer
+vectorizer = HFTextVectorizer(
+ model="redis/langcache-embed-v1",
+ cache_folder=os.getenv("SENTENCE_TRANSFORMERS_HOME")
+)
+```
+
+ /Users/tyler.hutcherson/Library/Caches/pypoetry/virtualenvs/redisvl-VnTEShF2-py3.13/lib/python3.13/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
+ from .autonotebook import tqdm as notebook_tqdm
+ Compiling the model with `torch.compile` and using a `torch.mps` device is not supported. Falling back to non-compiled mode.
+
+
+## Initializing the EmbeddingsCache
+
+Now let's initialize our `EmbeddingsCache`. The cache requires a Redis connection to store the embeddings and their associated data.
+
+
+```python
+# Initialize the embeddings cache
+cache = EmbeddingsCache(
+ name="embedcache", # name prefix for Redis keys
+ redis_url="redis://localhost:6379", # Redis connection URL
+ ttl=None # Optional TTL in seconds (None means no expiration)
+)
+```
+
+## Basic Usage
+
+### Storing Embeddings
+
+Let's store some text with its embedding in the cache. The `set` method takes the following parameters:
+- `text`: The input text that was embedded
+- `model_name`: The name of the embedding model used
+- `embedding`: The embedding vector
+- `metadata`: Optional metadata associated with the embedding
+- `ttl`: Optional time-to-live override for this specific entry
+
+
+```python
+# Text to embed
+text = "What is machine learning?"
+model_name = "redis/langcache-embed-v1"
+
+# Generate the embedding
+embedding = vectorizer.embed(text)
+
+# Optional metadata
+metadata = {"category": "ai", "source": "user_query"}
+
+# Store in cache
+key = cache.set(
+ text=text,
+ model_name=model_name,
+ embedding=embedding,
+ metadata=metadata
+)
+
+print(f"Stored with key: {key[:15]}...")
+```
+
+ Stored with key: embedcache:909f...
+
+
+### Retrieving Embeddings
+
+To retrieve an embedding from the cache, use the `get` method with the original text and model name:
+
+
+```python
+# Retrieve from cache
+
+if result := cache.get(text=text, model_name=model_name):
+ print(f"Found in cache: {result['text']}")
+ print(f"Model: {result['model_name']}")
+ print(f"Metadata: {result['metadata']}")
+ print(f"Embedding shape: {np.array(result['embedding']).shape}")
+else:
+ print("Not found in cache.")
+```
+
+ Found in cache: What is machine learning?
+ Model: redis/langcache-embed-v1
+ Metadata: {'category': 'ai', 'source': 'user_query'}
+ Embedding shape: (768,)
+
+
+### Checking Existence
+
+You can check if an embedding exists in the cache without retrieving it using the `exists` method:
+
+
+```python
+# Check if existing text is in cache
+exists = cache.exists(text=text, model_name=model_name)
+print(f"First query exists in cache: {exists}")
+
+# Check if a new text is in cache
+new_text = "What is deep learning?"
+exists = cache.exists(text=new_text, model_name=model_name)
+print(f"New query exists in cache: {exists}")
+```
+
+ First query exists in cache: True
+ New query exists in cache: False
+
+
+### Removing Entries
+
+To remove an entry from the cache, use the `drop` method:
+
+
+```python
+# Remove from cache
+cache.drop(text=text, model_name=model_name)
+
+# Verify it's gone
+exists = cache.exists(text=text, model_name=model_name)
+print(f"After dropping: {exists}")
+```
+
+ After dropping: False
+
+
+## Advanced Usage
+
+### Key-Based Operations
+
+The `EmbeddingsCache` also provides methods that work directly with Redis keys, which can be useful for advanced use cases:
+
+
+```python
+# Store an entry again
+key = cache.set(
+ text=text,
+ model_name=model_name,
+ embedding=embedding,
+ metadata=metadata
+)
+print(f"Stored with key: {key[:15]}...")
+
+# Check existence by key
+exists_by_key = cache.exists_by_key(key)
+print(f"Exists by key: {exists_by_key}")
+
+# Retrieve by key
+result_by_key = cache.get_by_key(key)
+print(f"Retrieved by key: {result_by_key['text']}")
+
+# Drop by key
+cache.drop_by_key(key)
+```
+
+ Stored with key: embedcache:909f...
+ Exists by key: True
+ Retrieved by key: What is machine learning?
+
+
+### Batch Operations
+
+When working with multiple embeddings, batch operations can significantly improve performance by reducing network roundtrips. The `EmbeddingsCache` provides methods prefixed with `m` (for "multi") that handle batches efficiently.
+
+
+```python
+# Create multiple embeddings
+texts = [
+ "What is machine learning?",
+ "How do neural networks work?",
+ "What is deep learning?"
+]
+embeddings = [vectorizer.embed(t) for t in texts]
+
+# Prepare batch items as dictionaries
+batch_items = [
+ {
+ "text": texts[0],
+ "model_name": model_name,
+ "embedding": embeddings[0],
+ "metadata": {"category": "ai", "type": "question"}
+ },
+ {
+ "text": texts[1],
+ "model_name": model_name,
+ "embedding": embeddings[1],
+ "metadata": {"category": "ai", "type": "question"}
+ },
+ {
+ "text": texts[2],
+ "model_name": model_name,
+ "embedding": embeddings[2],
+ "metadata": {"category": "ai", "type": "question"}
+ }
+]
+
+# Store multiple embeddings in one operation
+keys = cache.mset(batch_items)
+print(f"Stored {len(keys)} embeddings with batch operation")
+
+# Check if multiple embeddings exist in one operation
+exist_results = cache.mexists(texts, model_name)
+print(f"All embeddings exist: {all(exist_results)}")
+
+# Retrieve multiple embeddings in one operation
+results = cache.mget(texts, model_name)
+print(f"Retrieved {len(results)} embeddings in one operation")
+
+# Delete multiple embeddings in one operation
+cache.mdrop(texts, model_name)
+
+# Alternative: key-based batch operations
+# cache.mget_by_keys(keys) # Retrieve by keys
+# cache.mexists_by_keys(keys) # Check existence by keys
+# cache.mdrop_by_keys(keys) # Delete by keys
+```
+
+ Stored 3 embeddings with batch operation
+ All embeddings exist: True
+ Retrieved 3 embeddings in one operation
+
+
+Batch operations are particularly beneficial when working with large numbers of embeddings. They provide the same functionality as individual operations but with better performance by reducing network roundtrips.
+
+For asynchronous applications, async versions of all batch methods are also available with the `am` prefix (e.g., `amset`, `amget`, `amexists`, `amdrop`).
+
+### Working with TTL (Time-To-Live)
+
+You can set a global TTL when initializing the cache, or specify TTL for individual entries:
+
+
+```python
+# Create a cache with a default 5-second TTL
+ttl_cache = EmbeddingsCache(
+ name="ttl_cache",
+ redis_url="redis://localhost:6379",
+ ttl=5 # 5 second TTL
+)
+
+# Store an entry
+key = ttl_cache.set(
+ text=text,
+ model_name=model_name,
+ embedding=embedding
+)
+
+# Check if it exists
+exists = ttl_cache.exists_by_key(key)
+print(f"Immediately after setting: {exists}")
+
+# Wait for it to expire
+time.sleep(6)
+
+# Check again
+exists = ttl_cache.exists_by_key(key)
+print(f"After waiting: {exists}")
+```
+
+ Immediately after setting: True
+ After waiting: False
+
+
+You can also override the default TTL for individual entries:
+
+
+```python
+# Store an entry with a custom 1-second TTL
+key1 = ttl_cache.set(
+ text="Short-lived entry",
+ model_name=model_name,
+ embedding=embedding,
+ ttl=1 # Override with 1 second TTL
+)
+
+# Store another entry with the default TTL (5 seconds)
+key2 = ttl_cache.set(
+ text="Default TTL entry",
+ model_name=model_name,
+ embedding=embedding
+ # No TTL specified = uses the default 5 seconds
+)
+
+# Wait for 2 seconds
+time.sleep(2)
+
+# Check both entries
+exists1 = ttl_cache.exists_by_key(key1)
+exists2 = ttl_cache.exists_by_key(key2)
+
+print(f"Entry with custom TTL after 2 seconds: {exists1}")
+print(f"Entry with default TTL after 2 seconds: {exists2}")
+
+# Cleanup
+ttl_cache.drop_by_key(key2)
+```
+
+ Entry with custom TTL after 2 seconds: False
+ Entry with default TTL after 2 seconds: True
+
+
+## Async Support
+
+The `EmbeddingsCache` provides async versions of all methods for use in async applications. The async methods are prefixed with `a` (e.g., `aset`, `aget`, `aexists`, `adrop`).
+
+
+```python
+async def async_cache_demo():
+ # Store an entry asynchronously
+ key = await cache.aset(
+ text="Async embedding",
+ model_name=model_name,
+ embedding=embedding,
+ metadata={"async": True}
+ )
+
+ # Check if it exists
+ exists = await cache.aexists_by_key(key)
+ print(f"Async set successful? {exists}")
+
+ # Retrieve it
+ result = await cache.aget_by_key(key)
+ success = result is not None and result["text"] == "Async embedding"
+ print(f"Async get successful? {success}")
+
+ # Remove it
+ await cache.adrop_by_key(key)
+
+# Run the async demo
+await async_cache_demo()
+```
+
+ Async set successful? True
+ Async get successful? True
+
+
+## Real-World Example
+
+Let's build a simple embeddings caching system for a text classification task. We'll check the cache before computing new embeddings to save computation time.
+
+
+```python
+# Create a fresh cache for this example
+example_cache = EmbeddingsCache(
+ name="example_cache",
+ redis_url="redis://localhost:6379",
+ ttl=3600 # 1 hour TTL
+)
+
+vectorizer = HFTextVectorizer(
+ model=model_name,
+ cache=example_cache,
+ cache_folder=os.getenv("SENTENCE_TRANSFORMERS_HOME")
+)
+
+# Simulate processing a stream of queries
+queries = [
+ "What is artificial intelligence?",
+ "How does machine learning work?",
+ "What is artificial intelligence?", # Repeated query
+ "What are neural networks?",
+ "How does machine learning work?" # Repeated query
+]
+
+# Process the queries and track statistics
+total_queries = 0
+cache_hits = 0
+
+for query in queries:
+ total_queries += 1
+
+ # Check cache before computing
+ before = example_cache.exists(text=query, model_name=model_name)
+ if before:
+ cache_hits += 1
+
+ # Get embedding (will compute or use cache)
+ embedding = vectorizer.embed(query)
+
+# Report statistics
+cache_misses = total_queries - cache_hits
+hit_rate = (cache_hits / total_queries) * 100
+
+print("\nStatistics:")
+print(f"Total queries: {total_queries}")
+print(f"Cache hits: {cache_hits}")
+print(f"Cache misses: {cache_misses}")
+print(f"Cache hit rate: {hit_rate:.1f}%")
+
+# Cleanup
+for query in set(queries): # Use set to get unique queries
+ example_cache.drop(text=query, model_name=model_name)
+```
+
+
+ Statistics:
+ Total queries: 5
+ Cache hits: 2
+ Cache misses: 3
+ Cache hit rate: 40.0%
+
+
+## Performance Benchmark
+
+Let's run benchmarks to compare the performance of embedding with and without caching, as well as batch versus individual operations.
+
+
+```python
+# Text to use for benchmarking
+benchmark_text = "This is a benchmark text to measure the performance of embedding caching."
+
+# Create a fresh cache for benchmarking
+benchmark_cache = EmbeddingsCache(
+ name="benchmark_cache",
+ redis_url="redis://localhost:6379",
+ ttl=3600 # 1 hour TTL
+)
+vectorizer.cache = benchmark_cache
+
+# Number of iterations for the benchmark
+n_iterations = 10
+
+# Benchmark without caching
+print("Benchmarking without caching:")
+start_time = time.time()
+for _ in range(n_iterations):
+ embedding = vectorizer.embed(text, skip_cache=True)
+no_cache_time = time.time() - start_time
+print(f"Time taken without caching: {no_cache_time:.4f} seconds")
+print(f"Average time per embedding: {no_cache_time/n_iterations:.4f} seconds")
+
+# Benchmark with caching
+print("\nBenchmarking with caching:")
+start_time = time.time()
+for _ in range(n_iterations):
+ embedding = vectorizer.embed(text)
+cache_time = time.time() - start_time
+print(f"Time taken with caching: {cache_time:.4f} seconds")
+print(f"Average time per embedding: {cache_time/n_iterations:.4f} seconds")
+
+# Compare performance
+speedup = no_cache_time / cache_time
+latency_reduction = (no_cache_time/n_iterations) - (cache_time/n_iterations)
+print(f"\nPerformance comparison:")
+print(f"Speedup with caching: {speedup:.2f}x faster")
+print(f"Time saved: {no_cache_time - cache_time:.4f} seconds ({(1 - cache_time/no_cache_time) * 100:.1f}%)")
+print(f"Latency reduction: {latency_reduction:.4f} seconds per query")
+```
+
+ Benchmarking without caching:
+ Time taken without caching: 0.4735 seconds
+ Average time per embedding: 0.0474 seconds
+
+ Benchmarking with caching:
+ Time taken with caching: 0.0663 seconds
+ Average time per embedding: 0.0066 seconds
+
+ Performance comparison:
+ Speedup with caching: 7.14x faster
+ Time saved: 0.4073 seconds (86.0%)
+ Latency reduction: 0.0407 seconds per query
+
+
+## Common Use Cases for Embedding Caching
+
+Embedding caching is particularly useful in the following scenarios:
+
+1. **Search applications**: Cache embeddings for frequently searched queries to reduce latency
+2. **Content recommendation systems**: Cache embeddings for content items to speed up similarity calculations
+3. **API services**: Reduce costs and improve response times when generating embeddings through paid APIs
+4. **Batch processing**: Speed up processing of datasets that contain duplicate texts
+5. **Chatbots and virtual assistants**: Cache embeddings for common user queries to provide faster responses
+6. **Development** workflows
+
+## Cleanup
+
+Let's clean up our caches to avoid leaving data in Redis:
+
+
+```python
+# Clean up all caches
+cache.clear()
+ttl_cache.clear()
+example_cache.clear()
+benchmark_cache.clear()
+```
+
+## Summary
+
+The `EmbeddingsCache` provides an efficient way to store and retrieve embeddings with their associated text and metadata. Key features include:
+
+- Simple API for storing and retrieving individual embeddings (`set`/`get`)
+- Batch operations for working with multiple embeddings efficiently (`mset`/`mget`/`mexists`/`mdrop`)
+- Support for metadata storage alongside embeddings
+- Configurable time-to-live (TTL) for cache entries
+- Key-based operations for advanced use cases
+- Async support for use in asynchronous applications
+- Significant performance improvements (15-20x faster with batch operations)
+
+By using the `EmbeddingsCache`, you can reduce computational costs and improve the performance of applications that rely on embeddings.
diff --git a/content/integrate/redisvl/user_guide/getting_started.md b/content/integrate/redisvl/user_guide/getting_started.md
index f8604fdc13..bd9f294c86 100644
--- a/content/integrate/redisvl/user_guide/getting_started.md
+++ b/content/integrate/redisvl/user_guide/getting_started.md
@@ -162,13 +162,6 @@ index = SearchIndex.from_dict(schema, redis_url="redis://localhost:6379", valida
# connect to Redis at the default address "redis://localhost:6379".
```
-
-
-
-
-
-
-
### Create the index
Now that we are connected to Redis, we need to run the create command.
@@ -188,8 +181,8 @@ Use the `rvl` CLI to inspect the created index and its fields:
!rvl index listall
```
- 10:59:25 [RedisVL] INFO Indices:
- 10:59:25 [RedisVL] INFO 1. user_simple
+ 19:17:09 [RedisVL] INFO Indices:
+ 19:17:09 [RedisVL] INFO 1. user_simple
@@ -200,21 +193,21 @@ Use the `rvl` CLI to inspect the created index and its fields:
Index Information:
- ╭──────────────┬────────────────┬──────────────────────┬─────────────────┬────────────╮
- │ Index Name │ Storage Type │ Prefixes │ Index Options │ Indexing │
- ├──────────────┼────────────────┼──────────────────────┼─────────────────┼────────────┤
- │ user_simple │ HASH │ ['user_simple_docs'] │ [] │ 0 │
- ╰──────────────┴────────────────┴──────────────────────┴─────────────────┴────────────╯
+ ╭──────────────────────┬──────────────────────┬──────────────────────┬──────────────────────┬──────────────────────╮
+ │ Index Name │ Storage Type │ Prefixes │ Index Options │ Indexing │
+ ├──────────────────────┼──────────────────────┼──────────────────────┼──────────────────────┼──────────────────────┤
+ | user_simple | HASH | ['user_simple_docs'] | [] | 0 |
+ ╰──────────────────────┴──────────────────────┴──────────────────────┴──────────────────────┴──────────────────────╯
Index Fields:
- ╭────────────────┬────────────────┬─────────┬────────────────┬────────────────┬────────────────┬────────────────┬────────────────┬────────────────┬─────────────────┬────────────────╮
- │ Name │ Attribute │ Type │ Field Option │ Option Value │ Field Option │ Option Value │ Field Option │ Option Value │ Field Option │ Option Value │
- ├────────────────┼────────────────┼─────────┼────────────────┼────────────────┼────────────────┼────────────────┼────────────────┼────────────────┼─────────────────┼────────────────┤
- │ user │ user │ TAG │ SEPARATOR │ , │ │ │ │ │ │ │
- │ credit_score │ credit_score │ TAG │ SEPARATOR │ , │ │ │ │ │ │ │
- │ job │ job │ TEXT │ WEIGHT │ 1 │ │ │ │ │ │ │
- │ age │ age │ NUMERIC │ │ │ │ │ │ │ │ │
- │ user_embedding │ user_embedding │ VECTOR │ algorithm │ FLAT │ data_type │ FLOAT32 │ dim │ 3 │ distance_metric │ COSINE │
- ╰────────────────┴────────────────┴─────────┴────────────────┴────────────────┴────────────────┴────────────────┴────────────────┴────────────────┴─────────────────┴────────────────╯
+ ╭─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────╮
+ │ Name │ Attribute │ Type │ Field Option │ Option Value │ Field Option │ Option Value │ Field Option │ Option Value │ Field Option │ Option Value │
+ ├─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┤
+ │ user │ user │ TAG │ SEPARATOR │ , │ │ │ │ │ │ │
+ │ credit_score │ credit_score │ TAG │ SEPARATOR │ , │ │ │ │ │ │ │
+ │ job │ job │ TEXT │ WEIGHT │ 1 │ │ │ │ │ │ │
+ │ age │ age │ NUMERIC │ │ │ │ │ │ │ │ │
+ │ user_embedding │ user_embedding │ VECTOR │ algorithm │ FLAT │ data_type │ FLOAT32 │ dim │ 3 │ distance_metric │ COSINE │
+ ╰─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────╯
## Load Data to `SearchIndex`
@@ -231,7 +224,7 @@ keys = index.load(data)
print(keys)
```
- ['user_simple_docs:01JQ9FEZ4GAAYT9W7BWAF7CV18', 'user_simple_docs:01JQ9FEZ4JCE5FD1D5QY6BAJ0J', 'user_simple_docs:01JQ9FEZ4KF9AZYBKMYNMYBZ5A']
+ ['user_simple_docs:01JT4PPPNJZMSK2395RKD208T9', 'user_simple_docs:01JT4PPPNM63J55ZESZ4TV1VR8', 'user_simple_docs:01JT4PPPNM59RCKS2YQ58B1HQW']
By default, `load` will create a unique Redis key as a combination of the index key `prefix` and a random ULID. You can also customize the key by providing direct keys or pointing to a specified `id_field` on load.
@@ -246,15 +239,15 @@ This will raise a `SchemaValidationError` if `validate_on_load` is set to true i
keys = index.load([{"user_embedding": True}])
```
- 11:00:03 redisvl.index.index ERROR Schema validation error while loading data
+ 19:17:21 redisvl.index.index ERROR Schema validation error while loading data
Traceback (most recent call last):
- File "/Users/tyler.hutcherson/Documents/AppliedAI/redis-vl-python/redisvl/index/storage.py", line 204, in _preprocess_and_validate_objects
+ File "/Users/justin.cechmanek/Documents/redisvl/redisvl/index/storage.py", line 204, in _preprocess_and_validate_objects
processed_obj = self._validate(processed_obj)
- File "/Users/tyler.hutcherson/Documents/AppliedAI/redis-vl-python/redisvl/index/storage.py", line 160, in _validate
+ File "/Users/justin.cechmanek/Documents/redisvl/redisvl/index/storage.py", line 160, in _validate
return validate_object(self.index_schema, obj)
- File "/Users/tyler.hutcherson/Documents/AppliedAI/redis-vl-python/redisvl/schema/validation.py", line 274, in validate_object
+ File "/Users/justin.cechmanek/Documents/redisvl/redisvl/schema/validation.py", line 276, in validate_object
validated = model_class.model_validate(flat_obj)
- File "/Users/tyler.hutcherson/Library/Caches/pypoetry/virtualenvs/redisvl-VnTEShF2-py3.13/lib/python3.13/site-packages/pydantic/main.py", line 627, in model_validate
+ File "/Users/justin.cechmanek/.pyenv/versions/3.13/envs/redisvl-dev/lib/python3.13/site-packages/pydantic/main.py", line 627, in model_validate
return cls.__pydantic_validator__.validate_python(
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
obj, strict=strict, from_attributes=from_attributes, context=context
@@ -269,7 +262,7 @@ keys = index.load([{"user_embedding": True}])
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
- File "/Users/tyler.hutcherson/Documents/AppliedAI/redis-vl-python/redisvl/index/index.py", line 586, in load
+ File "/Users/justin.cechmanek/Documents/redisvl/redisvl/index/index.py", line 686, in load
return self._storage.write(
~~~~~~~~~~~~~~~~~~~^
self._redis_client, # type: ignore
@@ -279,13 +272,13 @@ keys = index.load([{"user_embedding": True}])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
- File "/Users/tyler.hutcherson/Documents/AppliedAI/redis-vl-python/redisvl/index/storage.py", line 265, in write
+ File "/Users/justin.cechmanek/Documents/redisvl/redisvl/index/storage.py", line 265, in write
prepared_objects = self._preprocess_and_validate_objects(
list(objects), # Convert Iterable to List
...<3 lines>...
validate=validate,
)
- File "/Users/tyler.hutcherson/Documents/AppliedAI/redis-vl-python/redisvl/index/storage.py", line 211, in _preprocess_and_validate_objects
+ File "/Users/justin.cechmanek/Documents/redisvl/redisvl/index/storage.py", line 211, in _preprocess_and_validate_objects
raise SchemaValidationError(str(e), index=i) from e
redisvl.exceptions.SchemaValidationError: Validation failed for object at index 0: 1 validation error for user_simple__PydanticModel
user_embedding
@@ -298,24 +291,24 @@ keys = index.load([{"user_embedding": True}])
ValidationError Traceback (most recent call last)
- File ~/Documents/AppliedAI/redis-vl-python/redisvl/index/storage.py:204, in BaseStorage._preprocess_and_validate_objects(self, objects, id_field, keys, preprocess, validate)
+ File ~/Documents/redisvl/redisvl/index/storage.py:204, in BaseStorage._preprocess_and_validate_objects(self, objects, id_field, keys, preprocess, validate)
203 if validate:
--> 204 processed_obj = self._validate(processed_obj)
206 # Store valid object with its key for writing
- File ~/Documents/AppliedAI/redis-vl-python/redisvl/index/storage.py:160, in BaseStorage._validate(self, obj)
+ File ~/Documents/redisvl/redisvl/index/storage.py:160, in BaseStorage._validate(self, obj)
159 # Pass directly to validation function and let any errors propagate
--> 160 return validate_object(self.index_schema, obj)
- File ~/Documents/AppliedAI/redis-vl-python/redisvl/schema/validation.py:274, in validate_object(schema, obj)
- 273 # Validate against model
- --> 274 validated = model_class.model_validate(flat_obj)
- 275 return validated.model_dump(exclude_none=True)
+ File ~/Documents/redisvl/redisvl/schema/validation.py:276, in validate_object(schema, obj)
+ 275 # Validate against model
+ --> 276 validated = model_class.model_validate(flat_obj)
+ 277 return validated.model_dump(exclude_none=True)
- File ~/Library/Caches/pypoetry/virtualenvs/redisvl-VnTEShF2-py3.13/lib/python3.13/site-packages/pydantic/main.py:627, in BaseModel.model_validate(cls, obj, strict, from_attributes, context)
+ File ~/.pyenv/versions/3.13/envs/redisvl-dev/lib/python3.13/site-packages/pydantic/main.py:627, in BaseModel.model_validate(cls, obj, strict, from_attributes, context)
626 __tracebackhide__ = True
--> 627 return cls.__pydantic_validator__.validate_python(
628 obj, strict=strict, from_attributes=from_attributes, context=context
@@ -333,34 +326,35 @@ keys = index.load([{"user_embedding": True}])
SchemaValidationError Traceback (most recent call last)
- Cell In[16], line 1
- ----> 1 keys = index.load([{"user_embedding": True}])
+ Cell In[31], line 3
+ 1 # NBVAL_SKIP
+ ----> 3 keys = index.load([{"user_embedding": True}])
- File ~/Documents/AppliedAI/redis-vl-python/redisvl/index/index.py:586, in SearchIndex.load(self, data, id_field, keys, ttl, preprocess, batch_size)
- 556 """Load objects to the Redis database. Returns the list of keys loaded
- 557 to Redis.
- 558
+ File ~/Documents/redisvl/redisvl/index/index.py:686, in SearchIndex.load(self, data, id_field, keys, ttl, preprocess, batch_size)
+ 656 """Load objects to the Redis database. Returns the list of keys loaded
+ 657 to Redis.
+ 658
(...)
- 583 RedisVLError: If there's an error loading data to Redis.
- 584 """
- 585 try:
- --> 586 return self._storage.write(
- 587 self._redis_client, # type: ignore
- 588 objects=data,
- 589 id_field=id_field,
- 590 keys=keys,
- 591 ttl=ttl,
- 592 preprocess=preprocess,
- 593 batch_size=batch_size,
- 594 validate=self._validate_on_load,
- 595 )
- 596 except SchemaValidationError:
- 597 # Pass through validation errors directly
- 598 logger.exception("Schema validation error while loading data")
-
-
- File ~/Documents/AppliedAI/redis-vl-python/redisvl/index/storage.py:265, in BaseStorage.write(self, redis_client, objects, id_field, keys, ttl, preprocess, batch_size, validate)
+ 683 RedisVLError: If there's an error loading data to Redis.
+ 684 """
+ 685 try:
+ --> 686 return self._storage.write(
+ 687 self._redis_client, # type: ignore
+ 688 objects=data,
+ 689 id_field=id_field,
+ 690 keys=keys,
+ 691 ttl=ttl,
+ 692 preprocess=preprocess,
+ 693 batch_size=batch_size,
+ 694 validate=self._validate_on_load,
+ 695 )
+ 696 except SchemaValidationError:
+ 697 # Pass through validation errors directly
+ 698 logger.exception("Schema validation error while loading data")
+
+
+ File ~/Documents/redisvl/redisvl/index/storage.py:265, in BaseStorage.write(self, redis_client, objects, id_field, keys, ttl, preprocess, batch_size, validate)
262 return []
264 # Pass 1: Preprocess and validate all objects
--> 265 prepared_objects = self._preprocess_and_validate_objects(
@@ -374,7 +368,7 @@ keys = index.load([{"user_embedding": True}])
274 added_keys = []
- File ~/Documents/AppliedAI/redis-vl-python/redisvl/index/storage.py:211, in BaseStorage._preprocess_and_validate_objects(self, objects, id_field, keys, preprocess, validate)
+ File ~/Documents/redisvl/redisvl/index/storage.py:211, in BaseStorage._preprocess_and_validate_objects(self, objects, id_field, keys, preprocess, validate)
207 prepared_objects.append((key, processed_obj))
209 except ValidationError as e:
210 # Convert Pydantic ValidationError to SchemaValidationError with index context
@@ -408,7 +402,7 @@ keys = index.load(new_data)
print(keys)
```
- ['user_simple_docs:01JQ9FHCB1B64GXF6WPK127VZ6']
+ ['user_simple_docs:01JT4PPX63CH5YRN2BGEYB5TS2']
## Creating `VectorQuery` Objects
@@ -522,7 +516,7 @@ index.schema.add_fields([
await index.create(overwrite=True, drop=False)
```
- 11:01:30 redisvl.index.index INFO Index already exists, overwriting.
+ 19:17:29 redisvl.index.index INFO Index already exists, overwriting.
@@ -546,29 +540,29 @@ Use the `rvl` CLI to check the stats for the index:
Statistics:
- ╭─────────────────────────────┬─────────────╮
- │ Stat Key │ Value │
- ├─────────────────────────────┼─────────────┤
- │ num_docs │ 4 │
- │ num_terms │ 0 │
- │ max_doc_id │ 4 │
- │ num_records │ 20 │
- │ percent_indexed │ 1 │
- │ hash_indexing_failures │ 0 │
- │ number_of_uses │ 2 │
- │ bytes_per_record_avg │ 47.8 │
- │ doc_table_size_mb │ 0.000423431 │
- │ inverted_sz_mb │ 0.000911713 │
- │ key_table_size_mb │ 0.000165939 │
- │ offset_bits_per_record_avg │ nan │
- │ offset_vectors_sz_mb │ 0 │
- │ offsets_per_term_avg │ 0 │
- │ records_per_doc_avg │ 5 │
- │ sortable_values_size_mb │ 0 │
- │ total_indexing_time │ 6.529 │
- │ total_inverted_index_blocks │ 11 │
- │ vector_index_sz_mb │ 0.235947 │
- ╰─────────────────────────────┴─────────────╯
+ ╭─────────────────────────────┬────────────╮
+ │ Stat Key │ Value │
+ ├─────────────────────────────┼────────────┤
+ │ num_docs │ 4 │
+ │ num_terms │ 0 │
+ │ max_doc_id │ 4 │
+ │ num_records │ 20 │
+ │ percent_indexed │ 1 │
+ │ hash_indexing_failures │ 0 │
+ │ number_of_uses │ 2 │
+ │ bytes_per_record_avg │ 48.2000007 │
+ │ doc_table_size_mb │ 4.23431396 │
+ │ inverted_sz_mb │ 9.19342041 │
+ │ key_table_size_mb │ 1.93595886 │
+ │ offset_bits_per_record_avg │ nan │
+ │ offset_vectors_sz_mb │ 0 │
+ │ offsets_per_term_avg │ 0 │
+ │ records_per_doc_avg │ 5 │
+ │ sortable_values_size_mb │ 0 │
+ │ total_indexing_time │ 0.74400001 │
+ │ total_inverted_index_blocks │ 11 │
+ │ vector_index_sz_mb │ 0.23560333 │
+ ╰─────────────────────────────┴────────────╯
## Cleanup
diff --git a/content/integrate/redisvl/user_guide/llmcache.md b/content/integrate/redisvl/user_guide/llmcache.md
index acdbaa6b20..4e162f19db 100644
--- a/content/integrate/redisvl/user_guide/llmcache.md
+++ b/content/integrate/redisvl/user_guide/llmcache.md
@@ -43,6 +43,7 @@ def ask_openai(question: str) -> str:
print(ask_openai("What is the capital of France?"))
```
+ 19:17:51 httpx INFO HTTP Request: POST https://api.openai.com/v1/completions "HTTP/1.1 200 OK"
The capital of France is Paris.
@@ -52,16 +53,22 @@ print(ask_openai("What is the capital of France?"))
```python
-from redisvl.extensions.llmcache import SemanticCache
+from redisvl.extensions.cache.llm import SemanticCache
+from redisvl.utils .vectorize import HFTextVectorizer
llmcache = SemanticCache(
- name="llmcache", # underlying search index name
- redis_url="redis://localhost:6379", # redis connection url string
- distance_threshold=0.1 # semantic cache distance threshold
+ name="llmcache", # underlying search index name
+ redis_url="redis://localhost:6379", # redis connection url string
+ distance_threshold=0.1, # semantic cache distance threshold
+ vectorizer=HFTextVectorizer("redis/langcache-embed-v1"), # embdding model
)
```
- 22:11:38 redisvl.index.index INFO Index already exists, not overwriting.
+ 19:17:51 sentence_transformers.SentenceTransformer INFO Use pytorch device_name: mps
+ 19:17:51 sentence_transformers.SentenceTransformer INFO Load pretrained SentenceTransformer: redis/langcache-embed-v1
+
+
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 17.57it/s]
@@ -73,21 +80,21 @@ llmcache = SemanticCache(
Index Information:
- ╭──────────────┬────────────────┬──────────────┬─────────────────┬────────────╮
- │ Index Name │ Storage Type │ Prefixes │ Index Options │ Indexing │
- ├──────────────┼────────────────┼──────────────┼─────────────────┼────────────┤
- │ llmcache │ HASH │ ['llmcache'] │ [] │ 0 │
- ╰──────────────┴────────────────┴──────────────┴─────────────────┴────────────╯
+ ╭───────────────┬───────────────┬───────────────┬───────────────┬───────────────╮
+ │ Index Name │ Storage Type │ Prefixes │ Index Options │ Indexing │
+ ├───────────────┼───────────────┼───────────────┼───────────────┼───────────────┤
+ | llmcache | HASH | ['llmcache'] | [] | 0 |
+ ╰───────────────┴───────────────┴───────────────┴───────────────┴───────────────╯
Index Fields:
- ╭───────────────┬───────────────┬─────────┬────────────────┬────────────────┬────────────────┬────────────────┬────────────────┬────────────────┬─────────────────┬────────────────╮
- │ Name │ Attribute │ Type │ Field Option │ Option Value │ Field Option │ Option Value │ Field Option │ Option Value │ Field Option │ Option Value │
- ├───────────────┼───────────────┼─────────┼────────────────┼────────────────┼────────────────┼────────────────┼────────────────┼────────────────┼─────────────────┼────────────────┤
- │ prompt │ prompt │ TEXT │ WEIGHT │ 1 │ │ │ │ │ │ │
- │ response │ response │ TEXT │ WEIGHT │ 1 │ │ │ │ │ │ │
- │ inserted_at │ inserted_at │ NUMERIC │ │ │ │ │ │ │ │ │
- │ updated_at │ updated_at │ NUMERIC │ │ │ │ │ │ │ │ │
- │ prompt_vector │ prompt_vector │ VECTOR │ algorithm │ FLAT │ data_type │ FLOAT32 │ dim │ 768 │ distance_metric │ COSINE │
- ╰───────────────┴───────────────┴─────────┴────────────────┴────────────────┴────────────────┴────────────────┴────────────────┴────────────────┴─────────────────┴────────────────╯
+ ╭─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────╮
+ │ Name │ Attribute │ Type │ Field Option │ Option Value │ Field Option │ Option Value │ Field Option │ Option Value │ Field Option │ Option Value │
+ ├─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┤
+ │ prompt │ prompt │ TEXT │ WEIGHT │ 1 │ │ │ │ │ │ │
+ │ response │ response │ TEXT │ WEIGHT │ 1 │ │ │ │ │ │ │
+ │ inserted_at │ inserted_at │ NUMERIC │ │ │ │ │ │ │ │ │
+ │ updated_at │ updated_at │ NUMERIC │ │ │ │ │ │ │ │ │
+ │ prompt_vector │ prompt_vector │ VECTOR │ algorithm │ FLAT │ data_type │ FLOAT32 │ dim │ 768 │ distance_metric │ COSINE │
+ ╰─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────╯
## Basic Cache Usage
@@ -106,9 +113,14 @@ else:
print("Empty cache")
```
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 18.30it/s]
+
Empty cache
+
+
+
Our initial cache check should be empty since we have not yet stored anything in the cache. Below, store the `question`,
proper `response`, and any arbitrary `metadata` (as a python dictionary object) in the cache.
@@ -122,6 +134,9 @@ llmcache.store(
)
```
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 26.10it/s]
+
+
@@ -140,6 +155,9 @@ else:
print("Empty cache")
```
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 12.36it/s]
+
+
[{'prompt': 'What is the capital of France?', 'response': 'Paris', 'metadata': {'city': 'Paris', 'country': 'france'}, 'key': 'llmcache:115049a298532be2f181edb03f766770c0db84c22aff39003fec340deaec7545'}]
@@ -150,6 +168,9 @@ question = "What actually is the capital of France?"
llmcache.check(prompt=question)[0]['response']
```
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 12.22it/s]
+
+
@@ -167,7 +188,7 @@ Fortunately, you can seamlessly adjust the threshhold at any point like below:
```python
# Widen the semantic distance threshold
-llmcache.set_threshold(0.3)
+llmcache.set_threshold(0.5)
```
@@ -178,6 +199,9 @@ question = "What is the capital city of the country in Europe that also has a ci
llmcache.check(prompt=question)[0]['response']
```
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 19.20it/s]
+
+
@@ -194,6 +218,9 @@ llmcache.clear()
llmcache.check(prompt=question)
```
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 26.71it/s]
+
+
@@ -220,6 +247,9 @@ llmcache.store("This is a TTL test", "This is a TTL test response")
time.sleep(6)
```
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 20.45it/s]
+
+
```python
# confirm that the cache has cleared by now on it's own
@@ -228,9 +258,14 @@ result = llmcache.check("This is a TTL test")
print(result)
```
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 17.02it/s]
+
[]
+
+
+
```python
# Reset the TTL to null (long lived data)
@@ -275,7 +310,14 @@ print(f"Without caching, a call to openAI to answer this simple question took {e
llmcache.store(prompt=question, response="George Washington")
```
- Without caching, a call to openAI to answer this simple question took 0.9034533500671387 seconds.
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 14.88it/s]
+
+
+ 19:18:04 httpx INFO HTTP Request: POST https://api.openai.com/v1/completions "HTTP/1.1 200 OK"
+ Without caching, a call to openAI to answer this simple question took 0.8826751708984375 seconds.
+
+
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 18.38it/s]
@@ -301,8 +343,22 @@ print(f"Avg time taken with LLM cache enabled: {avg_time_with_cache}")
print(f"Percentage of time saved: {round(((end - start) - avg_time_with_cache) / (end - start) * 100, 2)}%")
```
- Avg time taken with LLM cache enabled: 0.09753389358520508
- Percentage of time saved: 89.2%
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 13.65it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 27.94it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 27.19it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 27.53it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 28.12it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 27.38it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 25.39it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 26.34it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 28.07it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 27.35it/s]
+
+ Avg time taken with LLM cache enabled: 0.0463670015335083
+ Percentage of time saved: 94.75%
+
+
+
@@ -313,29 +369,29 @@ print(f"Percentage of time saved: {round(((end - start) - avg_time_with_cache) /
Statistics:
- ╭─────────────────────────────┬─────────────╮
- │ Stat Key │ Value │
- ├─────────────────────────────┼─────────────┤
- │ num_docs │ 1 │
- │ num_terms │ 19 │
- │ max_doc_id │ 6 │
- │ num_records │ 53 │
- │ percent_indexed │ 1 │
- │ hash_indexing_failures │ 0 │
- │ number_of_uses │ 45 │
- │ bytes_per_record_avg │ 45.0566 │
- │ doc_table_size_mb │ 0.000134468 │
- │ inverted_sz_mb │ 0.00227737 │
- │ key_table_size_mb │ 2.76566e-05 │
- │ offset_bits_per_record_avg │ 8 │
- │ offset_vectors_sz_mb │ 3.91006e-05 │
- │ offsets_per_term_avg │ 0.773585 │
- │ records_per_doc_avg │ 53 │
- │ sortable_values_size_mb │ 0 │
- │ total_indexing_time │ 19.454 │
- │ total_inverted_index_blocks │ 21 │
- │ vector_index_sz_mb │ 3.0161 │
- ╰─────────────────────────────┴─────────────╯
+ ╭─────────────────────────────┬────────────╮
+ │ Stat Key │ Value │
+ ├─────────────────────────────┼────────────┤
+ │ num_docs │ 1 │
+ │ num_terms │ 19 │
+ │ max_doc_id │ 3 │
+ │ num_records │ 29 │
+ │ percent_indexed │ 1 │
+ │ hash_indexing_failures │ 0 │
+ │ number_of_uses │ 19 │
+ │ bytes_per_record_avg │ 75.9655151 │
+ │ doc_table_size_mb │ 1.34468078 │
+ │ inverted_sz_mb │ 0.00210094 │
+ │ key_table_size_mb │ 2.76565551 │
+ │ offset_bits_per_record_avg │ 8 │
+ │ offset_vectors_sz_mb │ 2.09808349 │
+ │ offsets_per_term_avg │ 0.75862067 │
+ │ records_per_doc_avg │ 29 │
+ │ sortable_values_size_mb │ 0 │
+ │ total_indexing_time │ 3.875 │
+ │ total_inverted_index_blocks │ 21 │
+ │ vector_index_sz_mb │ 3.01609802 │
+ ╰─────────────────────────────┴────────────╯
@@ -369,10 +425,20 @@ private_cache.store(
)
```
+ 19:18:07 [RedisVL] WARNING The default vectorizer has changed from `sentence-transformers/all-mpnet-base-v2` to `redis/langcache-embed-v1` in version 0.6.0 of RedisVL. For more information about this model, please refer to https://arxiv.org/abs/2504.02268 or visit https://huggingface.co/redis/langcache-embed-v1. To continue using the old vectorizer, please specify it explicitly in the constructor as: vectorizer=HFTextVectorizer(model='sentence-transformers/all-mpnet-base-v2')
+ 19:18:07 sentence_transformers.SentenceTransformer INFO Use pytorch device_name: mps
+ 19:18:07 sentence_transformers.SentenceTransformer INFO Load pretrained SentenceTransformer: redis/langcache-embed-v1
+
+
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 8.98it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 24.89it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 26.95it/s]
- 'private_cache:5de9d651f802d9cc3f62b034ced3466bf886a542ce43fe1c2b4181726665bf9c'
+
+
+ 'private_cache:2831a0659fb888e203cd9fedb9f65681bfa55e4977c092ed1bf87d42d2655081'
@@ -392,10 +458,15 @@ response = private_cache.check(
print(f"found {len(response)} entry \n{response[0]['response']}")
```
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 27.98it/s]
+
found 1 entry
The number on file is 123-555-0000
+
+
+
```python
# Cleanup
@@ -438,10 +509,22 @@ complex_cache.store(
)
```
+ 19:18:09 [RedisVL] WARNING The default vectorizer has changed from `sentence-transformers/all-mpnet-base-v2` to `redis/langcache-embed-v1` in version 0.6.0 of RedisVL. For more information about this model, please refer to https://arxiv.org/abs/2504.02268 or visit https://huggingface.co/redis/langcache-embed-v1. To continue using the old vectorizer, please specify it explicitly in the constructor as: vectorizer=HFTextVectorizer(model='sentence-transformers/all-mpnet-base-v2')
+ 19:18:09 sentence_transformers.SentenceTransformer INFO Use pytorch device_name: mps
+ 19:18:09 sentence_transformers.SentenceTransformer INFO Load pretrained SentenceTransformer: redis/langcache-embed-v1
+
+
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 13.54it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 16.76it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 21.82it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 28.80it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 21.04it/s]
+
- 'account_data:d48ebb3a2efbdbc17930a8c7559c548a58b562b2572ef0be28f0bb4ece2382e1'
+
+ 'account_data:944f89729b09ca46b99923d223db45e0bccf584cfd53fcaf87d2a58f072582d3'
@@ -464,10 +547,15 @@ print(f'found {len(response)} entry')
print(response[0]["response"])
```
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 28.15it/s]
+
found 1 entry
Your most recent transaction was for $350
+
+
+
```python
# Cleanup
diff --git a/content/integrate/redisvl/user_guide/message_history.md b/content/integrate/redisvl/user_guide/message_history.md
new file mode 100644
index 0000000000..61a4628034
--- /dev/null
+++ b/content/integrate/redisvl/user_guide/message_history.md
@@ -0,0 +1,200 @@
+---
+linkTitle: LLM message history
+title: LLM Message History
+type: integration
+weight: 07
+---
+
+
+Large Language Models are inherently stateless and have no knowledge of previous interactions with a user, or even of previous parts of the current conversation. While this may not be noticable when asking simple questions, it becomes a hinderance when engaging in long running conversations that rely on conversational context.
+
+The solution to this problem is to append the previous conversation history to each subsequent call to the LLM.
+
+This notebook will show how to use Redis to structure and store and retrieve this conversational message history.
+
+
+```python
+from redisvl.extensions.message_history import MessageHistory
+chat_history = MessageHistory(name='student tutor')
+```
+
+ 12:24:11 redisvl.index.index INFO Index already exists, not overwriting.
+
+
+To align with common LLM APIs, Redis stores messages with `role` and `content` fields.
+The supported roles are "system", "user" and "llm".
+
+You can store messages one at a time or all at once.
+
+
+```python
+chat_history.add_message({"role":"system", "content":"You are a helpful geography tutor, giving simple and short answers to questions about European countries."})
+chat_history.add_messages([
+ {"role":"user", "content":"What is the capital of France?"},
+ {"role":"llm", "content":"The capital is Paris."},
+ {"role":"user", "content":"And what is the capital of Spain?"},
+ {"role":"llm", "content":"The capital is Madrid."},
+ {"role":"user", "content":"What is the population of Great Britain?"},
+ {"role":"llm", "content":"As of 2023 the population of Great Britain is approximately 67 million people."},]
+ )
+```
+
+At any point we can retrieve the recent history of the conversation. It will be ordered by entry time.
+
+
+```python
+context = chat_history.get_recent()
+for message in context:
+ print(message)
+```
+
+ {'role': 'llm', 'content': 'The capital is Paris.'}
+ {'role': 'user', 'content': 'And what is the capital of Spain?'}
+ {'role': 'llm', 'content': 'The capital is Madrid.'}
+ {'role': 'user', 'content': 'What is the population of Great Britain?'}
+ {'role': 'llm', 'content': 'As of 2023 the population of Great Britain is approximately 67 million people.'}
+
+
+In many LLM flows the conversation progresses in a series of prompt and response pairs. Message history offer a convenience function `store()` to add these simply.
+
+
+```python
+prompt = "what is the size of England compared to Portugal?"
+response = "England is larger in land area than Portal by about 15000 square miles."
+chat_history.store(prompt, response)
+
+context = chat_history.get_recent(top_k=6)
+for message in context:
+ print(message)
+```
+
+ {'role': 'user', 'content': 'And what is the capital of Spain?'}
+ {'role': 'llm', 'content': 'The capital is Madrid.'}
+ {'role': 'user', 'content': 'What is the population of Great Britain?'}
+ {'role': 'llm', 'content': 'As of 2023 the population of Great Britain is approximately 67 million people.'}
+ {'role': 'user', 'content': 'what is the size of England compared to Portugal?'}
+ {'role': 'llm', 'content': 'England is larger in land area than Portal by about 15000 square miles.'}
+
+
+## Managing multiple users and conversations
+
+For applications that need to handle multiple conversations concurrently, Redis supports tagging messages to keep conversations separated.
+
+
+```python
+chat_history.add_message({"role":"system", "content":"You are a helpful algebra tutor, giving simple answers to math problems."}, session_tag='student two')
+chat_history.add_messages([
+ {"role":"user", "content":"What is the value of x in the equation 2x + 3 = 7?"},
+ {"role":"llm", "content":"The value of x is 2."},
+ {"role":"user", "content":"What is the value of y in the equation 3y - 5 = 7?"},
+ {"role":"llm", "content":"The value of y is 4."}],
+ session_tag='student two'
+ )
+
+for math_message in chat_history.get_recent(session_tag='student two'):
+ print(math_message)
+```
+
+ {'role': 'system', 'content': 'You are a helpful algebra tutor, giving simple answers to math problems.'}
+ {'role': 'user', 'content': 'What is the value of x in the equation 2x + 3 = 7?'}
+ {'role': 'llm', 'content': 'The value of x is 2.'}
+ {'role': 'user', 'content': 'What is the value of y in the equation 3y - 5 = 7?'}
+ {'role': 'llm', 'content': 'The value of y is 4.'}
+
+
+## Semantic message history
+For longer conversations our list of messages keeps growing. Since LLMs are stateless we have to continue to pass this conversation history on each subsequent call to ensure the LLM has the correct context.
+
+A typical flow looks like this:
+```
+while True:
+ prompt = input('enter your next question')
+ context = chat_history.get_recent()
+ response = LLM_api_call(prompt=prompt, context=context)
+ chat_history.store(prompt, response)
+```
+
+This works, but as context keeps growing so too does our LLM token count, which increases latency and cost.
+
+Conversation histories can be truncated, but that can lead to losing relevant information that appeared early on.
+
+A better solution is to pass only the relevant conversational context on each subsequent call.
+
+For this, RedisVL has the `SemanticMessageHistory`, which uses vector similarity search to return only semantically relevant sections of the conversation.
+
+
+```python
+from redisvl.extensions.message_history import SemanticMessageHistory
+semantic_history = SemanticMessageHistory(name='tutor')
+
+semantic_history.add_messages(chat_history.get_recent(top_k=8))
+```
+
+ 12:24:15 redisvl.index.index INFO Index already exists, not overwriting.
+
+
+
+```python
+prompt = "what have I learned about the size of England?"
+semantic_history.set_distance_threshold(0.35)
+context = semantic_history.get_relevant(prompt)
+for message in context:
+ print(message)
+```
+
+ {'role': 'user', 'content': 'what is the size of England compared to Portugal?'}
+ {'role': 'llm', 'content': 'England is larger in land area than Portal by about 15000 square miles.'}
+
+
+You can adjust the degree of semantic similarity needed to be included in your context.
+
+Setting a distance threshold close to 0.0 will require an exact semantic match, while a distance threshold of 1.0 will include everthing.
+
+
+```python
+semantic_history.set_distance_threshold(0.7)
+
+larger_context = semantic_history.get_relevant(prompt)
+for message in larger_context:
+ print(message)
+```
+
+ {'role': 'user', 'content': 'what is the size of England compared to Portugal?'}
+ {'role': 'llm', 'content': 'England is larger in land area than Portal by about 15000 square miles.'}
+ {'role': 'user', 'content': 'What is the population of Great Britain?'}
+ {'role': 'llm', 'content': 'As of 2023 the population of Great Britain is approximately 67 million people.'}
+
+
+## Conversation control
+
+LLMs can hallucinate on occasion and when this happens it can be useful to prune incorrect information from conversational histories so this incorrect information doesn't continue to be passed as context.
+
+
+```python
+semantic_history.store(
+ prompt="what is the smallest country in Europe?",
+ response="Monaco is the smallest country in Europe at 0.78 square miles." # Incorrect. Vatican City is the smallest country in Europe
+ )
+
+# get the key of the incorrect message
+context = semantic_history.get_recent(top_k=1, raw=True)
+bad_key = context[0]['entry_id']
+semantic_history.drop(bad_key)
+
+corrected_context = semantic_history.get_recent()
+for message in corrected_context:
+ print(message)
+```
+
+ {'role': 'user', 'content': 'What is the population of Great Britain?'}
+ {'role': 'llm', 'content': 'As of 2023 the population of Great Britain is approximately 67 million people.'}
+ {'role': 'user', 'content': 'what is the size of England compared to Portugal?'}
+ {'role': 'llm', 'content': 'England is larger in land area than Portal by about 15000 square miles.'}
+ {'role': 'user', 'content': 'what is the smallest country in Europe?'}
+
+
+
+```python
+chat_history.clear()
+semantic_history.clear()
+```
diff --git a/content/integrate/redisvl/user_guide/semantic_router.md b/content/integrate/redisvl/user_guide/semantic_router.md
index 91b8ed66bb..74ea12c04e 100644
--- a/content/integrate/redisvl/user_guide/semantic_router.md
+++ b/content/integrate/redisvl/user_guide/semantic_router.md
@@ -38,7 +38,7 @@ technology = Route(
"what's trending in tech?"
],
metadata={"category": "tech", "priority": 1},
- distance_threshold=1.0
+ distance_threshold=0.71
)
sports = Route(
@@ -51,7 +51,7 @@ sports = Route(
"basketball and football"
],
metadata={"category": "sports", "priority": 2},
- distance_threshold=0.5
+ distance_threshold=0.72
)
entertainment = Route(
@@ -90,25 +90,14 @@ router = SemanticRouter(
)
```
- /Users/robert.shelton/.pyenv/versions/3.11.9/lib/python3.11/site-packages/huggingface_hub/file_download.py:1142: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
- warnings.warn(
- /Users/robert.shelton/.pyenv/versions/3.11.9/lib/python3.11/site-packages/huggingface_hub/file_download.py:1142: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
- warnings.warn(
+ 19:18:32 sentence_transformers.SentenceTransformer INFO Use pytorch device_name: mps
+ 19:18:32 sentence_transformers.SentenceTransformer INFO Load pretrained SentenceTransformer: sentence-transformers/all-mpnet-base-v2
- 14:07:31 redisvl.index.index INFO Index already exists, overwriting.
-
-
-
-```python
-router.vectorizer
-```
-
-
-
-
- HFTextVectorizer(model='sentence-transformers/all-mpnet-base-v2', dims=768)
-
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 17.78it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 37.43it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 27.28it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 48.76it/s]
@@ -120,58 +109,63 @@ router.vectorizer
Index Information:
- ╭──────────────┬────────────────┬──────────────────┬─────────────────┬────────────╮
- │ Index Name │ Storage Type │ Prefixes │ Index Options │ Indexing │
- ├──────────────┼────────────────┼──────────────────┼─────────────────┼────────────┤
- │ topic-router │ HASH │ ['topic-router'] │ [] │ 0 │
- ╰──────────────┴────────────────┴──────────────────┴─────────────────┴────────────╯
+ ╭──────────────────┬──────────────────┬──────────────────┬──────────────────┬──────────────────╮
+ │ Index Name │ Storage Type │ Prefixes │ Index Options │ Indexing │
+ ├──────────────────┼──────────────────┼──────────────────┼──────────────────┼──────────────────┤
+ | topic-router | HASH | ['topic-router'] | [] | 0 |
+ ╰──────────────────┴──────────────────┴──────────────────┴──────────────────┴──────────────────╯
Index Fields:
- ╭────────────┬─────────────┬────────┬────────────────┬────────────────┬────────────────┬────────────────┬────────────────┬────────────────┬─────────────────┬────────────────╮
- │ Name │ Attribute │ Type │ Field Option │ Option Value │ Field Option │ Option Value │ Field Option │ Option Value │ Field Option │ Option Value │
- ├────────────┼─────────────┼────────┼────────────────┼────────────────┼────────────────┼────────────────┼────────────────┼────────────────┼─────────────────┼────────────────┤
- │ route_name │ route_name │ TAG │ SEPARATOR │ , │ │ │ │ │ │ │
- │ reference │ reference │ TEXT │ WEIGHT │ 1 │ │ │ │ │ │ │
- │ vector │ vector │ VECTOR │ algorithm │ FLAT │ data_type │ FLOAT32 │ dim │ 768 │ distance_metric │ COSINE │
- ╰────────────┴─────────────┴────────┴────────────────┴────────────────┴────────────────┴────────────────┴────────────────┴────────────────┴─────────────────┴────────────────╯
+ ╭─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────╮
+ │ Name │ Attribute │ Type │ Field Option │ Option Value │ Field Option │ Option Value │ Field Option │ Option Value │ Field Option │ Option Value │
+ ├─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┤
+ │ reference_id │ reference_id │ TAG │ SEPARATOR │ , │ │ │ │ │ │ │
+ │ route_name │ route_name │ TAG │ SEPARATOR │ , │ │ │ │ │ │ │
+ │ reference │ reference │ TEXT │ WEIGHT │ 1 │ │ │ │ │ │ │
+ │ vector │ vector │ VECTOR │ algorithm │ FLAT │ data_type │ FLOAT32 │ dim │ 768 │ distance_metric │ COSINE │
+ ╰─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────╯
-## Simple routing
-
```python
-# Query the router with a statement
-route_match = router("Can you tell me about the latest in artificial intelligence?")
-route_match
+router._index.info()["num_docs"]
```
- RouteMatch(name='technology', distance=0.119614303112)
+ 11
+## Simple routing
+
```python
-# Query the router with a statement and return a miss
-route_match = router("are aliens real?")
+# Query the router with a statement
+route_match = router("Can you tell me about the latest in artificial intelligence?")
route_match
```
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 6.40it/s]
+
- RouteMatch(name=None, distance=None)
+
+ RouteMatch(name='technology', distance=0.419145842393)
```python
-# Toggle the runtime distance threshold
-route_match = router("Which basketball team will win the NBA finals?")
+# Query the router with a statement and return a miss
+route_match = router("are aliens real?")
route_match
```
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 39.83it/s]
+
+
@@ -184,14 +178,18 @@ We can also route a statement to many routes and order them by distance:
```python
# Perform multi-class classification with route_many() -- toggle the max_k and the distance_threshold
-route_matches = router.route_many("Lebron James", max_k=3)
+route_matches = router.route_many("How is AI used in basketball?", max_k=3)
route_matches
```
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 40.50it/s]
+
+
- []
+ [RouteMatch(name='technology', distance=0.556493878365),
+ RouteMatch(name='sports', distance=0.671060125033)]
@@ -200,14 +198,18 @@ route_matches
# Toggle the aggregation method -- note the different distances in the result
from redisvl.extensions.router.schema import DistanceAggregationMethod
-route_matches = router.route_many("Lebron James", aggregation_method=DistanceAggregationMethod.min, max_k=3)
+route_matches = router.route_many("How is AI used in basketball?", aggregation_method=DistanceAggregationMethod.min, max_k=3)
route_matches
```
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.18it/s]
- []
+
+
+ [RouteMatch(name='technology', distance=0.556493878365),
+ RouteMatch(name='sports', distance=0.629264354706)]
@@ -230,10 +232,13 @@ route_matches = router.route_many("Lebron James")
route_matches
```
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 41.89it/s]
+
- []
+
+ [RouteMatch(name='sports', distance=0.663254022598)]
@@ -252,27 +257,25 @@ router.to_dict()
'references': ['what are the latest advancements in AI?',
'tell me about the newest gadgets',
"what's trending in tech?"],
- 'metadata': {'category': 'tech', 'priority': '1'},
- 'distance_threshold': 1.0},
+ 'metadata': {'category': 'tech', 'priority': 1},
+ 'distance_threshold': 0.71},
{'name': 'sports',
'references': ['who won the game last night?',
'tell me about the upcoming sports events',
"what's the latest in the world of sports?",
'sports',
'basketball and football'],
- 'metadata': {'category': 'sports', 'priority': '2'},
- 'distance_threshold': 0.5},
+ 'metadata': {'category': 'sports', 'priority': 2},
+ 'distance_threshold': 0.72},
{'name': 'entertainment',
'references': ['what are the top movies right now?',
'who won the best actor award?',
"what's new in the entertainment industry?"],
- 'metadata': {'category': 'entertainment', 'priority': '3'},
+ 'metadata': {'category': 'entertainment', 'priority': 3},
'distance_threshold': 0.7}],
'vectorizer': {'type': 'hf',
'model': 'sentence-transformers/all-mpnet-base-v2'},
- 'routing_config': {'distance_threshold': 0.5,
- 'max_k': 3,
- 'aggregation_method': 'min'}}
+ 'routing_config': {'max_k': 3, 'aggregation_method': 'min'}}
@@ -283,7 +286,16 @@ router2 = SemanticRouter.from_dict(router.to_dict(), redis_url="redis://localhos
assert router2.to_dict() == router.to_dict()
```
- 14:07:34 redisvl.index.index INFO Index already exists, not overwriting.
+ 19:18:38 sentence_transformers.SentenceTransformer INFO Use pytorch device_name: mps
+ 19:18:38 sentence_transformers.SentenceTransformer INFO Load pretrained SentenceTransformer: sentence-transformers/all-mpnet-base-v2
+
+
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 54.94it/s]
+
+ 19:18:40 redisvl.index.index INFO Index already exists, not overwriting.
+
+
+
@@ -298,7 +310,116 @@ router3 = SemanticRouter.from_yaml("router.yaml", redis_url="redis://localhost:6
assert router3.to_dict() == router2.to_dict() == router.to_dict()
```
- 14:07:34 redisvl.index.index INFO Index already exists, not overwriting.
+ 19:18:40 sentence_transformers.SentenceTransformer INFO Use pytorch device_name: mps
+ 19:18:40 sentence_transformers.SentenceTransformer INFO Load pretrained SentenceTransformer: sentence-transformers/all-mpnet-base-v2
+
+
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 18.77it/s]
+
+ 19:18:41 redisvl.index.index INFO Index already exists, not overwriting.
+
+
+
+
+
+# Add route references
+
+
+```python
+router.add_route_references(route_name="technology", references=["latest AI trends", "new tech gadgets"])
+```
+
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 13.22it/s]
+
+
+
+
+
+ ['topic-router:technology:f243fb2d073774e81c7815247cb3013794e6225df3cbe3769cee8c6cefaca777',
+ 'topic-router:technology:7e4bca5853c1c3298b4d001de13c3c7a79a6e0f134f81acc2e7cddbd6845961f']
+
+
+
+# Get route references
+
+
+```python
+# by route name
+refs = router.get_route_references(route_name="technology")
+refs
+```
+
+
+
+
+ [{'id': 'topic-router:technology:7e4bca5853c1c3298b4d001de13c3c7a79a6e0f134f81acc2e7cddbd6845961f',
+ 'reference_id': '7e4bca5853c1c3298b4d001de13c3c7a79a6e0f134f81acc2e7cddbd6845961f',
+ 'route_name': 'technology',
+ 'reference': 'new tech gadgets'},
+ {'id': 'topic-router:technology:f243fb2d073774e81c7815247cb3013794e6225df3cbe3769cee8c6cefaca777',
+ 'reference_id': 'f243fb2d073774e81c7815247cb3013794e6225df3cbe3769cee8c6cefaca777',
+ 'route_name': 'technology',
+ 'reference': 'latest AI trends'},
+ {'id': 'topic-router:technology:851f51cce5a9ccfbbcb66993908be6b7871479af3e3a4b139ad292a1bf7e0676',
+ 'reference_id': '851f51cce5a9ccfbbcb66993908be6b7871479af3e3a4b139ad292a1bf7e0676',
+ 'route_name': 'technology',
+ 'reference': 'what are the latest advancements in AI?'},
+ {'id': 'topic-router:technology:149a9c9919c58534aa0f369e85ad95ba7f00aa0513e0f81e2aff2ea4a717b0e0',
+ 'reference_id': '149a9c9919c58534aa0f369e85ad95ba7f00aa0513e0f81e2aff2ea4a717b0e0',
+ 'route_name': 'technology',
+ 'reference': "what's trending in tech?"},
+ {'id': 'topic-router:technology:85cc73a1437df27caa2f075a29c497e5a2e532023fbb75378aedbae80779ab37',
+ 'reference_id': '85cc73a1437df27caa2f075a29c497e5a2e532023fbb75378aedbae80779ab37',
+ 'route_name': 'technology',
+ 'reference': 'tell me about the newest gadgets'}]
+
+
+
+
+```python
+# by reference id
+refs = router.get_route_references(reference_ids=[refs[0]["reference_id"]])
+refs
+```
+
+
+
+
+ [{'id': 'topic-router:technology:7e4bca5853c1c3298b4d001de13c3c7a79a6e0f134f81acc2e7cddbd6845961f',
+ 'reference_id': '7e4bca5853c1c3298b4d001de13c3c7a79a6e0f134f81acc2e7cddbd6845961f',
+ 'route_name': 'technology',
+ 'reference': 'new tech gadgets'}]
+
+
+
+# Delete route references
+
+
+```python
+# by route name
+deleted_count = router.delete_route_references(route_name="sports")
+deleted_count
+```
+
+
+
+
+ 5
+
+
+
+
+```python
+# by id
+deleted_count = router.delete_route_references(reference_ids=[refs[0]["reference_id"]])
+deleted_count
+```
+
+
+
+
+ 1
+
## Clean up the router
diff --git a/content/integrate/redisvl/user_guide/threshold_optimization.md b/content/integrate/redisvl/user_guide/threshold_optimization.md
index 4598f8e658..43722609ef 100644
--- a/content/integrate/redisvl/user_guide/threshold_optimization.md
+++ b/content/integrate/redisvl/user_guide/threshold_optimization.md
@@ -19,12 +19,14 @@ Let's say you setup the following semantic cache with a distance_threshold of `X
```python
-from redisvl.extensions.llmcache import SemanticCache
+from redisvl.extensions.cache.llm import SemanticCache
+from redisvl.utils.vectorize import HFTextVectorizer
sem_cache = SemanticCache(
- name="sem_cache", # underlying search index name
- redis_url="redis://localhost:6379", # redis connection url string
- distance_threshold=0.5 # semantic cache distance threshold
+ name="sem_cache", # underlying search index name
+ redis_url="redis://localhost:6379", # redis connection url string
+ distance_threshold=0.5, # semantic cache distance threshold
+ vectorizer=HFTextVectorizer("redis/langcache-embed-v1") # embedding model
)
paris_key = sem_cache.store(prompt="what is the capital of france?", response="paris")
@@ -32,6 +34,20 @@ rabat_key = sem_cache.store(prompt="what is the capital of morocco?", response="
```
+ /Users/justin.cechmanek/.pyenv/versions/3.13/envs/redisvl-dev/lib/python3.13/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
+ from .autonotebook import tqdm as notebook_tqdm
+
+
+ 16:16:11 sentence_transformers.SentenceTransformer INFO Use pytorch device_name: mps
+ 16:16:11 sentence_transformers.SentenceTransformer INFO Load pretrained SentenceTransformer: redis/langcache-embed-v1
+
+
+ Batches: 0%| | 0/1 [00:00, ?it/s]Compiling the model with `torch.compile` and using a `torch.mps` device is not supported. Falling back to non-compiled mode.
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 3.38it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 1.02it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 25.04it/s]
+
+
This works well but we want to make sure the cache only applies for the appropriate questions. If we test the cache with a question we don't want a response to we see that the current distance_threshold is too high.
@@ -39,15 +55,18 @@ This works well but we want to make sure the cache only applies for the appropri
sem_cache.check("what's the capital of britain?")
```
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 1.24it/s]
+
+
[{'entry_id': 'c990cc06e5e77570e5f03360426d2b7f947cbb5a67daa8af8164bfe0b3e24fe3',
'prompt': 'what is the capital of france?',
'response': 'paris',
- 'vector_distance': 0.421104669571,
- 'inserted_at': 1741039231.99,
- 'updated_at': 1741039231.99,
+ 'vector_distance': 0.335606634617,
+ 'inserted_at': 1746051375.81,
+ 'updated_at': 1746051375.81,
'key': 'sem_cache:c990cc06e5e77570e5f03360426d2b7f947cbb5a67daa8af8164bfe0b3e24fe3'}]
@@ -102,7 +121,16 @@ print(f"Distance threshold after: {sem_cache.distance_threshold} \n")
Distance threshold before: 0.5
- Distance threshold after: 0.13050847457627118
+
+
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 1.09it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 23.17it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 24.43it/s]
+ /Users/justin.cechmanek/.pyenv/versions/3.13/envs/redisvl-dev/lib/python3.13/site-packages/ranx/metrics/f1.py:36: NumbaTypeSafetyWarning: unsafe cast from uint64 to int64. Precision may be lost.
+ scores[i] = _f1(qrels[i], run[i], k, rel_lvl)
+
+
+ Distance threshold after: 0.10372881355932204
@@ -113,6 +141,9 @@ We can also see that we no longer match on the incorrect example:
sem_cache.check("what's the capital of britain?")
```
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 12.39it/s]
+
+
@@ -127,15 +158,18 @@ But still match on highly relevant prompts:
sem_cache.check("what's the capital city of france?")
```
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 25.92it/s]
+
+
[{'entry_id': 'c990cc06e5e77570e5f03360426d2b7f947cbb5a67daa8af8164bfe0b3e24fe3',
'prompt': 'what is the capital of france?',
'response': 'paris',
- 'vector_distance': 0.0835866332054,
- 'inserted_at': 1741039231.99,
- 'updated_at': 1741039231.99,
+ 'vector_distance': 0.043138384819,
+ 'inserted_at': 1746051375.81,
+ 'updated_at': 1746051375.81,
'key': 'sem_cache:c990cc06e5e77570e5f03360426d2b7f947cbb5a67daa8af8164bfe0b3e24fe3'}]
@@ -186,6 +220,15 @@ router = SemanticRouter(
)
```
+ 16:16:41 sentence_transformers.SentenceTransformer INFO Use pytorch device_name: mps
+ 16:16:41 sentence_transformers.SentenceTransformer INFO Load pretrained SentenceTransformer: sentence-transformers/all-mpnet-base-v2
+
+
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 5.90it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 4.97it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.29it/s]
+
+
### Provide test_data
@@ -244,8 +287,684 @@ optimizer.optimize()
Route thresholds before: {'greeting': 0.5, 'farewell': 0.5}
- Eval metric F1: start 0.438, end 0.719
- Ending thresholds: {'greeting': 1.0858585858585856, 'farewell': 0.5545454545454545}
+
+
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 7.52it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.81it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.98it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.45it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.33it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.36it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 68.65it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 69.68it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 9.45it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.54it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 44.41it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 9.21it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 9.56it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.66it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.82it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 71.34it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 69.43it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.47it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.16it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.54it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.00it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.27it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.18it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.69it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.43it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.08it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 68.10it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 69.44it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.34it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.59it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 7.11it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 9.09it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 68.68it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 69.00it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 70.14it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 70.37it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.86it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.03it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.97it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.17it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.85it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 69.30it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.43it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.73it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.66it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 70.30it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 71.80it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 69.50it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 72.13it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 74.82it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 70.64it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 72.13it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 69.36it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.10it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.13it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.84it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 69.54it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 72.29it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 70.65it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 69.73it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.30it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 70.15it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.66it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 68.28it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 75.60it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 73.34it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 72.31it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.26it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 69.83it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 68.43it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 70.30it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 71.86it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 74.77it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 69.30it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.45it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.10it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.17it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.52it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.95it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 72.13it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 68.44it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.54it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.25it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.19it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 68.58it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 68.42it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.09it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 68.81it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.47it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.34it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 53.98it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.37it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.07it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 70.17it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.96it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 68.35it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 70.44it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 68.54it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 72.35it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 68.29it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.32it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.37it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 68.59it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 71.14it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.93it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.15it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.85it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.55it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 56.62it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.30it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.78it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.14it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.88it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.76it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.95it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 68.28it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.52it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.75it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.76it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.58it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.73it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 58.28it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 58.37it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 56.33it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.02it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.96it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.80it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 52.71it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.92it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.40it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.26it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 58.94it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.61it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.73it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 57.75it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.26it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.94it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.38it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.96it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.46it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.29it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.94it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.80it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.36it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.63it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.72it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.42it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.46it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 69.76it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 69.73it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.41it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 22.60it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.48it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 68.27it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 68.79it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.02it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 69.24it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.95it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 68.73it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.25it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 72.12it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 71.30it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 71.67it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 70.32it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 71.63it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 71.39it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 72.21it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 68.08it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.69it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 70.47it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 70.04it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 69.11it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 70.95it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 69.20it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 68.88it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 68.69it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.30it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 69.32it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 69.66it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 69.43it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 70.60it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 69.29it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.07it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.30it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.92it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.92it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.10it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.32it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 69.87it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 68.90it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.28it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.03it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.60it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.77it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.57it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.19it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 69.22it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.19it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.16it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.36it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 70.35it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 69.23it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.73it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.36it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.03it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.25it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.14it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.75it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.75it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.03it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.18it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.82it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.91it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.66it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.59it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.15it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.79it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.12it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.79it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.48it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.55it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 72.56it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.10it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 68.08it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 69.16it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 69.51it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 71.72it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 70.76it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 71.80it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 68.68it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 72.31it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.08it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.74it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 69.39it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 70.05it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 69.25it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.50it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 68.96it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 71.10it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 20.79it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.25it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.29it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.83it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.71it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 69.44it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.45it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.09it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.22it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.40it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.09it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.61it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 58.27it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.96it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.76it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.58it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.89it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.49it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.81it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.02it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.82it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.65it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.24it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.17it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.23it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.34it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.43it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.42it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.56it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.31it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.61it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 68.63it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.13it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.65it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 52.12it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.91it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.39it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.87it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 68.06it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.97it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.52it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.49it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 70.02it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.21it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 70.51it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.87it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.91it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 55.59it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.41it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.65it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 57.59it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.19it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.39it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.55it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.67it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.40it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.71it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.84it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.98it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.65it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.80it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.57it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.10it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.91it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 69.28it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.62it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.57it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.55it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.24it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.14it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 22.93it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 58.63it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.72it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.48it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 58.79it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.61it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.58it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.24it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.54it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 58.50it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.77it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.97it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.10it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.96it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.60it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.20it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.35it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.31it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.98it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.37it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.65it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.80it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.30it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.02it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.34it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.69it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.54it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.99it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.61it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.10it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.35it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.40it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.21it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.53it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.69it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.16it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.65it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.10it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.35it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.74it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.61it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.60it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.25it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.11it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.75it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.51it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.96it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.47it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.14it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 19.47it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.93it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.31it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.57it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 69.38it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.51it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.12it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.60it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 68.27it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.73it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.85it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.40it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.86it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.41it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 69.54it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 68.63it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.77it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.27it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.09it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.59it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.32it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.25it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.37it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.83it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 56.11it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.48it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.84it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.81it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.64it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.02it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 57.15it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 48.02it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 58.87it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 57.65it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 57.22it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.46it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 56.88it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.85it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.55it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.53it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.77it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.28it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.75it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.11it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.83it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 58.53it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 57.63it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.21it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.50it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.85it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 24.78it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 51.26it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.08it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.52it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.93it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.66it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.92it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.96it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.37it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.64it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 68.04it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.04it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.39it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 57.26it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.52it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.94it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.31it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.79it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.49it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.78it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.65it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.66it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.08it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 57.51it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.10it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.46it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.94it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.96it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.03it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 68.31it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.17it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 58.56it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.63it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 56.72it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.09it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.11it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.14it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.57it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.98it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.72it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.61it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.24it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 26.60it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 48.69it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.41it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.40it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.24it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.19it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.19it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 57.58it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.63it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.43it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.19it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.40it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.42it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.22it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.80it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.48it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.80it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.15it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.55it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.26it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.54it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.91it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.54it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 56.63it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.97it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.31it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.31it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.56it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 50.77it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.48it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 68.06it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.79it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.45it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 58.24it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.24it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 68.72it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.95it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 68.16it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.38it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.71it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 58.26it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 28.07it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 57.12it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.97it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.73it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.33it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.02it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.12it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.95it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.59it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.20it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.98it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.04it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 52.98it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.17it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 56.58it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.79it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.12it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.98it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.35it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.53it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.40it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.97it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.33it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.92it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.80it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.31it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.50it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.48it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.67it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.83it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.62it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.44it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.42it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 31.04it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.48it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 57.55it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 54.57it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.14it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 57.10it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 57.03it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.69it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.59it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.47it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.09it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 58.45it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 54.12it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 53.34it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 55.71it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.44it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.53it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.57it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 56.97it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.61it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.10it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.55it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.79it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.33it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.89it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.57it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.39it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.12it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.34it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.32it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.32it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.89it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 57.65it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 58.53it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 56.02it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 58.35it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.51it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.22it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 55.57it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 38.77it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 54.75it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.26it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.40it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 52.05it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 53.57it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.93it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.06it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.14it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.50it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.33it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.88it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 69.14it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.79it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.01it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.35it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.24it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.24it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.05it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.70it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.73it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 69.52it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 56.09it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 52.72it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 54.25it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 55.34it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.47it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.92it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 28.70it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.89it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.14it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.98it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.57it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.40it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.81it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.61it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.07it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.85it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.54it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.23it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.84it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.08it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.40it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.31it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.77it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.58it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.87it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.30it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.77it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.50it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.51it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.72it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 57.14it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.08it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.53it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.98it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.40it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.41it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 67.63it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.69it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 17.08it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.10it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.97it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.83it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.86it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.74it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 56.91it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.53it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.71it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.64it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 56.45it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.61it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.10it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.71it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.22it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 59.16it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 58.86it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 58.44it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.69it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.12it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.96it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.95it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.36it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.27it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 68.32it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.83it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.57it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 41.90it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.38it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 20.01it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.68it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 61.99it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.13it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.30it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.66it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.39it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.27it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 63.94it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.42it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 60.40it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.71it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.32it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 62.48it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 64.38it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.65it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 65.33it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 70.37it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 55.00it/s]
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 66.63it/s]
+
+
+ Eval metric F1: start 0.438, end 0.812
+ Ending thresholds: {'greeting': 0.5828282828282831, 'farewell': 0.7545454545454545}
### Test it out
@@ -257,10 +976,13 @@ route_match = router("hi there")
route_match
```
+ Batches: 100%|██████████| 1/1 [00:00<00:00, 55.72it/s]
+
+
- RouteMatch(name='greeting', distance=0.295984119177)
+ RouteMatch(name='greeting', distance=0.295984089375)
diff --git a/content/integrate/redisvl/user_guide/vectorizers.md b/content/integrate/redisvl/user_guide/vectorizers.md
index fe538de131..b50689c024 100644
--- a/content/integrate/redisvl/user_guide/vectorizers.md
+++ b/content/integrate/redisvl/user_guide/vectorizers.md
@@ -481,7 +481,7 @@ This enables the use of custom vectorizers with other RedisVL components
```python
-from redisvl.extensions.llmcache import SemanticCache
+from redisvl.extensions.cache.llm import SemanticCache
cache = SemanticCache(name="custom_cache", vectorizer=custom_vectorizer)