Skip to content

Commit 9acd518

Browse files
committed
update correct this time
1 parent 0e0a4f5 commit 9acd518

File tree

4 files changed

+383
-2
lines changed

4 files changed

+383
-2
lines changed

content/develop/get-started/vector-database.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -281,7 +281,7 @@ From the description, this bike is an excellent match for younger children, and
281281

282282
1. You can learn more about the query options, such as filters and vector range queries, by reading the [vector reference documentation]({{< relref "/develop/interact/search-and-query/advanced-concepts/vectors" >}}).
283283
2. The complete [Redis Query Engine documentation]({{< relref "/develop/interact/search-and-query/" >}}) might be interesting for you.
284-
3. If you want to follow the code examples more interactively, then you can use the [Jupyter notebook](https://github.com/RedisVentures/redis-vss-getting-started/blob/main/vector_similarity_with_redis.ipynb) that inspired this quick start guide.
284+
3. If you want to follow the code examples more interactively, then you can use the [Jupyter notebook](https://github.com/redis-developer/redis-ai-resources/blob/main/python-recipes/vector-search/00_redispy.ipynb) that inspired this quick start guide.
285285
4. If you want to see more advanced examples of a Redis vector database in action, visit the [Redis AI Resources](https://github.com/redis-developer/redis-ai-resources) page on GitHub.
286286

287287
## Continue learning with Redis University

content/integrate/amazon-bedrock/_index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,4 +36,4 @@ To fully set up Bedrock with Redis Cloud, you will need to do the following:
3636
## More info
3737

3838
- [Amazon Bedrock integration blog post](https://redis.io/blog/amazon-bedrock-integration-with-redis-enterprise/)
39-
- [Detailed steps](https://github.com/RedisVentures/aws-redis-bedrock-stack/blob/main/README.md)
39+
- [Detailed steps](https://github.com/redis-applied-ai/aws-redis-bedrock-stack/blob/main/README.md)
Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
---
2+
description: How to install RedisVL
3+
title: Install RedisVL
4+
type: integration
5+
---
6+
There are a few ways to install RedisVL. The easiest way is to use Python's `pip` command.
7+
8+
## Install RedisVL with pip
9+
10+
Install `redisvl` into your Python (>=3.8) environment using `pip`:
11+
12+
```bash
13+
$ pip install -U redisvl
14+
```
15+
16+
RedisVL comes with a few dependencies that are automatically installed. However, a few dependencies
17+
are optional and can be installed separately if needed:
18+
19+
```bash
20+
$ pip install redisvl[all] # install vectorizer dependencies
21+
$ pip install redisvl[dev] # install dev dependencies
22+
```
23+
24+
If you use Zsh, remember to escape the brackets:
25+
26+
```bash
27+
$ pip install redisvl\[all\]
28+
```
29+
30+
This library supports the use of [hiredis](https://redis.com/lp/hiredis/), so you can also install RedisVL by running:
31+
32+
```bash
33+
pip install redisvl[hiredis]
34+
```
35+
36+
## Install RedisVL from source
37+
38+
To install RedisVL from source, clone the repository and install the package using `pip`:
39+
40+
```bash
41+
$ git clone [email protected]:redis/redis-vl-python.git && cd redis-vl-python
42+
$ pip install .
43+
44+
# or for an editable installation (for developers of RedisVL)
45+
$ pip install -e .
46+
```
47+
48+
## Install Redis
49+
50+
RedisVL requires a distribution of Redis that supports the [search and query](https://redis.com/modules/redis-search/) capability, of which there are three:
51+
52+
1. [Redis Cloud](https://redis.com/try-free), a fully managed cloud offering that you can try for free.
53+
2. [Redis Stack]({{< relref "/operate/oss_and_stack/install/install-stack/docker" >}}), a local docker image for testing and development.
54+
3. [Redis Enterprise](https://redis.com/redis-enterprise/), a commercial self-hosted offering.
55+
56+
### Redis Cloud
57+
58+
Redis Cloud is the easiest way to get started with RedisVL. You can sign up for a free account [here](https://redis.com/try-free). Make sure to have the **Search and Query** capability enabled when creating your database.
59+
60+
### Redis Stack (local development)
61+
62+
For local development and testing, Redis Stack and be used. We recommend running Redis
63+
in a docker container. To do so, run the following command:
64+
65+
```bash
66+
docker run -d --name redis-stack -p 6379:6379 -p 8001:8001 redis/redis-stack:latest
67+
```
68+
69+
This will also start the [Redis Insight application](https://redis.com/redis-enterprise/redis-insight/) at `http://localhost:8001`.
70+
71+
### Redis Enterprise (self-hosted)
72+
73+
Redis Enterprise is a commercial offering that can be self-hosted. You can download the latest version [here](https://redis.com/redis-enterprise-software/download-center/software/).
74+
75+
If you are considering a self-hosted Redis Enterprise deployment on Kubernetes, there is the [Redis Enterprise Operator](https://docs.redis.com/latest/kubernetes/) for Kubernetes. This will allow you to easily deploy and manage a Redis Enterprise cluster on Kubernetes.
Lines changed: 306 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,306 @@
1+
---
2+
description: Storing JSON and hashes with RedisVL
3+
linkTitle: JSON vs. hash storage
4+
title: JSON vs. hash storage
5+
type: integration
6+
weight: 6
7+
---
8+
9+
Out of the box, Redis provides a [variety of data structures](https://redis.com/redis-enterprise/data-structures/) that can be used for your domain specific applications and use cases.
10+
In this document, you will learn how to use RedisVL with both [hash]({{< relref "/develop/data-types/hashes" >}}) and [JSON]({{< relref "/develop/data-types/json/" >}}) data.
11+
12+
{{< note >}}
13+
This document is a converted form of [this Jupyter notebook](https://github.com/redis/redis-vl-python/blob/main/docs/user_guide/05_hash_vs_json.ipynb).
14+
{{< /note >}}
15+
16+
Before beginning, be sure of the following:
17+
18+
1. You have installed RedisVL and have that environment activated.
19+
1. You have a running Redis instance with the search and query capability.
20+
21+
```python
22+
# import necessary modules
23+
import pickle
24+
25+
from redisvl.redis.utils import buffer_to_array
26+
from jupyterutils import result_print, table_print
27+
from redisvl.index import SearchIndex
28+
29+
# load in the example data and printing utils
30+
data = pickle.load(open("hybrid_example_data.pkl", "rb"))
31+
```
32+
33+
```python
34+
table_print(data)
35+
```
36+
37+
<table><tr><th>user</th><th>age</th><th>job</th><th>credit_score</th><th>office_location</th><th>user_embedding</th></tr><tr><td>john</td><td>18</td><td>engineer</td><td>high</td><td>-122.4194,37.7749</td><td>b'\xcd\xcc\xcc=\xcd\xcc\xcc=\x00\x00\x00?'</td></tr><tr><td>derrick</td><td>14</td><td>doctor</td><td>low</td><td>-122.4194,37.7749</td><td>b'\xcd\xcc\xcc=\xcd\xcc\xcc=\x00\x00\x00?'</td></tr><tr><td>nancy</td><td>94</td><td>doctor</td><td>high</td><td>-122.4194,37.7749</td><td>b'333?\xcd\xcc\xcc=\x00\x00\x00?'</td></tr><tr><td>tyler</td><td>100</td><td>engineer</td><td>high</td><td>-122.0839,37.3861</td><td>b'\xcd\xcc\xcc=\xcd\xcc\xcc>\x00\x00\x00?'</td></tr><tr><td>tim</td><td>12</td><td>dermatologist</td><td>high</td><td>-122.0839,37.3861</td><td>b'\xcd\xcc\xcc>\xcd\xcc\xcc>\x00\x00\x00?'</td></tr><tr><td>taimur</td><td>15</td><td>CEO</td><td>low</td><td>-122.0839,37.3861</td><td>b'\x9a\x99\x19?\xcd\xcc\xcc=\x00\x00\x00?'</td></tr><tr><td>joe</td><td>35</td><td>dentist</td><td>medium</td><td>-122.0839,37.3861</td><td>b'fff?fff?\xcd\xcc\xcc='</td></tr></table>
38+
39+
40+
## Hash or JSON - how to choose?
41+
42+
Both storage options offer a variety of features and tradeoffs. Below, you will work through a dummy dataset to learn when and how to use both data types.
43+
44+
### Working with hashes
45+
46+
Hashes in Redis are simple collections of field-value pairs. Think of it like a mutable, single-level dictionary that contains multiple "rows":
47+
48+
```python
49+
{
50+
"model": "Deimos",
51+
"brand": "Ergonom",
52+
"type": "Enduro bikes",
53+
"price": 4972,
54+
}
55+
```
56+
57+
Hashes are best suited for use cases with the following characteristics:
58+
59+
- Performance (speed) and storage space (memory consumption) are top concerns.
60+
- Data can be easily normalized and modeled as a single-level dictionary.
61+
62+
> Hashes are typically the default recommendation.
63+
64+
```python
65+
# define the hash index schema
66+
hash_schema = {
67+
"index": {
68+
"name": "user-hash",
69+
"prefix": "user-hash-docs",
70+
"storage_type": "hash", # default setting -- HASH
71+
},
72+
"fields": [
73+
{"name": "user", "type": "tag"},
74+
{"name": "credit_score", "type": "tag"},
75+
{"name": "job", "type": "text"},
76+
{"name": "age", "type": "numeric"},
77+
{"name": "office_location", "type": "geo"},
78+
{
79+
"name": "user_embedding",
80+
"type": "vector",
81+
"attrs": {
82+
"dims": 3,
83+
"distance_metric": "cosine",
84+
"algorithm": "flat",
85+
"datatype": "float32"
86+
}
87+
}
88+
],
89+
}
90+
```
91+
92+
```python
93+
# construct a search index from the hash schema
94+
hindex = SearchIndex.from_dict(hash_schema)
95+
96+
# connect to local redis instance
97+
hindex.connect("redis://localhost:6379")
98+
99+
# create the index (no data yet)
100+
hindex.create(overwrite=True)
101+
```
102+
103+
```python
104+
# show the underlying storage type
105+
hindex.storage_type
106+
107+
<StorageType.HASH: 'hash'>
108+
```
109+
110+
#### Vectors as byte strings
111+
112+
One nuance when working with hashes in Redis is that all vectorized data must be passed as a byte string (for efficient storage, indexing, and processing). An example of this can be seen below:
113+
114+
115+
```python
116+
# show a single entry from the data that will be loaded
117+
data[0]
118+
119+
{'user': 'john',
120+
'age': 18,
121+
'job': 'engineer',
122+
'credit_score': 'high',
123+
'office_location': '-122.4194,37.7749',
124+
'user_embedding': b'\xcd\xcc\xcc=\xcd\xcc\xcc=\x00\x00\x00?'}
125+
```
126+
127+
```python
128+
# load hash data
129+
keys = hindex.load(data)
130+
```
131+
132+
```python
133+
$ rvl stats -i user-hash
134+
135+
Statistics:
136+
╭─────────────────────────────┬─────────────╮
137+
│ Stat Key │ Value │
138+
├─────────────────────────────┼─────────────┤
139+
│ num_docs │ 7
140+
│ num_terms │ 6
141+
│ max_doc_id │ 7
142+
│ num_records │ 44
143+
│ percent_indexed │ 1
144+
│ hash_indexing_failures │ 0
145+
│ number_of_uses │ 1
146+
│ bytes_per_record_avg │ 3.40909
147+
│ doc_table_size_mb │ 0.000767708
148+
│ inverted_sz_mb │ 0.000143051
149+
│ key_table_size_mb │ 0.000248909
150+
│ offset_bits_per_record_avg │ 8
151+
│ offset_vectors_sz_mb │ 8.58307e-06
152+
│ offsets_per_term_avg │ 0.204545
153+
│ records_per_doc_avg │ 6.28571
154+
│ sortable_values_size_mb │ 0
155+
│ total_indexing_time │ 0.587
156+
│ total_inverted_index_blocks │ 18
157+
│ vector_index_sz_mb │ 0.0202332
158+
╰─────────────────────────────┴─────────────╯
159+
```
160+
161+
#### Performing queries
162+
163+
Once the index is created and data is loaded into the right format, you can run queries against the index:
164+
165+
```python
166+
from redisvl.query import VectorQuery
167+
from redisvl.query.filter import Tag, Text, Num
168+
169+
t = (Tag("credit_score") == "high") & (Text("job") % "enginee*") & (Num("age") > 17)
170+
171+
v = VectorQuery([0.1, 0.1, 0.5],
172+
"user_embedding",
173+
return_fields=["user", "credit_score", "age", "job", "office_location"],
174+
filter_expression=t)
175+
176+
177+
results = hindex.query(v)
178+
result_print(results)
179+
180+
```
181+
182+
<table><tr><th>vector_distance</th><th>user</th><th>credit_score</th><th>age</th><th>job</th><th>office_location</th></tr><tr><td>0</td><td>john</td><td>high</td><td>18</td><td>engineer</td><td>-122.4194,37.7749</td></tr><tr><td>0.109129190445</td><td>tyler</td><td>high</td><td>100</td><td>engineer</td><td>-122.0839,37.3861</td></tr></table>
183+
184+
```python
185+
# clean up
186+
hindex.delete()
187+
```
188+
189+
### Working with JSON
190+
191+
Redis also supports native **JSON** objects. These can be multi-level (nested) objects, with full [JSONPath]({{< relref "/develop/data-types/json/" >}}path/) support for retrieving and updating sub-elements:
192+
193+
```python
194+
{
195+
"name": "bike",
196+
"metadata": {
197+
"model": "Deimos",
198+
"brand": "Ergonom",
199+
"type": "Enduro bikes",
200+
"price": 4972,
201+
}
202+
}
203+
```
204+
205+
JSON is best suited for use cases with the following characteristics:
206+
207+
- Ease of use and data model flexibility are top concerns.
208+
- Application data is already native JSON.
209+
- Replacing another document storage/database solution.
210+
211+
#### Full JSON Path support
212+
213+
Because Redis enables full JSONPath support, when creating an index schema, elements need to be indexed and selected by their path with the desired `name` and `path` that points to where the data is located within the objects.
214+
215+
{{< note >}}
216+
By default, RedisVL will assume the path as `$.{name}` if not provided in JSON fields schema.
217+
{{< /note >}}
218+
219+
```python
220+
# define the json index schema
221+
json_schema = {
222+
"index": {
223+
"name": "user-json",
224+
"prefix": "user-json-docs",
225+
"storage_type": "json", # JSON storage type
226+
},
227+
"fields": [
228+
{"name": "user", "type": "tag"},
229+
{"name": "credit_score", "type": "tag"},
230+
{"name": "job", "type": "text"},
231+
{"name": "age", "type": "numeric"},
232+
{"name": "office_location", "type": "geo"},
233+
{
234+
"name": "user_embedding",
235+
"type": "vector",
236+
"attrs": {
237+
"dims": 3,
238+
"distance_metric": "cosine",
239+
"algorithm": "flat",
240+
"datatype": "float32"
241+
}
242+
}
243+
],
244+
}
245+
```
246+
247+
```python
248+
# construct a search index from the JSON schema
249+
jindex = SearchIndex.from_dict(json_schema)
250+
251+
# connect to a local redis instance
252+
jindex.connect("redis://localhost:6379")
253+
254+
# create the index (no data yet)
255+
jindex.create(overwrite=True)
256+
```
257+
258+
```python
259+
# note the multiple indices in the same database
260+
$ rvl index listall
261+
262+
20:23:08 [RedisVL] INFO Indices:
263+
20:23:08 [RedisVL] INFO 1. user-json
264+
265+
#### Vectors as float arrays
266+
267+
Vectorized data stored in JSON must be stored as a pure array (e.g., a Python list) of floats. Modify your sample data to account for this below:
268+
269+
```python
270+
import numpy as np
271+
272+
json_data = data.copy()
273+
274+
for d in json_data:
275+
d['user_embedding'] = buffer_to_array(d['user_embedding'], dtype=np.float32)
276+
```
277+
278+
```python
279+
# inspect a single JSON record
280+
json_data[0]
281+
```
282+
283+
{'user': 'john',
284+
'age': 18,
285+
'job': 'engineer',
286+
'credit_score': 'high',
287+
'office_location': '-122.4194,37.7749',
288+
'user_embedding': [0.10000000149011612, 0.10000000149011612, 0.5]}
289+
290+
291+
```python
292+
keys = jindex.load(json_data)
293+
```
294+
295+
```python
296+
# we can now run the exact same query as above
297+
result_print(jindex.query(v))
298+
```
299+
300+
<table><tr><th>vector_distance</th><th>user</th><th>credit_score</th><th>age</th><th>job</th><th>office_location</th></tr><tr><td>0</td><td>john</td><td>high</td><td>18</td><td>engineer</td><td>-122.4194,37.7749</td></tr><tr><td>0.109129190445</td><td>tyler</td><td>high</td><td>100</td><td>engineer</td><td>-122.0839,37.3861</td></tr></table>
301+
302+
## Cleanup
303+
304+
```python
305+
jindex.delete()
306+
```

0 commit comments

Comments
 (0)