Skip to content

Commit ca96355

Browse files
committed
DEV: enhance TAG docs per GH issues
1 parent 968fdda commit ca96355

File tree

2 files changed

+259
-31
lines changed

2 files changed

+259
-31
lines changed

content/develop/ai/search-and-query/advanced-concepts/tags.md

Lines changed: 162 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,16 @@ weight: 6
2020

2121
Tag fields provide exact match search capabilities with high performance and memory efficiency. Use tag fields when you need to filter documents by specific values without the complexity of full-text search tokenization.
2222

23-
Tag fields interpret text as a simple list of *tags* delimited by a [separator](#separator-options) character (comma "`,`" by default). This approach enables simpler [tokenization]({{< relref "/develop/ai/search-and-query/advanced-concepts/escaping/#tokenization-rules-for-tag-fields" >}}) and encoding, making tag indexes much more efficient than full-text indexes. Note: even though tag and text fields both use text, they are two separate field types and so you don't query them the same way.
23+
Tag fields interpret text as a simple list of *tags* delimited by a [separator](#separator-options) character. This approach enables simpler [tokenization]({{< relref "/develop/ai/search-and-query/advanced-concepts/escaping/#tokenization-rules-for-tag-fields" >}}) and encoding, making tag indexes much more efficient than full-text indexes. Note: even though tag and text fields both use text, they are two separate field types and so you don't query them the same way.
24+
25+
{{% alert title="Important: Different defaults for HASH vs JSON" color="warning" %}}
26+
- The default separator for hash documents is a comma (`,`).
27+
- There is no default separator for JSON documents. You must explicitly specify one if needed.
28+
29+
Specifying a tag from the text `"foo,bar"` behaves differently:
30+
- For hash documents, two tags are created: `"foo"` and `"bar"`.
31+
- For JSON documents, one tag is created: `"foo,bar"` (unless you add `SEPARATOR ","`).
32+
{{% /alert %}}
2433

2534
## Tag fields vs text fields
2635

@@ -69,9 +78,35 @@ FT.CREATE ... SCHEMA ... {field_name} TAG [SEPARATOR {sep}] [CASESENSITIVE]
6978

7079
### Separator options
7180

72-
- **Hash documents**: Default separator is comma (`,`). You can use any printable ASCII character
73-
- **JSON documents**: No default separator - you must specify one explicitly if needed
74-
- **Custom separators**: Use semicolon (`;`), pipe (`|`), or other characters as needed
81+
The separator behavior differs significantly between hash and JSON documents:
82+
83+
**Hash documents**
84+
85+
- The default separator is the comma (`,`).
86+
- Strings are automatically splits at commas. For example,
87+
the string `"red,blue,green"` becomes three tags: `"red"`, `"blue"`, and `"green"`.
88+
- You can use any printable ASCII character as a custom separator.
89+
90+
**JSON documents**
91+
92+
- There is no default separator; it's effectively `null`.
93+
- Treats the entire string as single tag unless you specify a separator with the `SEPARATOR` option. For example,
94+
the string `"red,blue,green"` becomes one tag: `"red,blue,green"`
95+
- Add `SEPARATOR ","` to your schema to allow splitting.
96+
- You should use JSON arrays instead of comma-separated strings
97+
98+
**Why the difference?**
99+
100+
JSON has native array support, so the preferred approach is:
101+
102+
```json
103+
{"colors": ["red", "blue", "green"]} // Use with $.colors[*] AS colors TAG
104+
```
105+
Rather than:
106+
107+
```json
108+
{"colors": "red,blue,green"} // Requires SEPARATOR ","
109+
```
75110

76111
### Case sensitivity
77112

@@ -80,33 +115,76 @@ FT.CREATE ... SCHEMA ... {field_name} TAG [SEPARATOR {sep}] [CASESENSITIVE]
80115

81116
### Examples
82117

83-
**Basic tag field with JSON:**
84-
```sql
85-
JSON.SET key:1 $ '{"colors": "red, orange, yellow"}'
86-
FT.CREATE idx ON JSON PREFIX 1 key: SCHEMA $.colors AS colors TAG SEPARATOR ","
87-
88-
> FT.SEARCH idx '@colors:{orange}'
89-
1) "1"
90-
2) "key:1"
91-
3) 1) "$"
92-
2) "{\"colors\":\"red, orange, yellow\"}"
93-
```
118+
**Hash examples**
94119

95-
**Case-sensitive tags with Hash:**
96-
```sql
97-
HSET product:1 categories "Electronics,Gaming,PC"
98-
FT.CREATE products ON HASH PREFIX 1 product: SCHEMA categories TAG CASESENSITIVE
120+
1. Basic hash tag field (automatic comma splitting):
99121

100-
> FT.SEARCH products '@categories:{PC}'
101-
1) "1"
102-
2) "product:1"
103-
```
122+
```sql
123+
HSET product:1 categories "Electronics,Gaming,PC"
124+
FT.CREATE products ON HASH PREFIX 1 product: SCHEMA categories TAG
104125

105-
**Custom separator:**
106-
```sql
107-
HSET book:1 genres "Fiction;Mystery;Thriller"
108-
FT.CREATE books ON HASH PREFIX 1 book: SCHEMA genres TAG SEPARATOR ";"
109-
```
126+
> FT.SEARCH products '@categories:{Gaming}'
127+
1) "1"
128+
2) "product:1"
129+
```
130+
131+
1. Hash with custom separator:
132+
133+
```sql
134+
HSET book:1 genres "Fiction;Mystery;Thriller"
135+
FT.CREATE books ON HASH PREFIX 1 book: SCHEMA genres TAG SEPARATOR ";"
136+
```
137+
138+
1. Case-sensitive hash tags:
139+
140+
```sql
141+
HSET product:1 categories "Electronics,Gaming,PC"
142+
FT.CREATE products ON HASH PREFIX 1 product: SCHEMA categories TAG CASESENSITIVE
143+
144+
> FT.SEARCH products '@categories:{PC}' # Case matters
145+
1) "1"
146+
2) "product:1"
147+
```
148+
149+
**JSON examples**
150+
151+
1. JSON with string and explicit separator (not recommended):
152+
153+
```sql
154+
JSON.SET key:1 $ '{"colors": "red, orange, yellow"}'
155+
FT.CREATE idx ON JSON PREFIX 1 key: SCHEMA $.colors AS colors TAG SEPARATOR ","
156+
157+
> FT.SEARCH idx '@colors:{orange}'
158+
1) "1"
159+
2) "key:1"
160+
3) 1) "$"
161+
2) "{\"colors\":\"red, orange, yellow\"}"
162+
```
163+
164+
1. JSON with array of strings (recommended approach):
165+
166+
```sql
167+
JSON.SET key:1 $ '{"colors": ["red", "orange", "yellow"]}'
168+
FT.CREATE idx ON JSON PREFIX 1 key: SCHEMA $.colors[*] AS colors TAG
169+
170+
> FT.SEARCH idx '@colors:{orange}'
171+
1) "1"
172+
2) "key:1"
173+
3) 1) "$"
174+
2) "{\"colors\":[\"red\",\"orange\",\"yellow\"]}"
175+
```
176+
177+
1. JSON without separator (single tag):
178+
179+
```sql
180+
JSON.SET key:1 $ '{"category": "Electronics,Gaming"}'
181+
FT.CREATE idx ON JSON PREFIX 1 key: SCHEMA $.category AS category TAG
182+
# No SEPARATOR specified - entire string becomes one tag
183+
184+
> FT.SEARCH idx '@category:{Electronics,Gaming}' # Must match exactly
185+
1) "1"
186+
2) "key:1"
187+
```
110188

111189
## Query tag fields
112190

@@ -271,6 +349,62 @@ FT.SEARCH products "@tags:{ Top\\ Rated\\ Product }"
271349

272350
See [Query syntax]({{< relref "/develop/ai/search-and-query/advanced-concepts/query_syntax#tag-filters" >}}) for complete escaping rules.
273351

352+
## Performance and architecture considerations
353+
354+
### Multiple TAG fields versus a single TAG field
355+
356+
You can structure your data in two ways:
357+
358+
1. Multiple single-value TAG fields
359+
360+
```sql
361+
FT.CREATE products ON JSON PREFIX 1 product: SCHEMA
362+
$.color AS color TAG
363+
$.brand AS brand TAG
364+
$.type AS type TAG
365+
366+
JSON.SET product:1 $ '{"color": "blue", "brand": "ASUS", "type": "laptop"}'
367+
368+
# Query specific fields
369+
FT.SEARCH products '@color:{blue} @brand:{ASUS}'
370+
```
371+
372+
1. Single multi-value TAG field
373+
374+
```sql
375+
FT.CREATE products ON JSON PREFIX 1 product: SCHEMA
376+
$.tags[*] AS tags TAG
377+
378+
JSON.SET product:1 $ '{"tags": ["color:blue", "brand:ASUS", "type:laptop"]}'
379+
380+
# Query with prefixed values
381+
FT.SEARCH products '@tags:{color:blue} @tags:{brand:ASUS}'
382+
```
383+
384+
### Performance comparison
385+
386+
Both approaches have similar performance characteristics:
387+
388+
- Memory usage is comparable: TAG indexes are highly compressed regardless of structure.
389+
- Query speed is similar: both use the same underlying inverted index structure.
390+
- Index efficiency; TAG fields store only document IDs (1-2 bytes per entry).
391+
392+
### Choose TAG fields based on your use case
393+
394+
Use multiple TAG fields when:
395+
396+
- You need field-specific queries (`@color:{blue}` vs `@brand:{ASUS}`).
397+
- Your schema is stable and well-defined.
398+
- You want cleaner, more readable queries.
399+
- You need different configurations per field (for example, case-sensitive versus case-insensitive).
400+
401+
Use single TAG field when:
402+
403+
- You have dynamic or unknown tag categories.
404+
- You want maximum flexibility for adding new tag types.
405+
- Your application manages tag prefixing/namespacing.
406+
- You have many sparse categorical fields.
407+
274408
## An e-commerce use case
275409

276410
```sql

content/develop/ai/search-and-query/indexing/_index.md

Lines changed: 97 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -167,14 +167,71 @@ For more information about search queries, see [Search query syntax]({{< relref
167167
[`FT.SEARCH`]({{< relref "commands/ft.search/" >}}) queries require `attribute` modifiers. Don't use JSONPath expressions in queries because the query parser doesn't fully support them.
168168
{{% /alert %}}
169169

170+
## Understanding TAG field behavior: hash versus JSON
171+
172+
TAG fields behave differently depending on whether you're indexing hash or JSON documents. This difference is a common source of confusion.
173+
174+
### Hash documents
175+
176+
```sql
177+
# HASH: Comma is the default separator
178+
HSET product:1 category "Electronics,Gaming,PC"
179+
FT.CREATE products ON HASH PREFIX 1 product: SCHEMA category TAG
180+
181+
# Result: Creates 3 separate tags: "Electronics", "Gaming", "PC"
182+
FT.SEARCH products '@category:{Gaming}' # ✅ Finds the document
183+
```
184+
185+
### JSON documents
186+
187+
```sql
188+
# JSON: No default separator - the entire string becomes one tag
189+
JSON.SET product:1 $ '{"category": "Electronics,Gaming,PC"}'
190+
FT.CREATE products ON JSON PREFIX 1 product: SCHEMA $.category AS category TAG
191+
192+
# Result: Creates 1 tag: "Electronics,Gaming,PC"
193+
FT.SEARCH products '@category:{Gaming}' # ❌ Does NOT find the document
194+
FT.SEARCH products '@category:{Electronics,Gaming,PC}' # ✅ Finds the document
195+
```
196+
197+
### Making JSON documents behave like hash documents
198+
199+
To get hash-like behavior in JSON, explicitly add `SEPARATOR ","`:
200+
201+
```sql
202+
JSON.SET product:1 $ '{"category": "Electronics,Gaming,PC"}'
203+
FT.CREATE products ON JSON PREFIX 1 product: SCHEMA $.category AS category TAG SEPARATOR ","
204+
205+
# Result: Creates 3 separate tags: "Electronics", "Gaming", "PC"
206+
FT.SEARCH products '@category:{Gaming}' # ✅ Now finds the document
207+
```
208+
209+
### Recommended approach for JSON
210+
211+
Instead of comma-separated strings, use JSON arrays:
212+
213+
```sql
214+
JSON.SET product:1 $ '{"category": ["Electronics", "Gaming", "PC"]}'
215+
FT.CREATE products ON JSON PREFIX 1 product: SCHEMA $.category[*] AS category TAG
216+
217+
# Result: Creates 3 separate tags: "Electronics", "Gaming", "PC"
218+
FT.SEARCH products '@category:{Gaming}' # ✅ Finds the document
219+
```
220+
170221
## Index JSON arrays as TAG
171222

172-
The preferred method for indexing a JSON field with multivalued terms is using JSON arrays. Each value of the array is indexed, and those values must be scalars. If you want to index string or boolean values as TAGs within a JSON array, use the [JSONPath]({{< relref "/develop/data-types/json/path" >}}) wildcard operator.
223+
For JSON documents, you have two approaches to create TAG fields with multiple values:
173224

174-
To index an item's list of available colors, specify the JSONPath `$.colors.*` in the `SCHEMA` definition during index creation:
225+
### Approach 1: JSON arrays (recommended)
226+
227+
The preferred method for indexing multiple tag values is using JSON arrays. Each array element becomes a separate tag value. Use the [JSONPath]({{< relref "/develop/data-types/json/path" >}}) wildcard operator `[*]` to index array elements.
175228

176229
```sql
177-
127.0.0.1:6379> FT.CREATE itemIdx2 ON JSON PREFIX 1 item: SCHEMA $.colors.* AS colors TAG $.name AS name TEXT $.description as description TEXT
230+
# Create index with array indexing
231+
127.0.0.1:6379> FT.CREATE itemIdx2 ON JSON PREFIX 1 item: SCHEMA $.colors[*] AS colors TAG $.name AS name TEXT $.description as description TEXT
232+
233+
# The JSON data uses arrays
234+
# Each array element ("black", "silver") becomes a separate tag
178235
```
179236

180237
Now you can search for silver headphones:
@@ -187,6 +244,43 @@ Now you can search for silver headphones:
187244
2) "{\"name\":\"Noise-cancelling Bluetooth headphones\",\"description\":\"Wireless Bluetooth headphones with noise-cancelling technology\",\"connection\":{\"wireless\":true,\"type\":\"Bluetooth\"},\"price\":99.98,\"stock\":25,\"colors\":[\"black\",\"silver\"]}"
188245
```
189246

247+
### Approach 2: strings with explicit separators
248+
249+
You can also use comma-separated strings, but you must explicitly specify the `SEPARATOR`:
250+
251+
```sql
252+
# JSON with comma-separated string
253+
JSON.SET item:1 $ '{"colors": "black,silver,gold"}'
254+
255+
# Index with explicit separator
256+
FT.CREATE itemIdx3 ON JSON PREFIX 1 item: SCHEMA $.colors AS colors TAG SEPARATOR ","
257+
258+
# Now you can search individual colors
259+
FT.SEARCH itemIdx3 "@colors:{silver}"
260+
```
261+
262+
{{% alert title="Important: JSON vs HASH behavior" color="warning" %}}
263+
- **JSON without SEPARATOR**: `"black,silver"` becomes one tag: `"black,silver"`.
264+
- **JSON with SEPARATOR ","**: `"black,silver"` becomes two tags: `"black"` and `"silver"`.
265+
- **Hash (default)**: `"black,silver"` becomes two tags: `"black"` and `"silver"`.
266+
267+
For JSON, always specify `SEPARATOR ","` if you want to split comma-separated strings, or use arrays instead.
268+
{{% /alert %}}
269+
270+
### Which approach to choose?
271+
272+
Use JSON arrays when:
273+
274+
- You control the data structure.
275+
- You want clean, structured data.
276+
- You need to store complex values (strings with spaces, punctuation).
277+
278+
Use strings with separators when:
279+
280+
- You're migrating from hashes to JSON.
281+
- You receive data as delimited strings.
282+
- You need compatibility with existing systems.
283+
190284
## Index JSON arrays as TEXT
191285
Starting with RediSearch v2.6.0, full text search can be done on an array of strings or on a JSONPath leading to multiple strings.
192286

0 commit comments

Comments
 (0)