Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
190 changes: 162 additions & 28 deletions content/develop/ai/search-and-query/advanced-concepts/tags.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,16 @@ weight: 6

Tag fields provide exact match search capabilities with high performance and memory efficiency. Use tag fields when you need to filter documents by specific values without the complexity of full-text search tokenization.

Tag fields interpret text as a simple list of *tags* delimited by a [separator](#separator-options) character (comma "`,`" by default). This approach enables simpler [tokenization]({{< relref "/develop/ai/search-and-query/advanced-concepts/escaping/#tokenization-rules-for-tag-fields" >}}) and encoding, making tag indexes much more efficient than full-text indexes. Note: even though tag and text fields both use text, they are two separate field types and so you don't query them the same way.
Tag fields interpret text as a simple list of *tags* delimited by a [separator](#separator-options) character. This approach enables simpler [tokenization]({{< relref "/develop/ai/search-and-query/advanced-concepts/escaping/#tokenization-rules-for-tag-fields" >}}) and encoding, making tag indexes much more efficient than full-text indexes. Note: even though tag and text fields both use text, they are two separate field types and so you don't query them the same way.

{{% alert title="Important: Different defaults for HASH vs JSON" color="warning" %}}
- The default separator for hash documents is a comma (`,`).
- There is no default separator for JSON documents. You must explicitly specify one if needed.

Specifying a tag from the text `"foo,bar"` behaves differently:
- For hash documents, two tags are created: `"foo"` and `"bar"`.
- For JSON documents, one tag is created: `"foo,bar"` (unless you add `SEPARATOR ","`).
{{% /alert %}}

## Tag fields vs text fields

Expand Down Expand Up @@ -69,9 +78,35 @@ FT.CREATE ... SCHEMA ... {field_name} TAG [SEPARATOR {sep}] [CASESENSITIVE]

### Separator options

- **Hash documents**: Default separator is comma (`,`). You can use any printable ASCII character
- **JSON documents**: No default separator - you must specify one explicitly if needed
- **Custom separators**: Use semicolon (`;`), pipe (`|`), or other characters as needed
The separator behavior differs significantly between hash and JSON documents:

**Hash documents**

- The default separator is the comma (`,`).
- Strings are automatically splits at commas. For example,
the string `"red,blue,green"` becomes three tags: `"red"`, `"blue"`, and `"green"`.
- You can use any printable ASCII character as a custom separator.

**JSON documents**

- There is no default separator; it's effectively `null`.
- Treats the entire string as single tag unless you specify a separator with the `SEPARATOR` option. For example,
the string `"red,blue,green"` becomes one tag: `"red,blue,green"`
- Add `SEPARATOR ","` to your schema to allow splitting.
- You should use JSON arrays instead of comma-separated strings

**Why the difference?**

JSON has native array support, so the preferred approach is:

```json
{"colors": ["red", "blue", "green"]} // Use with $.colors[*] AS colors TAG
```
Rather than:

```json
{"colors": "red,blue,green"} // Requires SEPARATOR ","
```

### Case sensitivity

Expand All @@ -80,33 +115,76 @@ FT.CREATE ... SCHEMA ... {field_name} TAG [SEPARATOR {sep}] [CASESENSITIVE]

### Examples

**Basic tag field with JSON:**
```sql
JSON.SET key:1 $ '{"colors": "red, orange, yellow"}'
FT.CREATE idx ON JSON PREFIX 1 key: SCHEMA $.colors AS colors TAG SEPARATOR ","

> FT.SEARCH idx '@colors:{orange}'
1) "1"
2) "key:1"
3) 1) "$"
2) "{\"colors\":\"red, orange, yellow\"}"
```
**Hash examples**

**Case-sensitive tags with Hash:**
```sql
HSET product:1 categories "Electronics,Gaming,PC"
FT.CREATE products ON HASH PREFIX 1 product: SCHEMA categories TAG CASESENSITIVE
1. Basic hash tag field (automatic comma splitting):

> FT.SEARCH products '@categories:{PC}'
1) "1"
2) "product:1"
```
```sql
HSET product:1 categories "Electronics,Gaming,PC"
FT.CREATE products ON HASH PREFIX 1 product: SCHEMA categories TAG

**Custom separator:**
```sql
HSET book:1 genres "Fiction;Mystery;Thriller"
FT.CREATE books ON HASH PREFIX 1 book: SCHEMA genres TAG SEPARATOR ";"
```
> FT.SEARCH products '@categories:{Gaming}'
1) "1"
2) "product:1"
```

1. Hash with custom separator:

```sql
HSET book:1 genres "Fiction;Mystery;Thriller"
FT.CREATE books ON HASH PREFIX 1 book: SCHEMA genres TAG SEPARATOR ";"
```

1. Case-sensitive hash tags:

```sql
HSET product:1 categories "Electronics,Gaming,PC"
FT.CREATE products ON HASH PREFIX 1 product: SCHEMA categories TAG CASESENSITIVE

> FT.SEARCH products '@categories:{PC}' # Case matters
1) "1"
2) "product:1"
```

**JSON examples**

1. JSON with string and explicit separator (not recommended):

```sql
JSON.SET key:1 $ '{"colors": "red, orange, yellow"}'
FT.CREATE idx ON JSON PREFIX 1 key: SCHEMA $.colors AS colors TAG SEPARATOR ","

> FT.SEARCH idx '@colors:{orange}'
1) "1"
2) "key:1"
3) 1) "$"
2) "{\"colors\":\"red, orange, yellow\"}"
```

1. JSON with array of strings (recommended approach):

```sql
JSON.SET key:1 $ '{"colors": ["red", "orange", "yellow"]}'
FT.CREATE idx ON JSON PREFIX 1 key: SCHEMA $.colors[*] AS colors TAG

> FT.SEARCH idx '@colors:{orange}'
1) "1"
2) "key:1"
3) 1) "$"
2) "{\"colors\":[\"red\",\"orange\",\"yellow\"]}"
```

1. JSON without separator (single tag):

```sql
JSON.SET key:1 $ '{"category": "Electronics,Gaming"}'
FT.CREATE idx ON JSON PREFIX 1 key: SCHEMA $.category AS category TAG
# No SEPARATOR specified - entire string becomes one tag

> FT.SEARCH idx '@category:{Electronics,Gaming}' # Must match exactly
1) "1"
2) "key:1"
```

## Query tag fields

Expand Down Expand Up @@ -271,6 +349,62 @@ FT.SEARCH products "@tags:{ Top\\ Rated\\ Product }"

See [Query syntax]({{< relref "/develop/ai/search-and-query/advanced-concepts/query_syntax#tag-filters" >}}) for complete escaping rules.

## Performance and architecture considerations

### Multiple TAG fields versus a single TAG field

You can structure your data in two ways:

1. Multiple single-value TAG fields

```sql
FT.CREATE products ON JSON PREFIX 1 product: SCHEMA
$.color AS color TAG
$.brand AS brand TAG
$.type AS type TAG

JSON.SET product:1 $ '{"color": "blue", "brand": "ASUS", "type": "laptop"}'

# Query specific fields
FT.SEARCH products '@color:{blue} @brand:{ASUS}'
```

1. Single multi-value TAG field

```sql
FT.CREATE products ON JSON PREFIX 1 product: SCHEMA
$.tags[*] AS tags TAG

JSON.SET product:1 $ '{"tags": ["color:blue", "brand:ASUS", "type:laptop"]}'

# Query with prefixed values
FT.SEARCH products '@tags:{color:blue} @tags:{brand:ASUS}'
```

### Performance comparison

Both approaches have similar performance characteristics:

- Memory usage is comparable: TAG indexes are highly compressed regardless of structure.
- Query speed is similar: both use the same underlying inverted index structure.
- Index efficiency; TAG fields store only document IDs (1-2 bytes per entry).

### Choose TAG fields based on your use case

Use multiple TAG fields when:

- You need field-specific queries (`@color:{blue}` vs `@brand:{ASUS}`).
- Your schema is stable and well-defined.
- You want cleaner, more readable queries.
- You need different configurations per field (for example, case-sensitive versus case-insensitive).

Use single TAG field when:

- You have dynamic or unknown tag categories.
- You want maximum flexibility for adding new tag types.
- Your application manages tag prefixing/namespacing.
- You have many sparse categorical fields.

## An e-commerce use case

```sql
Expand Down
100 changes: 97 additions & 3 deletions content/develop/ai/search-and-query/indexing/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -167,14 +167,71 @@ For more information about search queries, see [Search query syntax]({{< relref
[`FT.SEARCH`]({{< relref "commands/ft.search/" >}}) queries require `attribute` modifiers. Don't use JSONPath expressions in queries because the query parser doesn't fully support them.
{{% /alert %}}

## Understanding TAG field behavior: hash versus JSON

TAG fields behave differently depending on whether you're indexing hash or JSON documents. This difference is a common source of confusion.

### Hash documents

```sql
# HASH: Comma is the default separator
HSET product:1 category "Electronics,Gaming,PC"
FT.CREATE products ON HASH PREFIX 1 product: SCHEMA category TAG

# Result: Creates 3 separate tags: "Electronics", "Gaming", "PC"
FT.SEARCH products '@category:{Gaming}' # ✅ Finds the document
```

### JSON documents

```sql
# JSON: No default separator - the entire string becomes one tag
JSON.SET product:1 $ '{"category": "Electronics,Gaming,PC"}'
FT.CREATE products ON JSON PREFIX 1 product: SCHEMA $.category AS category TAG

# Result: Creates 1 tag: "Electronics,Gaming,PC"
FT.SEARCH products '@category:{Gaming}' # ❌ Does NOT find the document
FT.SEARCH products '@category:{Electronics,Gaming,PC}' # ✅ Finds the document
```

### Making JSON documents behave like hash documents

To get hash-like behavior in JSON, explicitly add `SEPARATOR ","`:

```sql
JSON.SET product:1 $ '{"category": "Electronics,Gaming,PC"}'
FT.CREATE products ON JSON PREFIX 1 product: SCHEMA $.category AS category TAG SEPARATOR ","

# Result: Creates 3 separate tags: "Electronics", "Gaming", "PC"
FT.SEARCH products '@category:{Gaming}' # ✅ Now finds the document
```

### Recommended approach for JSON

Instead of comma-separated strings, use JSON arrays:

```sql
JSON.SET product:1 $ '{"category": ["Electronics", "Gaming", "PC"]}'
FT.CREATE products ON JSON PREFIX 1 product: SCHEMA $.category[*] AS category TAG

# Result: Creates 3 separate tags: "Electronics", "Gaming", "PC"
FT.SEARCH products '@category:{Gaming}' # ✅ Finds the document
```

## Index JSON arrays as TAG

The preferred method for indexing a JSON field with multivalued terms is using JSON arrays. Each value of the array is indexed, and those values must be scalars. If you want to index string or boolean values as TAGs within a JSON array, use the [JSONPath]({{< relref "/develop/data-types/json/path" >}}) wildcard operator.
For JSON documents, you have two approaches to create TAG fields with multiple values:

To index an item's list of available colors, specify the JSONPath `$.colors.*` in the `SCHEMA` definition during index creation:
### Approach 1: JSON arrays (recommended)

The preferred method for indexing multiple tag values is using JSON arrays. Each array element becomes a separate tag value. Use the [JSONPath]({{< relref "/develop/data-types/json/path" >}}) wildcard operator `[*]` to index array elements.

```sql
127.0.0.1:6379> FT.CREATE itemIdx2 ON JSON PREFIX 1 item: SCHEMA $.colors.* AS colors TAG $.name AS name TEXT $.description as description TEXT
# Create index with array indexing
127.0.0.1:6379> FT.CREATE itemIdx2 ON JSON PREFIX 1 item: SCHEMA $.colors[*] AS colors TAG $.name AS name TEXT $.description as description TEXT

# The JSON data uses arrays
# Each array element ("black", "silver") becomes a separate tag
```

Now you can search for silver headphones:
Expand All @@ -187,6 +244,43 @@ Now you can search for silver headphones:
2) "{\"name\":\"Noise-cancelling Bluetooth headphones\",\"description\":\"Wireless Bluetooth headphones with noise-cancelling technology\",\"connection\":{\"wireless\":true,\"type\":\"Bluetooth\"},\"price\":99.98,\"stock\":25,\"colors\":[\"black\",\"silver\"]}"
```

### Approach 2: strings with explicit separators

You can also use comma-separated strings, but you must explicitly specify the `SEPARATOR`:

```sql
# JSON with comma-separated string
JSON.SET item:1 $ '{"colors": "black,silver,gold"}'

# Index with explicit separator
FT.CREATE itemIdx3 ON JSON PREFIX 1 item: SCHEMA $.colors AS colors TAG SEPARATOR ","

# Now you can search individual colors
FT.SEARCH itemIdx3 "@colors:{silver}"
```

{{% alert title="Important: JSON vs HASH behavior" color="warning" %}}
- **JSON without SEPARATOR**: `"black,silver"` becomes one tag: `"black,silver"`.
- **JSON with SEPARATOR ","**: `"black,silver"` becomes two tags: `"black"` and `"silver"`.
- **Hash (default)**: `"black,silver"` becomes two tags: `"black"` and `"silver"`.

For JSON, always specify `SEPARATOR ","` if you want to split comma-separated strings, or use arrays instead.
{{% /alert %}}

### Which approach to choose?

Use JSON arrays when:

- You control the data structure.
- You want clean, structured data.
- You need to store complex values (strings with spaces, punctuation).

Use strings with separators when:

- You're migrating from hashes to JSON.
- You receive data as delimited strings.
- You need compatibility with existing systems.

## Index JSON arrays as TEXT
Starting with RediSearch v2.6.0, full text search can be done on an array of strings or on a JSONPath leading to multiple strings.

Expand Down