From 3ef90cd2b7c1d30457afb314020016ec2a1e0264 Mon Sep 17 00:00:00 2001 From: "David W. Dougherty" Date: Thu, 27 Feb 2025 08:38:55 -0800 Subject: [PATCH 1/2] DEV: document Unicode limitations --- .../field-and-type-options.md | 29 ++++++++++++++++++- .../search-and-query/query/full-text.md | 29 ++++++++++++++++++- 2 files changed, 56 insertions(+), 2 deletions(-) diff --git a/content/develop/interact/search-and-query/basic-constructs/field-and-type-options.md b/content/develop/interact/search-and-query/basic-constructs/field-and-type-options.md index 64e17d99d4..233efc2313 100644 --- a/content/develop/interact/search-and-query/basic-constructs/field-and-type-options.md +++ b/content/develop/interact/search-and-query/basic-constructs/field-and-type-options.md @@ -232,4 +232,31 @@ You can search for documents with specific text values using the `` or the - Search for a term only in the `title` attribute ``` FT.SEARCH books-idx "@title:dogs" - ``` \ No newline at end of file + ``` + +## Unicode considerations + +Search and query only supports Unicode characters in the [basic multilingual plane](https://en.wikipedia.org/wiki/Plane_(Unicode)#Basic_Multilingual_Plane); U+0000 to U+FFFF. Unicode characters beyond U+FFFF, such as Emojis, are not supported and would not be retrieved by queries including such characters in the following use cases: + +* Querying TEXT fields with Prefix/Suffix/Infix +* Querying TEXT fields with fuzzy + +Examples: + +``` +redis> FT.CREATE idx SCHEMA tag TAG text TEXT +OK +redis> HSET doc:1 tag '😀😁🙂' text '😀😁🙂' +(integer) 2 +redis> HSET doc:2 tag '😀😁🙂abc' text '😀😁🙂abc' +(integer) 2 +redis> FT.SEARCH idx '@text:(*😀😁🙂)' NOCONTENT +1) (integer) 0 +redis> FT.SEARCH idx '@text:(*😀😁🙂*)' NOCONTENT +1) (integer) 0 +redis> FT.SEARCH idx '@text:(😀😁🙂*)' NOCONTENT +1) (integer) 0 + +redis> FT.SEARCH idx '@text:(%😀😁🙃%)' NOCONTENT +1) (integer) 0 +``` \ No newline at end of file diff --git a/content/develop/interact/search-and-query/query/full-text.md b/content/develop/interact/search-and-query/query/full-text.md index 98a2f27e00..cdd7043645 100644 --- a/content/develop/interact/search-and-query/query/full-text.md +++ b/content/develop/interact/search-and-query/query/full-text.md @@ -117,4 +117,31 @@ If you want to increase the maximum word distance to two, you can use the follow {{< clients-example query_ft ft5 >}} FT.SEARCH idx:bicycle "%%optamised%%" -{{< /clients-example >}} \ No newline at end of file +{{< /clients-example >}} + +## Unicode considerations + +Search and query only supports Unicode characters in the [basic multilingual plane](https://en.wikipedia.org/wiki/Plane_(Unicode)#Basic_Multilingual_Plane); U+0000 to U+FFFF. Unicode characters beyond U+FFFF, such as Emojis, are not supported and would not be retrieved by queries including such characters in the following use cases: + +* Querying TEXT fields with Prefix/Suffix/Infix +* Querying TEXT fields with fuzzy + +Examples: + +``` +redis> FT.CREATE idx SCHEMA tag TAG text TEXT +OK +redis> HSET doc:1 tag '😀😁🙂' text '😀😁🙂' +(integer) 2 +redis> HSET doc:2 tag '😀😁🙂abc' text '😀😁🙂abc' +(integer) 2 +redis> FT.SEARCH idx '@text:(*😀😁🙂)' NOCONTENT +1) (integer) 0 +redis> FT.SEARCH idx '@text:(*😀😁🙂*)' NOCONTENT +1) (integer) 0 +redis> FT.SEARCH idx '@text:(😀😁🙂*)' NOCONTENT +1) (integer) 0 + +redis> FT.SEARCH idx '@text:(%😀😁🙃%)' NOCONTENT +1) (integer) 0 +``` \ No newline at end of file From 5d33d33dbcc991770290e674d2fd106a8107a2ce Mon Sep 17 00:00:00 2001 From: "David W. Dougherty" Date: Thu, 27 Feb 2025 08:55:24 -0800 Subject: [PATCH 2/2] Apply code review comments --- .../search-and-query/basic-constructs/field-and-type-options.md | 2 +- content/develop/interact/search-and-query/query/full-text.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/content/develop/interact/search-and-query/basic-constructs/field-and-type-options.md b/content/develop/interact/search-and-query/basic-constructs/field-and-type-options.md index 233efc2313..783f1bab13 100644 --- a/content/develop/interact/search-and-query/basic-constructs/field-and-type-options.md +++ b/content/develop/interact/search-and-query/basic-constructs/field-and-type-options.md @@ -236,7 +236,7 @@ You can search for documents with specific text values using the `` or the ## Unicode considerations -Search and query only supports Unicode characters in the [basic multilingual plane](https://en.wikipedia.org/wiki/Plane_(Unicode)#Basic_Multilingual_Plane); U+0000 to U+FFFF. Unicode characters beyond U+FFFF, such as Emojis, are not supported and would not be retrieved by queries including such characters in the following use cases: +Redis Query Engine only supports Unicode characters in the [basic multilingual plane](https://en.wikipedia.org/wiki/Plane_(Unicode)#Basic_Multilingual_Plane); U+0000 to U+FFFF. Unicode characters beyond U+FFFF, such as Emojis, are not supported and would not be retrieved by queries including such characters in the following use cases: * Querying TEXT fields with Prefix/Suffix/Infix * Querying TEXT fields with fuzzy diff --git a/content/develop/interact/search-and-query/query/full-text.md b/content/develop/interact/search-and-query/query/full-text.md index cdd7043645..37fd35fe60 100644 --- a/content/develop/interact/search-and-query/query/full-text.md +++ b/content/develop/interact/search-and-query/query/full-text.md @@ -121,7 +121,7 @@ FT.SEARCH idx:bicycle "%%optamised%%" ## Unicode considerations -Search and query only supports Unicode characters in the [basic multilingual plane](https://en.wikipedia.org/wiki/Plane_(Unicode)#Basic_Multilingual_Plane); U+0000 to U+FFFF. Unicode characters beyond U+FFFF, such as Emojis, are not supported and would not be retrieved by queries including such characters in the following use cases: +Redis Query Engine only supports Unicode characters in the [basic multilingual plane](https://en.wikipedia.org/wiki/Plane_(Unicode)#Basic_Multilingual_Plane); U+0000 to U+FFFF. Unicode characters beyond U+FFFF, such as Emojis, are not supported and would not be retrieved by queries including such characters in the following use cases: * Querying TEXT fields with Prefix/Suffix/Infix * Querying TEXT fields with fuzzy