Skip to content

Commit 4a92c13

Browse files
Explain trigram indexing for text-2.0 provider (#1152)
1 parent 0eb5038 commit 4a92c13

File tree

2 files changed

+14
-2
lines changed

2 files changed

+14
-2
lines changed

modules/ROOT/pages/indexes/search-performance-indexes/managing-indexes.adoc

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -216,7 +216,7 @@ As of Neo4j 5.17, an informational notification is instead returned.
216216
Creating a text index can be done with the `CREATE TEXT INDEX` command.
217217
Note that the index name must be unique.
218218

219-
Text indexes have no supported index configuration and, as of Neo4j 5.1, they have two index providers available, `text-2.0` (default) and `text-1.0` (deprecated).
219+
Text indexes have no supported index configuration and, as of Neo4j 5.1, they have two index providers available, `text-2.0` (default -- see xref:indexes/search-performance-indexes/managing-indexes.adoc#text-indexes-trigram-indexes[Trigram indexing] below for more information) and `text-1.0` (deprecated).
220220

221221
[[text-indexes-supported-predicates]]
222222
[discrete]
@@ -286,6 +286,18 @@ See the section about xref:indexes/search-performance-indexes/using-indexes.adoc
286286
[TIP]
287287
Text indexes are only used for exact query matches. To perform approximate matches (including, for example, variations and typos), and to compute a similarity score between `STRING` values, use semantic xref:indexes/semantic-indexes/full-text-indexes.adoc[full-text indexes] instead.
288288

289+
[[text-indexes-trigram-indexes]]
290+
==== Trigram indexing
291+
292+
The default text index provider, `text-2.0`, uses trigram indexing.
293+
This means that `STRING` values are indexed into overlapping trigrams, each containing three Unicode code points.
294+
For example, the word `"developer"` would be indexed by the following trigrams: `["dev", "eve", "vel", "elo", "lop", "ope", "per"]`.
295+
296+
This makes text indexes particularly suitable for substring (`CONTAINS`) and suffix (`ENDS WITH`) searches, as well as prefix searches (`STARTS WITH`).
297+
For example, searches like `CONTAINS "vel"` or `ENDS WITH "per"` can be efficiently performed by directly looking up the relevant trigrams in the index.
298+
By comparison, range indexes, which indexes `STRING` values lexicographically (see xref:indexes/search-performance-indexes/using-indexes.adoc#range-index-backed-order-by[Range index-backed `ORDER BY`] for more information) and are therefore more suited for prefix searches, would need to scan through all indexed values to check if `"vel"` existed anywhere within the text.
299+
For more information, see xref:indexes/search-performance-indexes/using-indexes.adoc#text-indexes[The impact of indexes on query performance -> Text indexes].
300+
289301
[discrete]
290302
[[text-indexes-examples]]
291303
==== Examples

modules/ROOT/pages/indexes/search-performance-indexes/using-indexes.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -232,7 +232,7 @@ Total database accesses: 7, total allocated memory: 312
232232

233233
This is because range indexes store `STRING` values alphabetically.
234234
This means that, while they are very efficient for retrieving exact matches of a `STRING`, or for prefix matching, they are less efficient for suffix and contains searches, where they have to scan all relevant properties to filter any matches.
235-
Text indexes do not store `STRING` properties alphabetically, and are instead optimized for suffix and contains searches.
235+
Text indexes do not store `STRING` properties alphabetically, and are instead optimized for suffix and contains searches (for more information, see xref:indexes/search-performance-indexes/managing-indexes.adoc#text-indexes-trigram-indexes[Create a text index -> Trigram indexing]).
236236
That said, if no range index had been present on the name property, the previous query would still have been able to utilize the text index.
237237
It would have done so less efficiently than a range index, but it still would have been useful.
238238

0 commit comments

Comments
 (0)