From b982959d728435883a803fcc68cd0aab8b094d08 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jens=20Pryce-=C3=85klundh?= <112686610+JPryce-Aklundh@users.noreply.github.com> Date: Wed, 20 Nov 2024 09:57:15 +0100 Subject: [PATCH 1/4] add complete fulltext stop words and clarify populating state --- .../managing-indexes.adoc | 4 +++- .../semantic-indexes/full-text-indexes.adoc | 15 +++++++-------- .../indexes/semantic-indexes/vector-indexes.adoc | 4 ++++ 3 files changed, 14 insertions(+), 9 deletions(-) diff --git a/modules/ROOT/pages/indexes/search-performance-indexes/managing-indexes.adoc b/modules/ROOT/pages/indexes/search-performance-indexes/managing-indexes.adoc index bbca22600..ac10fba12 100644 --- a/modules/ROOT/pages/indexes/search-performance-indexes/managing-indexes.adoc +++ b/modules/ROOT/pages/indexes/search-performance-indexes/managing-indexes.adoc @@ -40,7 +40,9 @@ In those cases, nothing needs to be specified and the `OPTIONS` map should be om [TIP] Creating an index requires link:{neo4j-docs-base-uri}/operations-manual/{page-version}/authentication-authorization/database-administration/#access-control-database-administration-index[the `CREATE INDEX` privilege]. -A newly created index is not immediately available but is created in the background. +[NOTE] +An index cannot be queried if its `state` is `"POPULATING"` (i.e., if it was just created). +To confirm the `state` of a full-text index (whether it is `"ONLINE"` or `"POPULATING"`), run the following command: `SHOW INDEXES YIELD *`. [[create-range-index]] === Create a range index diff --git a/modules/ROOT/pages/indexes/semantic-indexes/full-text-indexes.adoc b/modules/ROOT/pages/indexes/semantic-indexes/full-text-indexes.adoc index 1ec0fdd97..3bac84b9e 100644 --- a/modules/ROOT/pages/indexes/semantic-indexes/full-text-indexes.adoc +++ b/modules/ROOT/pages/indexes/semantic-indexes/full-text-indexes.adoc @@ -3,7 +3,7 @@ = Full-text indexes A full-text index is used to index nodes and relationships by `STRING` properties. -Unlike xref:indexes/search-performance-indexes/managing-indexes.adoc#indexes-create-range-index[range] and xref:indexes/search-performance-indexes/managing-indexes.adoc#indexes-create-text-index[text] indexes, which can only perform limited `STRING` matching (exact, prefix, substring, or suffix matches), full-text indexes stores individual words in any given `STRING` property. +Unlike xref:indexes/search-performance-indexes/managing-indexes.adoc#create-range-index[range] and xref:indexes/search-performance-indexes/managing-indexes.adoc#create-text-index[text] indexes, which can only perform limited `STRING` matching (exact, prefix, substring, or suffix matches), full-text indexes stores individual words in any given `STRING` property. This means that full-text indexes can be used to match within the _content_ of a `STRING` property. Full-text indexes also return a score of proximity between a given query string and the `STRING` values stored in the database, thus enabling them to semantically interpret data. @@ -83,15 +83,15 @@ The default analyzer (`standard-no-stop-words`) analyzes both the indexed values Stop words are common words in a language that can be filtered out during information retrieval tasks since they are considered to be of little use when determining the meaning of a string. These words are typically short and frequently used across various contexts. -For example, the following stop words are included in Lucene’s english analyzer: "a", "an", "and", "are", "as", "at", "be", "but”, and so on. +For example, the following stop words are included in Lucene’s `english` analyzer: "a", "an", "and", "are", "as", "at", "be", "but", "by", "for", "if", "in", "into", "is", "it", "no", "not", "of", "on", "or", "such", "that", "the", "their", "then", "there", "these", "they", "this", "to", "was", "will", and "with". Removing stop words can help reduce the size of stored data and thereby improve the efficiency of data retrieval. ==== In some cases, using different analyzers for the indexed values and query string is more appropriate. -For example, if handling `STRING` values written in Swedish, it may be beneficial to select the _swedish_ analyzer, which knows how to tokenize Swedish words, and will avoid indexing Swedish stop words. +For example, if handling `STRING` values written in Swedish, it may be beneficial to select the `swedish` analyzer, which knows how to tokenize Swedish words, and will avoid indexing Swedish stop words. -A complete list of all available analyzers is included in the result of the link:{neo4j-docs-base-uri}/operations-manual/{page-version}/reference/procedures/#procedure_db_index_fulltext_listavailableanalyzers[`db.index.fulltext.listAvailableAnalyzers`] procedure. +The link:{neo4j-docs-base-uri}/operations-manual/{page-version}/reference/procedures/#procedure_db_index_fulltext_listavailableanalyzers[`db.index.fulltext.listAvailableAnalyzers()`] procedure shows all available analyzers. Neo4j also supports the use of custom analyzers. For more information, see the link:{neo4j-docs-base-uri}/java-reference/{page-version}/extending-neo4j/full-text-analyzer-provider[Java Reference Manual -> Full-text index analyzer providers]. @@ -134,13 +134,12 @@ For more information on how to configure full-text indexes, refer to the link:{n [[query-full-text-indexes]] == Query full-text indexes +Unlike xref:indexes/search-performance-indexes/managing-indexes.adoc[search-performance indexes], full-text indexes are not automatically used by the xref:planning-and-tuning/execution-plans.adoc[Cypher query planner]. To query a full-text index, use either the link:{neo4j-docs-base-uri}/operations-manual/{page-version}/reference/procedures/#procedure_db_index_fulltext_querynodes[`db.index.fulltext.queryNodes`] or the link:{neo4j-docs-base-uri}/operations-manual/{page-version}/reference/procedures/#procedure_db_index_fulltext_relationships[`db.index.fulltext.queryRelationships`] procedure. [NOTE] -==== -Unlike other xref:indexes/search-performance-indexes/managing-indexes.adoc[search-performance indexes], full-text indexes are not automatically used by the xref:planning-and-tuning/execution-plans.adoc[Cypher query planner]. -To access full-text indexes, they must be explicitly called with the above-mentioned procedures. -==== +An index cannot be queried if its `state` is `"POPULATING"` (i.e., if it was just created). +To confirm the `state` of a full-text index (whether it is `"ONLINE"` or `"POPULATING"`), run the following command: `SHOW FULLTEXT INDEXES YIELD *`. This query uses the `db.index.fulltext.queryNodes` to look for `nils` in the previously created full-text index `namesAndTeams`: diff --git a/modules/ROOT/pages/indexes/semantic-indexes/vector-indexes.adoc b/modules/ROOT/pages/indexes/semantic-indexes/vector-indexes.adoc index 61a986b33..3905fa353 100644 --- a/modules/ROOT/pages/indexes/semantic-indexes/vector-indexes.adoc +++ b/modules/ROOT/pages/indexes/semantic-indexes/vector-indexes.adoc @@ -196,6 +196,10 @@ Default value::: `100` To query a node vector index, use the link:{neo4j-docs-base-uri}/operations-manual/{page-version}/reference/procedures/#procedure_db_index_vector_queryNodes[`db.index.vector.queryNodes`] procedure. +[NOTE] +An index cannot be queried if its `state` is `"POPULATING"` (i.e., if it was just created). +To confirm the `state` of a vector index (whether it is `"ONLINE"` or `"POPULATING"`), run the following command: `SHOW VECTOR INDEXES YIELD *`. + .Signature for `db.index.vector.queryNodes` [source,syntax] ---- From 2291a4c9a163b550b37742daa6652f525624c6c2 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jens=20Pryce-=C3=85klundh?= <112686610+JPryce-Aklundh@users.noreply.github.com> Date: Wed, 20 Nov 2024 12:12:31 +0100 Subject: [PATCH 2/4] fix --- .../indexes/search-performance-indexes/managing-indexes.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/modules/ROOT/pages/indexes/search-performance-indexes/managing-indexes.adoc b/modules/ROOT/pages/indexes/search-performance-indexes/managing-indexes.adoc index ac10fba12..dc600c250 100644 --- a/modules/ROOT/pages/indexes/search-performance-indexes/managing-indexes.adoc +++ b/modules/ROOT/pages/indexes/search-performance-indexes/managing-indexes.adoc @@ -42,7 +42,7 @@ Creating an index requires link:{neo4j-docs-base-uri}/operations-manual/{page-ve [NOTE] An index cannot be queried if its `state` is `"POPULATING"` (i.e., if it was just created). -To confirm the `state` of a full-text index (whether it is `"ONLINE"` or `"POPULATING"`), run the following command: `SHOW INDEXES YIELD *`. +To confirm the `state` of an index (whether it is `"ONLINE"` or `"POPULATING"`), run the following command: `SHOW INDEXES YIELD *`. [[create-range-index]] === Create a range index From e02c3a29b208926e013f1fde4c4bb51b5744edbf Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jens=20Pryce-=C3=85klundh?= <112686610+JPryce-Aklundh@users.noreply.github.com> Date: Mon, 25 Nov 2024 09:31:47 +0100 Subject: [PATCH 3/4] better wording --- .../indexes/search-performance-indexes/managing-indexes.adoc | 4 ++-- .../pages/indexes/semantic-indexes/full-text-indexes.adoc | 4 ++-- .../ROOT/pages/indexes/semantic-indexes/vector-indexes.adoc | 4 ++-- 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/modules/ROOT/pages/indexes/search-performance-indexes/managing-indexes.adoc b/modules/ROOT/pages/indexes/search-performance-indexes/managing-indexes.adoc index dc600c250..c6caa34e6 100644 --- a/modules/ROOT/pages/indexes/search-performance-indexes/managing-indexes.adoc +++ b/modules/ROOT/pages/indexes/search-performance-indexes/managing-indexes.adoc @@ -41,8 +41,8 @@ In those cases, nothing needs to be specified and the `OPTIONS` map should be om Creating an index requires link:{neo4j-docs-base-uri}/operations-manual/{page-version}/authentication-authorization/database-administration/#access-control-database-administration-index[the `CREATE INDEX` privilege]. [NOTE] -An index cannot be queried if its `state` is `"POPULATING"` (i.e., if it was just created). -To confirm the `state` of an index (whether it is `"ONLINE"` or `"POPULATING"`), run the following command: `SHOW INDEXES YIELD *`. +An index cannot be used while its `state` is `POPULATING`, which occurs immediately after it is created. +To check the `state` of an index -- whether it is `ONLINE` (usable) or `POPULATING`(still being built) -- run the following command: `SHOW INDEXES YIELD *`. [[create-range-index]] === Create a range index diff --git a/modules/ROOT/pages/indexes/semantic-indexes/full-text-indexes.adoc b/modules/ROOT/pages/indexes/semantic-indexes/full-text-indexes.adoc index 3bac84b9e..8bc0422a3 100644 --- a/modules/ROOT/pages/indexes/semantic-indexes/full-text-indexes.adoc +++ b/modules/ROOT/pages/indexes/semantic-indexes/full-text-indexes.adoc @@ -138,8 +138,8 @@ Unlike xref:indexes/search-performance-indexes/managing-indexes.adoc[search-perf To query a full-text index, use either the link:{neo4j-docs-base-uri}/operations-manual/{page-version}/reference/procedures/#procedure_db_index_fulltext_querynodes[`db.index.fulltext.queryNodes`] or the link:{neo4j-docs-base-uri}/operations-manual/{page-version}/reference/procedures/#procedure_db_index_fulltext_relationships[`db.index.fulltext.queryRelationships`] procedure. [NOTE] -An index cannot be queried if its `state` is `"POPULATING"` (i.e., if it was just created). -To confirm the `state` of a full-text index (whether it is `"ONLINE"` or `"POPULATING"`), run the following command: `SHOW FULLTEXT INDEXES YIELD *`. +An index cannot be used while its `state` is `POPULATING`, which occurs immediately after it is created. +To check the `state` of a full-text index -- whether it is `ONLINE` (usable) or `POPULATING`(still being built) -- run the following command: `SHOW FULLTEXT INDEXES YIELD *`. This query uses the `db.index.fulltext.queryNodes` to look for `nils` in the previously created full-text index `namesAndTeams`: diff --git a/modules/ROOT/pages/indexes/semantic-indexes/vector-indexes.adoc b/modules/ROOT/pages/indexes/semantic-indexes/vector-indexes.adoc index 3905fa353..0067f33aa 100644 --- a/modules/ROOT/pages/indexes/semantic-indexes/vector-indexes.adoc +++ b/modules/ROOT/pages/indexes/semantic-indexes/vector-indexes.adoc @@ -197,8 +197,8 @@ Default value::: `100` To query a node vector index, use the link:{neo4j-docs-base-uri}/operations-manual/{page-version}/reference/procedures/#procedure_db_index_vector_queryNodes[`db.index.vector.queryNodes`] procedure. [NOTE] -An index cannot be queried if its `state` is `"POPULATING"` (i.e., if it was just created). -To confirm the `state` of a vector index (whether it is `"ONLINE"` or `"POPULATING"`), run the following command: `SHOW VECTOR INDEXES YIELD *`. +An index cannot be used while its `state` is `POPULATING`, which occurs immediately after it is created. +To check the `state` of a vector index -- whether it is `ONLINE` (usable) or `POPULATING`(still being built) -- run the following command: `SHOW VECTOR INDEXES YIELD *`. .Signature for `db.index.vector.queryNodes` [source,syntax] From 4935fe1afc57c3fcc23dfd90b34ca2c27cf652dc Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jens=20Pryce-=C3=85klundh?= <112686610+JPryce-Aklundh@users.noreply.github.com> Date: Mon, 25 Nov 2024 09:58:43 +0100 Subject: [PATCH 4/4] add populationPercent --- .../indexes/search-performance-indexes/managing-indexes.adoc | 2 +- .../ROOT/pages/indexes/semantic-indexes/full-text-indexes.adoc | 2 +- modules/ROOT/pages/indexes/semantic-indexes/vector-indexes.adoc | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/modules/ROOT/pages/indexes/search-performance-indexes/managing-indexes.adoc b/modules/ROOT/pages/indexes/search-performance-indexes/managing-indexes.adoc index c6caa34e6..53b35afa8 100644 --- a/modules/ROOT/pages/indexes/search-performance-indexes/managing-indexes.adoc +++ b/modules/ROOT/pages/indexes/search-performance-indexes/managing-indexes.adoc @@ -42,7 +42,7 @@ Creating an index requires link:{neo4j-docs-base-uri}/operations-manual/{page-ve [NOTE] An index cannot be used while its `state` is `POPULATING`, which occurs immediately after it is created. -To check the `state` of an index -- whether it is `ONLINE` (usable) or `POPULATING`(still being built) -- run the following command: `SHOW INDEXES YIELD *`. +To check the `state` of an index -- whether it is `ONLINE` (usable) or `POPULATING` (still being built; the `populationPercent` column shows the progress of the index creation) -- run the following command: `SHOW INDEXES`. [[create-range-index]] === Create a range index diff --git a/modules/ROOT/pages/indexes/semantic-indexes/full-text-indexes.adoc b/modules/ROOT/pages/indexes/semantic-indexes/full-text-indexes.adoc index 8bc0422a3..f2adb3aa6 100644 --- a/modules/ROOT/pages/indexes/semantic-indexes/full-text-indexes.adoc +++ b/modules/ROOT/pages/indexes/semantic-indexes/full-text-indexes.adoc @@ -139,7 +139,7 @@ To query a full-text index, use either the link:{neo4j-docs-base-uri}/operations [NOTE] An index cannot be used while its `state` is `POPULATING`, which occurs immediately after it is created. -To check the `state` of a full-text index -- whether it is `ONLINE` (usable) or `POPULATING`(still being built) -- run the following command: `SHOW FULLTEXT INDEXES YIELD *`. +To check the `state` of a full-text index -- whether it is `ONLINE` (usable) or `POPULATING` (still being built; the `populationPercent` column shows the progress of the index creation) -- run the following command: `SHOW FULLTEXT INDEXES`. This query uses the `db.index.fulltext.queryNodes` to look for `nils` in the previously created full-text index `namesAndTeams`: diff --git a/modules/ROOT/pages/indexes/semantic-indexes/vector-indexes.adoc b/modules/ROOT/pages/indexes/semantic-indexes/vector-indexes.adoc index 0067f33aa..b5d4f9bdb 100644 --- a/modules/ROOT/pages/indexes/semantic-indexes/vector-indexes.adoc +++ b/modules/ROOT/pages/indexes/semantic-indexes/vector-indexes.adoc @@ -198,7 +198,7 @@ To query a node vector index, use the link:{neo4j-docs-base-uri}/operations-manu [NOTE] An index cannot be used while its `state` is `POPULATING`, which occurs immediately after it is created. -To check the `state` of a vector index -- whether it is `ONLINE` (usable) or `POPULATING`(still being built) -- run the following command: `SHOW VECTOR INDEXES YIELD *`. +To check the `state` of a vector index -- whether it is `ONLINE` (usable) or `POPULATING` (still being built; the `populationPercent` column shows the progress of the index creation) -- run the following command: `SHOW VECTOR INDEXES`. .Signature for `db.index.vector.queryNodes` [source,syntax]