neo4j
diff --git a/‎modules/ROOT/pages/functions/vector.adoc‎
Lines changed: 124 additions & 4 deletions b/‎modules/ROOT/pages/functions/vector.adoc‎
Lines changed: 124 additions & 4 deletions
diff --git a/‎modules/ROOT/pages/genai-integrations.adoc‎
Lines changed: 133 additions & 13 deletions b/‎modules/ROOT/pages/genai-integrations.adoc‎
Lines changed: 133 additions & 13 deletions
diff --git a/‎modules/ROOT/pages/indexes/semantic-indexes/vector-indexes.adoc‎
Lines changed: 3 additions & 2 deletions b/‎modules/ROOT/pages/indexes/semantic-indexes/vector-indexes.adoc‎
Lines changed: 3 additions & 2 deletions
@@ -129,19 +129,139 @@ This returns the two nearest neighbors.
 
 ======
 
+
 [role=label--new-2025.xx]
 [[functions-vector_dimension_count]]
 == vector_dimension_count()
 
-* `(vector :: VECTOR) :: INTEGER`
-* Calculates the size of a vector.
+.Details
+|===
+| *Syntax* 3+| `vector_dimension_count(vector)`
+| *Description* 3+| Calculates the dimension of a `VECTOR`.
+.2+| *Arguments* | *Name* | *Type* | *Description*
+| `vector` | `VECTOR` | The vector to calculate the dimension of.
+| *Returns* 3+| `INTEGER`
+|===
+
+.Considerations
+|===
+
+| If `vector` is not a xref:values-and-types/vector.adoc[`VECTOR`] value, an error will be thrown.
+| Alias to the xref:functions/scalar.adoc#functions-size[`size()`] function.
+
+|===
+
+.vector_dimension_count()
+=====
+
+.Calculate the size of a `VECTOR`
+[source, cypher]
+----
+RETURN vector_dimension_count(vector([1, 2, 3], 3, INTEGER)) AS size
+----
+
+
+.Result
+[role="queryresult",options="header,footer",cols="1*<m"]
+|===
+
+| size
+| 3
+
+1+d|Rows: 1
+
+|===
+=====
+
 
 [role=label--new-2025.xx]
 [[functions-vector_distance]]
 == vector_distance()
 
-* `(vector1 :: VECTOR, vector2 :: VECTOR, vectorDistanceMetric :: [EUCLIDEAN, EUCLIDEAN_SQUARED, MANHATTAN, COSINE, DOT, HAMMING]) :: FLOAT`
-* Returns a `FLOAT` representing the distance between the two vector values based on the selected `vectorDistanceMetric` algorithm.
+.Details
+|===
+| *Syntax* 3+| `vector_dimension(vector1, vector2, vectorDistanceMetric)`
+| *Description* 3+| Returns a `FLOAT` representing the distance between the two vector values based on the selected `vectorDistanceMetric` algorithm.
+.4+| *Arguments* | *Name* | *Type* | *Description*
+| `vector1` | `VECTOR` | The first vector.
+| `vector2` | `VECTOR` | The second vector.
+| `vectorDistanceMetric` | `[EUCLIDEAN, EUCLIDEAN_SQUARED, MANHATTAN, COSINE, DOT, HAMMING]` | The vector distance algorithm to calculate the distance by.
+| *Returns* 3+| `FLOAT`
+|===
+
+.`vectorDistanceMetric` algorithms
+[cols="1,3", options="header"]
+|===
+| Distance Type | Formula
+
+| `EUCLIDEAN`
+| √( (A₁ - B₁)² + (A₂ - B₂)² + ... + (Aᴰ - Bᴰ)² )
+
+| `EUCLIDEAN_SQUARED`
+| (A₁ - B₁)² + (A₂ - B₂)² + ... + (Aᴰ - Bᴰ)²
+
+| `MANHATTAN`
+| \|A₁ - B₁\| + \|A₂ - B₂\| + ... + \|Aᴰ - Bᴰ\|
+
+| `COSINE`
+| 1 - ( (A₁×B₁ + A₂×B₂ + ... + Aᴰ×Bᴰ) / ( √(A₁² + A₂² + ... + Aᴰ²) × √(B₁² + B₂² + ... + Bᴰ²) ) )
+
+| `DOT`
+| - (A₁×B₁ + A₂×B₂ + ... + Aᴰ×Bᴰ)
+
+| `HAMMING`
+| Number of dimensions in which `vector1` and `vector2` differ.
+|===
+
+.Considerations
+|===
+
+| The smaller the returned number, the more similar the `VECTOR` values.
+The larger the number, the more distant the vectors.
+This is in contrast to the similarity functions where the closer to `1` the result is the higher the degree of similarity.
+
+|===
+
+
+.vector_distance()
+=====
+
+.Calculate the distance between two `VECTOR` values using the `COSINE` vector distance algorithm
+[source, cypher]
+----
+RETURN vector_distance(vector([1, 2, 3], 3, INT), vector([1, 2, 4], 3, INT), COSINE) AS distance
+----
+
+.Result
+[role="queryresult",options="header,footer",cols="1*<m"]
+|===
+
+| distance
+| 0.008539795875549316
+
+1+d|Rows: 1
+
+|===
+
+.Calculate the distance between two `VECTOR` values using the `EUCLIDEAN` vector distance algorithm
+[source, cypher]
+----
+RETURN vector_distance(vector([1.0, 5.0, 3.0, 6.7], 4, FLOAT), vector([5.0, 2.5, 3.1, 9.0], 4, FLOAT), EUCLIDEAN)
+----
+
+.Result
+[role="queryresult",options="header,footer",cols="1*<m"]
+|===
+
+| distance
+| 5.248809388804284
+
+1+d|Rows: 1
+
+|===
+
+=====
+
 
 
 [role=label--new-2025.xx]
 
@@ -42,7 +42,7 @@ Dump files can be imported for both link:{neo4j-docs-base-uri}/aura/auradb/impor
 The embeddings on this are generated using link:https://platform.openai.com/docs/guides/embeddings[OpenAI] (model `text-embedding-ada-002`), producing 1536-dimensional vectors.
 
 [[single-embedding]]
-== Generate a single embedding and store it
+== Generate a single embedding
 
 Use the `genai.vector.encode()` function to generate a vector embedding for a single value.
 
@@ -66,15 +66,64 @@ This function sends one API request every time it is called, which may result in
 If you want to generate many embeddings at once, use xref:genai-integrations.adoc#multiple-embeddings[].
 ====
 
-Use the `db.create.setNodeVectorProperty` procedure to store an embedding to a node property.
+[role=label--new-2025.xx label--enterprise-edition]
+[[store-single-embedding-vector]]
+=== Store a single embedding as a vector property
+
+`genai.vector.encode()` returns a `LIST<FLOAT>`.
+To convert this value to a xref:values-and-types/vector.adoc[`VECTOR` value], use the xref:functions/vector.adoc#functions-vector[`vector()`] function.
+
+.Signature for `vector()` label:function[]
+[source]
+----
+vector(vectorValue :: STRING | LIST<INTEGER | FLOAT>, dimension :: INTEGER, coordinateType :: [INTEGER64, INTEGER32, INTEGER16, INTEGER8, FLOAT64, FLOAT32]) :: VECTOR
+----
+
+.Create an embedding from a single property and store it as a `VECTOR` property value
+====
+
+.Create an `VECTOR` embedding property for the Godfather
+[source,cypher,role=test-skip]
+----
+MATCH (m:Movie {title:'Godfather, The'})
+WHERE m.plot IS NOT NULL AND m.title IS NOT NULL
+WITH m, m.title || ' ' || m.plot AS titleAndPlot // <1>
+WITH m, genai.vector.encode(titleAndPlot, 'OpenAI', { token: $token }) AS propertyVector // <2>
+SET m.embedding = vector(propertyVector, 1536, FLOAT32) // <3>
+RETURN m.embedding AS embedding
+----
+
+<1> Concatenate the `title` and `plot` of the `Movie` into a single `STRING`.
+<2> Create a 1536 dimensional embedding from the `titleAndPlot`.
+<3> Store the `propertyVector` as a new `VECTOR` `embedding` property on The Godfather node.
+
+.Result
+[source, "queryresult"]
+----
++----------------------------------------------------------------------------------------------------+
+| embedding                                                                                          |
++----------------------------------------------------------------------------------------------------+
+| [0.005239539314061403, -0.039358530193567276, -0.0005175105179660022, -0.038706034421920776, ... ] |
++----------------------------------------------------------------------------------------------------+
+----
+
+[NOTE]
+This result only shows the first 4 of the 1536 numbers in the embedding.
+
+====
+
+[[store-single-embedding-list-float]]
+=== Store a single embedding as a list of floats property
+
+Use the `db.create.setNodeVectorProperty` procedure to store an embedding as `LIST<FLOAT>` value to a node property.
 
 .Signature for `db.create.setNodeVectorProperty` label:procedure[]
 [source,syntax]
 ----
 db.create.setNodeVectorProperty(node :: NODE, key :: STRING, vector :: ANY)
 ----
 
-Use the `db.create.setRelationshipVectorProperty` procedure to store an embedding to a relationship property.
+Use the `db.create.setRelationshipVectorProperty` procedure to store an embedding as a `LIST<FLOAT>` value to a relationship property.
 
 .Signature for `db.create.setRelationshipVectorProperty` label:procedure[] 
 [source,syntax]
@@ -88,10 +137,10 @@ db.create.setRelationshipVectorProperty(relationship :: RELATIONSHIP, key :: STR
 
 The embeddings are stored as properties on nodes or relationships with the type `LIST<INTEGER | FLOAT>`.
 
-.Create an embedding from a single property and store it
+.Create an embedding from a single property and store it as a `LIST<FLOAT>` property value
 ====
 
-.Create an embedding property for the Godfather
+.Create a `LIST<FLOAT>` embedding property for the Godfather
 [source,cypher,role=test-skip]
 ----
 MATCH (m:Movie {title:'Godfather, The'})
@@ -104,7 +153,7 @@ RETURN m.embedding AS embedding
 
 <1> Concatenate the `title` and `plot` of the `Movie` into a single `STRING`.
 <2> Create a 1536 dimensional embedding from the `titleAndPlot`.
-<3> Store the `propertyVector` as a new `embedding` property on The Godfather node.
+<3> Store the `propertyVector` as a new `LIST<FLOAT> `embedding` property on The Godfather node.
 
 .Result
 [source, "queryresult"]
@@ -118,12 +167,13 @@ RETURN m.embedding AS embedding
 
 [NOTE]
 This result only shows the first 4 of the 1536 numbers in the embedding.
+
 ====
 
 [[multiple-embeddings]]
-== Generating a batch of embeddings and store them
+== Generate a batch of embeddings
 
-Use the `genai.vector.encodeBatch` procedure to generate many vector embeddings with a single API request.
+Use the `genai.vector.encodeBatch()` procedure to generate many vector embeddings with a single API request.
 This procedure takes a list of resources as an input, and returns the same number of result rows, instead of a single one.
 
 [IMPORTANT]
@@ -132,7 +182,7 @@ This procedure attempts to generate embeddings for all supplied resources in a s
 Therefore, it is recommended to see the respective provider's documentation for details on, for example, the maximum number of embeddings that can be generated per request.
 ====
 
-.Signature for `genai.vector.encodeBatch` label:procedure[]
+.Signature for `genai.vector.encodeBatch()` label:procedure[]
 [source,syntax]
 ----
 genai.vector.encodeBatch(resources :: LIST<STRING>, provider :: STRING, configuration :: MAP = {}) :: (index :: INTEGER, resource :: STRING, vector :: LIST<FLOAT>)
@@ -152,7 +202,77 @@ Each returned row contains the following columns:
 * The `resource` (a `STRING`) is the name of the input resource.
 * The `vector` (a `LIST<FLOAT>`) is the generated vector embedding for this resource.
 
-.Create embeddings from a limited number of properties and store them
+[[store-multiple-embedding-vector]]
+[role=label--new-2025.xx label--enterprise-edition]
+=== Store multiple embeddings as vector properties
+
+`genai.vector.encodeBatch()` returns a `LIST<FLOAT>` `vector` value.
+To convert this value to a xref:values-and-types/vector.adoc[`VECTOR` value], use the xref:functions/vector.adoc#functions-vector[`vector()`] function.
+The full function signature can be seen xref:genai-integrations.adoc#store-single-embedding-vector[above].
+
+.Create embeddings from a limited number of properties and store them as `VECTOR` properties
+====
+
+[source, cypher, role=test-skip]
+----
+MATCH (m:Movie WHERE m.plot IS NOT NULL)
+WITH m
+LIMIT 20
+WITH collect(m) AS moviesList // <1>
+WITH moviesList, [movie IN moviesList | movie.title || ': ' || movie.plot] AS batch // <2>
+CALL genai.vector.encodeBatch(batch, 'OpenAI', { token: $token }) YIELD index, vector
+WITH moviesList, index, vector
+CALL db.create.setNodeVectorProperty(moviesList[index], 'embedding', vector) // <3>
+----
+
+<1> xref:functions/aggregating.adoc#functions-collect[Collect] all  20 `Movie` nodes into a `LIST<NODE>`.
+<2> Use a xref:expressions/list-expressions.adoc#list-comprehension[list comprehension] (`[]`) to extract the `title` and `plot` properties of the movies in `moviesList` into a new `LIST<STRING>`.
+<3> `db.create.setNodeVectorProperty` is run for each `vector` returned by `genai.vector.encodeBatch()`, and stores that vector as a property named `embedding` on the corresponding node.
+====
+
+.Create embeddings from a large number properties and store them as `VECTOR` properties
+====
+[source, cypher, role=test-skip]
+----
+MATCH (m:Movie WHERE m.plot IS NOT NULL)
+WITH collect(m) AS moviesList, // <1>
+     count(*) AS total,
+     100 AS batchSize // <2>
+UNWIND range(0, total-1, batchSize) AS batchStart // <3>
+CALL (moviesList, batchStart, batchSize) { // <4>
+    WITH [movie IN moviesList[batchStart .. batchStart + batchSize] | movie.title || ': ' || movie.plot] AS batch // <5>
+    CALL genai.vector.encodeBatch(batch, 'OpenAI', { token: $token }) YIELD index, vector
+    CALL db.create.setNodeVectorProperty(moviesList[batchStart + index], 'embedding', vector) // <6>
+} IN CONCURRENT TRANSACTIONS OF 1 ROW <7>
+----
+
+<1> xref:functions/aggregating.adoc#functions-collect[Collect] all returned `Movie` nodes into a `LIST<NODE>`.
+<2> `batchSize` defines the number of nodes in `moviesList` to be processed at once.
+Because vector embeddings can be very large, a larger batch size may require significantly more memory on the Neo4j server.
+Too large a batch size may also exceed the provider's threshold.
+<3> Process `Movie` nodes in increments of `batchSize`.
+The end range `total-1` is due to `range` being inclusive on both ends.
+<4> A xref:subqueries/subqueries-in-transactions.adoc[`CALL` subquery] executes a separate transaction for each batch.
+Note that this `CALL` subquery uses a xref:subqueries/call-subquery.adoc#variable-scope-clause[variable scope clause].
+<5> `batch` is a list of strings, each being the concatenation of `title` and `plot` of one movie.
+<6> The procedure sets `vector` as value for the property named `embedding` for the node at position `batchStart + index` in the `moviesList`.
+<7> Set to `1` the amount of batches to be processed at once.
+For more information on concurrency in transactions, see xref:subqueries/subqueries-in-transactions.adoc#concurrent-transactions[`CALL` subqueries -> Concurrent transactions]).
+
+[NOTE]
+This example may not scale to larger datasets, as `collect(m)` requires the whole result set to be loaded in memory.
+For an alternative method more suitable to processing large amounts of data, see link:https://neo4j.com/docs/genai/tutorials/embeddings-vector-indexes/[GenAI documentation - Embeddings & Vector Indexes Tutorial -> Create embeddings with cloud AI providers].
+
+====
+
+[[store-multiple-embeddings-list-float]]
+=== Store multiple embeddings as list of floats properties
+
+Use the `db.create.setNodeVectorProperty` procedure to store an embedding as `LIST<FLOAT>` value to a node property.
+Use the `db.create.setRelationshipVectorProperty` procedure to store an embedding as a `LIST<FLOAT>` value to a relationship property.
+The full procedure signatures can be seen xref:genai-integrations.adoc#store-single-embedding-list-float[above].
+
+.Create embeddings from a limited number of properties and store them as `LIST<FLOAT>` properties
 ====
 
 [source, cypher, role=test-skip]
@@ -169,10 +289,10 @@ CALL db.create.setNodeVectorProperty(moviesList[index], 'embedding', vector) //
 
 <1> xref:functions/aggregating.adoc#functions-collect[Collect] all  20 `Movie` nodes into a `LIST<NODE>`.
 <2> Use a xref:expressions/list-expressions.adoc#list-comprehension[list comprehension] (`[]`) to extract the `title` and `plot` properties of the movies in `moviesList` into a new `LIST<STRING>`.
-<3> `db.create.setNodeVectorProperty` is run for each `vector` returned by `genai.vector.encodeBatch`, and stores that vector as a property named `embedding` on the corresponding node.
+<3> `db.create.setNodeVectorProperty` is run for each `vector` returned by `genai.vector.encodeBatch()`, and stores that vector as a property named `embedding` on the corresponding node.
 ====
 
-.Create embeddings from a large number of properties and store them
+.Create embeddings from a large number properties and store them as `LIST<FLOAT>` values
 ====
 [source, cypher, role=test-skip]
 ----
@@ -211,7 +331,7 @@ For an alternative method more suitable to processing large amounts of data, see
 == GenAI providers
 
 The following GenAI providers are supported for generating vector embeddings.
-Each provider has its own configuration map that can be passed to `genai.vector.encode` or `genai.vector.encodeBatch`.
+Each provider has its own configuration map that can be passed to `genai.vector.encode` or `genai.vector.encodeBatch()`.
 
 [[vertex-ai]]
 === Vertex AI
 
@@ -40,7 +40,8 @@ An embedding is a numerical representation of a data object, such as a text, ima
 Each word or token in a text is typically represented as high-dimensional vector where each dimension represents a certain aspect of the word’s meaning.
 
 The embedding for a particular data object can be created by both proprietary (such as https://cloud.google.com/vertex-ai[Vertex AI] or https://openai.com/[OpenAI]) and open source (such as https://github.com/UKPLab/sentence-transformers[sentence-transformers]) embedding generators, which can produce vector embeddings with dimensions such as 256, 768, 1536, and 3072.
-In Neo4j, vector embeddings are stored as `LIST<INTEGER | FLOAT>` properties on a node or relationship.
+Vector embeddings are stored as `LIST<INTEGER | FLOAT>` properties on a node or relationship.
+As of Neo4j 2025.xx, they can also be more efficiently stored as xref:values-and-types/vector.adoc[`VECTOR` types].
 
 [NOTE]
 ====
@@ -126,7 +127,7 @@ For more information about the values accepted by different index providers, see
 ==== `vector.dimensions`
 The dimensions of the vectors to be indexed.
 For more information, see xref:indexes/semantic-indexes/vector-indexes.adoc#embeddings[].
-This setting can be omitted, and any `LIST<INTEGER | FLOAT>` can be indexed and queried, separated by their dimensions, _though only vectors of the same dimension can be compared._
+This setting can be omitted, and any `LIST<INTEGER | FLOAT>` or, as of Neo4j 2025.xx, `VECTOR` value can be indexed and queried, separated by their dimensions, _though only vectors of the same dimension can be compared._
 Setting this value adds additional checks that ensure only vectors with the configured dimensions are indexed, and querying the index with a vector of a different dimensions returns an error.
 
 [NOTE]