Skip to content

Commit 34d821f

Browse files
slice and dice
1 parent 106472f commit 34d821f

File tree

7 files changed

+265
-217
lines changed

7 files changed

+265
-217
lines changed

modules/ROOT/pages/functions/index.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -793,7 +793,7 @@ Vector functions allow you to compute the similarity scores of vector pairs.
793793

794794
1.1+| xref::functions/vector.adoc#functions-vector[`vector()`]
795795
| `vector(vectorValue :: STRING \| LIST<INTEGER \| FLOAT>, dimension :: INTEGER, coordinateType :: [INTEGER64, INTEGER32, INTEGER16, INTEGER8, FLOAT64, FLOAT32]) :: VECTOR`
796-
| label:new[Introduced in Neo4j 2025.xx]
796+
| Constructs a `VECTOR` value with a dimension and coordinate type. label:new[Introduced in Neo4j 2025.xx]
797797

798798
1.1+| xref::functions/vector.adoc#functions-similarity-cosine[`vector.similarity.cosine()`]
799799
| `vector.similarity.cosine(a :: VECTOR \| LIST<INTEGER \| FLOAT>, b :: VECTOR \| LIST<INTEGER \| FLOAT>) :: FLOAT`

modules/ROOT/pages/functions/vector.adoc

Lines changed: 183 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,128 @@
11
:description: Vector functions allow you to compute the similarity scores of vector pairs.
22
:table-caption!:
3-
43
:link-vector-indexes: xref:indexes/semantic-indexes/vector-indexes.adoc
5-
6-
[[query-functions-vector]]
74
= Vector functions
85

9-
Vector functions allow you to create `VECTOR` values, compute the similarity scores of vector pairs, and calculate the size of a vector.
6+
Vector functions allow you to construct xref:values-and-types/vector.adoc[`VECTOR` values], compute the similarity and distance scores of vector pairs, and calculate the size of a vector.
107

118
[role=label--new-2025.xx]
129
[[functions-vector]]
1310
== vector()
1411

12+
.Details
13+
|===
14+
| *Syntax* 3+| `vector(vectorValue[, dimension, coordinateType])`
15+
| *Description* 3+| Constructs a `VECTOR` value with a dimension and coordinate type.
16+
.4+| *Arguments* | *Name* | *Type* | *Description*
17+
| `vectorValue` | `STRING` \| `LIST<INTEGER \| FLOAT>` | The numeric values to create the vector coordinate from.
18+
or `FLOAT` values, or a `STRING` defining the coordinates in the resulting `VECTOR`.
19+
| `dimension` | `INTEGER` | The number of dimensions (coordinates) in the vector.
20+
| `coordinateType` | `[INTEGER64, INTEGER32, INTEGER16, INTEGER8, FLOAT64, FLOAT32]` | The type of each coordinate in the vector.
21+
| *Returns* 3+| `VECTOR`
22+
|===
23+
24+
[NOTE]
25+
The `VECTOR` values generated by the `vector()` function can be xref:values-and-types/vector.adoc#store-vector-properties[stored as properties].
26+
As such, the `vector()` function can be used to store the embeddings generated by Neo4j's xref:genai-integrations.adoc[GenAI plugin] as `VECTOR` property values.
27+
`VECTOR` properties can semantically searched by a xref:indexes/semantic-indexes/vector-indexes.adoc[vector index].
28+
29+
30+
31+
.Considerations
32+
|===
33+
34+
| If a `STRING` is used in `vectorValue`, it must start and end with square brackets (`[]`).
35+
The values inside the brackets must be a number represented in either decimal or scientific notation and must be comma separated.
36+
| `null`, NaN, and infinity values are not allowed in `vectorValue`.
37+
| If `vectorValue` contain elements that are not of the specified `coordinateType`, they will be coerced to that coordinate type if possible.
38+
This includes the potential of lossy conversion in cases where a larger type, e.g. `INTEGER64` does not fit into the specified type, e.g. `FLOAT32`.
39+
| If `dimension` is omitted, it is calculated by taking the size of the `vectorValue`.
40+
For example a `vectorValue` with 1024 elements generates a `VECTOR` value with the dimension `1024`.
41+
| `dimension` must be greater than `0` and less than or equal to `4096`.
42+
| If `coordinateType` is omitted, the type will be determined by Cypher.
43+
If the `LIST` used as `vectorValue` is mixed containing exclusively `INTEGER` values, then the largest of those types will be set as the `coordinateType`.
44+
For example, `LIST<INTEGER64 \| INTEGER32`> generates a `VECTOR` value with a `coordinateType` of `INTEGER64`.
45+
If the `vectorValue` contains both `FLOAT` and `INTEGER` values, then the `coordinateType` will be that of the largest `FLOAT` present in `vectorValue`.
46+
For example, `LIST<INTEGER64 \| FLOAT64`> generates a `VECTOR` value with a `coordinateType` of `FLOAT64`.
47+
| A `null` `vectorValue`, `dimension`, or `coordinateType` will return `null`.
48+
|===
49+
50+
.vector()
51+
=====
52+
53+
.Construct a `VECTOR` value
54+
[source, cypher]
55+
----
56+
WITH vector([1, 2, 3], 3, INTEGER) AS vector
57+
RETURN vector, valueType(vector) AS vectorType
58+
----
59+
60+
.Result
61+
[role="queryresult",options="header,footer",cols="2*<m"]
62+
|===
63+
| vector | vectorType
64+
65+
| [1, 2, 3] | "VECTOR<INTEGER64 NOT NULL>(3) NOT NULL"
66+
67+
2+d|Rows: 1
68+
|===
69+
70+
71+
.Construct a `VECTOR` value with a `STRING` `vectorValue`
72+
[source, cypher]
73+
----
74+
WITH vector("[1.05000e+00, 0.123, 5]", 3, FLOAT32) as vector
75+
RETURN vector, valueType(vector) AS vectorType
76+
----
77+
78+
.Result
79+
[role="queryresult",options="header,footer",cols="2*<m"]
80+
|===
81+
| vector | vectorType
82+
83+
| | "VECTOR<FLOAT32 NOT NULL>(3) NOT NULL"
84+
85+
2+d|Rows: 1
86+
|===
87+
88+
.Construct a `VECTOR` value omitting both `dimension` and `coordinateType`
89+
[source, cypher]
90+
----
91+
WITH vector([1, 2.5, 3]) AS vector
92+
RETURN vector, valueType(vector) AS vectorType
93+
----
94+
95+
.Result
96+
[role="queryresult",options="header,footer",cols="2*<m"]
97+
|===
98+
| vector | vectorType
99+
100+
| [1, 2, 3] | "VECTOR<FLOAT64 NOT NULL>(3) NOT NULL"
101+
102+
2+d|Rows: 1
103+
|===
104+
105+
When constructing a `VECTOR` value with the `vector()` function, a
106+
107+
.`null` values
108+
[source, cypher]
109+
----
110+
RETURN vector(null, 3, FLOAT32) AS nullVectorValue,
111+
vector([1, 2, 3], null, INTEGER8) AS nullDimension,
112+
vector([1, 2, 3], 3, null) AS nullCoordinateType
113+
----
114+
115+
.Result
116+
[role="queryresult",options="header,footer",cols="3*<m"]
117+
|===
118+
| nullVectorValue | nullDimension | nullCoordinateType
119+
120+
| null | null | null
121+
122+
3+d|Rows: 1
123+
|===
124+
125+
=====
15126

16127
[[functions-similarity-cosine]]
17128
== vector.similarity.cosine()
@@ -90,9 +201,9 @@ To create the graph used in this example, run the following query in an empty Ne
90201
[source, cypher, role=test-setup]
91202
----
92203
CREATE
93-
(:Node { id: 1, vector: [1.0, 4.0, 2.0]}),
94-
(:Node { id: 2, vector: [3.0, -2.0, 1.0]}),
95-
(:Node { id: 3, vector: [2.0, 8.0, 3.0]});
204+
(:Node { id: 1, vector: vector([1.0, 4.0, 2.0], 3, FLOAT32) }),
205+
(:Node { id: 2, vector: vector([3.0, -2.0, 1.0], 3, FLOAT32) }),
206+
(:Node { id: 3, vector: vector([2.0, 8.0, 3.0], 3, FLOAT32) });
96207
----
97208
98209
Given a parameter `query` (here set to `[4.0, 5.0, 6.0]`), you can query for the two nearest neighbors of that query vector by Euclidean distance.
@@ -105,7 +216,7 @@ MATCH (node:Node)
105216
WITH node, vector.similarity.euclidean($query, node.vector) AS score
106217
RETURN node, score
107218
ORDER BY score DESCENDING
108-
LIMIT 2;
219+
LIMIT 2
109220
----
110221
111222
This returns the two nearest neighbors.
@@ -189,7 +300,7 @@ RETURN vector_dimension_count(vector([1, 2, 3], 3, INTEGER)) AS size
189300
| *Returns* 3+| `FLOAT`
190301
|===
191302

192-
.`vectorDistanceMetric` algorithms
303+
.Supported `vectorDistanceMetric` algorithms
193304
[cols="1,3", options="header"]
194305
|===
195306
| Distance Type | Formula
@@ -243,7 +354,7 @@ RETURN vector_distance(vector([1, 2, 3], 3, INT), vector([1, 2, 4], 3, INT), COS
243354
244355
|===
245356
246-
.Calculate the distance between two `VECTOR` values using the `EUCLIDEAN` vector distance algorithm
357+
.Calculate the distance between two `VECTOR` values using the `EUCLIDEAN` distance algorithm
247358
[source, cypher]
248359
----
249360
RETURN vector_distance(vector([1.0, 5.0, 3.0, 6.7], 4, FLOAT), vector([5.0, 2.5, 3.1, 9.0], 4, FLOAT), EUCLIDEAN)
@@ -262,11 +373,69 @@ RETURN vector_distance(vector([1.0, 5.0, 3.0, 6.7], 4, FLOAT), vector([5.0, 2.5,
262373
263374
=====
264375

265-
266-
267376
[role=label--new-2025.xx]
268377
[[functions-vector_norm]]
269378
== vector_norm()
270379

271-
* `vector_norm(vector :: VECTOR, vectorDistanceMetric :: [EUCLIDEAN, MANHATTAN]) :: FLOAT`
272-
* Returns a `FLOAT` representing the distance between the given vector and a vector of the same dimension with all coordinates set to zero, calculated using the specified `vectorDistanceMetric`.
380+
.Details
381+
|===
382+
| *Syntax* 3+| `vector_norm(vector, vectorDistanceMetric)`
383+
| *Description* 3+| Returns a `FLOAT` representing the norm (distance) between the given vector and an origin vector of the same dimension with all coordinates set to zero, calculated using the specified `vectorDistanceMetric`.
384+
.4+| *Arguments* | *Name* | *Type* | *Description*
385+
| `vector` | `VECTOR` | A vector for which the norm to the origin vector will be computed.
386+
| `vectorDistanceMetric` | `[EUCLIDEAN, MANHATTAN]` | The vector distance algorithm to calculate the distance by.
387+
| *Returns* 3+| `FLOAT`
388+
|===
389+
390+
.Supported `vectorDistanceMetric` algorithms
391+
[cols="1,3", options="header"]
392+
|===
393+
| Distance Type | Formula
394+
395+
| `EUCLIDEAN`
396+
| √( (A₁ - B₁)² + (A₂ - B₂)² + ... + (Aᴰ - Bᴰ)² )
397+
398+
| `MANHATTAN`
399+
| \|A₁ - B₁\| + \|A₂ - B₂\| + ... + \|Aᴰ - Bᴰ\|
400+
401+
|===
402+
403+
404+
.vector_norm()
405+
=====
406+
407+
.Measure the norm between a vector and an origin vector using the `EUCLIDEAN` distance algorithm
408+
[source, cypher]
409+
----
410+
RETURN vector_norm(vector([1.0, 5.0, 3.0, 6.7], 4, FLOAT), EUCLIDEAN) AS norm
411+
----
412+
413+
.Result
414+
[role="queryresult",options="header,footer",cols="1*<m"]
415+
|===
416+
417+
| norm
418+
| 8.93812060782355
419+
420+
1+d|Rows: 1
421+
422+
|===
423+
424+
.Measure the norm between a vector and an origin vector using the `EUCLIDEAN` distance algorithm
425+
[source, cypher]
426+
----
427+
RETURN vector_norm(Vector([1.0, 5.0, 3.0, 6.7], 4, FLOAT), 'MANHATTAN') AS norm
428+
----
429+
430+
.Result
431+
[role="queryresult",options="header,footer",cols="1*<m"]
432+
|===
433+
434+
| norm
435+
| 15.7
436+
437+
1+d|Rows: 1
438+
439+
|===
440+
441+
=====

modules/ROOT/pages/genai-integrations.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -202,7 +202,7 @@ Each returned row contains the following columns:
202202
* The `resource` (a `STRING`) is the name of the input resource.
203203
* The `vector` (a `LIST<FLOAT>`) is the generated vector embedding for this resource.
204204

205-
[[store-multiple-embedding-vector]]
205+
[[store-multiple-embeddings-vector]]
206206
[role=label--new-2025.xx label--enterprise-edition]
207207
=== Store multiple embeddings as vector properties
208208

modules/ROOT/pages/indexes/semantic-indexes/vector-indexes.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ Each word or token in a text is typically represented as high-dimensional vector
4141

4242
The embedding for a particular data object can be created by both proprietary (such as https://cloud.google.com/vertex-ai[Vertex AI] or https://openai.com/[OpenAI]) and open source (such as https://github.com/UKPLab/sentence-transformers[sentence-transformers]) embedding generators, which can produce vector embeddings with dimensions such as 256, 768, 1536, and 3072.
4343
Vector embeddings are stored as `LIST<INTEGER | FLOAT>` properties on a node or relationship.
44-
As of Neo4j 2025.xx, they can also be more efficiently stored as xref:values-and-types/vector.adoc[`VECTOR` types].
44+
As of Neo4j 2025.xx, they can also be more efficiently stored as xref:values-and-types/vector.adoc[`VECTOR`] property types.
4545

4646
[NOTE]
4747
====

modules/ROOT/pages/values-and-types/ordering-equality-comparison.adoc

Lines changed: 3 additions & 58 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,9 @@ For example, `1 > b` and `1 < b` are both `false` when `b` is NaN.
3838
* xref:values-and-types/spatial.adoc[Spatial values] and xref:values-and-types/vector.adoc[`VECTOR`] values cannot be compared using the operators `\<=`, `<`,`>=`, `>`.
3939
To compare spatial values within a specific range, use either the xref:functions/spatial.adoc#functions-withinBBox[`point.withinBBox()`] or the xref:functions/spatial.adoc#functions-point-wgs84-2d[`point()`] function.
4040

41+
[NOTE]
42+
See also xref:values-and-types/vector.adoc#ordering-vector[`VECTOR` values -> Ordering `VECTOR` values].
43+
4144
[[value-hierarchy]]
4245
=== Hierarchy of values
4346

@@ -146,61 +149,3 @@ If they have the same time and offset but different named time zones, they are s
146149
Since the length of a day, month, or year varies, Cypher does not define a strict ordering for durations.
147150
As a result, comparing two durations `(e.g, duration1 < duration2)` will always return `null`.
148151

149-
[role=label--new-2025.xx]
150-
[[ordering-vector]]
151-
=== Vector values
152-
153-
`VECTOR` values with a defined coordinate type and no dimension are ordered before values with only a defined dimension.
154-
Values with both a defined coordinate type and dimension are ordered according to the ordering of the vector coordinate types, listed in ascending order below:
155-
156-
* `INTEGER8`
157-
* `INTEGER16`
158-
* `INTEGER32`
159-
* `INTEGER64`
160-
* `FLOAT32`
161-
* `FLOAT64`
162-
163-
Within the same coordinate type, `VECTOR` values are ordered by their dimension, with smaller values first.
164-
`VECTOR` values of the same coordinate type and dimension are then ordered pairwise, similar to how `LIST` values are ordered.
165-
166-
.Ordering rules for `VECTOR` values
167-
[cols="3,3,2,6", options="header"]
168-
|===
169-
| A | B | Ordered As | Reason
170-
171-
| `VECTOR<FLOAT32>(12345)`
172-
| `VECTOR<FLOAT32>(123456)`
173-
| A < B
174-
| Same coordinate type, compare by dimension ascending.
175-
176-
| `VECTOR<INTEGER32>(1234)`
177-
| `VECTOR<FLOAT32>(1234)`
178-
| A < B
179-
| Coordinate type order: `INTEGER32` < `FLOAT32`
180-
181-
| `VECTOR<INTEGER8>`
182-
| `VECTOR(3)`
183-
| A < B
184-
| Coordinate type defined and no dimension < dimension defined and no coordinate type
185-
186-
| `VECTOR<INTEGER64>(123456)`
187-
| `VECTOR<FLOAT32>(3)`
188-
| A < B
189-
| Coordinate type order: `INTEGER64` < `FLOAT32`, compare coordinate type first.
190-
191-
| `VECTOR<FLOAT64>(1234)`
192-
| `VECTOR<FLOAT32>(1234)`
193-
| B < A
194-
| Coordinate type order: `FLOAT32` < `FLOAT64`
195-
196-
| `VECTOR<FLOAT32>([1, 2])`
197-
| `VECTOR<FLOAT32>([2, 1])`
198-
| A < B
199-
| Same coordinate type and dimension, pairwise value comparison.
200-
201-
| `VECTOR<INTEGER16>(1234)`
202-
| `LIST<INTEGER>`
203-
| A < B
204-
| `VECTOR` values are ordered before `LIST` values
205-
206-
|===

modules/ROOT/pages/values-and-types/property-structural-constructed.adoc

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,6 @@
33
:description: This section provides an overview of the property, structural, and constructed data types supported by Cypher.
44
:page-aliases: values-and-types/property-structural-composite.adoc
55

6-
76
Cypher provides first class support for a number of data value types.
87
These fall into the following three categories: *property*, *structural*, and *constructed*.
98
This section will first provide a brief overview of each type, and then go into more detail about the property data type.

0 commit comments

Comments
 (0)