|
2 | 2 | :description: Information about Cypher's `VECTOR` type. |
3 | 3 | :page-role: new-neo4j-2025.10 |
4 | 4 |
|
5 | | -Cypher supports the construction of a `VECTOR` value that can be stored as xref:indexes/semantic-indexes/vector-indexes.adoc#embeddings[embedding] properties on nodes and relationships and used for efficient semantic retrieval using Neo4j's xref:indexes/semantic-indexes/vector-indexes.adoc[vector indexes] and xref:genai-integrations.adoc[GenAI plugin]. |
6 | | -`VECTOR` values can also be measured and compared (in terms of similarity, distance, and norm) using Cypher's xref:functions/vector.adoc[vector functions]. |
| 5 | +`VECTOR` values can be created and stored as xref:indexes/semantic-indexes/vector-indexes.adoc#embeddings[embedding] properties on nodes and relationships, and used for efficient semantic retrieval using xref:indexes/semantic-indexes/vector-indexes.adoc[vector indexes] and the xref:genai-integrations.adoc[GenAI plugin]. |
| 6 | +`VECTOR` values can also be measured and compared (in terms of similarity, distance, and norm) using xref:functions/vector.adoc[vector functions]. |
7 | 7 |
|
8 | 8 | [[vector-type]] |
9 | 9 | == The vector type |
10 | 10 |
|
11 | | -The `VECTOR` type is a fixed-length, ordered collection of numeric coordinate values (`INTEGER` or `FLOAT`) stored as a single unit. |
12 | | -It is defined by its dimension, which sets how many numeric values it holds, and its coordinate type, which specifies the specific numeric data type for those numbers. |
13 | | -Each numeric value in the list represents a coordinate along one of the vector’s defined dimensions. |
14 | | -The coordinate type controls the type of each coordinate in the `VECTOR`, influencing precision and storage size. |
15 | | -A `VECTOR` value must have a dimension and a coordinate type. |
| 11 | +The `VECTOR` type is a fixed-length, ordered collection of numeric values (`INTEGER` or `FLOAT`) stored as a single unit. |
| 12 | +The type of a value is defined by: |
16 | 13 |
|
17 | | -.Example `VECTOR` type |
| 14 | +- *Dimension* -- The number of values it contains. |
| 15 | +- *Coordinate type* -- The data type of the entries, determining precision and storage size. |
| 16 | + |
| 17 | +.An example `VECTOR` value |
18 | 18 | [source] |
19 | 19 | ---- |
20 | | -VECTOR([1.05, 0.123, 5], 3, FLOAT32 NOT NULL) |
| 20 | +vector([1.05, 0.123, 5], 3, FLOAT32) |
21 | 21 | ---- |
22 | 22 |
|
23 | | -In this example, `[1.05, 0.123, 5]` is the coordinates of the `VECTOR`, `3` its dimensionality, and `FLOAT32` the data type for each coordinate value. |
24 | | - |
25 | | -.`VECTOR` and `LIST` values |
26 | | -[NOTE] |
27 | | -The dimensions and the formatting of a `VECTOR` is similar to a xref:values-and-types/lists.adoc[`LIST`] value. |
28 | | -However, whereas elements in `LIST` values can be accessed individually with xref:expressions/list-expressions.adoc[list expressions], operations on a `VECTOR` must operate on the entire `VECTOR`: it is not possible to access or slice individual elements of a `VECTOR` value. |
29 | | -Additionally, Neo4j cannot store lists containing `VECTOR` values. |
| 23 | +In this example, `[1.05, 0.123, 5]` is the list of values, `3` its dimension, and `FLOAT32` the data type of the individual entries. + |
| 24 | +Each number in the list can also be seen as a coordinate along one of the vector's dimensions. |
30 | 25 |
|
31 | | -The dimension of a `VECTOR` value must be larger than `0` and less than or equal to `4096`. |
32 | 26 |
|
33 | | -The following coordinate types are supported: |
| 27 | +[[valid-values]] |
| 28 | +=== Valid values |
34 | 29 |
|
35 | | -.Vector coordinate types |
| 30 | +- A `VECTOR` value must have a dimension and a coordinate type. |
| 31 | +- The dimension of a `VECTOR` value must be larger than `0` and less than or equal to `4096`. |
| 32 | +- Vectors cannot contain lists as elements. |
| 33 | +- Supported coordinate types are: + |
| 34 | ++ |
36 | 35 | [options="header",cols="2*<m"] |
37 | 36 | |=== |
38 | 37 | | Default name | Alias |
39 | 38 |
|
40 | 39 | | `FLOAT` | `FLOAT64` |
41 | 40 | | `FLOAT32` | |
42 | | -| `INTEGER` | `INT`, `INT64`, `SIGNED INTEGER`, `INTEGER64` |
| 41 | +| `INTEGER` | `INT`, `INT64`, `INTEGER64`, `SIGNED INTEGER` |
43 | 42 | | `INTEGER8` | `INT8` |
44 | 43 | | `INTEGER16`| `INT16` |
45 | 44 | | `INTEGER32` | `INT8` |
46 | 45 |
|
47 | 46 | |=== |
48 | 47 |
|
49 | | -`VECTOR` is a supertype of `VECTOR<TYPE>(DIMENSION)` types. |
50 | | -The same applies for `VECTOR` types that define only a coordinate type or a dimension: |
51 | | - |
52 | | -- `VECTOR` with only a defined dimension is a supertype of all `VECTOR` values of that dimension regardless of the coordinate type. |
53 | | -For example, `VECTOR(4)` is a supertype of `VECTOR<FLOAT>(4)` and `VECTOR<INT8>(4)`. |
54 | | -- `VECTOR` with only a defined coordinate type is a supertype of all `VECTOR` values with that coordinate type regardless of the dimension. |
55 | | -For example, `VECTOR<INT>` is a supertype of `VECTOR<INT>(3)` and `VECTOR<INT>(1024)`. |
56 | | - |
57 | | -All of these supertypes can be used in xref:expressions/predicates/type-predicate-expressions.adoc#type-predicate-vector[type predicate expressions] to verify `VECTOR` type values. |
58 | | - |
59 | | -See also: |
60 | | - |
61 | | -* xref:values-and-types/ordering-equality-comparison.adoc#ordering-and-comparison[Equality, ordering, and comparison of value types -> Ordering vector types] |
62 | | -* xref:values-and-types/property-structural-constructed.adoc#vector-type-normalization[Property, structural, and constructed values -> Vector type normalization] |
63 | 48 |
|
64 | 49 | [[construct-vector-values]] |
65 | 50 | == Construct vector values |
@@ -88,26 +73,6 @@ RETURN vector |
88 | 73 |
|
89 | 74 | ==== |
90 | 75 |
|
91 | | - |
92 | | -[IMPORTANT] |
93 | | -==== |
94 | | -Running a query that returns a `VECTOR` value via a link:{neo4j-docs-base-uri}/create-applications/[driver] version < 6.0 does not yield a `VECTOR`, but rather a placeholder `MAP` value. |
95 | | -A warning will also be attached. |
96 | | -
|
97 | | -.Result of returning a `VECTOR` with a driver older than 6.0 |
98 | | -[source] |
99 | | ----- |
100 | | -+----------------------------------------------------------------+ |
101 | | -| n.vector | |
102 | | -+----------------------------------------------------------------+ |
103 | | -| {originalType: "VECTOR(1, INTEGER64)", reason: "UNKNOWN_TYPE"} | |
104 | | -+----------------------------------------------------------------+ |
105 | | -warn: One or more values returned could not be handled by this version of the driver and were replaced with placeholder map values. Please upgrade your driver! |
106 | | -03N95 (Neo.ClientNotification.UnknownType) |
107 | | ----- |
108 | | -==== |
109 | | - |
110 | | - |
111 | 76 | .Create a `VECTOR` value using parameters |
112 | 77 | ==== |
113 | 78 |
|
@@ -143,10 +108,11 @@ RETURN vector |
143 | 108 | [[store-vector-properties]] |
144 | 109 | == Store vector values as properties |
145 | 110 |
|
146 | | -To store a `VECTOR` value as a node or relationship property, use the `vector()` function together with a write clause. |
| 111 | +To store a `VECTOR` value as a node or relationship property, use the `vector()` function and a write clause. |
147 | 112 |
|
148 | 113 | [NOTE] |
149 | | -Storing `VECTOR` values on an on-prem instance requires the database to be on link:{neo4j-docs-base-uri}/operations-manual/current/database-internals/store-formats/#store-format-overview[block format]. |
| 114 | +Storing `VECTOR` values on requires the database to be on link:{neo4j-docs-base-uri}/operations-manual/current/database-internals/store-formats/#store-format-overview[block format]. |
| 115 | +This is the default on Aura instances. |
150 | 116 |
|
151 | 117 | .Create a node and a `VECTOR` property |
152 | 118 | ==== |
@@ -200,22 +166,96 @@ RETURN n.vectorProp AS vectorProp |
200 | 166 |
|
201 | 167 | ==== |
202 | 168 |
|
203 | | -[NOTE] |
204 | | -It is not possible to store lists of `VECTOR` values. |
205 | 169 |
|
| 170 | +[[drivers-fallback]] |
| 171 | +== Vectors and client libraries (drivers) |
| 172 | + |
| 173 | +Working with vectors via link:{neo4j-docs-base-uri}/create-applications/[Neo4j's client libraries] results in a different behavior depending on the library version. |
| 174 | + |
| 175 | +- *Versions >= 6.0* -- Vectors are fully supported and mapped into client types (see the _Data types_ page of each language manual). |
| 176 | +- *Versions < 6.0* -- Vectors can be created, attached, and manipulated in queries via Cypher functions, but cannot be returned nor created in the application. + |
| 177 | +It is possible to create and store values as shown in the xref:store-vector-properties[Store vector values as properties] examples, but _returning_ a `VECTOR` results in a placeholder `MAP` value and a warning. |
| 178 | ++ |
| 179 | +.Result of returning a `VECTOR` with a driver older than 6.0 |
| 180 | +[source] |
| 181 | +---- |
| 182 | ++----------------------------------------------------------------+ |
| 183 | +| n.vector | |
| 184 | ++----------------------------------------------------------------+ |
| 185 | +| {originalType: "VECTOR(1, INTEGER64)", reason: "UNKNOWN_TYPE"} | |
| 186 | ++----------------------------------------------------------------+ |
| 187 | +warn: One or more values returned could not be handled by this version of the driver and were replaced with placeholder map values. Please upgrade your driver! |
| 188 | +03N95 (Neo.ClientNotification.UnknownType) |
| 189 | +---- |
| 190 | + |
| 191 | + |
| 192 | +[[type-coercion]] |
| 193 | +== Type coercion |
| 194 | + |
| 195 | +_Coercion_ is the action of forcing entries of a different (implicit) type into a vector with a different coordinate type. |
206 | 196 |
|
207 | | -[[genai-plugin-vector-indexes]] |
208 | | -== Vector embeddings, the GenAI plugin, and vector indexes |
| 197 | +When the coordinate type is the same as the type of the given elements, no coercion is done. |
| 198 | +When the coordinate type differs, coercion may be done or an error may be raised depending on the situation. |
209 | 199 |
|
210 | | -Storing vector embeddings as `VECTOR` properties with a defined coordinate type has the following benefits: |
| 200 | +*An error is raised* if a value does not fit into the coordinate type. |
| 201 | +If the coordinate type is an `INTEGER` type and all the coordinate values are `INTEGER` values, then an error will be raised if and only if one of the coordinate types does not fit into the size of the specified type. |
| 202 | +The same applies for `FLOAT` vector types: if the elements are all `FLOAT` values then an error will only be raised if one value does not fit into the specified type. |
211 | 203 |
|
212 | | -* `VECTOR` values are stored on Neo4j's link:{neo4j-docs-base-uri}/operations-manual/current/database-internals/store-formats/#store-format-overview[block format], ensuring efficient retrieval and computation for operations like similarity search or distance calculations. |
213 | | -* Defining a coordinate type with the `vector()` function allows `VECTOR` values to be stored more efficiently. |
214 | | -Additionally, reducing a vectors coordinate type (e.g., from `INTEGER16` to `INTEGER8`) reduces storage requirements and improves performance, provided all values remain within the range supported by the smaller type. |
| 204 | +[source, cypher, test-fail] |
| 205 | +---- |
| 206 | +RETURN VECTOR([128], 1, INT8) |
| 207 | +// 22N28: data exception - overflow error. The result of the operation 'vector()' has caused an overflow. |
| 208 | +// 22003: data exception - numeric value out of range. The numeric value 128 is outside the required range. |
| 209 | +---- |
| 210 | + |
| 211 | +*Coercion (i.e. lossy conversion) is allowed* when: |
215 | 212 |
|
216 | | -For information about how to store embeddings generated by Neo4j as `VECTOR` values with the xref:genai-integrations.adoc[GenAI plugin], see: |
| 213 | +- The list contains `INTEGER` values and the specified vector type is of a `FLOAT` type. |
| 214 | +Precision will be lost for values at the higher end of the range (see the link:https://docs.oracle.com/javase/specs/jls/se21/html/jls-5.html[Java type specification]), but an error will be raised only if the value were to overflow/underflow. + |
| 215 | ++ |
| 216 | +[source, cypher] |
| 217 | +---- |
| 218 | +RETURN VECTOR([987374677], 1, FLOAT32) |
| 219 | +// vector([9.8737466E8], 1, FLOAT32 NOT NULL) |
| 220 | +---- |
| 221 | +- The list contains `FLOAT` values and the specified type is of an `INTEGER` type. |
| 222 | +Information may be lost, as all values after the decimal point will be truncated, but an error will be raised only if the value were to overflow/underflow. + |
| 223 | ++ |
| 224 | +[source, cypher] |
| 225 | +---- |
| 226 | +RETURN VECTOR([1.2], 1, INT) |
| 227 | +// vector([1], 1, INTEGER NOT NULL) |
| 228 | +---- |
| 229 | + |
| 230 | + |
| 231 | +[[supertypes]] |
| 232 | +== Supertypes |
| 233 | + |
| 234 | +`VECTOR` is a supertype of `VECTOR<TYPE>(DIMENSION)` types. |
| 235 | +The same applies for `VECTOR` types with only a coordinate type or a dimension: |
| 236 | + |
| 237 | +- `VECTOR` with only a defined dimension is a supertype of all `VECTOR` values of that dimension, regardless of the coordinate type. |
| 238 | +For example, `VECTOR(4)` is a supertype of `VECTOR<FLOAT>(4)` and `VECTOR<INT8>(4)`. |
| 239 | +- `VECTOR` with only a defined coordinate type is a supertype of all `VECTOR` values with that coordinate type, regardless of the dimension. |
| 240 | +For example, `VECTOR<INT>` is a supertype of `VECTOR<INT>(3)` and `VECTOR<INT>(1024)`. |
| 241 | + |
| 242 | +All of these supertypes can be used in xref:expressions/predicates/type-predicate-expressions.adoc#type-predicate-vector[type predicate expressions]. |
| 243 | +For more information, see: |
| 244 | + |
| 245 | +* xref:values-and-types/ordering-equality-comparison.adoc#ordering-and-comparison[Equality, ordering, and comparison of value types -> Ordering vector types] |
| 246 | +* xref:values-and-types/property-structural-constructed.adoc#vector-type-normalization[Property, structural, and constructed values -> Vector type normalization] |
| 247 | + |
| 248 | + |
| 249 | +[[lists-embeddings-vector-indexes]] |
| 250 | +== Lists, vector embeddings, and vector indexes |
| 251 | + |
| 252 | +`VECTOR` and xref:values-and-types/lists.adoc[`LIST`] values are similar and can both be indexed and searched through using xref:indexes/semantic-indexes/vector-indexes.adoc[vector indexes], but have a few key differences: |
| 253 | + |
| 254 | +- Elements in a `LIST` can be accessed individually, whereas operations on a `VECTOR` must operate on the entire `VECTOR`: it is not possible to access or slice individual elements. |
| 255 | +- Storing vector embeddings as `VECTOR` properties with a defined coordinate type allows them to be stored more efficiently. |
| 256 | +Moreover, reducing a vector's coordinate type (e.g., from `INTEGER16` to `INTEGER8`) downsizes storage requirements and improves performance, provided all values remain within the range supported by the smaller type. |
| 257 | + |
| 258 | +For information about how to store embeddings as `VECTOR` values with the xref:genai-integrations.adoc[GenAI plugin], see: |
217 | 259 |
|
218 | 260 | * xref:genai-integrations.adoc#single-embedding[Generate a single embedding and store it] |
219 | 261 | * xref:genai-integrations.adoc#multiple-embeddings[Generate multiple embeddings and store them] |
220 | | - |
221 | | -Embeddings stored as `VECTOR` properties can be indexed and searched semantically using xref:indexes/semantic-indexes/vector-indexes.adoc[vector indexes]. |
|
0 commit comments