- 
                Notifications
    You must be signed in to change notification settings 
- Fork 64
Vector type in Cypher 5 #1402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
          
     Merged
      
      
    
  
     Merged
                    Vector type in Cypher 5 #1402
Changes from 2 commits
      Commits
    
    
            Show all changes
          
          
            4 commits
          
        
        Select commit
          Hold shift + click to select a range
      
      
    File filter
Filter by extension
Conversations
          Failed to load comments.   
        
        
          
      Loading
        
  Jump to
        
          Jump to file
        
      
      
          Failed to load files.   
        
        
          
      Loading
        
  Diff view
Diff view
There are no files selected for viewing
  
    
      This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
      Learn more about bidirectional Unicode characters
    
  
  
    
              
  
    
      This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
      Learn more about bidirectional Unicode characters
    
  
  
    
              | Original file line number | Diff line number | Diff line change | 
|---|---|---|
| @@ -0,0 +1,124 @@ | ||
| = Vectors | ||
| :description: Create and store vectors (embeddings) as properties on nodes and relationships, and use them for efficient semantic retrieval with vector indexes and the GenAI plugin. | ||
| :page-role: new-neo4j-2025.10 | ||
|  | ||
| `VECTOR` values can be stored as xref:indexes/semantic-indexes/vector-indexes.adoc#embeddings[embedding] properties on nodes and relationships, and used for efficient semantic retrieval using xref:indexes/semantic-indexes/vector-indexes.adoc[vector indexes] and the xref:genai-integrations.adoc[GenAI plugin]. | ||
|  | ||
| [IMPORTANT] | ||
| Although the `VECTOR` type is present in Cypher 5, constructing and comparing vectors is only possible via link:https://neo4j.com/docs/cypher-manual/25/values-and-types/vector/[Cypher 25 features]. | ||
| However, vectors can still be inserted in and retrieved from the database with Cypher 5 queries using a link:https://neo4j.com/docs/create-applications/[Neo4j client library] version >= 6.0. | ||
|  | ||
|  | ||
| [[vector-type]] | ||
| == The vector type | ||
|  | ||
| The `VECTOR` type is a fixed-length, ordered collection of numeric values (`INTEGER` or `FLOAT`) stored as a single unit. | ||
| The type of a value is defined by: | ||
|  | ||
| - *Dimension* -- The number of values it contains. | ||
| - *Coordinate type* -- The data type of the entries, determining precision and storage size. | ||
|  | ||
| .An example `VECTOR` value | ||
| [source] | ||
| ---- | ||
| vector([1.05, 0.123, 5], 3, FLOAT32) | ||
| ---- | ||
|  | ||
| In this example, `[1.05, 0.123, 5]` is the list of values, `3` its dimension, and `FLOAT32` the data type of the individual entries. + | ||
| Each number in the list can also be seen as a coordinate along one of the vector's dimensions. | ||
|  | ||
|  | ||
| [[valid-values]] | ||
| === Valid values | ||
|  | ||
| - A `VECTOR` value must have a dimension and a coordinate type. | ||
| - The dimension of a `VECTOR` value must be larger than `0` and less than or equal to `4096`. | ||
| - Vectors cannot contain lists as elements. | ||
| - Supported coordinate types are: + | ||
| + | ||
| [options="header",cols="2*<m"] | ||
| |=== | ||
| | Default name | Alias | ||
|  | ||
| | `FLOAT` | `FLOAT64` | ||
| | `FLOAT32` | | ||
| | `INTEGER` | `INT`, `INT64`, `INTEGER64`, `SIGNED INTEGER` | ||
| | `INTEGER8` | `INT8` | ||
| | `INTEGER16`| `INT16` | ||
| | `INTEGER32` | `INT8` | ||
|  | ||
| |=== | ||
|  | ||
|  | ||
| [[drivers-fallback]] | ||
| == Vectors and client libraries (drivers) | ||
|  | ||
| Working with vectors via link:{neo4j-docs-base-uri}/create-applications/[Neo4j's client libraries] results in a different behavior depending on the library version. | ||
|  | ||
| - *Versions >= 6.0* -- Vectors are fully supported and mapped into client types (see the _Data types_ page of each language manual). | ||
| - *Versions < 6.0* -- Returning a `VECTOR` already present in the database results in a placeholder `MAP` value and a warning. | ||
| + | ||
| .Result of returning a `VECTOR` with a driver older than 6.0 | ||
| [source] | ||
| ---- | ||
| +----------------------------------------------------------------+ | ||
| | n.vector | | ||
| +----------------------------------------------------------------+ | ||
| | {originalType: "VECTOR(1, INTEGER64)", reason: "UNKNOWN_TYPE"} | | ||
| +----------------------------------------------------------------+ | ||
| warn: One or more values returned could not be handled by this version of the driver and were replaced with placeholder map values. Please upgrade your driver! | ||
| 03N95 (Neo.ClientNotification.UnknownType) | ||
| ---- | ||
|  | ||
|  | ||
| [[type-coercion]] | ||
| == Type coercion | ||
|  | ||
| _Coercion_ is the action of forcing entries of a different (implicit) type into a vector with a different coordinate type. | ||
|  | ||
| When the coordinate type is the same as the type of the given elements, no coercion is done. | ||
| When the coordinate type differs, coercion may be done or an error may be raised depending on the situation. | ||
|  | ||
| *An error is raised* if a value does not fit into the coordinate type. | ||
| If the coordinate type is an `INTEGER` type and all the coordinate values are `INTEGER` values, then an error will be raised if and only if one of the coordinate types does not fit into the size of the specified type. | ||
| The same applies for `FLOAT` vector types: if the elements are all `FLOAT` values then an error will only be raised if one value does not fit into the specified type. | ||
|  | ||
| *Coercion (i.e. lossy conversion) is allowed* when: | ||
|  | ||
| - The list contains `INTEGER` values and the specified vector type is of a `FLOAT` type. | ||
| Precision will be lost for values at the higher end of the range (see the link:https://docs.oracle.com/javase/specs/jls/se21/html/jls-5.html[Java type specification]), but an error will be raised only if the value were to overflow/underflow. + | ||
| - The list contains `FLOAT` values and the specified type is of an `INTEGER` type. | ||
| Information may be lost, as all values after the decimal point will be truncated, but an error will be raised only if the value were to overflow/underflow. + | ||
|  | ||
|  | ||
| [[supertypes]] | ||
| == Supertypes | ||
|  | ||
| `VECTOR` is a supertype of `VECTOR<TYPE>(DIMENSION)` types. | ||
| The same applies for `VECTOR` types with only a coordinate type or a dimension: | ||
|  | ||
| - `VECTOR` with only a defined dimension is a supertype of all `VECTOR` values of that dimension, regardless of the coordinate type. | ||
| For example, `VECTOR(4)` is a supertype of `VECTOR<FLOAT>(4)` and `VECTOR<INT8>(4)`. | ||
| - `VECTOR` with only a defined coordinate type is a supertype of all `VECTOR` values with that coordinate type, regardless of the dimension. | ||
| For example, `VECTOR<INT>` is a supertype of `VECTOR<INT>(3)` and `VECTOR<INT>(1024)`. | ||
|  | ||
| All of these supertypes can be used in xref:expressions/predicates/type-predicate-expressions.adoc#type-predicate-vector[type predicate expressions]. | ||
| For more information, see: | ||
|  | ||
| * xref:values-and-types/ordering-equality-comparison.adoc#ordering-and-comparison[Equality, ordering, and comparison of value types -> Ordering vector types] | ||
| * xref:values-and-types/property-structural-constructed.adoc#vector-type-normalization[Property, structural, and constructed values -> Vector type normalization] | ||
|  | ||
|  | ||
| [[lists-embeddings-vector-indexes]] | ||
| == Lists, vector embeddings, and vector indexes | ||
|  | ||
| `VECTOR` and xref:values-and-types/lists.adoc[`LIST`] values are similar and can both be indexed and searched through using xref:indexes/semantic-indexes/vector-indexes.adoc[vector indexes], but have a few key differences: | ||
|  | ||
| - Elements in a `LIST` can be accessed individually, whereas operations on a `VECTOR` must operate on the entire `VECTOR`: it is not possible to access or slice individual elements. | ||
| - Storing vector embeddings as `VECTOR` properties with a defined coordinate type allows them to be stored more efficiently. | ||
| Moreover, reducing a vector's coordinate type (e.g., from `INTEGER16` to `INTEGER8`) downsizes storage requirements and improves performance, provided all values remain within the range supported by the smaller type. | ||
|  | ||
| For information about how to store embeddings as `VECTOR` values with the xref:genai-integrations.adoc[GenAI plugin], see: | ||
|  | ||
| * xref:genai-integrations.adoc#single-embedding[Generate a single embedding and store it] | ||
| * xref:genai-integrations.adoc#multiple-embeddings[Generate multiple embeddings and store them] | 
  Add this suggestion to a batch that can be applied as a single commit.
  This suggestion is invalid because no changes were made to the code.
  Suggestions cannot be applied while the pull request is closed.
  Suggestions cannot be applied while viewing a subset of changes.
  Only one suggestion per line can be applied in a batch.
  Add this suggestion to a batch that can be applied as a single commit.
  Applying suggestions on deleted lines is not supported.
  You must change the existing code in this line in order to create a valid suggestion.
  Outdated suggestions cannot be applied.
  This suggestion has been applied or marked resolved.
  Suggestions cannot be applied from pending reviews.
  Suggestions cannot be applied on multi-line comments.
  Suggestions cannot be applied while the pull request is queued to merge.
  Suggestion cannot be applied right now. Please check back later.
  
    
  
    
Uh oh!
There was an error while loading. Please reload this page.