| title | summary |
|---|---|
Vector Functions and Operators |
Learn about functions and operators available for Vector data types. |
This document lists the functions and operators available for Vector data types.
Warning:
This feature is experimental. It is not recommended that you use it in the production environment. This feature might be changed without prior notice. If you find a bug, you can report an issue on GitHub.
Note:
This feature is in beta. It might be changed without prior notice. If you find a bug, you can report an issue on GitHub.
Note:
Vector data types and these vector functions are available on TiDB Self-Managed, {{{ .starter }}}, {{{ .essential }}}, and TiDB Cloud Dedicated. For TiDB Self-Managed and TiDB Cloud Dedicated, the TiDB version must be v8.4.0 or later (v8.5.0 or later is recommended).
The following functions are designed specifically for Vector data types.
Vector distance functions:
| Function name | Description | Supported by vector index |
|---|---|---|
VEC_L2_DISTANCE |
Calculates L2 distance (Euclidean distance) between two vectors | Yes |
VEC_COSINE_DISTANCE |
Calculates the cosine distance between two vectors | Yes |
VEC_NEGATIVE_INNER_PRODUCT |
Calculates the negative of the inner product between two vectors | No |
VEC_L1_DISTANCE |
Calculates L1 distance (Manhattan distance) between two vectors | No |
Other vector functions:
| Function name | Description |
|---|---|
VEC_DIMS |
Returns the dimension of a vector |
VEC_L2_NORM |
Calculates the L2 norm (Euclidean norm) of a vector |
VEC_FROM_TEXT |
Converts a string into a vector |
VEC_AS_TEXT |
Converts a vector into a string |
The following built-in functions and operators are extended to support operations on Vector data types.
Arithmetic operators:
| Name | Description |
|---|---|
+ |
Vector element-wise addition operator |
- |
Vector element-wise subtraction operator |
For more information about how vector arithmetic works, see Vector Data Type | Arithmetic.
Aggregate (GROUP BY) functions:
| Name | Description |
|---|---|
COUNT() |
Return a count of the number of rows returned |
COUNT(DISTINCT) |
Return the count of a number of different values |
MAX() |
Return the maximum value |
MIN() |
Return the minimum value |
Comparison functions and operators:
| Name | Description |
|---|---|
BETWEEN ... AND ... |
Check whether a value is within a range of values |
COALESCE() |
Return the first non-NULL argument |
= |
Equal operator |
<=> |
NULL-safe equal to operator |
> |
Greater than operator |
>= |
Greater than or equal operator |
GREATEST() |
Return the largest argument |
IN() |
Check whether a value is within a set of values |
IS NULL |
Test whether a value is NULL |
ISNULL() |
Test whether the argument is NULL |
LEAST() |
Return the smallest argument |
< |
Less than operator |
<= |
Less than or equal operator |
NOT BETWEEN ... AND ... |
Check whether a value is not within a range of values |
!=, <> |
Not equal operator |
NOT IN() |
Check whether a value is not within a set of values |
For more information about how vectors are compared, see Vector Data Type | Comparison.
Control flow functions:
| Name | Description |
|---|---|
CASE |
Case operator |
IF() |
If/else construct |
IFNULL() |
Null if/else construct |
NULLIF() |
Return NULL if expr1 = expr2 |
Cast functions:
| Name | Description |
|---|---|
CAST() |
Cast a value as a string or vector |
CONVERT() |
Cast a value as a string |
For more information about how to use CAST(), see Vector Data Type | Cast.
VEC_L2_DISTANCE(vector1, vector2)Calculates the L2 distance (Euclidean distance) between two vectors using the following formula:
$DISTANCE(p,q)=\sqrt {\sum \limits {i=1}^{n}{(p{i}-q_{i})^{2}}}$
The two vectors must have the same number of dimensions. Otherwise, an error is returned.
Example:
SELECT VEC_L2_DISTANCE('[0, 3]', '[4, 0]');+-------------------------------------+
| VEC_L2_DISTANCE('[0, 3]', '[4, 0]') |
+-------------------------------------+
| 5 |
+-------------------------------------+
VEC_COSINE_DISTANCE(vector1, vector2)Calculates the cosine distance between two vectors using the following formula:
$DISTANCE(p,q)=1.0 - {\frac {\sum \limits {i=1}^{n}{p{i}q_{i}}}{{\sqrt {\sum \limits {i=1}^{n}{p{i}^{2}}}}\cdot {\sqrt {\sum \limits {i=1}^{n}{q{i}^{2}}}}}}$
The two vectors must have the same number of dimensions. Otherwise, an error is returned.
For embeddings from OpenAI, it is recommended that you use this function.
Example:
SELECT VEC_COSINE_DISTANCE('[1, 1]', '[-1, -1]');+-------------------------------------------+
| VEC_COSINE_DISTANCE('[1, 1]', '[-1, -1]') |
+-------------------------------------------+
| 2 |
+-------------------------------------------+
VEC_NEGATIVE_INNER_PRODUCT(vector1, vector2)Calculates the distance by using the negative of the inner product between two vectors, using the following formula:
$DISTANCE(p,q)=- INNER_PROD(p,q)=-\sum \limits {i=1}^{n}{p{i}q_{i}}$
The two vectors must have the same number of dimensions. Otherwise, an error is returned.
Example:
SELECT VEC_NEGATIVE_INNER_PRODUCT('[1, 2]', '[3, 4]');+------------------------------------------------+
| VEC_NEGATIVE_INNER_PRODUCT('[1, 2]', '[3, 4]') |
+------------------------------------------------+
| -11 |
+------------------------------------------------+
VEC_L1_DISTANCE(vector1, vector2)Calculates the L1 distance (Manhattan distance) between two vectors using the following formula:
$DISTANCE(p,q)=\sum \limits {i=1}^{n}{|p{i}-q_{i}|}$
The two vectors must have the same number of dimensions. Otherwise, an error is returned.
Example:
SELECT VEC_L1_DISTANCE('[0, 0]', '[3, 4]');+-------------------------------------+
| VEC_L1_DISTANCE('[0, 0]', '[3, 4]') |
+-------------------------------------+
| 7 |
+-------------------------------------+
VEC_DIMS(vector)Returns the dimension of a vector.
Examples:
SELECT VEC_DIMS('[1, 2, 3]');+-----------------------+
| VEC_DIMS('[1, 2, 3]') |
+-----------------------+
| 3 |
+-----------------------+
SELECT VEC_DIMS('[]');+----------------+
| VEC_DIMS('[]') |
+----------------+
| 0 |
+----------------+
VEC_L2_NORM(vector)Calculates the L2 norm (Euclidean norm) of a vector using the following formula:
$NORM(p)=\sqrt {\sum \limits {i=1}^{n}{p{i}^{2}}}$
Example:
SELECT VEC_L2_NORM('[3, 4]');+-----------------------+
| VEC_L2_NORM('[3, 4]') |
+-----------------------+
| 5 |
+-----------------------+
VEC_FROM_TEXT(string)Converts a string into a vector. In many cases, this conversion is done implicitly, for example when inserting data into a column of the VECTOR data type. However, in expressions where implicit conversion is not supported (such as arithmetic operations on vectors), you need to explicitly call this function.
Example:
SELECT VEC_FROM_TEXT('[1, 2]') + VEC_FROM_TEXT('[3, 4]');+-------------------------------------------------+
| VEC_FROM_TEXT('[1,2]') + VEC_FROM_TEXT('[3,4]') |
+-------------------------------------------------+
| [4,6] |
+-------------------------------------------------+
VEC_AS_TEXT(vector)Converts a vector into a string.
Example:
SELECT VEC_AS_TEXT('[1.000, 2.5]');+-----------------------------+
| VEC_AS_TEXT('[1.000, 2.5]') |
+-----------------------------+
| [1,2.5] |
+-----------------------------+
The vector functions and the extended usage of built-in functions and operators over vector data types are TiDB specific, and are not supported in MySQL.