Skip to content

Commit 7db6e5e

Browse files
Merge pull request #34800 from MashaMSFT/fixes
Updating supportability
2 parents 3e59a45 + a4d70ea commit 7db6e5e

File tree

1 file changed

+16
-14
lines changed

1 file changed

+16
-14
lines changed

docs/relational-databases/vectors/vectors-sql-server.md

Lines changed: 16 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -21,12 +21,11 @@ monikerRange: "=sql-server-ver17 || =sql-server-linux-ver17 || =azuresqldb-curre
2121

2222
[!INCLUDE [sqlserver2025-asdb-asmi-fabricsqldb](../../includes/applies-to-version/sqlserver2025-asdb-asmi-fabricsqldb.md)]
2323

24-
The SQL Database Engine provides the ability to store any kind of data and run any kind of query: structured and unstructured, and to perform vector search on that data. It is a good choice for scenarios where you need to do search across all these data together, and you don't want to use a separate service for search that would complicate your architecture.
24+
The SQL Database Engine provides the ability to store any kind of data and run any kind of query: structured and unstructured, and to perform vector search on that data. It's a good choice for scenarios where you need to search across all these data together, and you don't want to use a separate service for search that would complicate your architecture.
2525

2626
> [!NOTE]
27-
> - Vector support in preview and is subject to change. Make sure to read preview usage terms in [Service Level Agreements (SLA) for Online Services](https://www.microsoft.com/licensing/docs/view/Service-Level-Agreements-SLA-for-Online-Services).
28-
29-
Vector features are available in Azure SQL Managed Instance configured with the [Always-up-to-date](/azure/azure-sql/managed-instance/update-policy#always-up-to-date-update-policy) policy.
27+
> - Vector support is currently in preview and subject to change. Be sure to read preview usage terms in [Service Level Agreements (SLA) for Online Services](https://www.microsoft.com/licensing/docs/view/Service-Level-Agreements-SLA-for-Online-Services).
28+
> - Vector features are available in Azure SQL Managed Instance configured with the [Always-up-to-date](/azure/azure-sql/managed-instance/update-policy#always-up-to-date-update-policy) update policy.
3029
3130
## Vectors
3231

@@ -71,13 +70,13 @@ SELECT
7170
CAST(@v AS JSON) AS j
7271
```
7372

74-
### Exact Search and Vector Distance (Exact Nearest Neighbors)
73+
### Exact search and vector distance (exact nearest neighbors)
7574

76-
Exact Search, also known as k-Nearest Neighbor (k-NN) search, involves calculating the distance between a given vector and all other vectors in a dataset, sorting the results, and selecting the closest neighbors based on a specified distance metric. This method guarantees precise retrieval of the nearest neighbors but can be computationally intensive, especially for large datasets.
75+
Exact search, also known as k-nearest neighbor (k-NN) search, involves calculating the distance between a given vector and all other vectors in a dataset, sorting the results, and selecting the closest neighbors based on a specified distance metric. This method guarantees precise retrieval of the nearest neighbors but can be computationally intensive, especially for large datasets.
7776

78-
Vector Distance functions are used to measure the closeness between vectors. Common distance metrics include Euclidean distance, cosine similarity, and dot product. These functions are essential for performing k-NN searches and ensuring accurate results.
77+
Vector distance functions are used to measure the closeness between vectors. Common distance metrics include Euclidean distance, cosine similarity, and dot product. These functions are essential for performing k-NN searches and ensuring accurate results.
7978

80-
Exact Nearest Neighbor (ENN) Vector Search performs an exhaustive distance calculation across all indexed vectors to guarantee the retrieval of the closest neighbors based on a specified distance metric. This method is precise but resource-intensive, making it suitable for smaller datasets or scenarios where accuracy is paramount.
79+
Exact nearest neighbor (ENN) vector search performs an exhaustive distance calculation across all indexed vectors to guarantee the retrieval of the closest neighbors based on a specified distance metric. This method is precise but resource-intensive, making it suitable for smaller datasets or scenarios where accuracy is paramount.
8180

8281
In the SQL Database Engine, k-NN searches can be performed using the [VECTOR_DISTANCE](../../t-sql/functions/vector-distance-transact-sql.md) function, which allows for efficient calculation of distances between vectors and facilitates the retrieval of the nearest neighbors.
8382

@@ -92,23 +91,26 @@ ORDER BY distance
9291

9392
Using an exact search is recommended when you don't have many vectors to search on (less than 50,000 vectors as a general recommendation). The table can contain many more vectors as long as your search predicates reduce the number of vectors to use for neighbor search to 50,000 or fewer.
9493

95-
### Approximate Vector Index and Vector Search (Approximate Nearest Neighbors)
94+
### Approximate vector index and vector search (approximate nearest neighbors)
95+
96+
> [!NOTE]
97+
> Approximate vector index and vector search are in preview and currently only available in [!INCLUDE [sssql25-md](../../includes/sssql25-md.md)].
9698
97-
Identifying all vectors close to a given query vector requires substantial resources to calculate the distance between the query vector and the vectors stored in the table. Searching for all vectors close to a given query vector involves a complete scan of the table and significant CPU usage. This is called a "K-Nearest Neighbors" or "KNN" query and returns the "k" closest vectors.
99+
Identifying all vectors close to a given query vector requires substantial resources to calculate the distance between the query vector and the vectors stored in the table. Searching for all vectors close to a given query vector involves a complete scan of the table and significant CPU usage. This is called a "K-nearest neighbors" or "k-NN" query and returns the "k" closest vectors.
98100

99101
Vectors are used to find similar data for AI models to answer user queries. This involves querying the database for the "k" vectors nearest to the query vector using distance metrics like dot (inner) product, cosine similarity, or Euclidean distance.
100102

101-
KNN queries often struggle with scalability, making it acceptable in many cases to trade off some accuracy, particularly recall, for significant speed gains. This method is known as Approximate Nearest Neighbors (ANN).
103+
K-NN queries often struggle with scalability, making it acceptable in many cases to trade off some accuracy, particularly recall, for significant speed gains. This method is known as approximate nearest neighbors (ANN).
102104

103105
Recall is an important concept that should become familiar to everyone using or planning to use vectors and embeddings. In fact, recall measures the proportion of the approximate nearest neighbors that are identified by the algorithm, compared to the exact nearest neighbors that an exhaustive search would return. Therefore, it is a good measurement of the quality of the approximation that the algorithm is doing. A perfect recall, which is equivalent to no approximation, is 1.
104106

105-
For AI applications, the trade-off is quite reasonable. Since vector embeddings already approximate concepts, using ANN doesn't significantly affect the results, provided the recall is close to 1. This ensures that the returned results are very similar to those from KNN, while offering vastly improved performance and significantly reduced resource usage, which is highly beneficial for operational databases.
107+
For AI applications, the trade-off is quite reasonable. Since vector embeddings already approximate concepts, using ANN doesn't significantly affect the results, provided the recall is close to 1. This ensures that the returned results are very similar to those from k-NN, while offering vastly improved performance and significantly reduced resource usage, which is highly beneficial for operational databases.
106108

107109
It is important to understand that the term "index" when used referring to a [vector index](../../t-sql/statements/create-vector-index-transact-sql.md) has a different meaning than the index you are used to working with in relational databases. In fact, a vector index returns approximate results.
108110

109-
In MSSQL engine, vector indexes are based on the [DiskANN](https://www.microsoft.com/en-us/research/publication/diskann-fast-accurate-billion-point-nearest-neighbor-search-on-a-single-node) algorithm. DiskANN relies on creating a graph to navigate quickly through all the indexed vectors to find the closest match to a given vector. DiskANN is a graph-based system for indexing and searching large sets of vector data using limited computational resources. It efficiently uses SSDs and minimal memory to handle significantly more data than in-memory indices, while maintaining high queries per second (QPS) and low latency, ensuring a balance between memory, CPU and I/O usage and search performance.
111+
In the SQL Database engine, vector indexes are based on the [DiskANN](https://www.microsoft.com/en-us/research/publication/diskann-fast-accurate-billion-point-nearest-neighbor-search-on-a-single-node) algorithm. DiskANN relies on creating a graph to navigate quickly through all the indexed vectors to find the closest match to a given vector. DiskANN is a graph-based system for indexing and searching large sets of vector data using limited computational resources. It efficiently uses SSDs and minimal memory to handle significantly more data than in-memory indices, while maintaining high queries per second (QPS) and low latency, ensuring a balance between memory, CPU and I/O usage and search performance.
110112

111-
An Approximate Nearest Neighbors algorithm search can be done first creating a vector index using the [CREATE VECTOR INDEX](../../t-sql/statements/create-vector-index-transact-sql.md) T-SQL command and then using [VECTOR_SEARCH](../../t-sql/functions/vector-search-transact-sql.md) T-SQL function to run the approximate search.
113+
An approximate nearest neighbors algorithm search can be done first creating a vector index using the [CREATE VECTOR INDEX](../../t-sql/statements/create-vector-index-transact-sql.md) T-SQL command and then using [VECTOR_SEARCH](../../t-sql/functions/vector-search-transact-sql.md) T-SQL function to run the approximate search.
112114

113115
```sql
114116
DECLARE @qv VECTOR(1536) = AI_GENERATE_EMBEDDING(N'Pink Floyd music style' USE MODEL Ada2Embeddings);

0 commit comments

Comments
 (0)