Skip to content

Commit 8684784

Browse files
committed
checkpoint
1 parent 7d7dbf8 commit 8684784

File tree

4 files changed

+31
-26
lines changed

4 files changed

+31
-26
lines changed

articles/search/TOC.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -147,8 +147,6 @@
147147
href: cognitive-search-working-with-skillsets.md
148148
- name: Integrated vectorization (preview)
149149
href: vector-search-integrated-vectorization.md
150-
- name: Debug sessions
151-
href: cognitive-search-debug-session.md
152150
- name: Retrieval (queries)
153151
items:
154152
- name: Full-text search
@@ -368,6 +366,8 @@
368366
href: cognitive-search-defining-skillset.md
369367
- name: Create an index projection for a secondary index
370368
href: index-projections-concept-intro.md
369+
- name: Debug sessions overview
370+
href: cognitive-search-debug-session.md
371371
- name: Debug a skillset
372372
href: cognitive-search-how-to-debug-skillset.md
373373
- name: Reference an annotation
25.7 KB
Loading

articles/search/vector-search-how-to-create-index.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: Add vector search
2+
title: Create a vector store
33
titleSuffix: Azure AI Search
44
description: Create or update a search index to include vector fields.
55

@@ -9,17 +9,17 @@ ms.service: cognitive-search
99
ms.custom:
1010
- ignite-2023
1111
ms.topic: how-to
12-
ms.date: 11/27/2023
12+
ms.date: 01/29/2024
1313
---
1414

15-
# Add vector fields to a search index
15+
# Create a vector store
1616

17-
In Azure AI Search, vector data is indexed as *vector fields* in a [search index](search-what-is-an-index.md).
17+
In Azure AI Search, vector data is indexed and stored as *vector fields* in a [search index](search-what-is-an-index.md).
1818

1919
Follow these steps to index vector data:
2020

2121
> [!div class="checklist"]
22-
> + Add one or more vector configurations to an index schema.
22+
> + Define a schema with one or more vector configurations that specifies algorithms for indexing and search
2323
> + Add one or more vector fields.
2424
> + Load the index with vector data [as a separate step](#load-vector-data-for-indexing), or use [integrated vectorization (preview)](vector-search-integrated-vectorization.md) for data chunking and encoding during indexing.
2525
@@ -30,9 +30,9 @@ This article applies to the generally available, non-preview version of [vector
3030
3131
## Prerequisites
3232

33-
+ Azure AI Search, in any region and on any tier. Most existing services support vector search. For services created prior to January 2019, there's a small subset that support vector search. If an index containing vector fields fails to be created or updated, this is an indicator. In this situation, a new service must be created.
33+
+ Azure AI Search, in any region and on any tier. Most existing services support vector search. For services created prior to January 2019, there's a small subset that don't support vector search. If an index containing vector fields fails to be created or updated, this is an indicator. In this situation, a new service must be created.
3434

35-
+ Pre-existing vector embeddings in your source documents. Azure AI Search doesn't generate vectors in the generally available version of vector search. We recommend [Azure OpenAI embedding models](/azure/ai-services/openai/concepts/models#embeddings-models) but you can use any model for vectorization. For more information, see [Generate embeddings](vector-search-how-to-generate-embeddings.md).
35+
+ Pre-existing vector embeddings in your source documents. Azure AI Search doesn't generate vectors in the generally available version of the Azure SDKs and REST APIs. We recommend [Azure OpenAI embedding models](/azure/ai-services/openai/concepts/models#embeddings-models) but you can use any model for vectorization. For more information, see [Generate embeddings](vector-search-how-to-generate-embeddings.md).
3636

3737
+ You should know the dimensions limit of the model used to create the embeddings and how similarity is computed. In Azure OpenAI, for **text-embedding-ada-002**, the length of the numerical vector is 1536. Similarity is computed using `cosine`.
3838

articles/search/vector-search-overview.md

Lines changed: 22 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -9,49 +9,54 @@ ms.service: cognitive-search
99
ms.custom:
1010
- ignite-2023
1111
ms.topic: conceptual
12-
ms.date: 12/05/2023
12+
ms.date: 01/29/2024
1313
---
1414

15-
# Vector search in Azure AI Search
15+
# Vector stores and vector search in Azure AI Search
1616

17-
Vector search is an approach in information retrieval that uses numeric representations of content for search scenarios. Because the content is numeric rather than plain text, the search engine matches on vectors that are the most similar to the query, with no requirement for matching on exact terms.
17+
Vector search is an approach in information retrieval that stores numeric representations of content for search scenarios. Because the content is numeric rather than plain text, the search engine matches on vectors that are the most similar to the query, with no requirement for matching on exact terms.
1818

1919
This article is a high-level introduction to vector support in Azure AI Search. It also explains integration with other Azure services and covers [terminology and concepts](#vector-search-concepts) related to vector search development.
2020

2121
We recommend this article for background, but if you'd rather get started, follow these steps:
2222

2323
> [!div class="checklist"]
24-
> + [Generate vector embeddings](vector-search-how-to-generate-embeddings.md) before you start, or try out [integrated vectorization (preview)](vector-search-integrated-vectorization.md).
25-
> + [Add vector fields to an index](vector-search-how-to-create-index.md).
26-
> + [Load vector data](search-what-is-data-import.md) into an index using push or pull methodologies.
27-
> + [Query vector data](vector-search-how-to-query.md) using the Azure portal, REST APIs, or Azure SDK packages.
24+
> + [Provide embeddings](vector-search-how-to-generate-embeddings.md) or [generate embeddings (preview)](vector-search-integrated-vectorization.md)
25+
> + [Create a vector store](vector-search-how-to-create-index.md)
26+
> + [Run vector queries](vector-search-how-to-query.md)
2827
2928
You could also begin with the [vector quickstart](search-get-started-vector.md) or the [code samples on GitHub](https://github.com/Azure/azure-search-vector-samples).
3029

31-
Vector search is in the Azure portal and the Azure SDKs for [.NET](https://www.nuget.org/packages/Azure.Search.Documents), [Python](https://pypi.org/project/azure-search-documents), and [JavaScript](https://www.npmjs.com/package/@azure/search-documents/v/12.0.0-beta.2).
32-
3330
## What's vector search in Azure AI Search?
3431

35-
Vector search is a new capability for indexing, storing, and retrieving vector embeddings from a search index. You can use it to power similarity search, multi-modal search, recommendations engines, or applications implementing the [Retrieval Augmented Generation (RAG) architecture](https://aka.ms/what-is-rag).
32+
Vector search is a new capability for indexing, storing, and querying vector embeddings from a search index. You can use it to power similarity search, multimodal search, recommendations engines, or applications implementing the [Retrieval Augmented Generation (RAG) architecture](https://aka.ms/what-is-rag).
3633

3734
The following diagram shows the indexing and query workflows for vector search.
3835

3936
:::image type="content" source="media/vector-search-overview/vector-search-architecture-diagram-3.svg" alt-text="Architecture of vector search workflow." border="false" lightbox="media/vector-search-overview/vector-search-architecture-diagram-3-high-res.png":::
4037

41-
On the indexing side, Azure AI Search takes vector embeddings and uses a [nearest neighbors algorithm](vector-search-ranking.md) to co-locate similar vectors together in the search index (vectors about popular movies are closer than vectors about popular dog breeds).
38+
On the indexing side, Azure AI Search takes vector embeddings and uses a [nearest neighbors algorithm](vector-search-ranking.md) to place similar vectors close together in an index.
4239

43-
How you get embeddings from your source content depends on your approach and whether you can use preview features. You can vectorize or generate embeddings using models from OpenAI, Azure OpenAI, and any number of providers, over a wide range of source content including text, images, and other content types supported by the models. You can then push pre-vectorized content to [vector fields](vector-search-how-to-create-index.md) in a search index. That's the generally available approach. If you can use preview features, Azure AI Search provides [integrated data chunking and vectorization](vector-search-integrated-vectorization.md) in an indexer pipeline. You still provide the resources (endpoints and connection information), but Azure AI Search makes all of the calls and handles the transitions.
40+
How you get embeddings from your source content depends on your approach and whether you can use preview features. You can vectorize or generate embeddings using models from OpenAI, Azure OpenAI, and any number of providers, over a wide range of source content including text, images, and other content types supported by the models. You can then push pre-vectorized content to [vector fields](vector-search-how-to-create-index.md) to a vector store. That's the generally available approach. If you can use preview features, Azure AI Search offers [integrated data chunking and vectorization](vector-search-integrated-vectorization.md) in an indexer pipeline. You still provide the resources (endpoints and connection information to Azure OpenAI), but Azure AI Search makes all of the calls and handles the transitions.
4441

45-
On the query side, in your client application, collect the query input from a user. You can then add an encoding step that converts the input into a vector, and then send the vector query to your index on Azure AI Search for a similarity search. As with indexing, you can deploy the [integrated vectorization (preview)](vector-search-integrated-vectorization.md) to convert text inputs to a vector. For either approach, Azure AI Search returns documents with the requested `k` nearest neighbors (kNN) in the results.
42+
On the query side, in your client application, you collect the query input from a user, usually through a prompt workflow. You can then add an encoding step that converts the input into a vector, and then send the vector query to your index on Azure AI Search for a similarity search. As with indexing, you can deploy the [integrated vectorization (preview)](vector-search-integrated-vectorization.md) to convert the question into a vector. For either approach, Azure AI Search returns documents with the requested `k` nearest neighbors (kNN) in the results.
4643

47-
Azure AI Search supports [hybrid scenarios](hybrid-search-overview.md). You can index vector data as fields in documents alongside alphanumeric content. Vector queries can be issued singly or in combination with filters and other query types, including term queries and semantic ranking in the same search request.
44+
Azure AI Search supports [hybrid scenarios](hybrid-search-overview.md) that run vector and keyword search in parallel, returning a unified result set that often provides better results than just vector or keyword search alone. For hybrid, vector and non-vector content is ingested into the same index, for queries that run side by side.
4845

4946
## Availability and pricing
5047

5148
Vector search is available as part of all Azure AI Search tiers in all regions at no extra charge.
5249

5350
Newer services created after July 1, 2023 support [higher quotas for vector indexes](vector-search-index-size.md).
5451

52+
Vector search is available in:
53+
54+
+ Azure portal using the [Import and vectorize data wizard](search-get-started-portal-import-vectors.md)
55+
+ Azure OpenAI Studio, see this [Quickstart](search-get-started-retrieval-augmented-generation.md)
56+
+ Azure AI Studio
57+
+ Azure REST APIs, [version 2023-11-01](/rest/api/searchservice/operation-groups)
58+
+ Azure SDKs for [.NET](https://www.nuget.org/packages/Azure.Search.Documents), [Python](https://pypi.org/project/azure-search-documents), and [JavaScript](https://www.npmjs.com/package/@azure/search-documents/v/12.0.0-beta.2)
59+
5560
> [!NOTE]
5661
> Some older search services created before January 1, 2019 are deployed on infrastructure that doesn't support vector workloads. If you try to add a vector field to a schema and get an error, it's a result of outdated services. In this situation, you must create a new search service to try out the vector feature.
5762
@@ -115,13 +120,13 @@ For example, documents that talk about different species of dogs would be cluste
115120

116121
### Nearest neighbors search
117122

118-
In vector search, the search engine searches through the vectors within the embedding space to identify those that are near to the query vector. This technique is called [*nearest neighbor search*](https://en.wikipedia.org/wiki/Nearest_neighbor_search). Nearest neighbors help quantify the similarity between items. A high degree of vector similarity indicates that the original data was similar too. To facilitate fast nearest neighbor search, the search engine will perform optimizations or employ data structures or data partitioning to reduce the search space. Each vector search algorithm will have different approaches to this problem, trading off different characteristics such as latency, throughput, recall, and memory. To compute similarity, similarity metrics provide the mechanism for computing this distance.
123+
In vector search, the search engine searches through the vectors within the embedding space to identify those that are near to the query vector. This technique is called [*nearest neighbor search*](https://en.wikipedia.org/wiki/Nearest_neighbor_search). Nearest neighbors help quantify the similarity between items. A high degree of vector similarity indicates that the original data was similar too. To facilitate fast nearest neighbor search, the search engine performs optimizations or employ data structures or data partitioning to reduce the search space. Each vector search algorithm has different approaches to this problem, trading off different characteristics such as latency, throughput, recall, and memory. To compute similarity, similarity metrics provide the mechanism for computing this distance.
119124

120125
Azure AI Search currently supports the following algorithms:
121126

122127
+ Hierarchical Navigable Small World (HNSW): HNSW is a leading ANN algorithm optimized for high-recall, low-latency applications where data distribution is unknown or can change frequently. It organizes high-dimensional data points into a hierarchical graph structure that enables fast and scalable similarity search while allowing a tunable a trade-off between search accuracy and computational cost. Because the algorithm requires all data points to reside in memory for fast random access, this algorithm consumes [vector storage](vector-search-index-size.md) quota.
123128

124-
+ Exhaustive K-nearest neighbors (KNN): Calculates the distances between the query vector and all data points. It's computationally intensive, so it works best for smaller datasets. Because the algorithm doesn't require fast random access of data points, this algorithm doesn't consume vector storage quota. However, this algorithm will provide the global set of nearest neighbors.
129+
+ Exhaustive K-nearest neighbors (KNN): Calculates the distances between the query vector and all data points. It's computationally intensive, so it works best for smaller datasets. Because the algorithm doesn't require fast random access of data points, this algorithm doesn't consume vector storage quota. However, this algorithm provides the global set of nearest neighbors.
125130

126131
Within an index definition, you can specify one or more algorithms, and then for each vector field specify which algorithm to use:
127132

@@ -131,7 +136,7 @@ Within an index definition, you can specify one or more algorithms, and then for
131136

132137
Algorithm parameters that are used to initialize the index during index creation are immutable and can't be changed after the index is built. However, parameters that affect the query-time characteristics (`efSearch`) can be modified.
133138

134-
In addition, fields that specify HNSW algorithm also support exhaustive KNN search using the [query request](vector-search-how-to-query.md) parameter `"exhaustive": true`. The opposite isn't true however. If a field is indexed for `exhaustiveKnn`, you can't use HNSW in the query because the additional data structures that enable efficient search don’t exist.
139+
In addition, fields that specify HNSW algorithm also support exhaustive KNN search using the [query request](vector-search-how-to-query.md) parameter `"exhaustive": true`. The opposite isn't true however. If a field is indexed for `exhaustiveKnn`, you can't use HNSW in the query because the extra data structures that enable efficient search don’t exist.
135140

136141
### Approximate Nearest Neighbors
137142

0 commit comments

Comments
 (0)