Skip to content

Commit 0e1b23f

Browse files
Merge pull request #246694 from HeidiSteen/heidist-vectors
[azure search] vector concept updates
2 parents f9fcf18 + 11207a7 commit 0e1b23f

File tree

2 files changed

+11
-7
lines changed

2 files changed

+11
-7
lines changed

articles/search/search-faq-frequently-asked-questions.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ metadata:
99
ms.author: heidist
1010
ms.service: cognitive-search
1111
ms.topic: faq
12-
ms.date: 06/29/2023
12+
ms.date: 07/28/2023
1313
title: Cognitive Search Frequently Asked Questions
1414
summary: Find answers to commonly asked questions about Azure Cognitive Search.
1515

@@ -92,7 +92,7 @@ sections:
9292
- question: |
9393
What is vector search?
9494
answer: |
95-
Vector search is a technique used in information retrieval to find similar items in a dataset based on their vector representations.
95+
Vector search is a technique that finds the most similar documents by comparing their vector representations. Since the goal of a vector representation is to capture the essential characteristics of an item in a numerical format, it can capture abstract concepts and identify matches even if there are no explicit matches based on keywords or tags. When a user performs a search, the query is summarized into a vector representation and the vector search engine identifies the most similar documents. To improve efficiency on large databases, vector search often provides the approximate nearest neighbors for a query vector. See [Vector search overview](vector-search-overview.md) for the specifics of Azure Cognitive Search's vector search product offering.
9696
9797
- question: |
9898
Does Azure Cognitive Search support vector search?

articles/search/vector-search-overview.md

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -7,15 +7,15 @@ author: robertklee
77
ms.author: robertlee
88
ms.service: cognitive-search
99
ms.topic: conceptual
10-
ms.date: 07/10/2023
10+
ms.date: 07/28/2023
1111
---
1212

1313
# Vector search within Azure Cognitive Search
1414

1515
> [!IMPORTANT]
1616
> Vector search is in public preview under [supplemental terms of use](https://azure.microsoft.com/support/legal/preview-supplemental-terms/). It's available through the Azure portal, preview REST API, and [alpha SDKs](https://github.com/Azure/cognitive-search-vector-pr#readme).
1717
18-
This article is a high-level introduction to vector search in Azure Cognitive Search. It also explains integration with other Azure services and covers the core concepts you should know for vector search development.
18+
This article is a high-level introduction to vector search in Azure Cognitive Search. It also explains integration with other Azure services and covers [terms and concepts](#vector-search-concepts) related to vector search development.
1919

2020
We recommend this article for background, but if you'd rather get started, follow these steps:
2121

@@ -60,7 +60,7 @@ Scenarios for vector search include:
6060

6161
+ **Filtered vector search**. A query request can include a vector query and a [filter expression](search-filters.md). Filters apply to text and numeric fields, and are useful for including or excluding search documents based on filter criteria. Although a vector field isn't filterable itself, you can set up a filterable text or numeric field. The search engine processes the filter first, reducing the surface area of the search corpus before running the vector query.
6262

63-
+ **Vector database**. Use Cognitive Search as a vector store to serve as long-term memory or an external knowledge base for Large Language Models (LLMs), or other applications.
63+
+ **Vector database**. Use Cognitive Search as a vector store to serve as long-term memory or an external knowledge base for Large Language Models (LLMs), or other applications. For example, you can use Azure Cognitive Search as a [*vector index* in an Azure Machine Learning prompt flow](/azure/machine-learning/concept-vector-stores) for Retrieval Augmented Generation (RAG) applications.
6464

6565
## Azure integration and related services
6666

@@ -82,11 +82,15 @@ If you're new to vectors, this section explains some core concepts.
8282

8383
### About vector search
8484

85-
Vector search is a method of information retrieval that aims to overcome the limitations of traditional keyword-based search. Rather than relying solely on lexical analysis and matching of individual query terms, vector search uses machine learning models to capture the meaning of words and phrases in context. This is done by representing documents and queries as vectors in a high-dimensional space, called an embedding. By capturing the intent of the query with the embedding, vector search can return more relevant results that match the user's needs, even if the exact terms aren't present in the document. Additionally, vector search can be applied to different types of content, such as images and videos, not just text. This enables new search experiences such as multi-modal search or cross-language search.
85+
Vector search is a method of information retrieval where documents and queries are represented as vectors instead of plain text. In vector search, machine learning models generate the vector representations of source inputs, which can be text, images, audio, or video content. Having a mathematic representation of content provides a common basis for search scenarios. If everything is a vector, a query can find a match in vector space, even if the associated original content is in different media or in a different language than the query.
86+
87+
### Why use vector search
88+
89+
Vectors can overcome the limitations of traditional keyword-based search by using machine learning models to capture the meaning of words and phrases in context, rather than relying solely on lexical analysis and matching of individual query terms. By capturing the intent of the query, vector search can return more relevant results that match the user's needs, even if the exact terms aren't present in the document. Additionally, vector search can be applied to different types of content, such as images and videos, not just text. This enables new search experiences such as multi-modal search or cross-language search.
8690

8791
### Embeddings and vectorization
8892

89-
*Embeddings* are a specific type of vector representation created by machine learning models that capture the semantic meaning of text, or representations of other content such as images. Natural language machine learning models are trained on large amounts of data to identify patterns and relationships between words. During training, they learn to represent any input as a vector of real numbers in an intermediary step called the *encoder*. After training is complete, these language models can be modified so the intermediary vector representation becomes the model's output. The resulting embeddings are high-dimensional vectors, where words with similar meanings are closer together in the vector space, as explained in [this Azure OpenAI Service article](/azure/ai-services/openai/concepts/understand-embeddings).
93+
*Embeddings* are a specific type of vector representation of content or a query, created by machine learning models that capture the semantic meaning of text or representations of other content such as images. Natural language machine learning models are trained on large amounts of data to identify patterns and relationships between words. During training, they learn to represent any input as a vector of real numbers in an intermediary step called the *encoder*. After training is complete, these language models can be modified so the intermediary vector representation becomes the model's output. The resulting embeddings are high-dimensional vectors, where words with similar meanings are closer together in the vector space, as explained in [this Azure OpenAI Service article](/azure/ai-services/openai/concepts/understand-embeddings).
9094

9195
The effectiveness of vector search in retrieving relevant information depends on the effectiveness of the embedding model in distilling the meaning of documents and queries into the resulting vector. The best models are well-trained on the types of data they're representing. You can evaluate existing models such as Azure OpenAI text-embedding-ada-002, bring your own model that's trained directly on the problem space, or fine-tune a general-purpose model. Azure Cognitive Search doesn't impose constraints on which model you choose, so pick the best one for your data.
9296

0 commit comments

Comments
 (0)