You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/cognitive-search-concept-intro.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -17,7 +17,7 @@ ms.date: 01/30/2024
17
17
18
18
In Azure AI Search, *AI enrichment* refers to integration with [Azure AI services](/azure/ai-services/what-are-ai-services) to process content that isn't searchable in its raw form. Through enrichment, analysis and inference are used to create searchable content and structure where none previously existed.
19
19
20
-
Because Azure AI Search is a text and vector search solution, the purpose of AI enrichment is to improve the utility of your content in search-related scenarios. Source content must be textual (you can't enrich vectors), but the content created by an enrichment pipeline can be vectorized and indexed in a vector store using skills like [Text Split skill](cognitive-search-skill-textsplit.md) for chunking and [AzureOpenAIEmbedding skill](cognitive-search-skill-azure-openai-embedding.md) for encoding.
20
+
Because Azure AI Search is a text and vector search solution, the purpose of AI enrichment is to improve the utility of your content in search-related scenarios. Source content must be textual (you can't enrich vectors), but the content created by an enrichment pipeline can be vectorized and indexed in a vector index using skills like [Text Split skill](cognitive-search-skill-textsplit.md) for chunking and [AzureOpenAIEmbedding skill](cognitive-search-skill-azure-openai-embedding.md) for encoding.
21
21
22
22
AI enrichment is based on [*skills*](cognitive-search-working-with-skillsets.md).
Copy file name to clipboardExpand all lines: articles/search/cognitive-search-custom-skill-web-api.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,7 +15,7 @@ ms.date: 03/05/2024
15
15
16
16
The **Custom Web API** skill allows you to extend AI enrichment by calling out to a Web API endpoint providing custom operations. Similar to built-in skills, a **Custom Web API** skill has inputs and outputs. Depending on the inputs, your Web API receives a JSON payload when the indexer runs, and outputs a JSON payload as a response, along with a success status code. The response is expected to have the outputs specified by your custom skill. Any other response is considered an error and no enrichments are performed. The structure of the JSON payload is described further down in this document.
17
17
18
-
The **Custom Web API** skill is also used in the implementation of [Azure OpenAI On Your Data](/azure/ai-services/openai/concepts/use-your-data) feature. If Azure OpenAI is [configured for role-based access](/azure/ai-services/openai/how-to/use-your-data-securely#configure-azure-openai) and you get `403 Forbidden` calls when creating the vector store, verify that Azure AI Search has a [system assigned identity](search-howto-managed-identities-data-sources.md#create-a-system-managed-identity) and runs as a [trusted service](/azure/ai-services/openai/how-to/use-your-data-securely#enable-trusted-service) on Azure OpenAI.
18
+
The **Custom Web API** skill is also used in the implementation of [Azure OpenAI On Your Data](/azure/ai-services/openai/concepts/use-your-data) feature. If Azure OpenAI is [configured for role-based access](/azure/ai-services/openai/how-to/use-your-data-securely#configure-azure-openai) and you get `403 Forbidden` calls when creating the vector index, verify that Azure AI Search has a [system assigned identity](search-howto-managed-identities-data-sources.md#create-a-system-managed-identity) and runs as a [trusted service](/azure/ai-services/openai/how-to/use-your-data-securely#enable-trusted-service) on Azure OpenAI.
19
19
20
20
> [!NOTE]
21
21
> The indexer retries twice for certain standard HTTP status codes returned from the Web API. These HTTP status codes are:
Copy file name to clipboardExpand all lines: articles/search/cognitive-search-output-field-mapping.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -37,7 +37,7 @@ Output field mappings apply to:
37
37
38
38
+ In-memory content that's created by skills or extracted by an indexer. The source field is a node in an enriched document tree.
39
39
40
-
+ Search indexes. If you're populating a [knowledge store](knowledge-store-concept-intro.md), use [projections](knowledge-store-projections-examples.md) for data path configuration. If you're populating a vector store, output field mappings aren't used.
40
+
+ Search indexes. If you're populating a [knowledge store](knowledge-store-concept-intro.md), use [projections](knowledge-store-projections-examples.md) for data path configuration. If you're populating vector fields, output field mappings aren't used.
41
41
42
42
Output field mappings are applied after [skillset execution](cognitive-search-working-with-skillsets.md) or after document cracking if there's no associated skillset.
Copy file name to clipboardExpand all lines: articles/search/samples-dotnet.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -61,7 +61,7 @@ Code samples from the Azure AI Search team demonstrate features and workflows. A
61
61
|[DotNetHowToSynonyms](https://github.com/Azure-Samples/search-dotnet-getting-started/tree/master/DotNetHowToSynonyms)|[Example: Add synonyms in C#](search-synonyms-tutorial-sdk.md)| Synonym lists are used for query expansion, providing matchable terms that are external to an index. |
62
62
|[DotNetToIndexers](https://github.com/Azure-Samples/search-dotnet-getting-started/tree/master/DotNetHowToIndexers)|[Tutorial: Index Azure SQL data](search-indexer-tutorial.md)| Shows how to configure an Azure SQL indexer that has a schedule, field mappings, and parameters. |
63
63
|[DotNetHowToEncryptionUsingCMK](https://github.com/Azure-Samples/search-dotnet-getting-started/tree/master/DotNetHowToEncryptionUsingCMK)|[How to configure customer-managed keys for data encryption](search-security-manage-encryption-keys.md)| Shows how to create objects that are encrypted with a Customer Key. |
64
-
|[DotNetVectorDemo](https://github.com/Azure/azure-search-vector-samples/tree/main/demo-dotnet/DotNetVectorDemo)|[readme](https://github.com/Azure/azure-search-vector-samples/tree/main/demo-dotnet/DotNetVectorDemo/readme.md)| Create, load, and query a vector store. |
64
+
|[DotNetVectorDemo](https://github.com/Azure/azure-search-vector-samples/tree/main/demo-dotnet/DotNetVectorDemo)|[readme](https://github.com/Azure/azure-search-vector-samples/tree/main/demo-dotnet/DotNetVectorDemo/readme.md)| Create, load, and query a vector index. |
65
65
|[DotNetIntegratedVectorizationDemo](https://github.com/Azure/azure-search-vector-samples/tree/main/demo-dotnet/DotNetIntegratedVectorizationDemo)|[readme](https://github.com/Azure/azure-search-vector-samples/tree/main/demo-dotnet/DotNetIntegratedVectorizationDemo/readme.md)| Extends the vector workflow to include skills-based automation for data chunking and embedding. |
Copy file name to clipboardExpand all lines: articles/search/samples-python.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -61,8 +61,8 @@ A demo repo provides proof-of-concept source code for examples or scenarios show
61
61
62
62
| Repository | Description |
63
63
|------------|-------------|
64
-
|[azure-search-vector-python-sample.ipynb](https://github.com/Azure/azure-search-vector-samples/blob/main/demo-python/code/basic-vector-workflow/azure-search-vector-python-sample.ipynb)| Uses the **azure.search.documents** library in the Azure SDK for Python to create, load, and query a vector store. |
65
-
|[azure-search-integrated-vectorization-sample.ipynb](https://github.com/Azure/azure-search-vector-samples/blob/main/demo-python/code/integrated-vectorization/azure-search-integrated-vectorization-sample.ipynb)| Extends the vector store workflow to include integrated data chunking and embedding. |
64
+
|[azure-search-vector-python-sample.ipynb](https://github.com/Azure/azure-search-vector-samples/blob/main/demo-python/code/basic-vector-workflow/azure-search-vector-python-sample.ipynb)| Uses the **azure.search.documents** library in the Azure SDK for Python to create, load, and query a vector index. |
65
+
|[azure-search-integrated-vectorization-sample.ipynb](https://github.com/Azure/azure-search-vector-samples/blob/main/demo-python/code/integrated-vectorization/azure-search-integrated-vectorization-sample.ipynb)| Extends the vector indexing workflow to include integrated data chunking and embedding. |
66
66
|[azure-search-vector-image-index-creation-python-sample.ipynb](https://github.com/Azure/azure-search-vector-samples/blob/main/demo-python/code/multimodal/azure-search-vector-image-index-creation-python-sample.ipynb)| Demonstrates multimodal search over text and images. |
|[azure-search-vector-python-huggingface-model-sample.ipynb](https://github.com/Azure/azure-search-vector-samples/blob/main/demo-python/code/community-integration/hugging-face/azure-search-vector-python-huggingface-model-sample.ipynb)| Hugging Face integration. |
Copy file name to clipboardExpand all lines: articles/search/vector-search-overview.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -45,7 +45,7 @@ Scenarios for vector search include:
45
45
46
46
+**Filtered vector search**. A query request can include a vector query and a [filter expression](search-filters.md). Filters apply to text and numeric fields, and are useful for metadata filters, and including or excluding search results based on filter criteria. Although a vector field isn't filterable itself, you can set up a filterable text or numeric field. The search engine can process the filter before or after the vector query executes.
47
47
48
-
+**Vector database**. Azure AI Search stores the data that you query over. Use it as a [pure vector store](vector-store.md) any time you need long-term memory or a knowledge base, or grounding data for [Retrieval Augmented Generation (RAG) architecture](https://aka.ms/what-is-rag), or any app that uses vectors.
48
+
+**Vector storage**. Azure AI Search stores the data that you query over. Use it as a [pure vector store](vector-store.md) any time you need long-term memory or a knowledge base, or grounding data for [Retrieval Augmented Generation (RAG) architecture](https://aka.ms/what-is-rag), or any app that uses vectors.
Copy file name to clipboardExpand all lines: articles/search/vector-search-ranking.md
+8-8Lines changed: 8 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,16 +9,16 @@ ms.service: cognitive-search
9
9
ms.custom:
10
10
- ignite-2023
11
11
ms.topic: conceptual
12
-
ms.date: 01/31/2024
12
+
ms.date: 04/12/2024
13
13
---
14
14
15
15
# Relevance in vector search
16
16
17
-
In vector query execution, the search engine looks for similar vectors to find the best candidates to return in search results. Depending on how you indexed the vector content, the search for relevant matches is either exhaustive, or constrained to near neighbors for faster processing. Once candidates are found, similarity metrics are used to score each result based on the strength of the match.
17
+
During vector query execution, the search engine looks for similar vectors to find the best candidates to return in search results. Depending on how you indexed the vector content, the search for relevant matches is either exhaustive, or constrained to near neighbors for faster processing. Once candidates are found, similarity metrics are used to score each result based on the strength of the match.
18
18
19
19
This article explains the algorithms used to find relevant matches and the similarity metrics used for scoring. It also offers tips for improving relevance if search results don't meet expectations.
20
20
21
-
## Scope of a vector search
21
+
## Algorithms used in vector search
22
22
23
23
Vector search algorithms include exhaustive k-nearest neighbors (KNN) and Hierarchical Navigable Small World (HNSW).
24
24
@@ -30,25 +30,25 @@ Only vector fields marked as `searchable` in the index, or as `searchFields` in
30
30
31
31
### When to use exhaustive KNN
32
32
33
-
Exhaustive KNN calculates the distances between all pairs of data points and finds the exact `k` nearest neighbors for a query point. It's intended for scenarios where high recall is of utmost importance, and users are willing to accept the trade-offs in search performance. Because it's computationally intensive, use exhaustive KNN for small to medium datasets, or when precision requirements outweigh query performance considerations.
33
+
Exhaustive KNN calculates the distances between all pairs of data points and finds the exact `k` nearest neighbors for a query point. It's intended for scenarios where high recall is of utmost importance, and users are willing to accept the trade-offs in query latency. Because it's computationally intensive, use exhaustive KNN for small to medium datasets, or when precision requirements outweigh query performance considerations.
34
34
35
-
Another use case is to build a dataset to evaluate approximate nearest neighbor algorithm recall. Exhaustive KNN can be used to build the ground truth set of nearest neighbors.
35
+
A seconary use case is to build a dataset to evaluate approximate nearest neighbor algorithm recall. Exhaustive KNN can be used to build the ground truth set of nearest neighbors.
36
36
37
37
Exhaustive KNN support is available through [2023-11-01 REST API](/rest/api/searchservice/search-service-api-versions#2023-11-01), [2023-10-01-Preview REST API](/rest/api/searchservice/search-service-api-versions#2023-10-01-Preview), and in Azure SDK client libraries that target either REST API version.
38
38
39
39
### When to use HNSW
40
40
41
-
During indexing, HNSW creates extra data structures for faster search, organizing data points into a hierarchical graph structure. HHNSW has several configuration parameters that can be tuned to achieve the throughput, latency, and recall objectives for your search application. For example, at query time, you can specify options for exhaustive search, even if the vector field is indexed for HNSW.
41
+
During indexing, HNSW creates extra data structures for faster search, organizing data points into a hierarchical graph structure. HNSW has several configuration parameters that can be tuned to achieve the throughput, latency, and recall objectives for your search application. For example, at query time, you can specify options for exhaustive search, even if the vector field is indexed for HNSW.
42
42
43
43
During query execution, HNSW enables fast neighbor queries by navigating through the graph. This approach strikes a balance between search accuracy and computational efficiency. HNSW is recommended for most scenarios due to its efficiency when searching over larger data sets.
44
44
45
45
## How nearest neighbor search works
46
46
47
-
Vector queries execute against an embedding space consisting of vectors generated from the same embedding model. Generally, the input value within a query request is fed into the same machine learning model that generated embeddings in the vector store. The output is a vector in the same embedding space. Since similar vectors are clustered close together, finding matches is equivalent to finding the vectors that are closest to the query vector, and returning the associated documents as the search result.
47
+
Vector queries execute against an embedding space consisting of vectors generated from the same embedding model. Generally, the input value within a query request is fed into the same machine learning model that generated embeddings in the vector index. The output is a vector in the same embedding space. Since similar vectors are clustered close together, finding matches is equivalent to finding the vectors that are closest to the query vector, and returning the associated documents as the search result.
48
48
49
49
For example, if a query request is about hotels, the model maps the query into a vector that exists somewhere in the cluster of vectors representing documents about hotels. Identifying which vectors are the most similar to the query, based on a similarity metric, determines which documents are the most relevant.
50
50
51
-
When vector fields are indexed for exhaustive KNN, the query executes against "all neighbors". For fields indexed for HNSW, the search engine uses an HNSW graph to search over a subset of nodes within the vector store.
51
+
When vector fields are indexed for exhaustive KNN, the query executes against "all neighbors". For fields indexed for HNSW, the search engine uses an HNSW graph to search over a subset of nodes within the vector index.
0 commit comments