Skip to content

Commit 55c62e7

Browse files
committed
whats new revisions for build
1 parent 9928714 commit 55c62e7

File tree

4 files changed

+17
-19
lines changed

4 files changed

+17
-19
lines changed

articles/search/vector-search-how-to-index-binary-data.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ ms.date: 05/21/2024
1212

1313
# Index binary data for vector search
1414

15-
Beginning with the 2024-05-01-preview REST API, Azure AI Search supports a packed binary type of `Collection(Edm.Binary)` for further reducing the storage and memory footprint of vector data. You can assign this data type to fields that store binary embeddings from models such as [Cohere's Embed v3 binary embedding models](https://cohere.com/blog/introducing-embed-v3).
15+
Beginning with the 2024-05-01-preview REST API, Azure AI Search supports a packed binary type of `Collection(Edm.Binary)` for further reducing the storage and memory footprint of vector data. You can use this data type for output from models such as [Cohere's Embed v3 binary embedding models](https://cohere.com/blog/introducing-embed-v3).
1616

1717
There are three steps to configuring an index for binary data:
1818

@@ -31,10 +31,11 @@ This article assumes you're familiar with [creating an index in Azure AI Search]
3131

3232
+ No scalar compression or integrated vectorization support.
3333
+ No Azure portal support in the Import and vectorize data wizard.
34+
+ No support for binary fields in the [AML skill](cognitive-search-aml-skill.md) that's used for integrated vectorization of models in the Azure AI Studio model catalog.
3435

3536
## Add a vector search algorithm and vector profile
3637

37-
Vector search algorithms are used to create the query navigation structures during index. For binary data fields, vector comparisons are performed using the Hamming distance. Configuration of a vector search algorithm is in a search index.
38+
Vector search algorithms are used to create the query navigation structures during indexing. For binary data fields, vector comparisons are performed using the Hamming distance metric.
3839

3940
1. To add a binary field to an index, set up a [`Create or Update Index`](/rest/api/searchservice/indexes/create-or-update?view=rest-searchservice-2024-05-01-preview&preserve-view=true) request using the **2024-05-01-preview REST API** or the Azure portal.
4041

articles/search/vector-search-how-to-query.md

Lines changed: 7 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -12,24 +12,24 @@ ms.date: 05/21/2024
1212

1313
# Create a vector query in Azure AI Search
1414

15-
In Azure AI Search, if you [have vector fields](vector-search-how-to-create-index.md) in a search index, this article explains how to:
15+
In Azure AI Search, if you have a [vector index](vector-search-how-to-create-index.md), this article explains how to:
1616

1717
> [!div class="checklist"]
1818
> + [Query vector fields](#vector-query-request)
1919
> + [Filter a vector query](#vector-query-with-filter)
2020
> + [Query multiple vector fields at once](#multiple-vector-fields)
2121
> + [Query with integrated vectorization (preview)](#query-with-integrated-vectorization-preview)
2222
> + [Set thresholds to exclude low-scoring results (preview)](#set-thresholds-to-exclude-low-scoring-results-preview)
23-
> + [Set MaxTextSizeRecall to control the number of results](#maxtextsizerecall-for-hybrid-search-preview)
24-
> + [Vector weights (preview)](#vector-weighting-preview)
23+
> + [Set MaxTextSizeRecall to control the number of results (preview)](#maxtextsizerecall-for-hybrid-search-preview)
24+
> + [Set vector weights (preview)](#vector-weighting-preview)
2525
2626
This article uses REST for illustration. For code samples in other languages, see the [azure-search-vector-samples](https://github.com/Azure/azure-search-vector-samples) GitHub repository for end-to-end solutions that include vector queries.
2727

2828
## Prerequisites
2929

3030
+ Azure AI Search, in any region and on any tier.
3131

32-
+ [A vector index on Azure AI Search](vector-search-how-to-create-index.md). Check for a `vectorSearch` section in your index.
32+
+ [A vector index on Azure AI Search](vector-search-how-to-create-index.md). Check for a `vectorSearch` section in your index to confirm a vector index.
3333

3434
+ Visual Studio Code with a [REST client](https://marketplace.visualstudio.com/items?itemName=humao.rest-client) and sample data if you want to run these examples on your own. To get started with the REST client, see [Quickstart: Azure AI Search using REST](search-get-started-rest.md).
3535

@@ -568,7 +568,7 @@ During query execution, a vector query can only target one internal vector index
568568

569569
## Set thresholds to exclude low-scoring results (preview)
570570

571-
Because nearest neighbor search always returns the requested `k` neighbors, you might find that low scoring matches in are included just to meet the `k` number requirement.
571+
Because nearest neighbor search always returns the requested `k` neighbors, it's possible to get low scoring matches as part of meeting the `k` number requirement on search results.
572572

573573
Using the 2024-05-01-preview REST APIs, you can now add a `threshold` query parameter to exclude low-scoring search results.
574574

@@ -596,7 +596,7 @@ Filtering occurs before [fusing results](hybrid-search-ranking.md) from differen
596596

597597
## MaxTextSizeRecall for hybrid search (preview)
598598

599-
Add a `hybridSearch` query parameter object to specify the maximum number of documents recalled using text queries in hybrid (text and vector) search. The default is 1,000 documents, which often are more than is necessary for RAG scenarios. With this parameter, you can decrease or increase the number of results returned in hybrid queries.
599+
Add a `hybridSearch` query parameter object to specify the maximum number of documents recalled using text queries in hybrid (text and vector) search. The default is 1,000 documents. With this parameter, you can increase or decrease the number of results returned in hybrid queries.
600600

601601
```http
602602
POST https://[service-name].search.windows.net/indexes/[index-name]/docs/search?api-version=2024-05-01-Preview
@@ -622,11 +622,10 @@ POST https://[service-name].search.windows.net/indexes/[index-name]/docs/search?
622622

623623
## Vector weighting (preview)
624624

625-
Add a `weight` query parameter to specify the relative weights of each vector included in search operations. This feature is particularly useful in complex queries where two or more distinct result sets need to be combined, such as in hybrid search or multivector requests.
625+
Add a `weight` query parameter to specify the relative weight of each vector included in search operations. This feature is particularly useful in complex queries where two or more distinct result sets need to be combined, such as in hybrid search or multivector requests.
626626

627627
Weights are used when calculating the [reciprocal rank fusion](hybrid-search-ranking.md) scores of each document. The calculation is multiplier of the `weight` value against the rank score of the document within its respective result set.
628628

629-
630629
```http
631630
POST https://[service-name].search.windows.net/indexes/[index-name]/docs/search?api-version=2024-05-01-Preview
632631
Content-Type: application/json

articles/search/vector-search-index-size.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -204,7 +204,8 @@ The following table summarizes the overhead percentages observed in internal tes
204204
| 96 | 4 | 20% |
205205
| 200 | 4 | 8% |
206206
| 768 | 4 | 2% |
207-
| 1536 | 4 | 1% |
207+
| 1536 | 4 | 1% |
208+
| 3072 | 4 | 0.5% |
208209

209210
These results demonstrate the relationship between dimensions, HNSW parameter `m`, and memory overhead for the HNSW algorithm.
210211

articles/search/whats-new.md

Lines changed: 5 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -23,20 +23,17 @@ ms.custom:
2323

2424
| Item                         | Type | Description |
2525
|-----------------------------|------|--------------|
26-
| [OneLake files indexer (preview)](search-how-to-index-onelake-files.md) | Feature | New indexer for OneLake files and OneLake shortcuts. If you use Microsoft Fabric and OneLake for cloud data access to Amazon Web Services (AWS) and Google data sources, use this indexer to import external data into a search index. This vectorizer is available through the Azure portal, the [2024-05-01-preview REST API](/rest/api/searchservice/data-sources/create-or-update?view=rest-searchservice-2024-05-01-preview&preserve-view=true), and Azure SDK beta packages. |
27-
| [Vector search relevance tuning](vector-search-how-to-query.md) | Feature | Three enhancements improve vector search relevance. <br/>First, you can now set thresholds on vector search results, returning matches based on search score. <br/>Second, you can set `MaxSizeTextRecall` and `countAndFacetMode` in hybrid queries to specify the maximum number of documents that can be recalled using text query in hybrid (text and vector) search. Previously, the maximum was fixed at 1,000. If you have more matches, you can now specify a higher limit to get more results back. <br/>Third, for hybrid queries, you can set a weight on vector queries to have more or less importance than the nonvector query. |
28-
| [Binary data type](/rest/api/searchservice/supported-data-types) | Feature | `Collection(Edm.Byte)` is a new supported data type. This data type opens up integration with the [Cohere v3 binary embedding models](https://cohere.com/blog/int8-binary-embeddings) and custom binary quantization. Narrow data types lower the cost of large vector datasets. See [Index binary data for vector search](vector-search-how-to-index-binary-data.md) for more information.|
26+
| [Higher capacity and more vector quota at every tier (same billing rate)](search-limits-quotas-capacity.md#service-limits) | Infrastructure | Partition sizes are now even larger for Standard 2 (S2), Standard 3 (S3), and Standard 3 High Density (S3 HD) for all services created after April 3, 2024. If you create a new service now, you get the larger partitions. If you created a new service between April 3 and May 17, you get the larger partitions automatically. <br><br>Storage Optimized tiers (L1 and L2) also have more capacity. L1 and L2 customers must create a new service to benefit from the higher capacity. There's no in-place upgrade at this time. <br><br>Extra capacity is now available in [more regions](search-limits-quotas-capacity.md#supported-regions-with-higher-storage-limits): South Africa North​, Germany North​, Germany West Central​, Switzerland West​, East US 2 EUAP/PPE​, and Azure Government (Texas, Arizona, and Virginia).|
27+
| [OneLake files and shortcuts integration (preview)](search-how-to-index-onelake-files.md) | Feature | New indexer for OneLake files and OneLake shortcuts. If you use Microsoft Fabric and OneLake for data access to Amazon Web Services (AWS) and Google data sources, use this indexer to import external data into a search index. This indexer is available through the Azure portal, the [2024-05-01-preview REST API](/rest/api/searchservice/data-sources/create-or-update?view=rest-searchservice-2024-05-01-preview&preserve-view=true), and Azure SDK beta packages. |
28+
| [Relevance tuning and search results customization](vector-search-how-to-query.md) | Feature | Three enhancements improve vector search relevance. <br><br>First, you can now set thresholds on vector search results to exclude low-scoring results. <br><br>Second, you can set `MaxSizeTextRecall` and `countAndFacetMode` in hybrid queries to specify the maximum number of documents that can be recalled using text query in hybrid (text and vector) search. Previously, the maximum was fixed at 1,000. If you have more matches, you can now specify a higher limit to get more results back. <br><br>Third, for hybrid queries, you can set a weight on vector queries to have more or less importance than the nonvector query. |
29+
| [Binary data support](/rest/api/searchservice/supported-data-types) | Feature | `Collection(Edm.Byte)` is a new supported data type. This data type opens up integration with the [Cohere v3 binary embedding models](https://cohere.com/blog/int8-binary-embeddings) and custom binary quantization. Narrow data types lower the cost of large vector datasets. See [Index binary data for vector search](vector-search-how-to-index-binary-data.md) for more information.|
2930
| [Azure AI Vision multimodal embeddings skill (preview)](cognitive-search-skill-vision-vectorize.md) | Skill | New skill that's bound to the [multimodal embeddings API of Azure AI Vision](../ai-services/computer-vision/concept-image-retrieval.md). You can generate embeddings for text or images during indexing. This skill is available through the Azure portal and the [2024-05-01-preview REST API](/rest/api/searchservice/operation-groups?view=rest-searchservice-2024-05-01-preview&preserve-view=true).|
3031
| [Azure AI Vision vectorizer (preview)](vector-search-vectorizer-ai-services-vision.md) | Vectorizer | New vectorizer connects to an Azure AI Vision resource using the [multimodal embeddings API](../ai-services/computer-vision/concept-image-retrieval.md) to generate embeddings at query time. This vectorizer is available through the Azure portal and the [2024-05-01-preview REST API](/rest/api/searchservice/operation-groups?view=rest-searchservice-2024-05-01-preview&preserve-view=true). |
3132
| [Azure AI Studio model catalog vectorizer (preview)](vector-search-vectorizer-azure-machine-learning-ai-studio-catalog.md) | Vectorizer | New vectorizer connects to an embedding model deployed from the [Azure AI Studio model catalog](../ai-studio/how-to/model-catalog.md). This vectorizer is available through the Azure portal and the [2024-05-01-preview REST API](/rest/api/searchservice/operation-groups?view=rest-searchservice-2024-05-01-preview&preserve-view=true). <br><br>[**How to implement integrated vectorization using models from Azure AI Studio**](vector-search-integrated-vectorization-ai-studio.md).|
3233
| [AzureOpenAIEmbedding skill (preview) supports more models on Azure OpenAI](cognitive-search-skill-azure-openai-embedding.md) | Skill | Updates to this skill add support for more embedding models on Azure OpenAI. New `dimensions` and `modelName` properties are used for specifying models. Previously, the dimensions limits were fixed at 1,536 dimensions. It's now configurable. This update is available through the Azure portal and the [2024-05-01-preview REST API](/rest/api/searchservice/operation-groups?view=rest-searchservice-2024-05-01-preview&preserve-view=true).|
33-
| [More regions having higher capacity tiers](search-limits-quotas-capacity.md#supported-regions-with-higher-storage-limits) | Infrastructure | Higher capacity tiers were announced in April 2024 for selected regions. Beginning on May 17, 2024, the list of regions now includes South Africa North​, Germany North​, Germany West Central​, Switzerland West​, East US 2 EUAP/PPE​, and Azure Government (Texas, Arizona, and Virginia).|
34-
| [More storage on L1 and L2 tiers](search-limits-quotas-capacity.md#service-limits) | Infrastructure | Also starting on May 17, 2024: higher capacity partitions are available for Storage Optimized tiers (L1 and L2). L1 and L2 customers must create a new service to benefit from the higher capacity. There's no in-place upgrade at this time. |
35-
| [More storage on S2, S3, and S3 HD tiers](search-limits-quotas-capacity.md#service-limits) | Infrastructure | Partition sizes are larger for Standard 2 (S2), Standard 3 (S3), and Standard 3 High Density (S3 HD) for all services created after April 3, 2024. If you create a new service now, you get the larger partitions. If you created a new service between April 3 and May 17, you get the larger partitions automatically. In short, you don't need to create a new S2 or S3 search service if you're already running on post-April 3 infrastructure. |
36-
| [2024-05-01-preview Search REST API](/rest/api/searchservice/search-service-api-versions#2024-05-01-preview) | API | New preview version of the Search REST APIs provides new skills and vectorizers, new binary data type, OneLake files indexer, new query parameters for vector weights, and recall. See [Upgrade REST APIs](search-api-migration.md) if you have existing code written against the 2023-07-01-preview and need to migrate to this version.|
34+
| [2024-05-01-preview Search REST API](/rest/api/searchservice/search-service-api-versions#2024-05-01-preview) | API | New preview version of the Search REST APIs provides new skills and vectorizers, new binary data type, OneLake files indexer, and new query parameters for more relevant results. See [Upgrade REST APIs](search-api-migration.md) if you have existing code written against the 2023-07-01-preview and need to migrate to this version.|
3735
| Azure SDK beta packages for new features | API | Review the changelogs of the following Azure SDK beta packages for new feature support: [Azure SDK for Python](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/search/azure-search-documents/CHANGELOG.md), [Azure SDK for .NET](https://github.com/Azure/azure-sdk-for-net/blob/Azure.Search.Documents_11.6.0-beta.4/sdk/search/Azure.Search.Documents/CHANGELOG.md), [Azure SDK for Java](https://github.com/Azure/azure-sdk-for-java/blob/main/sdk/search/azure-search-documents/CHANGELOG.md) |
3836
<!-- | Network security perimeter support (preview) | Feature | A network security perimeter is a new service that provides a secure perimeter for communication, and controlled access to resources outside of the perimeter. Azure AI Search is one of the eight Azure services that can run within a network security perimeter. This feature is provided by the [2024-03-01-preview Management REST API](/rest/api/searchmanagement/operation-groups?view=rest-searchmanagement-2024-03-01-preview&preserve-view=true) and the Azure portal. | -->
39-
<!-- | [Custom Web API vectorizer (preview)](vector-search-vectorizer-custom-web-api.md) | Vectorizer | Configure your search queries to call out to a Web API endpoint to generate embeddings at query time. This vectorizer is available through the [2024-05-01-preview REST API](/rest/api/searchservice/operation-groups?view=rest-searchservice-2024-05-01-preview&preserve-view=true).|-->
4037

4138
## April 2024
4239

0 commit comments

Comments
 (0)