Skip to content

Commit 0480491

Browse files
committed
Updates to vector storage optimization, rescoring with original vectors
1 parent 6cd1c83 commit 0480491

5 files changed

+140
-48
lines changed

articles/search/vector-search-how-to-assign-narrow-data-types.md

Lines changed: 7 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -7,18 +7,18 @@ author: heidisteen
77
ms.author: heidist
88
ms.service: azure-ai-search
99
ms.topic: how-to
10-
ms.date: 11/04/2024
10+
ms.date: 11/19/2024
1111
---
1212

13-
# Assign narrow data types
13+
# Assign narrow data types to vector fields in Azure AI Search
1414

1515
An easy way to reduce vector size is to store embeddings in a smaller data format. Most embedding models output 32-bit floating point numbers, but if you quantize your vectors, or if your embedding model supports it natively, output might be float16, int16, or int8, which is significantly smaller than float32. You can accommodate these smaller vector sizes by assigning a narrow data type to a vector field. In the vector index, narrow data types consume less storage.
1616

1717
Data types are assigned to fields in an index definition. You can use the Azure portal, the [Search REST APIs](/rest/api/searchservice/indexes/create), or an Azure SDK package that provides the feature.
1818

1919
## Prerequisites
2020

21-
- An embedding model that output small data formats.
21+
- An embedding model that output small data formats, such as text-embedding-3 or Cohere V3 embedding models.
2222

2323
## Supported narrow data types
2424

@@ -47,7 +47,7 @@ Data types are assigned to fields in an index definition. You can use the Azure
4747

4848
## Assign the data type
4949

50-
[Define and build the index](vector-search-how-to-create-index.md). You can use the Azure portal, [Create or Update Index (REST API)](/rest/api/searchservice/indexes/create-or-update), or an Azure SDK package for this step.
50+
[Define and build an index](vector-search-how-to-create-index.md). You can use the Azure portal, [Create or Update Index (REST API)](/rest/api/searchservice/indexes/create-or-update), or an Azure SDK package for this step.
5151

5252
This field definition uses a narrow data type, `Collection(Edm.Half)`, that can accept a float32 embedding stored as a float16 value. As is true for all vector fields, `dimensions` and `vectorSearchProfile` are set. The specifics of the `vectorSearchProfile` are immaterial to the datatype.
5353

@@ -80,12 +80,9 @@ Data types are assigned on new fields when they're created. You can't change the
8080

8181
## Check results
8282

83-
1. Verify the field content matches the data type. Assuming the vector field is marked as retrievable, use [Search explorer](search-explorer.md) or [Search - POST](/rest/api/searchservice/documents/search-post?) to return vector field content.
83+
1. Verify the field content matches the data type. Assuming the vector field is marked as `retrievable`, use [Search explorer](search-explorer.md) or [Search - POST](/rest/api/searchservice/documents/search-post?) to return vector field content.
8484

85-
1. To check vector index size, refer to the vector index size column on the Indexes page in the Azure portal or use the [GET Statistics (REST API)](/rest/api/searchservice/indexes/get-statistics) or equivalent Azure SDK method to get the size.
86-
87-
<!--
88-
Evidence of choosing the wrong data type, for example choosing `int8` for a `float32` embedding, is a field that's indexed as an array of zeros. If you encounter this problem, start over. -->
85+
1. To check vector index size, refer to the vector index size column on the **Search management > Indexes** page in the [Azure portal](https://portal.azure.com) or use the [GET Statistics (REST API)](/rest/api/searchservice/indexes/get-statistics) or equivalent Azure SDK method to get the size.
8986

9087
> [!NOTE]
91-
> The field's data type is used to create the physical data structure. If you want to change a data type later, either drop and rebuild the index, or create a second field with the new definition.
88+
> The field's data type is used to create the physical data structure. If you want to change a data type later, either [drop and rebuild the index](search-howto-reindex.md), or create a second field with the new definition.

articles/search/vector-search-how-to-index-binary-data.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -9,30 +9,30 @@ ms.service: azure-ai-search
99
ms.custom:
1010
- build-2024
1111
ms.topic: how-to
12-
ms.date: 08/05/2024
12+
ms.date: 11/19/2024
1313
---
1414

1515
# Index binary vectors for vector search
1616

17-
Azure AI Search supports a packed binary type of `Collection(Edm.Byte)` for further reducing the storage and memory footprint of vector data. You can use this data type for output from models such as [Cohere's Embed v3 binary embedding models](https://cohere.com/blog/introducing-embed-v3).
17+
Azure AI Search supports a packed binary type of `Collection(Edm.Byte)` for further reducing the storage and memory footprint of vector data. You can use this data type for output from models such as [Cohere's Embed v3 binary embedding models](https://cohere.com/blog/introducing-embed-v3) or any other embedding model or process that outputs vectors as binary bytes.
1818

1919
There are three steps to configuring an index for binary vectors:
2020

2121
> [!div class="checklist"]
2222
> + Add a vector search algorithm that specifies Hamming distance for binary vector comparison
2323
> + Add a vector profile that points to the algorithm
24-
> + Add the vector profile to your binary field definition
24+
> + Add a vector field of type `Collection(Edm.Byte)` and assign the Hamming distance
2525
26-
This article assumes you're familiar with [creating an index in Azure AI Search](search-how-to-create-search-index.md). It uses the REST APIs to illustrate each step, but you could also add a binary field to an index in the Azure portal.
26+
This article assumes you're familiar with [creating an index in Azure AI Search](search-how-to-create-search-index.md) and [adding vector fields](vector-search-how-to-create-index.md). It uses the REST APIs to illustrate each step, but you could also add a binary field to an index in the Azure portal or Azure SDK.
2727

28-
Binary data types are generally available starting with API version 2024-07-01 and are assigned to fields using the [Create Index](/rest/api/searchservice/indexes/create) or [Create Or Update Index](/rest/api/searchservice/indexes/create-or-update) APIs.
28+
The binary data type is generally available starting with API version 2024-07-01 and is assigned to fields using the [Create Index](/rest/api/searchservice/indexes/create) or [Create Or Update Index](/rest/api/searchservice/indexes/create-or-update) APIs.
2929

3030
> [!TIP]
31-
> If you're investigating binary vector support for its smaller footprint, you might also consider the vector quantization and storage reduction features in Azure AI Search. Inputs are float32 or float16 embeddings. Output is stored data in a much smaller format. For more information, see [Assign narrow data types](vector-search-how-to-assign-narrow-data-types.md).
31+
> If you're investigating binary vector support for its smaller footprint, you might also consider the vector quantization and storage reduction features in Azure AI Search. Inputs are float32 or float16 embeddings. Output is stored data in a much smaller format. For more information, see [Compress using binary or scalar quantization](vector-search-how-to-quantization.md) and [Assign narrow data types](vector-search-how-to-assign-narrow-data-types.md).
3232
3333
## Prerequisites
3434

35-
+ Binary vectors, with 1 bit per dimension, packaged in uint8 values with 8 bits per value. These can be obtained by using models that directly generate "packaged binary" vectors, or by quantizing vectors into binary vectors client-side during indexing and searching.
35+
+ Binary vectors, with 1 bit per dimension, packaged in uint8 values with 8 bits per value. These can be obtained by using models that directly generate *packaged binary* vectors, or by quantizing vectors into binary vectors client-side during indexing and searching.
3636

3737
## Limitations
3838

articles/search/vector-search-how-to-quantization.md

Lines changed: 36 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,38 +1,44 @@
11
---
2-
title: Quantize vector fields
2+
title: Compress vectors using quantization
33
titleSuffix: Azure AI Search
44
description: Configure built-in scalar or quantization for compressing vectors on disk and in memory.
55

66
author: heidisteen
77
ms.author: heidist
88
ms.service: azure-ai-search
99
ms.topic: how-to
10-
ms.date: 11/04/2024
10+
ms.date: 11/19/2024
1111
---
1212

13-
# Use scalar or binary quantization to compress vector size
13+
# Compress vectors using scalar or binary quantization
1414

15-
Quantization is recommended for reducing vector size because it lowers both memory and disk storage requirements for float16 and float32 embeddings. To offset the effects of a smaller index, you can add oversampling and reranking over uncompressed vectors.
16-
17-
Quantization applies to vector fields receiving float-type vectors. In the examples in this article, the field's data type is `Collection(Edm.Single)` for incoming float32 embeddings, but float16 is also supported. When the vectors are received on a field with compression configured, the engine automatically performs quantization to reduce the footprint of the vector data in memory and on disk.
18-
19-
Two types of quantization are supported:
20-
21-
- Scalar quantization compresses float values into narrower data types. AI Search currently supports int8, which is 8 bits, reducing vector index size fourfold.
22-
23-
- Binary quantization converts floats into binary bits, which takes up 1 bit. This results in up to 28 times reduced vector index size.
15+
Azure AI Search supports scalar and binary quantization for reducing the size of vectors in a search index. Quantization is recommended for reducing vector size because it lowers both memory and disk storage consumption for float16 and float32 embeddings. To offset the effects of a smaller index, you can add oversampling and reranking over uncompressed vectors.
2416

2517
To use built-in quantization, follow these steps:
2618

2719
> [!div class="checklist"]
28-
> - Use [Create Index](/rest/api/searchservice/indexes/create) or [Create Or Update Index](/rest/api/searchservice/indexes/create-or-update) to specify vector compression
29-
> - Add `vectorSearch.compressions` to a search index
20+
> - Add [vector fields and a `vectorSearch` configuration](vector-search-how-to-create-index.md) to an index
21+
> - Add `vectorSearch.compressions`
3022
> - Add a `scalarQuantization` or `binaryQuantization` configuration and give it a name
3123
> - Set optional properties to mitigate the effects of lossy indexing
3224
> - Create a new vector profile that uses the named configuration
3325
> - Create a new vector field having the new vector profile
3426
> - Load the index with float32 or float16 data that's quantized during indexing with the configuration you defined
35-
> - Optionally, [query quantized data](#) using the oversampling parameter if you want to override the default
27+
> - Optionally, [query quantized data](#query-a-quantized-vector-field-using-oversampling) using the oversampling parameter if you want to override the default
28+
29+
## Prerequisites
30+
31+
- [Vector fields in a search index](vector-search-how-to-create-index.md) with a `vectorSearch` configuration, using the HNSW algorithm and a new vector profile.
32+
33+
## Supported quantization techniques
34+
35+
Quantization applies to vector fields receiving float-type vectors. In the examples in this article, the field's data type is `Collection(Edm.Single)` for incoming float32 embeddings, but float16 is also supported. When the vectors are received on a field with compression configured, the engine automatically performs quantization to reduce the footprint of the vector data in memory and on disk.
36+
37+
Two types of quantization are supported:
38+
39+
- Scalar quantization compresses float values into narrower data types. AI Search currently supports int8, which is 8 bits, reducing vector index size fourfold.
40+
41+
- Binary quantization converts floats into binary bits, which takes up 1 bit. This results in up to 28 times reduced vector index size.
3642

3743
## Add "compressions" to a search index
3844

@@ -76,7 +82,7 @@ POST https://[servicename].search.windows.net/indexes?api-version=2024-07-01
7682

7783
**Key points**:
7884

79-
- `kind` must be set to `scalarQuantization` or `binaryQuantization`
85+
- `kind` must be set to `scalarQuantization` or `binaryQuantization`.
8086

8187
- `rerankWithOriginalVectors` uses the original, uncompressed vectors to recalculate similarity and rerank the top results returned by the initial search query. The uncompressed vectors exist in the search index even if `stored` is false. This property is optional. Default is true.
8288

@@ -227,3 +233,17 @@ POST https://[service-name].search.windows.net/indexes/demo-index/docs/search?ap
227233
- Applies to vector fields that undergo vector compression, per the vector profile assignment.
228234

229235
- Overrides the `defaultOversampling` value or introduces oversampling at query time, even if the index's compression configuration didn't specify oversampling or reranking options.
236+
237+
<!--
238+
RESCORE WITH ORIGINAL VECTORS -- NEEDS AN H2 or H3
239+
It's used to rescore search results obtained used compressed vectors.
240+
241+
Rescore with original vectors
242+
After the initial query, rescore results using uncompressed vectors
243+
244+
For "enableRescoring", we provide true or false options. if it's true, the query will first retrieve using compressed vectors, then rescore results using uncompressed vectors.
245+
246+
Step one: Vector query executes using the compressed vectors.
247+
Step two: Query returns the top oversampling k-matches.
248+
Step three: Oversampling k-matches are rescored using the uncompressed vectors, adjusting the scores and ranking so that more relevant matches appear first.
249+
-->

0 commit comments

Comments
 (0)