Skip to content

Commit ec42a4d

Browse files
Merge pull request #272819 from HeidiSteen/heidist-vectors
[azure search] vector index size edits
2 parents 020530c + 8997ecd commit ec42a4d

File tree

2 files changed

+47
-28
lines changed

2 files changed

+47
-28
lines changed

articles/search/vector-search-index-size.md

Lines changed: 46 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -9,35 +9,33 @@ ms.service: cognitive-search
99
ms.custom:
1010
- ignite-2023
1111
ms.topic: conceptual
12-
ms.date: 04/03/2024
12+
ms.date: 04/19/2024
1313
---
1414

1515
# Vector index size and staying under limits
1616

1717
For each vector field, Azure AI Search constructs an internal vector index using the algorithm parameters specified on the field. Because Azure AI Search imposes quotas on vector index size, you should know how to estimate and monitor vector size to ensure you stay under the limits.
1818

1919
> [!NOTE]
20-
> A note about terminology. Internally, the physical data structures of a search index include raw content (used for retrieval patterns requiring non-tokenized content), inverted indexes (used for searchable text fields), and vector indexes (used for searchable vector fields). This article explains the limits for the physical vector indexes that back each of your vector fields.
20+
> A note about terminology. Internally, the physical data structures of a search index include raw content (used for retrieval patterns requiring non-tokenized content), inverted indexes (used for searchable text fields), and vector indexes (used for searchable vector fields). This article explains the limits for the internal vector indexes that back each of your vector fields.
2121
2222
> [!TIP]
23-
> [Vector quantization and storage configuration](vector-search-how-to-configure-compression-storage.md) is now in preview. You can use narrow data types, apply scalar quantization, and eliminate some storage requirements if you don't need the data.
23+
> [Vector quantization and storage configuration](vector-search-how-to-configure-compression-storage.md) is now in preview. Use capabilities like narrow data types, scalar quantization, and elimination of redundant storage to stay under vector quota and storage quota.
2424
2525
## Key points about quota and vector index size
2626

2727
+ Vector index size is measured in bytes.
2828

29-
+ There's no quota at the search index level. Instead vector quotas are enforced service-wide at the partition level. Quota varies by service tier (or `SKU`) and the service creation date, with newer services having much higher quotas per partition.
29+
+ Vector quotas are based on memory constraints. All searchable vector indexes must be loaded into memory. At the same time, there must also be sufficient memory for other runtime operations. Vector quotas exist to ensure that the overall system remains stable and balanced for all workloads.
30+
31+
+ Vector indexes are also subject to disk quota, in the sense that all indexes, vector and nonvector, are subject disk quota. There's no separate disk quota for vector indexes.
32+
33+
+ Vector quotas are enforced on the search service as a whole, per partition, meaning that if you add partitions, vector quota goes up. Per-partition vector quotas are higher on newer services:
3034

3135
+ [Vector quota for services created after April 3, 2024](search-limits-quotas-capacity.md#vector-limits-on-services-created-after-april-3-2024-in-supported-regions)
3236
+ [Vector quota for services created between July 1, 2023 and April 3, 2024](search-limits-quotas-capacity.md#vector-limits-on-services-created-between-july-1-2023-and-april-3-2024)
3337
+ [Vector quota for services created before July 1, 2023](search-limits-quotas-capacity.md#vector-limits-on-services-created-before-july-1-2023)
3438

35-
+ Vector quotas are primarily designed around memory constraints. All searchable vector indexes must be loaded into memory. At the same time, there must also be sufficient memory for other runtime operations. Vector quotas exist to ensure that the overall system remains stable and balanced for all workloads.
36-
37-
+ Vector quotas are expressed in terms of physical storage, and physical storage is contingent upon partition size and quantity. Each tier offers increasingly powerful and larger partitions. Higher tiers and more partitions give you more vector quota to work with. In [service limits](search-limits-quotas-capacity.md#service-limits), maximum vector quotas are based on the maximum amount of physical space that all vector indexes can consume collectively, assuming all partitions are in use for that service.
38-
39-
For example, on new services in a supported region, the sum total of all vector indexes on a Basic search service can't be more than 15 GB because Basic can have up to three partitions (5-GB quota per partition). On S1, which can have up to 12 partitions, the quota for vector data is 35 GB per partition, or up to 160 GB if you allocate all 12 partitions.
40-
4139
## How to check partition size and quantity
4240

4341
If you aren't sure what your search service limits are, here are two ways to get that information:
@@ -78,31 +76,38 @@ A request for vector metrics is a data plane operation. You can use the Azure po
7876

7977
Usage information can be found on the **Overview** page's **Usage** tab. Portal pages refresh every few minutes so if you recently updated an index, wait a bit before checking results.
8078

81-
The following screenshot is for a Standard 1 (S1) tier, configured for one partition and one replica. Vector index quota, measured in megabytes, refers to the internal vector indexes created for each vector field. Overall, indexes consume almost 460 megabytes of available storage, but the vector index component takes up just 93 megabytes of the 460 used on this search service.
79+
The following screenshot is for an older Standard 1 (S1) search service, configured for one partition and one replica.
80+
81+
+ Storage quota is a disk constraint, and it's inclusive of all indexes (vector and nonvector) on a search service.
82+
+ Vector index size quota is a memory constraint. It's the amount of memory required to load all internal vector indexes created for each vector field on a search service.
83+
84+
The screenshot indicates that indexes (vector and nonvector) consume almost 460 megabytes of available disk storage. Vector indexes consume almost 93 megabytes of memory at the service level.
8285

8386
:::image type="content" source="media/vector-search-index-size/portal-vector-index-size.png" lightbox="media/vector-search-index-size/portal-vector-index-size.png" alt-text="Screenshot of the Overview page's usage tab showing vector index consumption against quota.":::
8487

85-
The tile on the Usage tab tracks vector index consumption at the search service level. If you increase or decrease search service capacity, the tile reflects the changes accordingly.
88+
Quotas for both storage and vector index size increase or decrease as you add or remove partitions. If you change the partition count, the tile shows a corresponding change in storage and vector quota.
89+
90+
> [!NOTE]
91+
> On disk, vector indexes aren't 93 megabytes. Vector indexes on disk take up about three times more space than vector indexes in memory. See [How vector fields affect disk storage](#how-vector-fields-affect-disk-storage) for details.
8692
8793
### [**REST**](#tab/rest-vector-quota)
8894

8995
Use the following data plane REST APIs (version 2023-10-01-preview, 2023-11-01, and later) for vector usage statistics:
9096

97+
+ [GET Service Statistics](/rest/api/searchservice/get-service-statistics/get-service-statistics) returns quota and usage for the search service all-up.
9198
+ [GET Index Statistics](/rest/api/searchservice/indexes/get-statistics) returns usage for a given index.
92-
+ [GET Service Statistics](/rest/api/searchservice/get-service-statistics/get-service-statistics) returns quota and usage for the search service all-up.
9399

94-
For a visual, here's the sample response for a Basic search service that has the quickstart vector search index. `storageSize` and `vectorIndexSize` are reported in bytes.
100+
Usage and quota are reported in bytes.
95101

96-
```json
97-
{
98-
"@odata.context": "https://my-demo.search.windows.net/$metadata#Microsoft.Azure.Search.V2023_11_01.IndexStatistics",
99-
"documentCount": 108,
100-
"storageSize": 5853396,
101-
"vectorIndexSize": 1342756
102-
}
102+
Here's GET Service Statistics:
103+
104+
```http
105+
GET {{baseUrl}}/servicestats?api-version=2023-11-01 HTTP/1.1
106+
Content-Type: application/json
107+
api-key: {{apiKey}}
103108
```
104109

105-
Return service statistics to compare usage against available quota at the service level:
110+
Response includes metrics for `storageSize`, which doesn't distinguish between vector and nonvector indexes. The `vectorIndexSize` statistic shows usage and quota at the service level.
106111

107112
```json
108113
{
@@ -136,6 +141,24 @@ Return service statistics to compare usage against available quota at the servic
136141
}
137142
```
138143

144+
You can also send a GET Index Statistics to get the physical size of the index on disk, plus the in-memory size of the vector fields.
145+
146+
```http
147+
GET {{baseUrl}}/indexes/vector-healthplan-idx/stats?api-version=2023-11-01 HTTP/1.1
148+
Content-Type: application/json
149+
api-key: {{apiKey}}
150+
```
151+
152+
Response includes usage information at the index level. This example is based on the index created in the [integrated vectorization quickstart](search-get-started-portal-import-vectors.md) that chunks and vectorizes health plan PDFs. Each chunk contributes to `documentCount`.
153+
154+
```json
155+
{
156+
"@odata.context": "https://my-demo.search.windows.net/$metadata#Microsoft.Azure.Search.V2023_11_01.IndexStatistics",
157+
"documentCount": 147,
158+
"storageSize": 4592870,
159+
"vectorIndexSize": 915484
160+
}
161+
```
139162
---
140163

141164
## Factors affecting vector index size
@@ -206,8 +229,4 @@ To obtain the **vector index size**, multiply this **raw_size** by the **algorit
206229

207230
## How vector fields affect disk storage
208231

209-
Disk storage overhead of vector data is roughly three times the size of vector index size.
210-
211-
### Storage vs. vector index size quotas
212-
213-
Service storage and vector index size quotas aren't separate quotas. Vector indexes contribute to the [storage quota for the search service](search-limits-quotas-capacity.md#service-limits) as a whole. For example, if your storage quota is exhausted but there's remaining vector quota, you can't index any more documents, regardless if they're vector documents, until you scale up in partitions to increase storage quota or delete documents (either text or vector) to reduce storage usage. Similarly, if vector quota is exhausted but there's remaining storage quota, further indexing attempts fail until vector quota is freed, either by deleting some vector documents or by scaling up in partitions.
232+
Most of this article provides information about the size of vectors in memory. If you want to know about vector size on disk, the disk consumption for vector data is roughly three times the size of the vector index in memory. For example, if your `vectorIndexSize` usage is at 100 megabytes (10 million bytes), you would have used least 300 megabytes of `storageSize` quota to accommodate your vector indexes

includes/azure-search-limits-per-service.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ Currently, there's no in-place upgrade. You should [create a new search service]
4040
| Resource | Free | Basic | S1 | S2 | S3 | S3 HD | L1 | L2 |
4141
|----------|------|--------|----|----|----|------------|----|----|
4242
| Service level agreement (SLA) | No |Yes |Yes |Yes |Yes |Yes |Yes |Yes |
43-
| Storage (partition size) | 50 MB | 15 GB | 160 GB | 350 GB | 700 GB |200 GB| 1 TB | 2 TB |
43+
| Storage (partition size) | 50 MB | 15 GB | 160 GB | 350 GB | 700 GB |700 GB| 1 TB | 2 TB |
4444
| Partitions | N/A |3 |12 |12 |12 |3 |12 |12 |
4545
| Replicas | N/A | 3 |12 |12 |12 |12 |12 |12 |
4646

0 commit comments

Comments
 (0)