You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/performance-benchmarks.md
+13-13Lines changed: 13 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,12 +8,12 @@ ms.service: cognitive-search
8
8
ms.custom:
9
9
- ignite-2023
10
10
ms.topic: conceptual
11
-
ms.date: 01/31/2023
11
+
ms.date: 01/19/2024
12
12
---
13
13
14
14
# Azure AI Search performance benchmarks
15
15
16
-
Azure AI Search's performance depends on a [variety of factors](search-performance-tips.md) including the size of your search service and the types of queries you're sending. To help estimate the size of search service needed for your workload, we've run several benchmarks to document the performance for different search services and configurations. These benchmarks in no way guarantee a certain level of performance from your service but can give you an idea of the performance you can expect.
16
+
Azure AI Search's performance depends on a [variety of factors](search-performance-tips.md) including the size of your search service and the types of queries you're sending. To help estimate the size of search service needed for your workload, we've run several benchmarks to document the performance for different search services and configurations. *These benchmarks in no way guarantee a certain level of performance from your service but can give you an idea of the performance you can expect*.
17
17
18
18
To cover a range of different use cases, we ran benchmarks for two main scenarios:
19
19
@@ -49,11 +49,12 @@ Each scenario used at least 10,000 unique queries to avoid tests being overly sk
49
49
50
50
-**Latency** - The server's latency for a query; these numbers don't include [round trip delay (RTT)](https://en.wikipedia.org/wiki/Round-trip_delay). Values are in milliseconds (ms).
51
51
52
-
### Disclaimer
52
+
##Testing disclaimer
53
53
54
54
The code we used to run these benchmarks is available on the [azure-search-performance-testing](https://github.com/Azure-Samples/azure-search-performance-testing/tree/main/other_tools) repository. It's worth noting that we observed slightly lower QPS levels with the [JMeter performance testing solution](https://github.com/Azure-Samples/azure-search-performance-testing) than in the benchmarks. The differences can be attributed to differences in the style of the tests. This speaks to the importance of making your performance tests as similar to your production workload as possible.
55
55
56
-
These benchmarks in no way guarantee a certain level of performance from your service but can give you an idea of the performance you can expect based on your scenario.
56
+
> [!IMPORTANT]
57
+
> These benchmarks in no way guarantee a certain level of performance from your service but can give you an idea of the performance you can expect based on your scenario.
57
58
58
59
If you have any questions or concerns, reach out to us at [email protected].
59
60
@@ -87,15 +88,14 @@ The following chart shows the highest query load a service could handle for an e
87
88
88
89
#### Query latency
89
90
90
-
Query latency varies based on the load of the service and services under higher stress will have a higher average query latency. The following table shows the 25th, 50th, 75th, 90th, 95th, and 99th percentiles of query latency for three different usage levels.
91
+
Query latency varies based on the load of the service and services under higher stress have a higher average query latency. The following table shows the 25th, 50th, 75th, 90th, 95th, and 99th percentiles of query latency for three different usage levels.
91
92
92
93
| Percentage of max QPS | Average latency | 25% | 75% | 90% | 95% | 99%|
93
94
|---|---|---|---| --- | --- | --- |
94
95
| 20% | 104 ms | 35 ms | 115 ms | 177 ms | 257 ms | 738 ms |
95
96
| 50% | 140 ms | 47 ms | 144 ms | 241 ms | 400 ms | 1175 ms |
96
97
| 80% | 239 ms | 77 ms | 248 ms | 466 ms | 763 ms | 1752 ms |
97
98
98
-
99
99
### S2 Performance
100
100
101
101
#### Queries per second
@@ -106,7 +106,7 @@ The following chart shows the highest query load a service could handle for an e
106
106
107
107
#### Query latency
108
108
109
-
Query latency varies based on the load of the service and services under higher stress will have a higher average query latency. The following table shows the 25th, 50th, 75th, 90th, 95th, and 99th percentiles of query latency for three different usage levels.
109
+
Query latency varies based on the load of the service and services under higher stress have a higher average query latency. The following table shows the 25th, 50th, 75th, 90th, 95th, and 99th percentiles of query latency for three different usage levels.
110
110
111
111
| Percentage of max QPS | Average latency | 25% | 75% | 90% | 95% | 99%|
112
112
|---|---|---|---| --- | --- | --- |
@@ -126,7 +126,7 @@ In this case, we see that adding a second partition significantly increases the
126
126
127
127
#### Query latency
128
128
129
-
Query latency varies based on the load of the service and services under higher stress will have a higher average query latency. The following table shows the 25th, 50th, 75th, 90th, 95th, and 99th percentiles of query latency for three different usage levels.
129
+
Query latency varies based on the load of the service and services under higher stress have a higher average query latency. The following table shows the 25th, 50th, 75th, 90th, 95th, and 99th percentiles of query latency for three different usage levels.
130
130
131
131
| Percentage of max QPS | Average latency | 25% | 75% | 90% | 95% | 99%|
132
132
|---|---|---|---| --- | --- | --- |
@@ -153,7 +153,7 @@ The following chart shows the highest query load a service could handle for an e
153
153
154
154
#### Query latency
155
155
156
-
Query latency varies based on the load of the service and services under higher stress will have a higher average query latency. The following table shows the 25th, 50th, 75th, 90th, 95th, and 99th percentiles of query latency for three different usage levels.
156
+
Query latency varies based on the load of the service and services under higher stress have a higher average query latency. The following table shows the 25th, 50th, 75th, 90th, 95th, and 99th percentiles of query latency for three different usage levels.
157
157
158
158
| Percentage of max QPS | Average latency | 25% | 75% | 90% | 95% | 99%|
159
159
|---|---|---|---| --- | --- | --- |
@@ -171,7 +171,7 @@ The following chart shows the highest query load a service could handle for an e
171
171
172
172
#### Query latency
173
173
174
-
Query latency varies based on the load of the service and services under higher stress will have a higher average query latency. The following table shows the 25th, 50th, 75th, 90th, 95th, and 99th percentiles of query latency for three different usage levels.
174
+
Query latency varies based on the load of the service and services under higher stress have a higher average query latency. The following table shows the 25th, 50th, 75th, 90th, 95th, and 99th percentiles of query latency for three different usage levels.
175
175
176
176
| Percentage of max QPS | Average latency | 25% | 75% | 90% | 95% | 99%|
177
177
|---|---|---|---| --- | --- | --- |
@@ -189,7 +189,7 @@ The following chart shows the highest query load a service could handle for an e
189
189
190
190
#### Query latency
191
191
192
-
Query latency varies based on the load of the service and services under higher stress will have a higher average query latency. The following table shows the 25th, 50th, 75th, 90th, 95th, and 99th percentiles of query latency for three different usage levels.
192
+
Query latency varies based on the load of the service and services under higher stress have a higher average query latency. The following table shows the 25th, 50th, 75th, 90th, 95th, and 99th percentiles of query latency for three different usage levels.
193
193
194
194
| Percentage of max QPS | Average latency | 25% | 75% | 90% | 95% | 99%|
195
195
|---|---|---|---| --- | --- | --- |
@@ -204,9 +204,9 @@ Through these benchmarks, you can get an idea of the performance Azure AI Search
204
204
Some key take ways from these benchmarks are:
205
205
206
206
* An S2 can typically handle at least four times the query volume as an S1
207
-
* An S2 will typically have lower latency than an S1 at comparable query volumes
207
+
* An S2 typically has lower latency than an S1 at comparable query volumes
208
208
* As you add replicas, the QPS a service can handle typically scales linearly (for example, if one replica can handle 10 QPS then five replicas can usually handle 50 QPS)
209
-
* The higher the load on the service, the higher the average latency will be
209
+
* The higher the load on the service, the higher the average latency
210
210
211
211
You can also see that performance can vary drastically between scenarios. If you're not getting the performance you expect, check out the [tips for better performance](search-performance-tips.md).
Copy file name to clipboardExpand all lines: articles/search/search-get-started-vector.md
+7-7Lines changed: 7 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,22 +16,22 @@ ms.date: 01/19/2024
16
16
17
17
Get started with vector search in Azure AI Search using the **2023-11-01** REST APIs that create, load, and query a search index.
18
18
19
-
Search indexes can have vector and non-vector fields. You can create pure vector queries, or hybrid queries targeting both vector *and* textual fields configured for filters, sorts, facets, and semantic reranking.
19
+
Search indexes can have vector and nonvector fields. You can execute pure vector queries, or hybrid queries targeting both vector *and* textual fields configured for filters, sorts, facets, and semantic reranking.
20
20
21
21
> [!NOTE]
22
-
> Looking for [built-in data chunking and vectorization public preview](vector-search-integrated-vectorization.md)? Try the [**Import and vectorize data** wizard](search-get-started-portal-import-vectors.md)instead.
22
+
> The stable REST API version depends on external modules for data chunking and embedding. If you want test-drive the [built-in data chunking and vectorization (public preview)](vector-search-integrated-vectorization.md) features, try the [**Import and vectorize data** wizard](search-get-started-portal-import-vectors.md)for an end-to-end walkthrough.
+[Sample Postman collection](https://github.com/Azure-Samples/azure-search-postman-samples/tree/main/Quickstart-vectors), with requests targeting the **2023-11-01** API version of Azure AI Search.
29
+
28
30
+ An Azure subscription. [Create one for free](https://azure.microsoft.com/free/).
29
31
30
32
+ Azure AI Search, in any region and on any tier. Most existing services support vector search. For a small subset of services created prior to January 2019, an index containing vector fields will fail on creation. In this situation, a new service must be created.
31
33
32
-
For the optional [semantic ranking](semantic-search-overview.md) shown in the last example, your search service must be Basic tier or higher, with [semantic ranking enabled](semantic-how-to-enable-disable.md).
33
-
34
-
+[Sample Postman collection](https://github.com/Azure-Samples/azure-search-postman-samples/tree/main/Quickstart-vectors), with requests targeting the **2023-11-01** API version of Azure AI Search.
34
+
+ Optionally, for [semantic reranking](semantic-search-overview.md) shown in the last example, your search service must be Basic tier or higher, with [semantic ranking enabled](semantic-how-to-enable-disable.md).
35
35
36
36
+ Optionally, an [Azure OpenAI](https://aka.ms/oai/access) resource with a deployment of **text-embedding-ada-002**. The quickstart includes an optional step for generating new text embeddings, but we provide existing embeddings so that you can skip this step.
37
37
@@ -233,7 +233,7 @@ You should get a status HTTP 201 success.
233
233
234
234
**Key points:**
235
235
236
-
+ The `"fields"` collection includes a required key field, text and vector fields (such as `"Description"`, `"DescriptionVector"`) for keyword and vector search. Colocating vector and non-vector fields in the same index enables hybrid queries. For instance, you can combine filters, keyword search with semantic ranking, and vectors into a single query operation.
236
+
+ The `"fields"` collection includes a required key field, text and vector fields (such as `"Description"`, `"DescriptionVector"`) for keyword and vector search. Colocating vector and nonvector fields in the same index enables hybrid queries. For instance, you can combine filters, keyword search with semantic ranking, and vectors into a single query operation.
237
237
238
238
+ Vector fields must be `"type": "Collection(Edm.Single)"` with `"dimensions"` and `"vectorSearchProfile"` properties. See [Create or Update Index](/rest/api/searchservice/indexes/create-or-update) for property descriptions.
239
239
@@ -476,7 +476,7 @@ The response for the vector equivalent of "classic lodging near running trails,
476
476
477
477
### Single vector search with filter
478
478
479
-
You can add filters, but the filters are applied to the non-vector content in your index. In this example, the filter applies to the `"Tags"` field, filtering out any hotels that don't provide free WIFI.
479
+
You can add filters, but the filters are applied to the nonvector content in your index. In this example, the filter applies to the `"Tags"` field, filtering out any hotels that don't provide free WIFI.
480
480
481
481
This example sets `vectorFilterMode` to pre-query filtering, which is the default, so you don't need to set it. It's listed here for awareness because it's a newer feature.
0 commit comments