You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/cosmos-db/troubleshoot-query-performance.md
+42-39Lines changed: 42 additions & 39 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,26 +4,29 @@ description: Learn how to identify, diagnose, and troubleshoot Azure Cosmos DB S
4
4
author: ginamr
5
5
ms.service: cosmos-db
6
6
ms.topic: troubleshooting
7
-
ms.date: 01/10/2020
7
+
ms.date: 01/14/2020
8
8
ms.author: girobins
9
9
ms.subservice: cosmosdb-sql
10
10
ms.reviewer: sngun
11
11
---
12
12
# Troubleshoot query issues when using Azure Cosmos DB
13
13
14
-
This article walks through a general recommended approach for troubleshooting queries in Azure Cosmos DB. While the steps outlined in this document should not be considered a “catch all” for potential query issues, we have consolidated the most common performance tips here. You should use this document as a starting place for troubleshooting for Azure Cosmos DB’s core (SQL) API.
14
+
This article walks through a general recommended approach for troubleshooting queries in Azure Cosmos DB. While the steps outlined in this document should not be considered a “catch all” for potential query issues, we have included the most common performance tips here. You should use this document as a starting place for troubleshooting slow or expensive queries in Azure Cosmos DB’s core (SQL) API. You use [diagnostics logs](cosmosdb-monitor-resource-logs.md) to identify queries that are slow or consume significant amounts of throughput.
15
15
16
16
You can broadly categorize query optimizations in Azure Cosmos DB: Optimizations that reduce the Request Unit (RU) charge of the query and optimizations that just reduce latency. By reducing the RU charge of a query, you will almost certainly decrease latency as well.
17
+
17
18
This document will use examples that can be recreated using the [nutrition](https://github.com/CosmosDB/labs/blob/master/dotnet/setup/NutritionData.json) dataset.
18
19
19
20
### Obtaining query metrics:
20
21
21
-
When optimizing a query in Azure Cosmos DB, the first step is always to [obtain the query metrics](profile-sql-api-query.md) for your query. These are also available through the Azure Portal as shown below:
22
+
When optimizing a query in Azure Cosmos DB, the first step is always to [obtain the query metrics](profile-sql-api-query.md) for your query. These are also available through the Azure portal as shown below:
After obtaining query metrics, compare the Retrieved Document Count with the Output Document Count for your query. Use this comparison to identify the relevant sections to reference below.
26
27
28
+
The Retrieved Document Count is the number of documents that the query needed to load. The Output Document Count is the number of documents that were needed for the results of the query. If the Retrieved Document Count is significantly higher than the Output Document Count, then there was at least one part of your query that was unable to utilize the index and needed to do a scan.
29
+
27
30
You can reference the below section to understand the relevant query optimizations for your scenario:
28
31
29
32
### Query's RU charge is too high
@@ -72,7 +75,7 @@ Query:
72
75
73
76
```sql
74
77
SELECT VALUE c.description
75
-
FROMc
78
+
FROM c
76
79
WHEREUPPER(c.description) ="BABYFOOD, DESSERT, FRUIT DESSERT, WITHOUT ASCORBIC ACID, JUNIOR"
77
80
```
78
81
@@ -102,7 +105,7 @@ Client Side Metrics
102
105
Request Charge : 4,059.95 RUs
103
106
```
104
107
105
-
Retrieved Document Count (60,951) is significantly greater than Output Document Count (7) so this query needed to do a scan.
108
+
Retrieved Document Count (60,951) is significantly greater than Output Document Count (7) so this query needed to do a scan. In this case, the system function [UPPER()](sql-query-upper.md) does not utilize the index.
106
109
107
110
## Ensure that the indexing policy includes necessary paths
108
111
@@ -166,9 +169,9 @@ If the expression can be translated into a range of string values, then it can u
166
169
167
170
Here is the list of string functions that can utilize the index:
168
171
169
-
-STARTSWITH(str_expr, str_expr)
170
-
-LEFT(str_expr, num_expr) = str_expr
171
-
-SUBSTRING(str_expr, num_expr, num_expr) = str_expr, but only if first num_expr is 0
172
+
-STARTSWITH(str_expr, str_expr)
173
+
-LEFT(str_expr, num_expr) = str_expr
174
+
-SUBSTRING(str_expr, num_expr, num_expr) = str_expr, but only if first num_expr is 0
172
175
173
176
Some common system functions that do not use the index and must load each document are below:
174
177
@@ -184,7 +187,7 @@ Other parts of the query may still utilize the index despite the system function
184
187
185
188
## Optimize queries with both a filter and an ORDER BY clause
186
189
187
-
While queries with a filter and an ORDER BY clause will normally utilize a range index, they will be more efficient if they can be served from a composite index. You can observe the impact by running a query on the [nutrition](https://github.com/CosmosDB/labs/blob/master/dotnet/setup/NutritionData.json) dataset.
190
+
While queries with a filter and an ORDER BY clause will normally utilize a range index, they will be more efficient if they can be served from a composite index. In addition to modifying the indexing policy, you should add all properties in the composite index to the ORDER BY clause. This query modification will ensure that it utilizes the composite index. You can observe the impact by running a query on the [nutrition](https://github.com/CosmosDB/labs/blob/master/dotnet/setup/NutritionData.json) dataset.
188
191
189
192
### Original
190
193
@@ -214,10 +217,12 @@ Indexing policy:
214
217
215
218
### Optimized
216
219
217
-
Updated query:
220
+
Updated query (includes both properties in the ORDER BY clause):
218
221
219
222
```sql
220
-
SELECT*FROM c WHEREc.foodGroup= “Soups, Sauces, and Gravies” ORDER BYc.foodGroup, c._tsASC
223
+
SELECT*FROM c
224
+
WHEREc.foodGroup= “Soups, Sauces, and Gravies”
225
+
ORDER BYc.foodGroup, c._tsASC
221
226
```
222
227
223
228
Updated indexing policy:
@@ -252,30 +257,15 @@ Updated indexing policy:
252
257
253
258
## Optimize queries that use DISTINCT
254
259
255
-
It will be more efficient to find the DISTINCT set of results if the duplicate results are consecutive. Adding an ORDER BY clause to the query and a composite index will ensure that duplicate results are consecutive. If you need to ORDER BY multiple properties, add a composite index. You can observe the impact by running a query on the [nutrition](https://github.com/CosmosDB/labs/blob/master/dotnet/setup/NutritionData.json) dataset.
260
+
It will be more efficient to find the `DISTINCT` set of results if the duplicate results are consecutive. Adding an ORDER BY clause to the query and a composite index will ensure that duplicate results are consecutive. If you need to ORDER BY multiple properties, add a composite index. You can observe the impact by running a query on the [nutrition](https://github.com/CosmosDB/labs/blob/master/dotnet/setup/NutritionData.json) dataset.
256
261
257
262
### Original
258
263
259
264
Query:
260
265
261
266
```sql
262
-
SELECT DISTINCTc.foodGroupFROM c
263
-
```
264
-
265
-
Indexing policy:
266
-
267
-
```json
268
-
{
269
-
"automatic":true,
270
-
"indexingMode":"Consistent",
271
-
"includedPaths":[
272
-
{
273
-
"path":"/*"
274
-
}
275
-
],
276
-
"excludedPaths":[]
277
-
}
278
-
267
+
SELECT DISTINCTc.foodGroup
268
+
FROM c
279
269
```
280
270
281
271
**RU Charge:** 32.39 RU's
@@ -285,13 +275,15 @@ Indexing policy:
285
275
Updated query:
286
276
287
277
```sql
288
-
SELECT DISTINCTc.foodGroupFROM c ORDER BYc.foodGroup
278
+
SELECT DISTINCTc.foodGroup
279
+
FROM c
280
+
ORDER BYc.foodGroup
289
281
```
290
282
291
283
**RU Charge:** 3.38 RU's
292
284
293
285
## Optimize JOIN expressions by using a subquery
294
-
Multi-value subqueries can optimize JOIN expressions by pushing predicates after each select-many expression rather than after all cross-joins in the WHERE clause.
286
+
Multi-value subqueries can optimize `JOIN` expressions by pushing predicates after each select-many expression rather than after all cross-joins in the WHERE clause.
295
287
296
288
Consider the following query:
297
289
@@ -305,7 +297,9 @@ WHERE t.name = 'infant formula' AND (n.nutritionValue > 0
305
297
ANDn.nutritionValue<10) ANDs.amount>1
306
298
```
307
299
308
-
For this query, the index will match any document that has a tag with the name "infant formula",nutrtionValue greater than 0, and serving amount greater than 1. The JOIN expression here will perform the cross-product of all items of tags, nutrients, and servings arrays for each matching document before any filter is applied. The WHERE clause will then apply the filter predicate on each <c, t, n, s> tuple.
300
+
**RU Charge:** 167.62 RU's
301
+
302
+
For this query, the index will match any document that has a tag with the name "infant formula", nutritionValue greater than 0, and serving amount greater than 1. The JOIN expression here will perform the cross-product of all items of tags, nutrients, and servings arrays for each matching document before any filter is applied. The WHERE clause will then apply the filter predicate on each `<c, t, n, s>` tuple.
309
303
310
304
For instance, if a matching document had 10 items in each of the three arrays, it will expand to 1 x 10 x 10 x 10 (that is, 1,000) tuples. Using subqueries here can help in filtering out joined array items before joining with the next expression.
311
305
@@ -319,6 +313,8 @@ JOIN (SELECT VALUE n FROM n IN c.nutrients WHERE n.nutritionValue > 0 AND n.nutr
319
313
JOIN (SELECT VALUE s FROM s INc.servingsWHEREs.amount>1)
320
314
```
321
315
316
+
**RU Charge:** 22.17 RU's
317
+
322
318
Assume that only one item in the tags array matches the filter, and there are five items for both nutrients and servings arrays. The JOIN expressions will then expand to 1 x 1 x 5 x 5 = 25 items, as opposed to 1,000 items in the first query.
323
319
324
320
## Optimizations for queries where Retrieved Document Count is approximately equal to Output Document Count:
@@ -334,23 +330,27 @@ If you have a large number of provisioned RU’s (over 30,000) or a large amount
334
330
For example, if we create a container with the partition key foodGroup, the following queries would only need to check a single physical partition:
335
331
336
332
```sql
337
-
SELECT*FROM c WHEREc.foodGroup= “Soups, Sauces, and Gravies” andc.description="Mushroom, oyster, raw"
333
+
SELECT*FROM c
334
+
WHEREc.foodGroup= “Soups, Sauces, and Gravies” andc.description="Mushroom, oyster, raw"
338
335
```
339
336
340
337
These queries would also be optimized by including the partition key in the query:
341
338
342
339
```sql
343
-
SELECT*FROM c WHEREc.foodGroupIN(“Soups, Sauces, and Gravies”, “"Vegetables and Vegetable Products”) and c.description = "Mushroom, oyster, raw"
340
+
SELECT*FROM c
341
+
WHEREc.foodGroupIN(“Soups, Sauces, and Gravies”, “"Vegetables and Vegetable Products”) and c.description = "Mushroom, oyster, raw"
344
342
```
345
343
346
344
Queries that have range filters on the partition key or don’t have any filters on the partition key, will need to “fan-out” and check every physical partition’s index for results.
347
345
348
346
```sql
349
-
SELECT * FROM c WHERE c.description = "Mushroom, oyster, raw"
347
+
SELECT * FROM c
348
+
WHERE c.description = "Mushroom, oyster, raw"
350
349
```
351
350
352
351
```sql
353
-
SELECT * FROM c WHERE c.foodGroup > “Soups, Sauces, and Gravies” and c.description = "Mushroom, oyster, raw"
352
+
SELECT * FROM c
353
+
WHERE c.foodGroup > “Soups, Sauces, and Gravies” and c.description = "Mushroom, oyster, raw"
354
354
```
355
355
356
356
## Optimize queries that have a filter on multiple properties
@@ -360,11 +360,13 @@ While queries with filters on multiple properties will normally utilize a range
360
360
Here are some examples of queries which could be optimized with a composite index:
361
361
362
362
```sql
363
-
SELECT * FROM c WHERE c.foodGroup = "Vegetables and Vegetable Products" AND c._ts = 1575503264
363
+
SELECT * FROM c
364
+
WHERE c.foodGroup = "Vegetables and Vegetable Products" AND c._ts = 1575503264
364
365
```
365
366
366
367
```sql
367
-
SELECT * FROM c WHERE c.foodGroup = "Vegetables and Vegetable Products" AND c._ts > 1575503264
368
+
SELECT * FROM c
369
+
WHERE c.foodGroup = "Vegetables and Vegetable Products" AND c._ts > 1575503264
368
370
```
369
371
370
372
Here is the relevant composite index:
@@ -404,9 +406,10 @@ Queries that are run from a different region than the Azure Cosmos DB account wi
404
406
405
407
## Increasing provisioned throughput
406
408
407
-
In Azure Cosmos DB, your provisioned throughput is measured in Request Units (RU’s). Let’s imagine you have a query that consumes 5 RU’s of throughput. For example, if you provision 1,000 RU’s, you would be able to run that query 200 times per second. If you attempted to run the query when there was not enough throughput available, Azure Cosmos DB would return an HTTP 429 error. Any of the current Core (SQL) API sdk's will automatically retry this query after waiting a brief period. Throttled requests take a longer amount of time, so increasing provisioned throughput can improve query latency. You can observe the [total number of requests throttled requests(use-metrics.md#understand-how-many-requests-are-succeeding-or-causing-errors in the Metrics blade of the Azure Portal.
409
+
In Azure Cosmos DB, your provisioned throughput is measured in Request Units (RU’s). Let’s imagine you have a query that consumes 5 RU’s of throughput. For example, if you provision 1,000 RU’s, you would be able to run that query 200 times per second. If you attempted to run the query when there was not enough throughput available, Azure Cosmos DB would return an HTTP 429 error. Any of the current Core (SQL) API sdk's will automatically retry this query after waiting a brief period. Throttled requests take a longer amount of time, so increasing provisioned throughput can improve query latency. You can observe the [total number of requests throttled requests](use-metrics.md#understand-how-many-requests-are-succeeding-or-causing-errors.md) in the Metrics blade of the Azure portal.
408
410
409
411
## Increasing MaxConcurrency
412
+
410
413
Parallel queries work by querying multiple partitions in parallel. However, data from an individual partitioned collection is fetched serially with respect to the query. So, adjusting the MaxConcurrency to the number of partitions has the maximum chance of achieving the most performant query, provided all other system conditions remain the same. If you don't know the number of partitions, you can set the MaxConcurrency (or MaxDegreesOfParallelism in older sdk versions) to a high number, and the system chooses the minimum (number of partitions, user provided input) as the maximum degree of parallelism.
0 commit comments