|
1 | 1 | ---
|
2 |
| -title: Monitoring and debugging with metrics in Azure Cosmos DB | Microsoft Docs |
| 2 | +title: Monitor and debug with metrics in Azure Cosmos DB | Microsoft Docs |
3 | 3 | description: Use metrics in Azure Cosmos DB to debug common issues and monitor the database.
|
4 | 4 | keywords: metrics
|
5 | 5 | services: cosmos-db
|
6 | 6 | author: kanshiG
|
7 | 7 | manager: kfile
|
8 | 8 | editor: ''
|
9 |
| - |
10 | 9 | ms.service: cosmos-db
|
11 | 10 | ms.devlang: na
|
12 | 11 | ms.topic: conceptual
|
13 |
| -ms.date: 09/25/2017 |
| 12 | +ms.date: 11/15/2018 |
14 | 13 | ms.author: govindk
|
15 |
| - |
16 | 14 | ---
|
| 15 | +# Monitor and debug with metrics in Azure Cosmos DB |
17 | 16 |
|
18 |
| -# Monitoring and debugging with metrics in Azure Cosmos DB |
19 |
| - |
20 |
| -Azure Cosmos DB provides metrics for throughput, storage, consistency, availability, and latency. The [Azure portal](https://portal.azure.com) provides an aggregated view of these metrics; for more granular metrics, both the client SDK and the [diagnostic logs](./logging.md) are available. |
| 17 | +Azure Cosmos DB provides metrics for throughput, storage, consistency, availability, and latency. The [Azure portal](https://portal.azure.com) provides an aggregated view of these metrics. For more granular metrics, both the client SDK and the [diagnostic logs](./logging.md) are available. |
21 | 18 |
|
22 |
| -This article walks through common use cases and how Azure Cosmos DB metrics can be used to analyze and debug these issues. Metrics are collected every five minutes and are retained for seven days. |
| 19 | +This article walks through common use cases and how Azure Cosmos DB metrics can be used to analyze and debug these issues. Metrics are collected every five minutes and are kept for seven days. |
23 | 20 |
|
24 |
| -## Understanding how many requests are succeeding or causing errors |
| 21 | +## Understand how many requests are succeeding or causing errors |
25 | 22 |
|
26 |
| -To get started, head to the [Azure portal](https://portal.azure.com) and navigate to the **Metrics** blade. In the blade, find the **Number of requests exceeded capacity per 1 minute** chart. This chart shows a minute by minute total requests segmented by the status code. For more information about HTTP status codes, see [HTTP Status Codes for Azure Cosmos DB](https://docs.microsoft.com/rest/api/cosmos-db/http-status-codes-for-cosmosdb). |
| 23 | +To get started, head to the [Azure portal](https://portal.azure.com) and navigate to the **Metrics** blade. In the blade, find the **Number of requests exceeded capacity per 1 minute** chart. This chart shows a minute by minute total requests segmented by the status code. For more information about HTTP status codes, see [HTTP status codes for Azure Cosmos DB](https://docs.microsoft.com/rest/api/cosmos-db/http-status-codes-for-cosmosdb). |
27 | 24 |
|
28 |
| -The most common error status code is 429 (rate limiting/throttling), which means that requests to Azure Cosmos DB are exceeding the provisioned throughput. The most common solution to this is to [scale up the RUs](./set-throughput.md) for the given collection. |
| 25 | +The most common error status code is 429 (rate limiting/throttling). This error means that requests to Azure Cosmos DB are more than the provisioned throughput. The most common solution to this problem is to [scale up the RUs](./set-throughput.md) for the given collection. |
29 | 26 |
|
30 | 27 | 
|
31 | 28 |
|
32 |
| -## Determining the throughput distribution across partitions |
| 29 | +## Determine the throughput distribution across partitions |
33 | 30 |
|
34 |
| -Having a good cardinality of your partition keys is essential for any scalable application. To determine the throughput distribution of any partitioned container broken down by partitions, navigate to the **Metrics blade** in the [Azure portal](https://portal.azure.com). In the **Throughput** tab, the storage breakdown is shown in the **Max consumed RU/second by each physical partition** chart. The following graphic illustrates an example of a poor distribution of data as evidenced by the skewed partition on the far left. |
| 31 | +Having a good cardinality of your partition keys is essential for any scalable application. To determine the throughput distribution of any partitioned container broken down by partitions, navigate to the **Metrics blade** in the [Azure portal](https://portal.azure.com). In the **Throughput** tab, the storage breakdown is shown in the **Max consumed RU/second by each physical partition** chart. The following graphic illustrates an example of a poor distribution of data as shown by the skewed partition on the far left. |
35 | 32 |
|
36 | 33 | 
|
37 | 34 |
|
38 | 35 | An uneven throughput distribution may cause *hot* partitions, which can result in throttled requests and may require repartitioning. For more information about partitioning in Azure Cosmos DB, see [Partition and scale in Azure Cosmos DB](./partition-data.md).
|
39 | 36 |
|
40 |
| -## Determining the storage distribution across partitions |
| 37 | +## Determine the storage distribution across partitions |
41 | 38 |
|
42 |
| -Having a good cardinality of your partition is essential for any scalable application. To determine the throughput distribution of any partitioned container broken down by partitions, head to the Metrics blade in the [Azure portal](https://portal.azure.com). In the Throughput tab, the storage breakdown is shown in the Max consumed RU/second by each physical partition chart. The following graphic illustrates a poor distribution of data as evidenced by the skewed partition on the far left. |
| 39 | +Having a good cardinality of your partition is essential for any scalable application. To determine the throughput distribution of any partitioned container broken down by partitions, head to the Metrics blade in the [Azure portal](https://portal.azure.com). In the Throughput tab, the storage breakdown is shown in the Max consumed RU/second by each physical partition chart. The following graphic illustrates a poor distribution of data as shown by the skewed partition on the far left. |
43 | 40 |
|
44 | 41 | 
|
45 | 42 |
|
46 |
| -You can root cause which partition key is skewing the distribution by clicking on the partition in the chart. |
| 43 | +You can root cause which partition key is skewing the distribution by clicking on the partition in the chart. |
47 | 44 |
|
48 | 45 | 
|
49 | 46 |
|
50 | 47 | After identifying which partition key is causing the skew in distribution, you may have to repartition your container with a more distributed partition key. For more information about partitioning in Azure Cosmos DB, see [Partition and scale in Azure Cosmos DB](./partition-data.md).
|
51 | 48 |
|
52 |
| -## Comparing data size against index size |
| 49 | +## Compare data size against index size |
53 | 50 |
|
54 |
| -In Azure Cosmos DB, the total consumed storage is the combination of both the Data size and Index size. Typically, the index size is a fraction of the data size. In the Metrics blade in the [Azure portal](https://portal.azure.com), the Storage tab showcases the breakdown of storage consumption based on data and index. |
| 51 | +In Azure Cosmos DB, the total consumed storage is the combination of both the Data size and Index size. Typically, the index size is a fraction of the data size. In the Metrics blade in the [Azure portal](https://portal.azure.com), the Storage tab showcases the breakdown of storage consumption based on data and index. |
55 | 52 |
|
56 | 53 | ```csharp
|
57 | 54 | // Measure the document size usage (which includes the index size)
|
58 |
| -ResourceResponse<DocumentCollection> collectionInfo = await client.ReadDocumentCollectionAsync(UriFactory.CreateDocumentCollectionUri("db", "coll")); |
| 55 | +ResourceResponse<DocumentCollection> collectionInfo = await client.ReadDocumentCollectionAsync(UriFactory.CreateDocumentCollectionUri("db", "coll")); |
59 | 56 | Console.WriteLine("Document size quota: {0}, usage: {1}", collectionInfo.DocumentQuota, collectionInfo.DocumentUsage);
|
60 |
| -``` |
61 |
| -If you would like to conserve index space, you can adjust the [indexing policy](./indexing-policies.md). |
| 57 | +``` |
| 58 | + |
| 59 | +If you would like to conserve index space, you can adjust the [indexing policy](index-policy.md). |
62 | 60 |
|
63 |
| -## Debugging why queries are running slow |
| 61 | +## Debug why queries are running slow |
64 | 62 |
|
65 |
| -In the SQL API SDKs, Azure Cosmos DB provides query execution statistics. |
| 63 | +In the SQL API SDKs, Azure Cosmos DB provides query execution statistics. |
66 | 64 |
|
67 | 65 | ```csharp
|
68 | 66 | IDocumentQuery<dynamic> query = client.CreateDocumentQuery(
|
69 |
| - UriFactory.CreateDocumentCollectionUri(DatabaseName, CollectionName), |
70 |
| - "SELECT * FROM c WHERE c.city = 'Seattle'", |
71 |
| - new FeedOptions |
72 |
| - { |
73 |
| - PopulateQueryMetrics = true, |
74 |
| - MaxItemCount = -1, |
75 |
| - MaxDegreeOfParallelism = -1, |
76 |
| - EnableCrossPartitionQuery = true |
| 67 | + UriFactory.CreateDocumentCollectionUri(DatabaseName, CollectionName), |
| 68 | + "SELECT * FROM c WHERE c.city = 'Seattle'", |
| 69 | + new FeedOptions |
| 70 | + { |
| 71 | + PopulateQueryMetrics = true, |
| 72 | + MaxItemCount = -1, |
| 73 | + MaxDegreeOfParallelism = -1, |
| 74 | + EnableCrossPartitionQuery = true |
77 | 75 | }).AsDocumentQuery();
|
78 | 76 | FeedResponse<dynamic> result = await query.ExecuteNextAsync();
|
79 | 77 |
|
80 |
| -// Returns metrics by partition key range Id |
| 78 | +// Returns metrics by partition key range Id |
81 | 79 | IReadOnlyDictionary<string, QueryMetrics> metrics = result.QueryMetrics;
|
82 | 80 | ```
|
83 | 81 |
|
84 |
| -*QueryMetrics* provides details on how long each component of the query took to execution. The most common root cause for long running queries are scans (the query was unable to leverage the indexes), which can be resolved with a better filter condition. |
| 82 | +*QueryMetrics* provides details on how long each component of the query took to execution. The most common root cause for long running queries is scans, meaning the query was unable to leverage the indexes. This problem can be resolved with a better filter condition. |
85 | 83 |
|
86 | 84 | ## Next steps
|
87 | 85 |
|
88 |
| -Now that you've learned how to monitor and debug issues using the metrics provided in the Azure portal, you may want to learn more about improving database performance by reading the following articles: |
| 86 | +You've now learned how to monitor and debug issues using the metrics provided in the Azure portal. You may want to learn more about improving database performance by reading the following articles: |
89 | 87 |
|
90 | 88 | * [Performance and scale testing with Azure Cosmos DB](performance-testing.md)
|
91 | 89 | * [Performance tips for Azure Cosmos DB](performance-tips.md)
|
0 commit comments