Skip to content

Commit 25a2fd3

Browse files
committed
Merge branch 'horz-monitor-cosmosdb' of https://github.com/cdpark/azure-docs-pr into horizontals-cosmos-db
2 parents 9d53f90 + bc500a7 commit 25a2fd3

14 files changed

+72
-240
lines changed

articles/cosmos-db/audit-restore-continuous.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,10 +9,10 @@ ms.author: govindk
99
ms.reviewer: mjbrown
1010
---
1111

12-
# Audit the point in time restore action for continuous backup mode in Azure Cosmos DB
12+
# Audit the point-in-time restore action for continuous backup mode in Azure Cosmos DB
1313
[!INCLUDE[NoSQL, MongoDB, Gremlin, Table](includes/appliesto-nosql-mongodb-gremlin-table.md)]
1414

15-
Azure Cosmos DB provides you the list of all the point in time restores for continuous mode that were performed on an Azure Cosmos DB account using [Activity Logs](../azure-monitor/essentials/activity-log.md). Activity logs can be viewed for any Azure Cosmos DB account from the **Activity Logs** page in the Azure portal. The Activity Log shows all the operations that were triggered on the specific account. When a point in time restore is triggered, it shows up as `Restore Database Account` operation on the source account as well as the target account. The Activity Log for the source account can be used to audit restore events, and the activity logs on the target account can be used to get the updates about the progress of the restore.
15+
Azure Cosmos DB provides you with a list of all point-in-time restores for continuous mode that were performed on an Azure Cosmos DB account using [activity logs](monitor.md#activity-log). Activity logs can be viewed for any Azure Cosmos DB account from the **Activity Logs** page in the Azure portal. The activity log shows all the operations that were triggered on the specific account. When a point-in-time restore is triggered, it shows up as `Restore Database Account` operation on the source account as well as the target account. The activity log for the source account can be used to audit restore events, and the activity logs on the target account can be used to get the updates about the progress of the restore.
1616

1717
## Audit the restores that were triggered on a live database account
1818

articles/cosmos-db/autoscale-per-partition-region.md

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -54,14 +54,12 @@ This feature is available for new Azure Cosmos DB accounts. To enable this featu
5454

5555
:::image type="content" source="media/autoscale-per-partition-region/enable-feature.png" lightbox="media/autoscale-per-partition-region/enable-feature.png" alt-text="Screenshot of the 'Per Region and Per Partition Autoscale' feature in the Azure portal.":::
5656

57-
> [!IMPORTANT]
58-
> The feature is enabled at the account level, so all containers within the account will automatically have this capability applied. The feature is available for both shared throughput databases and containers with dedicated throughput. Provisioned throughput accounts must switch over to autoscale and then enable this feature, if interested.
59-
60-
## Metrics
57+
> [!IMPORTANT]
58+
> The feature is enabled at the account level, so all containers within the account will automatically have this capability applied. The feature is available for both shared throughput databases and containers with dedicated throughput. Provisioned throughput accounts must switch over to autoscale and then enable this feature, if interested.
6159
62-
Use Azure Monitor to analyze how the new autoscaling is being applied across partitions and regions. Filter to your desired database account and container, then filter or split by the `PhysicalPartitionID` metric. This metric shows all partitions across their various regions.
60+
1. Use [Azure Monitor metrics](monitor-reference.md#supported-metrics-for-microsoftdocumentdbdatabaseaccounts) to analyze how the new autoscaling is applied across partitions and regions. Filter to your desired database account and container, then filter or split by the `PhysicalPartitionID` metric. This metric shows all partitions across their various regions.
6361

64-
Then, use `NormalizedRUConsumption' to see which partitions are scaling indpendently and which regions are scaling independently if applicable. You can use the 'ProvisionedThroughput' metric to see what throughput value is getting emmitted to our billing service.
62+
Then, use `NormalizedRUConsumption` to see which partitions and regions scale independently. You can use the `ProvisionedThroughput` metric to see what throughput value is emitted to our billing service.
6563

6664
## Requirements/Limitations
6765

articles/cosmos-db/cassandra/error-codes-solution.md

Lines changed: 7 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -13,19 +13,19 @@ ms.custom: template-how-to
1313
# Server diagnostics for Azure Cosmos DB for Apache Cassandra
1414
[!INCLUDE[Cassandra](../includes/appliesto-cassandra.md)]
1515

16-
Log Analytics is a tool in the Azure portal that helps you run server diagnostics on your API for Cassandra account. Run log queries from data collected by Azure Monitor Logs and interactively analyze their results. Records retrieved from Log Analytics queries help provide various insights into your data.
16+
Log Analytics is a tool in the Azure portal that helps you run server diagnostics on your API for Cassandra account.
1717

1818
## Prerequisites
1919

20-
- Create a [Log Analytics Workspace](../../azure-monitor/logs/quick-create-workspace.md).
21-
- Create [Diagnostic Settings](../monitor-resource-logs.md).
20+
- Create a [Log Analytics workspace](../../azure-monitor/logs/quick-create-workspace.md).
21+
- Create [diagnostic settings](../monitor-resource-logs.md).
2222
- Start [log analytics](../../azure-monitor/logs/log-analytics-overview.md) on your API for Cassandra account.
2323

2424
## Use Log Analytics
2525
After you've completed the log analytics setup, you can begin to explore your logs to gain more insights.
2626

2727
### Explore Data Plane Operations
28-
Use the CDBCassandraRequests table to see data plane operations specifically for your API for Cassandra account. A sample query to see the topN(10) consuming request and get detailed information on each request made.
28+
Use the [CDBCassandraRequests table](/azure/azure-monitor/reference/tables/cdbcassandrarequests) to see data plane operations specifically for your API for Cassandra account. A sample query to see the topN(10) consuming request and get detailed information on each request made.
2929

3030
```Kusto
3131
CDBCassandraRequests
@@ -35,33 +35,10 @@ CDBCassandraRequests
3535
| take 10
3636
```
3737

38-
#### Error Codes and Possible Solutions
39-
|Status Code | Error Code | Description |
40-
|------------|----------------------|--------------|
41-
| 200 | -1 | Successful |
42-
| 400 | 8704 | The query is correct but an invalid syntax. |
43-
| 400 | 8192 | The submitted query has a syntax error. Review your query. |
44-
| 400 | 8960 | The query is invalid because of some configuration issue. |
45-
| 401 |8448 | The logged user does not have the right permissions to perform the query. |
46-
| 403 | 8448 | Forbidden response as the user may not have the necessary permissions to carry out the request. |
47-
| 404 | 5376 | A non-timeout exception during a write request as a result of response not found. |
48-
| 405 | 0 | Server-side Cassandra error. The error rarely occurs, open a support ticket. |
49-
| 408 | 4608 | Timeout during a read request. |
50-
| 408 | 4352 | Timeout exception during a write serviceRequest. |
51-
| 409 | 9216 | Attempting to create a keyspace or table that already exist. |
52-
| 412 | 5376 | Precondition failure. To ensure data integrity, we ensure that the write request based on the read response is true. A non-timeout write request exception is returned. |
53-
| 413 | 5376 | This non-timeout exception during a write request is because of payload maybe too large. Currently, there is a limit of 2MB per row. |
54-
| 417 | 9472 | The exception is thrown when a prepared statement is not cached on the server node. It should be transient/non-blocking. |
55-
| 423 | 5376 | There is a lock because a write request that is currently processing. |
56-
| 429 | 4097| Overload exception is as a result of RU shortage or high request rate. Probably need more RU to handle the higher volume request. In, native Cassandra this can be interpreted as one of the VMs not having enough CPU. We advise reviewing current data model to ensure that you do not have excessive skews that might be causing hot partitions. |
57-
| 449 | 5376 | Concurrent execution exception. This occurs to ensure only one write update at a time for a given row. |
58-
| 500 | 0 | Server cassandraError: something unexpected happened. This indicates a server-side bug. |
59-
| 503 | 4096 | Service unavailable. |
60-
| | 256 | This may be because of invalid connection credentials. Please check your connection credentials. |
61-
| | 10 | A client message triggered protocol violation. An example is query message sent before a startup one has been sent. |
38+
For a list of error codes and their possible solutions, see [Error codes](../monitor-reference.md#error-codes-for-cassandra).
6239

6340
### Troubleshoot Query Consumption
64-
The CDBPartitionKeyRUConsumption table contains details on request unit (RU) consumption for logical keys in each region within each of their physical partitions.
41+
The [CDBPartitionKeyRUConsumption table](/azure/azure-monitor/reference/tables/cdbpartitionkeyruconsumption) contains details on request unit (RU) consumption for logical keys in each region within each of their physical partitions.
6542

6643
```Kusto
6744
CDBPartitionKeyRUConsumption
@@ -70,7 +47,7 @@ CDBPartitionKeyRUConsumption
7047
```
7148

7249
### Explore Control Plane Operations
73-
The CBDControlPlaneRequests table contains details on control plane operations, specifically for API for Cassandra accounts.
50+
The [CBDControlPlaneRequests table](/azure/azure-monitor/reference/tables/cdbcontrolplanerequests) contains details on control plane operations, specifically for API for Cassandra accounts.
7451

7552
```Kusto
7653
CDBControlPlaneRequests

articles/cosmos-db/cassandra/monitor-insights.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -89,7 +89,7 @@ The chart below shows if your application’s high RU consumption is because of
8989

9090
:::image type="content" source="./media/monitor-insights/normalized-ru-pk-rangeid.png" alt-text="Screenshot showing normalized request unit consumption by partition key range ID.":::
9191

92-
The chart below shows a breakdown of requests by different status code. Understand the meaning of the different codes for your [API for Cassandra codes](./error-codes-solution.md).
92+
The chart below shows a breakdown of requests by different status code. Understand the meaning of the different codes for your [API for Cassandra codes](../monitor-reference.md#error-codes-for-cassandra).
9393

9494
:::image type="content" source="./media/monitor-insights/total-request-by-status-code.png" alt-text="Screenshot image of a graph showing the total request by status code for a cassandra api account.":::
9595

articles/cosmos-db/concepts-limits.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ An Azure Cosmos DB container (or shared throughput database) using manual throug
4646

4747
The current and minimum throughput of a container or a database can be retrieved from the Azure portal or the SDKs. For more information, see [Allocate throughput on containers and databases](set-throughput.md).
4848

49-
The actual minimum RU/s may vary depending on your account configuration. You can use [Azure Monitor metrics](monitor.md#view-operation-level-metrics-for-azure-cosmos-db) to view the history of provisioned throughput (RU/s) and storage on a resource.
49+
The actual minimum RU/s might vary depending on your account configuration. You can use [Azure Monitor metrics](monitor.md#analyze-azure-cosmos-db-metrics) to view the history of provisioned throughput (RU/s) and storage on a resource.
5050

5151
#### Minimum throughput on container
5252

articles/cosmos-db/how-to-choose-offer.md

Lines changed: 6 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -46,9 +46,9 @@ Use the Azure Cosmos DB [capacity calculator](estimate-ru-with-capacity-planner.
4646

4747
### Existing applications ###
4848

49-
If you have an existing application using standard (manual) provisioned throughput, you can use [Azure Monitor metrics](insights-overview.md) to determine if your traffic pattern is suitable for autoscale.
49+
If you have an existing application using standard (manual) provisioned throughput, you can use [Azure Monitor metrics](monitor-reference.md#metrics) to determine if your traffic pattern is suitable for autoscale.
5050

51-
First, find the [normalized request unit consumption metric](monitor-normalized-request-units.md#view-the-normalized-request-unit-consumption-metric) of your database or container. Normalized utilization is a measure of how much you are currently using your standard (manual) provisioned throughput. The closer the number is to 100%, the more you are fully using your provisioned RU/s. [Learn more](monitor-normalized-request-units.md#view-the-normalized-request-unit-consumption-metric) about the metric.
51+
First, find the [normalized request unit consumption metric](monitor-normalized-request-units.md#view-the-normalized-request-unit-consumption-metric) of your database or container.
5252

5353
Next, determine how the normalized utilization varies over time. Find the highest normalized utilization for each hour. Then, calculate the average normalized utilization across all hours. If you see that your average utilization is less than 66%, consider enabling autoscale on your database or container. In contrast, if the average utilization is greater than 66%, it's recommended to remain on standard (manual) provisioned throughput.
5454

@@ -122,14 +122,13 @@ To calculate the average of the highest utilization across all hours:
122122
:::image type="content" source="media/how-to-choose-offer/variable-workload-highest-util-by-hour.png" alt-text="To see normalized RU consumption by hour, 1) Select time granularity to 1 hour; 2) Edit chart settings; 3) Select bar chart option; 4) Under Share, select Download to Excel option to calculate average across all hours. ":::
123123

124124
## Measure and monitor your usage
125-
Over time, after you've chosen the throughput type, you should monitor your application and make adjustments as needed.
125+
Over time, after you've chosen the throughput type, you should monitor your application and make adjustments as needed.
126126

127-
When using autoscale, use Azure Monitor to see the provisioned autoscale max RU/s (**Autoscale Max Throughput**) and the RU/s the system is currently scaled to (**Provisioned Throughput**). Below is an example of a variable or unpredictable workload using autoscale. Note when there isn't any traffic, the system scales the RU/s to the minimum of 10% of the max RU/s, which in this case is 5000 RU/s and 50,000 RU/s, respectively.
127+
When using autoscale, use Azure Monitor to see the provisioned autoscale max RU/s (**Autoscale Max Throughput**) and the RU/s the system is currently scaled to (**Provisioned Throughput**).
128128

129-
:::image type="content" source="media/how-to-choose-offer/autoscale-metrics-azure-monitor.png" alt-text="Example of workload using autoscale, with autoscale max RU/s of 50,000 RU/s and throughput ranging from 5000 - 50,000 RU/s":::
129+
The following example shows a variable or unpredictable workload using autoscale. Note when there isn't any traffic, the system scales the RU/s to the minimum of 10% of the max RU/s, which in this case is 5,000 RU/s and 50,000 RU/s, respectively.
130130

131-
> [!NOTE]
132-
> When you use standard (manual) provisioned throughput, the **Provisioned Throughput** metric refers to what you as a user have set. When you use autoscale throughput, this metric refers to the RU/s the system is currently scaled to.
131+
:::image type="content" source="media/how-to-choose-offer/autoscale-metrics-azure-monitor.png" alt-text="Example of workload using autoscale, with autoscale max RU/s of 50,000 RU/s and throughput ranging from 5000 - 50,000 RU/s":::
133132

134133
## Next steps
135134
* Use [RU calculator](https://cosmos.azure.com/capacitycalculator/) to estimate throughput for new workloads.

articles/cosmos-db/integrated-cache.md

Lines changed: 1 addition & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -140,19 +140,7 @@ The integrated cache has a limited storage capacity determined by the dedicated
140140

141141
## Metrics
142142

143-
It's helpful to monitor some key metrics for the integrated cache. These metrics include:
144-
145-
- `DedicatedGatewayCPUUsage` - CPU usage with Avg, Max, or Min Aggregation types for data across all dedicated gateway nodes.
146-
- `DedicatedGatewayAverageCPUUsage` - (Deprecated) Average CPU usage across all dedicated gateway nodes.
147-
- `DedicatedGatewayMaximumCPUUsage` - (Deprecated) Maximum CPU usage across all dedicated gateway nodes.
148-
- `DedicatedGatewayMemoryUsage` - Memory usage with Avg, Max, or Min Aggregation types for data across all dedicated gateway nodes.
149-
- `DedicatedGatewayAverageMemoryUsage` - (Deprecated) Average memory usage across all dedicated gateway nodes.
150-
- `DedicatedGatewayRequests` - Total number of dedicated gateway requests across all dedicated gateway nodes.
151-
- `IntegratedCacheEvictedEntriesSize` – The average amount of data evicted from the integrated cache due to LRU across all dedicated gateway nodes. This value doesn't include data that expired due to exceeding the `MaxIntegratedCacheStaleness` time.
152-
- `IntegratedCacheItemExpirationCount` - The average number of items that are evicted from the integrated cache due to cached point reads exceeding the `MaxIntegratedCacheStaleness` time across all dedicated gateway nodes.
153-
- `IntegratedCacheQueryExpirationCount` - The average number of queries that are evicted from the integrated cache due to cached queries exceeding the `MaxIntegratedCacheStaleness` time across all dedicated gateway nodes.
154-
- `IntegratedCacheItemHitRate` – The proportion of point reads that used the integrated cache (out of all point reads routed through the dedicated gateway with session or eventual consistency). This value is an average of integrated cache instances across all dedicated gateway nodes.
155-
- `IntegratedCacheQueryHitRate` – The proportion of queries that used the integrated cache (out of all queries routed through the dedicated gateway with session or eventual consistency). This value is an average of integrated cache instances across all dedicated gateway nodes.
143+
It's helpful to monitor some key `DedicatedGateway` and `IntegratedCache` metrics for the integrated cache. To learn about these metrics, see [Supported metrics for Microsoft.DocumentDB/DatabaseAccounts](monitor-reference.md#supported-metrics-for-microsoftdocumentdbdatabaseaccounts).
156144

157145
All existing metrics are available, by default, from **Metrics** in the Azure portal (not Metrics classic):
158146

0 commit comments

Comments
 (0)